I recently came across a piece by Carl Brooks over at IT Tech News Daily that caught my eye, title was Cloud Storage Often Results in Data Loss. The piece has an effective title (good for search engine: SEO optimization) as it stood out from many others I saw on that particular day. What caught my eye on Carl’s piece is that it reads as if the facts based on a quick survey point to clouds resulting in data loss, as opposed to being an opinion that some cloud usage can result in data loss.
My opinion is that if not used properly including ignoring best practices, any form of data storage medium or media could result or be blamed for data loss. For some people they have lost data as a result of using cloud storage services just as other people have lost data or access to information on other storage mediums and solutions. For example, data has been lost on tape, Hard Disk Drives (HDDs), Solid State Devices (SSD), Hybrid HDDs (HHDD), RAID and non RAID, local and remote and even optical based storage systems large and small. In some cases, there have been errors or problems with the medium or media, in other cases storage systems have lost access to, or lost data due to hardware, firmware, software, or configuration including due to human error among other issues.
Technology failure: Not if, rather when and how to decrease impact
Any technology regardless of what it is or who it is from along with its architecture design and implementation can fail. It is not if, rather when and how gracefully along with what safeguards to decrease the impact, in addition to containing or isolating faults differentiates various products or solutions. How they automatically repair and self heal to keep running or support accessibility and maintain data integrity are important as is how those options are used. Granted a failure may not be technology related per say, rather something associated with human intervention, configuration, change management (or lack thereof) along with accidental or intentional activities.
Walking the talk
I have used public cloud storage services for several years including SaaS and AaaS as well as IaaS (See more XaaS here) and knock on wood, have not lost any data yet, loss of access sure, however not data being lost.
In the several years of using cloud based storage and services there has been some loss of access, however no loss of data. Those service disruptions or loss of access to data and services ranged from a few minutes to a little over an hour. In those scenarios, if I could not have waited for cloud storage to become accessible, I could have accessed a local copy if it were available.
Had a major disruption occurred where it would have been several days before I could gain access to that information, or if it were actually lost, I have a data insurance policy. That data insurance policy is part of my business continuance (BC) and disaster recovery (DR) strategy. My BC and DR strategy is a multi layered approach combining local, offline and offsite as along with online cloud data protection and archiving.
Assuming my cloud storage service could get data back to a given point (RPO) in a given amount of time (RTO), I have some options. One option is to wait for the service or information to become available again assuming a local copy is no longer valid or available. Another option is to start restoration from a master gold copy and then roll forward changes from the cloud services as that information becomes available. In other words, I am using cloud storage as another resource that is for both protecting what is local, as well as complimenting how I locally protect things.
Minimize or cut data loss or loss of access
Anything important should be protected locally and remotely meaning leveraging cloud and a master or gold backup copy.
To cut the cost of protecting information, I also leverage archives, which mean not all data gets protected the same. Important data is protected more often reducing RPO exposure and speed up RTO during restoration. Other data that is not as important is protected, however on a different frequency with other retention cycles, in other words, tiered data protection. By implementing tiered data protection, best practices, and various technologies including data footprint reduction (DFR) such as archive, compression, dedupe in addition to local disk to disk (D2D), disk to disk to cloud (D2D2C), along with routine copies to offline media (removable HDDs or RHDDs) that go offsite, Im able to stretch my data protection budget further. Not only is my data protection budget stretched further, I have more options to speed up RTO and better detail for recovery and enhanced RPOs.
If you are looking to avoid losing data, or loss of access, it is a simple equation in no particular order:
- Strategy and design
- Best practices and processes
- Various technologies
- Quality products
- Robust service delivery
- Configuration and implementation
- SLO and SLA management metrics
- People skill set and knowledge
- Usage guidelines or terms of service (ToS)
Unfortunately, clouds like other technologies or solutions get a bad reputation or blamed when something goes wrong. Sometimes it is the technology or service that fails, other times it is a combination of errors that resulted in loss of access or lost data. With clouds as has been the case with other storage mediums and systems in the past, when something goes wrong and if it has been hyped, chances are it will become a target for blame or finger pointing vs. determining what went wrong so that it does not occur again. For example cloud storage has been hyped as easy to use, dont worry, just put your data there, you can get out of the business of managing storage as the cloud will do that magically for you behind the scenes.
The reality is that while cloud storage solutions can offload functions, someone is still responsible for making decisions on its usage and configuration that impact availability. What separates various providers is their ability to design in best practices, isolate and contain faults quickly, have resiliency integrated as part of a solution along with various SLAs aligned to what the service level you are expecting in an easy to use manner.
Does that mean the more you pay the more reliable and resilient a solution should be?
No, not necessarily, as there can still be risks including how the solution is used.
Does that mean low cost or for free solutions have the most risk?
No, not necessarily as it comes down to how you use or design around those options. In other words, while cloud storage services remove or mask complexity, it still comes down to how you are going to use a given service.
Shared responsibility for cloud (and non cloud) storage data protection
Anything important enough that you cannot afford to lose, or have quick access to should be protected in different locations and on various mediums. In other words, balance your risk. Cloud storage service provider toned to take responsibility to meet service expectations for a given SLA and SLOs that you agree to pay for (unless free).
As the customer you have the responsibility of following best practices supplied by the service provider including reading the ToS. Part of the responsibility as a customer or consumer is to understand what are the ToS, SLA and SLOs for a given level of service that you are using. As a customer or consumer, this means doing your homework to be ready as a smart educated buyer or consumer of cloud storage services.
If you are a vendor or value added reseller (VAR), your opportunity is to help customers with the acquisition process to make informed decision. For VARs and solution providers, this can mean up selling customers to a higher level of service by making them aware of the risk and reward benefits as opposed to focus on cost. After all, if a order taker at McDonalds can ask Would you like to super size your order, why cant you as a vendor or solution provider also have a value oriented up sell message.
Taking action, what you should (or not) do
Don’t be scared of clouds, however do your homework, be ready, look before you leap and follow best practices. Look into the service level agreements (SLAs) associated with a given cloud storage product or service. Follow best practices about how you or someone else will protect what data is put into the cloud.
For critical data or information, consider having a copy of that data in the cloud as well as at or in another place, which could be in a different cloud or local or offsite and offline. Keep in mind the theme for critical information and data is not if, rather when so what can be done to decrease the risk or impact of something happening, in other words, be ready.
Data put into the cloud can be lost, or, loss of access to it can occur for some amount of time just as happens with using non cloud storage such as tape, disk or ssd. What impacts or minimizes your risk of using traditional local or remote as well as cloud storage are the best practices, how configured, protected, secured and managed. Another consideration is the type and quality of the storage product or cloud service can have a big impact. Sure, a quality product or service can fail; however, you can also design and configure to decrease those impacts.
Bottom line, do not be scared of cloud storage, however be ready, do your homework, review best practices, understand benefits and caveats, risk and reward. For those who want to learn more about cloud storage (public, private and hybrid) along with data protection, data management, data footprint reduction among other related topics and best practices, I happen to know of some good resources. Those resources in addition to the links provided above are titled Cloud and Virtual Data Storage Networking (CRC Press) that you can learn more about here as well as find at Amazon among other venues. Also, check out Enterprise Systems Backup and Recovery: A Corporate Insurance Policy by Preston De Guise (aka twitter @backupbear ) which is a great resource for protecting data.
[Cross-posted at StorageIO Blog]