RTO Archives

January 30, 2014November 26, 2023

Welcome to the Data Protection Diaries

Updated 1/10/2018

Storage I/O trends

Welcome to the Data Protection Diaries

This is a series of posts about data protection which includes security (logical and physical), backup/restore, business continuance (BC), disaster recovery (DR), business resiliency (BR) along with high availability (HA), archiving and related topic themes, technologies and trends.

Think of data protection like protect, preserve and serve information across cloud, virtual and physical environments spanning traditional servers, storage I/O networking along with mobile (ok, some IoT as well), SOHO/SMB to enterprise.

Getting started, taking a step back

Recently I have done a series of webinars and Google+ hangouts as part of the BackupU initiative brought to you by Dell Software (that’s a disclosure btw ;) ) that are vendor and technology neutral. Instead of the usual vendor product or technology focused seminars and events, these are about getting back to the roots, the fundamentals of what to protect when and why, then decide your options as well as different approaches (e.g. what tools to use when).

In addition over the past year (ok, years) I have also been doing other data protection related events, seminars, workshops, articles, tips, posts across cloud, virtual and physical from SOHO/SMB to enterprise. These are in addition to the other data infrastructure server and storage I/O stuff (e.g. SSD, object storage, software defined, big data, little data, buzzword bingo and others).

Keep in mind that in the data center or information factory everything is not the same as there are different applications, threat risk scenarios, availability and durability among other considerations. In this series like the cloud conversations among others, I’m going to be pulling various data protection themes together hopefully to make it easier for others to find, as well as where I know where to get them.

data protection diaries
Some notes for an upcoming post in this series using my Livescribe about data protection

Data protection topics, trends, technologies and related themes

Part 1 – Data Infrastructure Data Protection Fundamentals
Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
Part 3 – Data Protection Access Availability RAID Erasure Codes ( EC) including LRC
Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
Part 5 – Point In Time Data Protection Granularity Points of Interest
Part 6 – Data Protection Security Logical Physical Software Defined
Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
Part 8 – Data Protection Diaries Walking Data Protection Talk
Part 9 – who’s Doing What ( Toolbox Technology Tools)
Part 10 – Data Protection Resources Where to Learn More
Data Protection Diaries series
Data Infrastructure server storage I/O network Recommended Reading List Book Shelf
Software Defined Data Infrastructure Essentials (CRC 2017) Book

Here are some more posts to checkout pertaining to data protection trends, technologies and perspectives:

Iron Mountain: Information Lifecycle Management Strategy: Which Data Types Have Value?
StorageIOblog: S3motion Buckets Containers Objects AWS S3 Cloud and EMCcode
BackupU resources (Sponsored by Dell): Various Application, Virtualization and Related Data Protection links and resources (vendor and product neutral)
BackupU resources (Sponsored by Dell): Various Application, Virtualization and Related Data Protection webinars and G+ chats (vendor and product neutral)
StorageIOblog: AWS S3 Cross Region Replication storage enhancements
StorageIOblog: March 2015 Server StorageIO Update Newsletter Focus on Data Protection
StorageIOblog: Are your restores ready for World Backup Day 2015?
StorageIOblog: Data Protection Gumbo = Protect Preserve and Serve Information
CyberTrend: Comments on Software Defined Data Center and Virtualization
Processor: Comments on Enterprise Backup Solution Buying Tips
Processor: Comments on Considerations For Failed & Old Drives
Processor: Comments on Quickly Detect & Avoid Hard Drive Failures
Processor: Comments on What Resilient & Highly Available Mean
Processor: Comments on What Abandoned Data Is Costing Your Company
Processor: Comments on Match Application Needs & Infrastructure Capabilities
CyberTrend: Comments on Enterprise Backup
Processor: Comments on Buying Tips: Enterprise Backup Solutions
Processor: Comments on Re-evaluate Server Security
5 Tips for Factoring Software into Disaster Recovery Plans
Remote office backup, archiving and disaster recovery for networking pros
Securing your information assets and data, what about your storage?
Bridging the gap: Choosing storage-over-distance network technology
Wide area network resiliency best practices
3 Questions to Help SMBs Plan a Backup Strategy
Until the focus expands to data protection, backup is staying alive!
Part II Until the focus expands to data protection – What to do about it
Part III Until the focus expands to data protection – Taking action
Data Protection Modernization, More than swapping out media
Data Protection Diaries – My data protection needs and wants
Only you can prevent cloud data loss
Cloud storage: Dont be scared, however look before you leap
Cloud conversations: Has Nirvanix shutdown caused cloud confidence concerns?
Virtual, Cloud and IT Availability, its a shared responsibility and common sense
Cloud conversations: Loss of data access vs. data loss
backup, restore, BC, DR and archiving
Presentation/Webinar – Hybrid Clouds: Bridging the Gap Between Public and Private Environments
Presentation/Webinar – Modernizing Data Protection For Cloud, Virtual and Physical Environments
Missing MH370 should remind us, do you know where your digital assets are?
Rings Of Security For Protection Or Bling For Appearance?
Securing your information assets and data, what about your storage?
My copies were corrupted: The 3-2-1 rule
Data Protection Diaries: March 31 World Backup Day is Restore Data Test Time

Ok, nuff said (for now)

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

January 29, 2013December 29, 2025

In the data center or information factory, not everything is the same

StorageIO Industry trends and perspectives image

Sometimes what should be understood, or that is common sense or that you think everybody should know needs to be stated. After all, there could be somebody who does not know what some assume as common sense or what others know for various reasons. At times, there is simply the need to restate or have a reminder of what should be known.

Consequently, in the data center or information factory, either traditional, virtual, converged, private, hybrid or public cloud, everything is not the same. When I say not everything is the same, is that different applications with various service level objectives (SLO’s) and service level agreements (SLA’s). These are based on different characteristics from performance, availability, reliability, responsiveness, cost, security, privacy among others. Likewise, there are different size and types of organizations with various requirements from enterprise to SMB, ROBO and SOHO, business or government, education or research.

There are also different threat risks for various applications or information services within in an organization, or across different industry sectors. Thus various needs for meeting availability SLA’s, recovery time objectives (RTO’s) and recovery point objectives (RPO’s) for data protection ranging from backup/restore, to high-availability (HA), business continuance (BC), disaster recovery (DR) and archiving. Let us not forget about logical and physical security of information, assets and people, processes and intellectual property.

Some data centers or information factories are compute intensive while others are data centric, some are IO or activity intensive with a mix of compute and storage. On the other hand, some data centers such as a communications hub may be network centric with very little data sticking or being stored.

Even within in a data center or information factory, various applications will have different profiles, protection requirements for big data and little data. There can also be a mix of old legacy applications and new systems developed in-house, purchased, open-source based or accessed as a service. The servers and storage may be software defined (a new buzzword that has already jumped the shark), virtualized or operated in a private, hybrid or community cloud if not using a public service.

Here are some related posts tied to everything is not the same:
Optimize Data Storage for Performance and Capacity
Is SSD only for performance?
Cloud conversations: Gaining cloud confidence from insights into AWS outages
Data Center Infrastructure Management (DCIM) and IRM
Saving Money with Green IT: Time To Invest In Information Factories
Everything Is Not Equal in the Datacenter, Part 1
Everything Is Not Equal in the Datacenter, Part 2
Everything Is Not Equal in the Datacenter, Part 3

Thus, not all things are the same in the data center, or information factories, both those under traditional management paradigms, as well as those supporting public, private, hybrid or community clouds.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 7, 2013December 29, 2025

Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)

StorageIO industry trends cloud, virtualization and big data

This is the second in a two-part industry trends and perspective looking at learning from cloud incidents, view part I here.

There is good information, insight and lessons to be learned from cloud outages and other incidents.

Sorry cynics no that does not mean an end to clouds, as they are here to stay. However when and where to use them, along with what best practices, how to be ready and configure for use are part of the discussion. This means that clouds may not be for everybody or all applications, or at least today. For those who are into clouds for the long haul (either all in or partially) including current skeptics, there are many lessons to be learned and leveraged.

In order to gain confidence in clouds, some questions that I routinely am asked include are clouds more or less reliable than what you are doing? Depends on what you are doing, and how you will be using the cloud services. If you are applying HA and other BC or resiliency best practices, you may be able to configure and isolate from the more common situations. On the other hand, if you are simply using the cloud services as a low-cost alternative selecting the lowest price and service class (SLAs and SLOs), you might get what you paid for. Thus, clouds are a shared responsibility, the service provider has things they need to do, and the user or person designing how the service will be used have some decisions making responsibilities.

Keep in mind that high availability (HA), resiliency, business continuance (BC) along with disaster recovery (DR) are the sum of several pieces. This includes people, best practices, processes including change management, good design eliminating points of failure and isolating or containing faults, along with how the components or technology used (e.g. hardware, software, networks, services, tools). Good technology used in goods ways can be part of a highly resilient flexible and scalable data infrastructure. Good technology used in the wrong ways may not leverage the solutions to their full potential.

While it is easy to focus on the physical technologies (servers, storage, networks, software, facilities), many of the cloud services incidents or outages have involved people, process and best practices so those need to be considered.

These incidents or outages bring awareness, a level set, that this is still early in the cloud evolution lifecycle and to move beyond seeing clouds as just a way to cut cost, and seeing the importance and value HA, resiliency, BC and DR. This means learning from mistakes, taking action to correct or fix errors, find and cut points of failure are part of a technology maturing or the use of it. These all tie into having services with service level agreements (SLAs) with service level objectives (SLOs) for availability, reliability, durability, accessibility, performance and security among others to protect against mayhem or other things that can and do happen.

Images licensed for use by StorageIO via
Atomazul / Shutterstock.com

The reason I mentioned earlier that AWS had another incident is that like their peers or competitors who have incidents in the past, AWS appears to be going through some growing, maturing, evolution related activities. During summer 2012 there was an AWS incident that affected Netflix (read more here: AWS and the Netflix Fix?). It should also be noted that there were earlier AWS outages where Netflix (read about Netflix architecture here) leveraged resiliency designs to try and prevent mayhem when others were impacted.

Is AWS a lightning rod for things to happen, a point of attraction for Mayhem and others?

Granted given their size, scope of services and how being used on a global basis AWS is blazing new territory and experiences, similar to what other information services delivery platforms did in the past. What I mean is that while taken for granted today, open systems Unix, Linux, Windows-based along with client-server, midrange or distributed systems, not to mention mainframe hardware, software, networks, processes, procedures, best practices all went through growing pains.

There are a couple of interesting threads going on over in various LinkedIn Groups based on some reporters stories including on speculation of what happened, followed with some good discussions of what actually happened and how to prevent recurrence of them in the future.

Over in the Cloud Computing, SaaS & Virtualization group forum, this thread is based on a Forbes article (Amazon AWS Takes Down Netflix on Christmas Eve) and involves conversations about SLAs, best practices, HA and related themes. Have a look at the story the thread is based on and some of the assertions being made, and ensuing discussions.

Also over at LinkedIn, in the Cloud Hosting & Service Providers group forum, this thread is based on a story titled Why Netflix’ Christmas Eve Crash Was Its Own Fault with a good discussion on clouds, HA, BC, DR, resiliency and related themes.

Over at the Virtualization Practice, there is a piece titled Is Amazon Ruining Public Cloud Computing? with comments from me and Adrian Cockcroft (@Adrianco) a Netflix Architect (you can read his blog here). You can also view some presentations about the Netflix architecture here.

What this all means

Saying you get what you pay for would be too easy and perhaps not applicable.

There are good services free, or low-cost, just like good free content and other things, however vice versa, just because something costs more, does not make it better.

Otoh, there are services that charge a premium however may have no better if not worse reliability, same with content for fee or perceived value that is no better than what you get free.

Additional related material

Some closing thoughts:

Clouds are real and can be used safely; however, they are a shared responsibility.
Only you can prevent cloud data loss, which means do your homework, be ready.
If something can go wrong, it probably will, particularly if humans are involved.
Prepare for the unexpected and clarify assumptions vs. realities of service capabilities.
Leverage fault isolation and containment to prevent rolling or spreading disasters.
Look at cloud services beyond lowest cost or for cost avoidance.
What is your organizations culture for learning from mistakes vs. fixing blame?
Ask yourself if you, your applications and organization are ready for clouds.
Ask your cloud providers if they are ready for you and your applications.
Identify what your cloud concerns are to decide what can be done about them.
Do a proof of concept to decide what types of clouds and services are best for you.

Do not be scared of clouds, however be ready, do your homework, learn from the mistakes, misfortune and errors of others. Establish and leverage known best practices while creating new ones. Look at the past for guidance to the future, however avoid clinging to, and bringing the baggage of the past to the future. Use new technologies, tools and techniques in new ways vs. using them in old ways.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 7, 2013December 29, 2025

Cloud conversations: Gaining cloud confidence from insights into AWS outages

StorageIO industry trends cloud, virtualization and big data

This is the first of a two-part industry trends and perspectives series looking at how to learn from cloud outages (read part II here).

In case you missed it, there were some public cloud outages during the recent Christmas 2012-holiday season. One incident involved Microsoft Xbox (view the Microsoft Azure status dashboard here) users were impacted, and the other was another Amazon Web Services (AWS) incident. Microsoft and AWS are not alone, most if not all cloud services have had some type of incident and have gone on to improve from those outages. Google has had issues with different applications and services including some in December 2012 along with a Gmail incident that received covered back in 2011.

For those interested, here is a link to the AWS status dashboard and a link to the AWS December 24 2012 incident postmortem. In the case of the recent AWS incident which affected users such as Netflix, the incident (read the AWS postmortem and Netflix postmortem) was tied to a human error. This is not to say AWS has more outages or incidents vs. others including Microsoft, it just seems that we hear more about AWS when things happen compared to others. That could be due to AWS size and arguably market leading status, diversity of services and scale at which some of their clients are using them.

Btw, if you were not aware, Microsoft Azure is more than just about supporting SQLserver, Exchange, SharePoint or Office, it is also an IaaS layer for running virtual machines such as Hyper-V, as well as a storage target for storing data. You can use Microsoft Azure storage services as a target for backing up or archiving or as general storage, similar to using AWS S3 or Rackspace Cloud files or other services. Some backup and archiving AaaS and SaaS providers including Evault partner with Microsoft Azure as a storage repository target.

When reading some of the coverage of these recent cloud incidents, I am not sure if I am more amazed by some of the marketing cloud washing, or the cloud bashing and uniformed reporting or lack of research and insight. Then again, if someone repeats a myth often enough for others to hear and repeat, as it gets amplified, the myth may assume status of reality. After all, you may know the expression that if it is on the internet then it must be true?

Images licensed for use by StorageIO via
Atomazul / Shutterstock.com

Have AWS and public cloud services become a lightning rod for when things go wrong?

Here is some coverage of various cloud incidents:

Huffington post coverage of February 2011 Google Gmail incident
Microsoft Azure coverage by Allthingsd.com
Neowin.net covering Microsoft Xbox incident
Google’s Gmail blog coverage of Gmail outage
Forbes article Amazon AWS Takes Down Netflix on Christmas Eve
Over at Performance Critical Apps they assert the AWS incident was Netflix fault
From The Virtualization Practice: Amazon Ruining Public Cloud Computing?
Here is Netflix architect Adrian Cockcroft discussing the recent incident
From StorageIOblog Amazon Web Services (AWS) and the Netflix Fix?
From CRN, here are some cloud service availability status via Nasuni

The above are a small sampling of different stories, articles, columns, blogs, perspectives about cloud services outages or other incidents. Assuming the services are available, you can Google or Bing many others along with reading postmortems to gain insight into what happened, the cause, effect and how to prevent in the future.

Do these recent incidents show a trend of increased cloud outages? Alternatively, do they say that the cloud services are being used more and on a larger basis, thus the impacts become more known?

Perhaps it is a mix of the above, and like when a magnetic storage tape gets lost or stolen, it makes for good news or copy, something to write about. Granted there are fewer tapes actually lost than in the past, and far fewer vs. lost or stolen laptops and other devices with data on them. There are probably other reasons such as the lightning rod effect given how much industry hype around clouds that when something does happen, the cynics or foes come out in force, sometimes with FUD.

Similar to traditional hardware or software based product vendors, some service providers have even tried to convince me that they have never had an incident, lost or corrupted or compromised any data, yeah, right. Candidly, I put more credibility and confidence in a vendor or solution provider who tells me that they have had incidents and taken steps to prevent them from recurring. Granted those steps might be made public while others might be under NDA, at least they are learning and implementing improvements.

As part of gaining insights, here are some links to AWS, Google, Microsoft Azure and other service status dashboards where you can view current and past situations.

AWS service status dashboard
Bluehost server status dashboard
Google App status dashboard
HP cloud service status console (requires login)
Microsoft Azure service status dashboard
Microsoft Xbox service status dashboard
Rackspace service status dashboards

What is your take on IT clouds? Click here to cast your vote and see what others are thinking about clouds.

Ok, nuff said for now (check out part II here )

Disclosure: I am a customer of AWS for EC2, EBS, S3 and Glacier as well as a customer of Bluehost for hosting and Rackspace for backups. Other than Amazon being a seller of my books (and my blog via Kindle) along with running ads on my sites and being an Amazon Associates member (Google also has ads), none of those mentioned are or have been StorageIO clients.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

July 26, 2012December 29, 2025

Data protection modernization, more than swapping out media

backup, restore, BC, DR and archiving

Have you modernized your data protection strategy and environment?

If not, are you thinking about updating your strategy and environment?

Why modernize your data protection including backup restore, business continuance (BC), high availability (HA) and disaster recovery (DR) strategy and environment?

backup, restore, BC, DR and archiving

Is it to leverage new technology such as disk to disk (D2D) backups, cloud, virtualization, data footprint reduction (DFR) including compression or dedupe?

Perhaps you have or are considering data protection modernization because somebody told you to or you read about it or watched a video or web cast? Or, perhaps your backup and restore are broke so its time to change media or try something different.

Lets take a step back for a moment and ask the question of what is your view of data protection modernization?

Perhaps it is modernizing backup by replacing tape with disk, or disk with clouds?

Maybe it is leveraging data footprint reduction (DFR) techniques including compression and dedupe?

How about instead of swapping out media, changing backup software?

Or what about virtualizing servers moving from physical machines to virtual machines?

On the other hand maybe your view of modernizing data protection is around using a different product ranging from backup software to a data protection appliance, or snapshots and replication.

The above and others certainly fall under the broad group of backup, restore, BC, DR and archiving, however there is another area which is not as much technology as it is techniques, best practices, processes and procedure based. That is, revisit why data and applications are being protected against what applicable threat risks and associated business risks.

backup, restore, BC, DR and archiving

This means reviewing service needs and wants including backup, restore, BC, DR and archiving that in turn drive what data and applications to protect, how often, how many copies and where those are located, along with how long they will be retained.

backup, restore, BC, DR and archiving

Modernizing data protection is more than simply swapping out old or broken media like flat tires on a vehicle.

To be effective, data protection modernization involves taking a step back from the technology, tools and buzzword bingo topics to review what is being protected and why. It also means revisiting service level expectations and clarify wants vs. needs which translates to what if for free that is what is wanted, however for a cost then what is required.

backup, restore, BC, DR and archiving

Certainly technologies and tools play a role, however simply using new tools and techniques without revisiting data protection challenges at the source will result in new problems that resemble old problems.

backup, restore, BC, DR and archiving

Hence to support growth with a constrained or shrinking budget while maintaining or enhancing service levels, the trick is to remove complexity and costs.

backup, restore, BC, DR and archiving

This means not treating all data and applications the same, stretch your available resources to be more effective without compromise on service is mantra of modernizing data protection.

Ok, nuff said for now, plenty more to discuss later.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

October 18, 2011December 29, 2025

The blame game: Does cloud storage result in data loss?

I recently came across a piece by Carl Brooks over at IT Tech News Daily that caught my eye, title was Cloud Storage Often Results in Data Loss. The piece has an effective title (good for search engine: SEO optimization) as it stood out from many others I saw on that particular day.

What caught my eye on Carls piece is that it reads as if the facts based on a quick survey point to clouds resulting in data loss, as opposed to being an opinion that some cloud usage can result in data loss.

My opinion is that if not used properly including ignoring best practices, any form of data storage medium or media could result or be blamed for data loss. For some people they have lost data as a result of using cloud storage services just as other people have lost data or access to information on other storage mediums and solutions. For example, data has been lost on tape, Hard Disk Drives (HDDs), Solid State Devices (SSD), Hybrid HDDs (HHDD), RAID and non RAID, local and remote and even optical based storage systems large and small. In some cases, there have been errors or problems with the medium or media, in other cases storage systems have lost access to, or lost data due to hardware, firmware, software, or configuration including due to human error among other issues.

Technology failure: Not if, rather when and how to decrease impact
Any technology regardless of what it is or who it is from along with its architecture design and implementation can fail. It is not if, rather when and how gracefully along with what safeguards to decrease the impact, in addition to containing or isolating faults differentiates various products or solutions. How they automatically repair and self heal to keep running or support accessibility and maintain data integrity are important as is how those options are used. Granted a failure may not be technology related per say, rather something associated with human intervention, configuration, change management (or lack thereof) along with accidental or intentional activities.

Walking the talk
I have used public cloud storage services for several years including SaaS and AaaS as well as IaaS (See more XaaS here) and knock on wood, have not lost any data yet, loss of access sure, however not data being lost.

I follow my advice and best practices when selecting cloud providers looking for good value, service level agreements (SLAs) and service level objectives (SLOs) over low cost or for free services.

In the several years of using cloud based storage and services there has been some loss of access, however no loss of data. Those service disruptions or loss of access to data and services ranged from a few minutes to a little over an hour. In those scenarios, if I could not have waited for cloud storage to become accessible, I could have accessed a local copy if it were available.

Had a major disruption occurred where it would have been several days before I could gain access to that information, or if it were actually lost, I have a data insurance policy. That data insurance policy is part of my business continuance (BC) and disaster recovery (DR) strategy. My BC and DR strategy is a multi layered approach combining local, offline and offsite as along with online cloud data protection and archiving.

Assuming my cloud storage service could get data back to a given point (RPO) in a given amount of time (RTO), I have some options. One option is to wait for the service or information to become available again assuming a local copy is no longer valid or available. Another option is to start restoration from a master gold copy and then roll forward changes from the cloud services as that information becomes available. In other words, I am using cloud storage as another resource that is for both protecting what is local, as well as complimenting how I locally protect things.

Minimize or cut data loss or loss of access
Anything important should be protected locally and remotely meaning leveraging cloud and a master or gold backup copy.

To cut the cost of protecting information, I also leverage archives, which mean not all data gets protected the same. Important data is protected more often reducing RPO exposure and speed up RTO during restoration. Other data that is not as important is protected, however on a different frequency with other retention cycles, in other words, tiered data protection. By implementing tiered data protection, best practices, and various technologies including data footprint reduction (DFR) such as archive, compression, dedupe in addition to local disk to disk (D2D), disk to disk to cloud (D2D2C), along with routine copies to offline media (removable HDDs or RHDDs) that go offsite, Im able to stretch my data protection budget further. Not only is my data protection budget stretched further, I have more options to speed up RTO and better detail for recovery and enhanced RPOs.

If you are looking to avoid losing data, or loss of access, it is a simple equation in no particular order:

Strategy and design
Best practices and processes
Various technologies
Quality products
Robust service delivery
Configuration and implementation
SLO and SLA management metrics
People skill set and knowledge
Usage guidelines or terms of service (ToS)

Unfortunately, clouds like other technologies or solutions get a bad reputation or blamed when something goes wrong. Sometimes it is the technology or service that fails, other times it is a combination of errors that resulted in loss of access or lost data. With clouds as has been the case with other storage mediums and systems in the past, when something goes wrong and if it has been hyped, chances are it will become a target for blame or finger pointing vs. determining what went wrong so that it does not occur again. For example cloud storage has been hyped as easy to use, don’t worry, just put your data there, you can get out of the business of managing storage as the cloud will do that magically for you behind the scenes.

The reality is that while cloud storage solutions can offload functions, someone is still responsible for making decisions on its usage and configuration that impact availability. What separates various providers is their ability to design in best practices, isolate and contain faults quickly, have resiliency integrated as part of a solution along with various SLAs aligned to what the service level you are expecting in an easy to use manner.

Does that mean the more you pay the more reliable and resilient a solution should be?
No, not necessarily, as there can still be risks including how the solution is used.

Does that mean low cost or for free solutions have the most risk?
No, not necessarily as it comes down to how you use or design around those options. In other words, while cloud storage services remove or mask complexity, it still comes down to how you are going to use a given service.

Shared responsibility for cloud (and non cloud) storage data protection
Anything important enough that you cannot afford to lose, or have quick access to should be protected in different locations and on various mediums. In other words, balance your risk. Cloud storage service provider toned to take responsibility to meet service expectations for a given SLA and SLOs that you agree to pay for (unless free).

As the customer you have the responsibility of following best practices supplied by the service provider including reading the ToS. Part of the responsibility as a customer or consumer is to understand what are the ToS, SLA and SLOs for a given level of service that you are using. As a customer or consumer, this means doing your homework to be ready as a smart educated buyer or consumer of cloud storage services.

If you are a vendor or value added reseller (VAR), your opportunity is to help customers with the acquisition process to make informed decision. For VARs and solution providers, this can mean up selling customers to a higher level of service by making them aware of the risk and reward benefits as opposed to focus on cost. After all, if a order taker at McDonalds can ask Would you like to super size your order, why cant you as a vendor or solution provider also have a value oriented up sell message.

Additional related links to read more and sources of information:

Choosing the Right Local/Cloud Hybrid Backup for SMBs
E2E Awareness and insight for IT environments
Poll: What Do You Think of IT Clouds?
Convergence: People, Processes, Policies and Products
What do VARs and Clouds as well as MSPs have in common?
Industry adoption vs. industry deployment, is there a difference?
Cloud conversations: Loss of data access vs. data loss
Clouds and Data Loss: Time for CDP (Commonsense Data Protection)?
Clouds are like Electricity: Dont be scared
Wit and wisdom for BC and DR
Criteria for choosing the right business continuity or disaster recovery consultant
Local and Cloud Hybrid Backup for SMBs
Is cloud disaster recovery appropriate for SMBs?
Laptop data protection: A major headache with many cures
Disaster recovery in the cloud explained
Backup in the cloud: Large enterprises wary, others climbing on board
Cloud and Virtual Data Storage Networking (CRC Press, 2011)
Enterprise Systems Backup and Recovery: A Corporate Insurance Policy

Poll: Who is responsible for cloud storage data loss?

Taking action, what you should (or not) do
Dont be scared of clouds, however do your homework, be ready, look before you leap and follow best practices. Look into the service level agreements (SLAs) associated with a given cloud storage product or service. Follow best practices about how you or someone else will protect what data is put into the cloud.

For critical data or information, consider having a copy of that data in the cloud as well as at or in another place, which could be in a different cloud or local or offsite and offline. Keep in mind the theme for critical information and data is not if, rather when so what can be done to decrease the risk or impact of something happening, in other words, be ready.

Data put into the cloud can be lost, or, loss of access to it can occur for some amount of time just as happens with using non cloud storage such as tape, disk or ssd. What impacts or minimizes your risk of using traditional local or remote as well as cloud storage are the best practices, how configured, protected, secured and managed. Another consideration is the type and quality of the storage product or cloud service can have a big impact. Sure, a quality product or service can fail; however, you can also design and configure to decrease those impacts.

Wrap up
Bottom line, do not be scared of cloud storage, however be ready, do your homework, review best practices, understand benefits and caveats, risk and reward. For those who want to learn more about cloud storage (public, private and hybrid) along with data protection, data management, data footprint reduction among other related topics and best practices, I happen to know of some good resources. Those resources in addition to the links provided above are titled Cloud and Virtual Data Storage Networking (CRC Press) that you can learn more about here as well as find at Amazon among other venues. Also, check out Enterprise Systems Backup and Recovery: A Corporate Insurance Policy by Preston De Guise (aka twitter @backupbear ) which is a great resource for protecting data.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

October 16, 2010April 27, 2025

What do you do when your service provider drops the ball

Do you have a web, internet, backup or other IT cloud service provider of some type?

Do you pay for it, or is it a free service?

Do you take your service provider for granted?

Does your service provider take you or your data for granted?

Does your provider offer some form of service level objectives (SLO)?

For example, Recovery Time Objectives (RTO), Recovery Point Objectives (RPO), Quality of Service (QOS) or if a backup service alternate forms of recovery among others?

So what happens when there is a service disruption, do you threaten to leave the provider and if so, how much does that (or would it) cost you to move?

A couple of weeks ago I was using on a Delta airlines flight from LAX to MSP returning from a west coast speaking engagement event.

During the late evening three hour flight, I was using the gogo inflight wifi service to get caught up on some emails, blog items along with other work items in addition to doing a few twitter tweets while flying high over the real clouds from my virtual office.

During that time, I saw a twitter tweet from Devang Panchigar (@storageNerve) commenting that his hosting service provider Bluehost was down or offline. This caught my attention as Bluehost is also my service provider and a quick check verified that my sites and services were still working. I subsequently sent a tweet to Devang indicating that Bluehost or at least from looking at my sites and services were still functioning, or at least for the time being as I was about to find out. Long story short, about 20 to 25 minutes later, I noticed that I could not longer get to any of my sites, low and behold my Bluehost services were also now offline.

Overall, I have been pleased with Bluehost as a service provider including finding their call support staff very accommodating and easy to work with when I have questions or need something taken care of. Normally I would have simply called Bluehost to see what was going on, however being at about 38,000 feet above the clouds, a quick conversation was not going to be possible. Instead, I checked some forums that revealed Bluehost was experiencing some electrical power issues with their data center (I believe in Utah). Looking at some of the forums as well as various twitter comments, I also decided to check to see if Bluehost CEO Matt Heaton blog was functioning (it was).

It would have been too easy to do one of those irate customer type posts telling them how bad they were, how I was dropping them like a hot potato and then doing a blog post telling everyone to never use them again or along those lines that are far to common and often get deleted as spam.

Instead, I took a different approach (you could have read it here however I just checked and it has been deleted). My comment on Matts blog post took a week or so to be moderated (now since deleted). Essentially my post took the opposite approach of going off on the usual customer tirade instead commenting how ironic that a hosting service for my web site which contains content information about resilient data infrastructure themes was offline.

Now I realize that I am not paying for a high end no downtime always available hosting service, however I also realize that I am paying for a more premium package vs. a basic subscription or even a for free service. While I was not happy about the one hour of downtime around midnight, it was comforting to know that no data was lost and my sites were only offline for a short period of time.

What does all of this mean?

There have been some widely publicized and discussed internet and cloud service related disruptions.

I hope Bluehost continues to improve on their services to stay out of the news for a major disruption as well as minimize or eliminate downtime for their for fee based services.

I also hope that Bluehost CEO Matt Heaton continues to listen to what his customers have to say while improving his services to keep us as customers instead of taking us for granted as some providers or vendors do.

Thanks again to Devang for the tip that there was a service disruption, after all, sometimes we take services for granted and in other situations some service providers take their customers for granted.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

Welcome to the Data Protection Diaries

Getting started, taking a step back

Data protection topics, trends, technologies and related themes

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: