This is the first of a two-part industry trends and perspectives series looking at how to learn from cloud outages (read part II here).

In case you missed it, there were some public cloud outages during the recent Christmas 2012-holiday season. One incident involved Microsoft Xbox (view the Microsoft Azure status dashboard here) users were impacted, and the other was another Amazon Web Services (AWS) incident. Microsoft and AWS are not alone, most if not all cloud services have had some type of incident and have gone on to improve from those outages. Google has had issues with different applications and services including some in December 2012 along with a Gmail incident that received covered back in 2011.

For those interested, here is a link to the AWS status dashboard and a link to the AWS December 24 2012 incident postmortem. In the case of the recent AWS incident which affected users such as Netflix, the incident (read the AWS postmortem and Netflix postmortem) was tied to a human error. This is not to say AWS has more outages or incidents vs. others including Microsoft, it just seems that we hear more about AWS when things happen compared to others. That could be due to AWS size and arguably market leading status, diversity of services and scale at which some of their clients are using them.

Btw, if you were not aware, Microsoft Azure is more than just about supporting SQLserver, Exchange, SharePoint or Office, it is also an IaaS layer for running virtual machines such as Hyper-V, as well as a storage target for storing data. You can use Microsoft Azure storage services as a target for backing up or archiving or as general storage, similar to using AWS S3 or Rackspace Cloud files or other services. Some backup and archiving AaaS and SaaS providers including Evault partner with Microsoft Azure as a storage repository target.

When reading some of the coverage of these recent cloud incidents, I am not sure if I am more amazed by some of the marketing cloud washing, or the cloud bashing and uniformed reporting or lack of research and insight. Then again, if someone repeats a myth often enough for others to hear and repeat, as it gets amplified, the myth may assume status of reality. After all, you may know the expression that if it is on the internet then it must be true?

Images licensed for use by StorageIO via
Atomazul / Shutterstock.com

Have AWS and public cloud services become a lightning rod for when things go wrong?

Here is some coverage of various cloud incidents:

The above are a small sampling of different stories, articles, columns, blogs, perspectives about cloud services outages or other incidents. Assuming the services are available, you can Google or Bing many others along with reading postmortems to gain insight into what happened, the cause, effect and how to prevent in the future.

Do these recent incidents show a trend of increased cloud outages? Alternatively, do they say that the cloud services are being used more and on a larger basis, thus the impacts become more known?

Perhaps it is a mix of the above, and like when a magnetic storage tape gets lost or stolen, it makes for good news or copy, something to write about. Granted there are fewer tapes actually lost than in the past, and far fewer vs. lost or stolen laptops and other devices with data on them. There are probably other reasons such as the lightning rod effect given how much industry hype around clouds that when something does happen, the cynics or foes come out in force, sometimes with FUD.

Similar to traditional hardware or software based product vendors, some service providers have even tried to convince me that they have never had an incident, lost or corrupted or compromised any data, yeah, right. Candidly, I put more credibility and confidence in a vendor or solution provider who tells me that they have had incidents and taken steps to prevent them from recurring. Granted those steps might be made public while others might be under NDA, at least they are learning and implementing improvements.

As part of gaining insights, here are some links to AWS, Google, Microsoft Azure and other service status dashboards where you can view current and past situations.

What is your take on IT clouds? Click here to cast your vote and see what others are thinking about clouds.

Ok, nuff said for now (check out part II here )

Disclosure: I am a customer of AWS for EC2, EBS, S3 and Glacier as well as a customer of Bluehost for hosting and Rackspace for backups. Other than Amazon being a seller of my books (and my blog via Kindle) along with running ads on my sites and being an Amazon Associates member (Google also has ads), none of those mentioned are or have been StorageIO clients.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

greg

Recent Posts

RTO Context Matters

RTO Context Matters With RTO context matters similar to many things in and around Information…

2 months ago

Microsoft Azure Elastic SAN from Cloud to On-Prem

What is Azure Elastic SAN Azure Elastic SAN (AES) is a new (now GA) Azure…

9 months ago

Microsoft Hyper-V Is Alive Enhanced With Windows Server 2025

Yes, you read that correctly, Microsoft Hyper-V is alive and enhanced with Windows Server 2025,…

11 months ago

March 31st is world backup day; when is world recovery day

March 31st is world backup day; when is world recovery day If March 31st is…

2 years ago

ToE NVMeoF TCP Performance Line Boost Performance Reduce Costs

ToE NVMeoF TCP Performance Line Boost Performance Reduce Costs. Yes, you read that correct; leverage…

3 years ago