Until the focus expands to data protection, backup is staying alive!
This is the first of a three-part series discussing how and why vendors are keeping backup alive, read part two here.
Some vendors, Value Added Resellers (VARs), pundits (consultants, analysts, media, bloggers) and their followers want backup to not only be declared dead, they also want to attend (or send flowers) to the wake and funeral not to mention proof of burial so to speak.
Yet many of these same vendors, VARs and their pundits also are helping or causing backup to staying alive.
Sure there are plenty of discussion including industry adoption and customer deployment around modernizing backup and data protection that are also tied to disaster recovery (DR), business continuance (BC), high availability (HA) and business resiliency (BR).
On the other hand the usual themes are around talking about product or technology deployment to modernize backup by simply swapping out hardware (e.g. disk for tape, cloud for disk), applying data footprint reduciton (DFR) including archiving, compression and dedupe or, another common scenario of switching from one vendors tool to another.
How vendors are helping backup staying alive?
One of the routine things I hear from vendors among others is that backup needs to move from the 70’s or 80’s or 90’s to the current era when the John Travolta and Oliva Newton John movie Saturday Night Fever and the Bee Gees song "Stayin Alive" appeared (click here to hear the song via Amazon).
Some vendors keep talking and using the term backup instead of expanding the conversation to data protection that includes backup/restore, business continuance (BC), disaster recovery (DR) along with archiving and security. Now let’s be that we can not expect something like backup to be removed from the vocabulary overnight as its been around for decades, hence it will take time.
IMHO: The biggest barrier to moving away from backup is the industry including vendors, their pundits, press/media, vars and customers who continue to insist on using or referring to back up vs. expanding the conversation to data protection. – GS @StorageIO
Until there’s a broad focus on shifting to and using the term data protection including backup, BC, DR and archiving, people will simply keep referring to what they know, read or hear (e.g. backup). On the other hand if the industry starts putting more focus on using data protection with backup, people will stat following suit using the two and over time backup as a term can fade away.
Taking a step back to move forward
Some of the modernizing backup discussions is actually focused on take a step back to reconsider why, when, where, how and with what different applications, systems and data gets protected. certainly there are the various industry trends, challenges and opportunities some of which are shown below including more facts to protect, preserve and service for longer periods of time.
Likewise there are various threat risks or scenarios to protect information assets from or against, not all of which are head-line news making event situations.
Not all threat risks are headline news making events
There is an old saying in and around backup/restore, BC, DR, BR and HA of never letting a disaster go to waste. What this means is that if you have never noticed, there is usually a flurry of marketing and awareness activity including conversations about why you should do something BC, DR and other data protection activities right around, or shortly after a disaster scenario. However not all disasters or incidents are headline news making events and hence there should be more awareness every day vs. just during disaster season or situations. In addition, this also means expanding the focus on other situations that are likely to occur including among others those in the following figure.
Continue reading part two of this series here to see what can be done about shifting the conversation about modernizing data protection. Also check out conversations about trends, themes, technologies, techniques perspectives in my ongoing data protection diaries discussions (e.g. www.storageioblog.com/data-protection-diaries-main/).
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
This is a series of posts about data protection which includes security (logical and physical), backup/restore, business continuance (BC), disaster recovery (DR), business resiliency (BR) along with high availability (HA), archiving and related topic themes, technologies and trends.
Think of data protection like protect, preserve and serve information across cloud, virtual and physical environments spanning traditional servers, storage I/O networking along with mobile (ok, some IoT as well), SOHO/SMB to enterprise.
Getting started, taking a step back
Recently I have done a series of webinars and Google+ hangouts as part of the BackupU initiative brought to you by Dell Software (that’s a disclosure btw ;) ) that are vendor and technology neutral. Instead of the usual vendor product or technology focused seminars and events, these are about getting back to the roots, the fundamentals of what to protect when and why, then decide your options as well as different approaches (e.g. what tools to use when).
In addition over the past year (ok, years) I have also been doing other data protection related events, seminars, workshops, articles, tips, posts across cloud, virtual and physical from SOHO/SMB to enterprise. These are in addition to the other data infrastructure server and storage I/O stuff (e.g. SSD, object storage, software defined, big data, little data, buzzword bingo and others).
Keep in mind that in the data center or information factory everything is not the same as there are different applications, threat risk scenarios, availability and durability among other considerations. In this series like the cloud conversations among others, I’m going to be pulling various data protection themes together hopefully to make it easier for others to find, as well as where I know where to get them.
Some notes for an upcoming post in this series using my Livescribe about data protection
Data protection topics, trends, technologies and related themes
BackupU resources (Sponsored by Dell): Various Application, Virtualization and Related Data Protection links and resources (vendor and product neutral)
BackupU resources (Sponsored by Dell): Various Application, Virtualization and Related Data Protection webinars and G+ chats (vendor and product neutral)
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
Sometimes what should be understood, or that is common sense or that you think everybody should know needs to be stated. After all, there could be somebody who does not know what some assume as common sense or what others know for various reasons. At times, there is simply the need to restate or have a reminder of what should be known.
Consequently, in the data center or information factory, either traditional, virtual, converged, private, hybrid or public cloud, everything is not the same. When I say not everything is the same, is that different applications with various service level objectives (SLO’s) and service level agreements (SLA’s). These are based on different characteristics from performance, availability, reliability, responsiveness, cost, security, privacy among others. Likewise, there are different size and types of organizations with various requirements from enterprise to SMB, ROBO and SOHO, business or government, education or research.
There are also different threat risks for various applications or information services within in an organization, or across different industry sectors. Thus various needs for meeting availability SLA’s, recovery time objectives (RTO’s) and recovery point objectives (RPO’s) for data protection ranging from backup/restore, to high-availability (HA), business continuance (BC), disaster recovery (DR) and archiving. Let us not forget about logical and physical security of information, assets and people, processes and intellectual property.
Some data centers or information factories are compute intensive while others are data centric, some are IO or activity intensive with a mix of compute and storage. On the other hand, some data centers such as a communications hub may be network centric with very little data sticking or being stored.
Even within in a data center or information factory, various applications will have different profiles, protection requirements for big data and little data. There can also be a mix of old legacy applications and new systems developed in-house, purchased, open-source based or accessed as a service. The servers and storage may be software defined (a new buzzword that has already jumped the shark), virtualized or operated in a private, hybrid or community cloud if not using a public service.
Thus, not all things are the same in the data center, or information factories, both those under traditional management paradigms, as well as those supporting public, private, hybrid or community clouds.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
This is the second in a two-part industry trends and perspective looking at learning from cloud incidents, view part I here.
There is good information, insight and lessons to be learned from cloud outages and other incidents.
Sorry cynics no that does not mean an end to clouds, as they are here to stay. However when and where to use them, along with what best practices, how to be ready and configure for use are part of the discussion. This means that clouds may not be for everybody or all applications, or at least today. For those who are into clouds for the long haul (either all in or partially) including current skeptics, there are many lessons to be learned and leveraged.
In order to gain confidence in clouds, some questions that I routinely am asked include are clouds more or less reliable than what you are doing? Depends on what you are doing, and how you will be using the cloud services. If you are applying HA and other BC or resiliency best practices, you may be able to configure and isolate from the more common situations. On the other hand, if you are simply using the cloud services as a low-cost alternative selecting the lowest price and service class (SLAs and SLOs), you might get what you paid for. Thus, clouds are a shared responsibility, the service provider has things they need to do, and the user or person designing how the service will be used have some decisions making responsibilities.
Keep in mind that high availability (HA), resiliency, business continuance (BC) along with disaster recovery (DR) are the sum of several pieces. This includes people, best practices, processes including change management, good design eliminating points of failure and isolating or containing faults, along with how the components or technology used (e.g. hardware, software, networks, services, tools). Good technology used in goods ways can be part of a highly resilient flexible and scalable data infrastructure. Good technology used in the wrong ways may not leverage the solutions to their full potential.
While it is easy to focus on the physical technologies (servers, storage, networks, software, facilities), many of the cloud services incidents or outages have involved people, process and best practices so those need to be considered.
These incidents or outages bring awareness, a level set, that this is still early in the cloud evolution lifecycle and to move beyond seeing clouds as just a way to cut cost, and seeing the importance and value HA, resiliency, BC and DR. This means learning from mistakes, taking action to correct or fix errors, find and cut points of failure are part of a technology maturing or the use of it. These all tie into having services with service level agreements (SLAs) with service level objectives (SLOs) for availability, reliability, durability, accessibility, performance and security among others to protect against mayhem or other things that can and do happen.
The reason I mentioned earlier that AWS had another incident is that like their peers or competitors who have incidents in the past, AWS appears to be going through some growing, maturing, evolution related activities. During summer 2012 there was an AWS incident that affected Netflix (read more here: AWS and the Netflix Fix?). It should also be noted that there were earlier AWS outages where Netflix (read about Netflix architecture here) leveraged resiliency designs to try and prevent mayhem when others were impacted.
Is AWS a lightning rod for things to happen, a point of attraction for Mayhem and others?
Granted given their size, scope of services and how being used on a global basis AWS is blazing new territory and experiences, similar to what other information services delivery platforms did in the past. What I mean is that while taken for granted today, open systems Unix, Linux, Windows-based along with client-server, midrange or distributed systems, not to mention mainframe hardware, software, networks, processes, procedures, best practices all went through growing pains.
There are a couple of interesting threads going on over in various LinkedIn Groups based on some reporters stories including on speculation of what happened, followed with some good discussions of what actually happened and how to prevent recurrence of them in the future.
Over in the Cloud Computing, SaaS & Virtualization group forum, this thread is based on a Forbes article (Amazon AWS Takes Down Netflix on Christmas Eve) and involves conversations about SLAs, best practices, HA and related themes. Have a look at the story the thread is based on and some of the assertions being made, and ensuing discussions.
Also over at LinkedIn, in the Cloud Hosting & Service Providers group forum, this thread is based on a story titled Why Netflix’ Christmas Eve Crash Was Its Own Fault with a good discussion on clouds, HA, BC, DR, resiliency and related themes.
Over at the Virtualization Practice, there is a piece titled Is Amazon Ruining Public Cloud Computing? with comments from me and Adrian Cockcroft (@Adrianco) a Netflix Architect (you can read his blog here). You can also view some presentations about the Netflix architecture here.
What this all means
Saying you get what you pay for would be too easy and perhaps not applicable.
There are good services free, or low-cost, just like good free content and other things, however vice versa, just because something costs more, does not make it better.
Otoh, there are services that charge a premium however may have no better if not worse reliability, same with content for fee or perceived value that is no better than what you get free.
Clouds are real and can be used safely; however, they are a shared responsibility.
Only you can prevent cloud data loss, which means do your homework, be ready.
If something can go wrong, it probably will, particularly if humans are involved.
Prepare for the unexpected and clarify assumptions vs. realities of service capabilities.
Leverage fault isolation and containment to prevent rolling or spreading disasters.
Look at cloud services beyond lowest cost or for cost avoidance.
What is your organizations culture for learning from mistakes vs. fixing blame?
Ask yourself if you, your applications and organization are ready for clouds.
Ask your cloud providers if they are ready for you and your applications.
Identify what your cloud concerns are to decide what can be done about them.
Do a proof of concept to decide what types of clouds and services are best for you.
Do not be scared of clouds, however be ready, do your homework, learn from the mistakes, misfortune and errors of others. Establish and leverage known best practices while creating new ones. Look at the past for guidance to the future, however avoid clinging to, and bringing the baggage of the past to the future. Use new technologies, tools and techniques in new ways vs. using them in old ways.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
This is the first of a two-part industry trends and perspectives series looking at how to learn from cloud outages (read part II here).
In case you missed it, there were some public cloud outages during the recent Christmas 2012-holiday season. One incident involved Microsoft Xbox (view the Microsoft Azure status dashboard here) users were impacted, and the other was another Amazon Web Services (AWS) incident. Microsoft and AWS are not alone, most if not all cloud services have had some type of incident and have gone on to improve from those outages. Google has had issues with different applications and services including some in December 2012 along with a Gmail incident that received covered back in 2011.
For those interested, here is a link to the AWS status dashboard and a link to the AWS December 24 2012 incident postmortem. In the case of the recent AWS incident which affected users such as Netflix, the incident (read the AWS postmortem and Netflix postmortem) was tied to a human error. This is not to say AWS has more outages or incidents vs. others including Microsoft, it just seems that we hear more about AWS when things happen compared to others. That could be due to AWS size and arguably market leading status, diversity of services and scale at which some of their clients are using them.
Btw, if you were not aware, Microsoft Azure is more than just about supporting SQLserver, Exchange, SharePoint or Office, it is also an IaaS layer for running virtual machines such as Hyper-V, as well as a storage target for storing data. You can use Microsoft Azure storage services as a target for backing up or archiving or as general storage, similar to using AWS S3 or Rackspace Cloud files or other services. Some backup and archiving AaaS and SaaS providers including Evault partner with Microsoft Azure as a storage repository target.
When reading some of the coverage of these recent cloud incidents, I am not sure if I am more amazed by some of the marketing cloud washing, or the cloud bashing and uniformed reporting or lack of research and insight. Then again, if someone repeats a myth often enough for others to hear and repeat, as it gets amplified, the myth may assume status of reality. After all, you may know the expression that if it is on the internet then it must be true?
From CRN, here are some cloud service availability status via Nasuni
The above are a small sampling of different stories, articles, columns, blogs, perspectives about cloud services outages or other incidents. Assuming the services are available, you can Google or Bing many others along with reading postmortems to gain insight into what happened, the cause, effect and how to prevent in the future.
Do these recent incidents show a trend of increased cloud outages? Alternatively, do they say that the cloud services are being used more and on a larger basis, thus the impacts become more known?
Perhaps it is a mix of the above, and like when a magnetic storage tape gets lost or stolen, it makes for good news or copy, something to write about. Granted there are fewer tapes actually lost than in the past, and far fewer vs. lost or stolen laptops and other devices with data on them. There are probably other reasons such as the lightning rod effect given how much industry hype around clouds that when something does happen, the cynics or foes come out in force, sometimes with FUD.
Similar to traditional hardware or software based product vendors, some service providers have even tried to convince me that they have never had an incident, lost or corrupted or compromised any data, yeah, right. Candidly, I put more credibility and confidence in a vendor or solution provider who tells me that they have had incidents and taken steps to prevent them from recurring. Granted those steps might be made public while others might be under NDA, at least they are learning and implementing improvements.
As part of gaining insights, here are some links to AWS, Google, Microsoft Azure and other service status dashboards where you can view current and past situations.
Disclosure: I am a customer of AWS for EC2, EBS, S3 and Glacier as well as a customer of Bluehost for hosting and Rackspace for backups. Other than Amazon being a seller of my books (and my blog via Kindle) along with running ads on my sites and being an Amazon Associates member (Google also has ads), none of those mentioned are or have been StorageIO clients.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
Have you modernized your data protection strategy and environment?
If not, are you thinking about updating your strategy and environment?
Why modernize your data protection including backup restore, business continuance (BC), high availability (HA) and disaster recovery (DR) strategy and environment?
Is it to leverage new technology such as disk to disk (D2D) backups, cloud, virtualization, data footprint reduction (DFR) including compression or dedupe?
Perhaps you have or are considering data protection modernization because somebody told you to or you read about it or watched a video or web cast? Or, perhaps your backup and restore are broke so its time to change media or try something different.
Lets take a step back for a moment and ask the question of what is your view of data protection modernization?
Perhaps it is modernizing backup by replacing tape with disk, or disk with clouds?
How about instead of swapping out media, changing backup software?
Or what about virtualizing servers moving from physical machines to virtual machines?
On the other hand maybe your view of modernizing data protection is around using a different product ranging from backup software to a data protection appliance, or snapshots and replication.
The above and others certainly fall under the broad group of backup, restore, BC, DR and archiving, however there is another area which is not as much technology as it is techniques, best practices, processes and procedure based. That is, revisit why data and applications are being protected against what applicable threat risks and associated business risks.
This means reviewing service needs and wants including backup, restore, BC, DR and archiving that in turn drive what data and applications to protect, how often, how many copies and where those are located, along with how long they will be retained.
Modernizing data protection is more than simply swapping out old or broken media like flat tires on a vehicle.
To be effective, data protection modernization involves taking a step back from the technology, tools and buzzword bingo topics to review what is being protected and why. It also means revisiting service level expectations and clarify wants vs. needs which translates to what if for free that is what is wanted, however for a cost then what is required.
Certainly technologies and tools play a role, however simply using new tools and techniques without revisiting data protection challenges at the source will result in new problems that resemble old problems.
Hence to support growth with a constrained or shrinking budget while maintaining or enhancing service levels, the trick is to remove complexity and costs.
This means not treating all data and applications the same, stretch your available resources to be more effective without compromise on service is mantra of modernizing data protection.
I recently came across a piece by Carl Brooks over at IT Tech News Daily that caught my eye, title was Cloud Storage Often Results in Data Loss. The piece has an effective title (good for search engine: SEO optimization) as it stood out from many others I saw on that particular day.
What caught my eye on Carls piece is that it reads as if the facts based on a quick survey point to clouds resulting in data loss, as opposed to being an opinion that some cloud usage can result in data loss.
My opinion is that if not used properly including ignoring best practices, any form of data storage medium or media could result or be blamed for data loss. For some people they have lost data as a result of using cloud storage services just as other people have lost data or access to information on other storage mediums and solutions. For example, data has been lost on tape, Hard Disk Drives (HDDs), Solid State Devices (SSD), Hybrid HDDs (HHDD), RAID and non RAID, local and remote and even optical based storage systems large and small. In some cases, there have been errors or problems with the medium or media, in other cases storage systems have lost access to, or lost data due to hardware, firmware, software, or configuration including due to human error among other issues.
Technology failure: Not if, rather when and how to decrease impact Any technology regardless of what it is or who it is from along with its architecture design and implementation can fail. It is not if, rather when and how gracefully along with what safeguards to decrease the impact, in addition to containing or isolating faults differentiates various products or solutions. How they automatically repair and self heal to keep running or support accessibility and maintain data integrity are important as is how those options are used. Granted a failure may not be technology related per say, rather something associated with human intervention, configuration, change management (or lack thereof) along with accidental or intentional activities.
I follow my advice and best practices when selecting cloud providers looking for good value, service level agreements (SLAs) and service level objectives (SLOs) over low cost or for free services.
In the several years of using cloud based storage and services there has been some loss of access, however no loss of data. Those service disruptions or loss of access to data and services ranged from a few minutes to a little over an hour. In those scenarios, if I could not have waited for cloud storage to become accessible, I could have accessed a local copy if it were available.
Had a major disruption occurred where it would have been several days before I could gain access to that information, or if it were actually lost, I have a data insurance policy. That data insurance policy is part of my business continuance (BC) and disaster recovery (DR) strategy. My BC and DR strategy is a multi layered approach combining local, offline and offsite as along with online cloud data protection and archiving.
Assuming my cloud storage service could get data back to a given point (RPO) in a given amount of time (RTO), I have some options. One option is to wait for the service or information to become available again assuming a local copy is no longer valid or available. Another option is to start restoration from a master gold copy and then roll forward changes from the cloud services as that information becomes available. In other words, I am using cloud storage as another resource that is for both protecting what is local, as well as complimenting how I locally protect things.
Minimize or cut data loss or loss of access Anything important should be protected locally and remotely meaning leveraging cloud and a master or gold backup copy.
To cut the cost of protecting information, I also leverage archives, which mean not all data gets protected the same. Important data is protected more often reducing RPO exposure and speed up RTO during restoration. Other data that is not as important is protected, however on a different frequency with other retention cycles, in other words, tiered data protection. By implementing tiered data protection, best practices, and various technologies including data footprint reduction (DFR) such as archive, compression, dedupe in addition to local disk to disk (D2D), disk to disk to cloud (D2D2C), along with routine copies to offline media (removable HDDs or RHDDs) that go offsite, Im able to stretch my data protection budget further. Not only is my data protection budget stretched further, I have more options to speed up RTO and better detail for recovery and enhanced RPOs.
If you are looking to avoid losing data, or loss of access, it is a simple equation in no particular order:
Strategy and design
Best practices and processes
Various technologies
Quality products
Robust service delivery
Configuration and implementation
SLO and SLA management metrics
People skill set and knowledge
Usage guidelines or terms of service (ToS)
Unfortunately, clouds like other technologies or solutions get a bad reputation or blamed when something goes wrong. Sometimes it is the technology or service that fails, other times it is a combination of errors that resulted in loss of access or lost data. With clouds as has been the case with other storage mediums and systems in the past, when something goes wrong and if it has been hyped, chances are it will become a target for blame or finger pointing vs. determining what went wrong so that it does not occur again. For example cloud storage has been hyped as easy to use, don’t worry, just put your data there, you can get out of the business of managing storage as the cloud will do that magically for you behind the scenes.
The reality is that while cloud storage solutions can offload functions, someone is still responsible for making decisions on its usage and configuration that impact availability. What separates various providers is their ability to design in best practices, isolate and contain faults quickly, have resiliency integrated as part of a solution along with various SLAs aligned to what the service level you are expecting in an easy to use manner.
Does that mean the more you pay the more reliable and resilient a solution should be? No, not necessarily, as there can still be risks including how the solution is used.
Does that mean low cost or for free solutions have the most risk? No, not necessarily as it comes down to how you use or design around those options. In other words, while cloud storage services remove or mask complexity, it still comes down to how you are going to use a given service.
Shared responsibility for cloud (and non cloud) storage data protection Anything important enough that you cannot afford to lose, or have quick access to should be protected in different locations and on various mediums. In other words, balance your risk. Cloud storage service provider toned to take responsibility to meet service expectations for a given SLA and SLOs that you agree to pay for (unless free).
As the customer you have the responsibility of following best practices supplied by the service provider including reading the ToS. Part of the responsibility as a customer or consumer is to understand what are the ToS, SLA and SLOs for a given level of service that you are using. As a customer or consumer, this means doing your homework to be ready as a smart educated buyer or consumer of cloud storage services.
If you are a vendor or value added reseller (VAR), your opportunity is to help customers with the acquisition process to make informed decision. For VARs and solution providers, this can mean up selling customers to a higher level of service by making them aware of the risk and reward benefits as opposed to focus on cost. After all, if a order taker at McDonalds can ask Would you like to super size your order, why cant you as a vendor or solution provider also have a value oriented up sell message.
Additional related links to read more and sources of information:
Poll: Who is responsible for cloud storage data loss?
Taking action, what you should (or not) do Dont be scared of clouds, however do your homework, be ready, look before you leap and follow best practices. Look into the service level agreements (SLAs) associated with a given cloud storage product or service. Follow best practices about how you or someone else will protect what data is put into the cloud.
For critical data or information, consider having a copy of that data in the cloud as well as at or in another place, which could be in a different cloud or local or offsite and offline. Keep in mind the theme for critical information and data is not if, rather when so what can be done to decrease the risk or impact of something happening, in other words, be ready.
Data put into the cloud can be lost, or, loss of access to it can occur for some amount of time just as happens with using non cloud storage such as tape, disk or ssd. What impacts or minimizes your risk of using traditional local or remote as well as cloud storage are the best practices, how configured, protected, secured and managed. Another consideration is the type and quality of the storage product or cloud service can have a big impact. Sure, a quality product or service can fail; however, you can also design and configure to decrease those impacts.
Wrap up Bottom line, do not be scared of cloud storage, however be ready, do your homework, review best practices, understand benefits and caveats, risk and reward. For those who want to learn more about cloud storage (public, private and hybrid) along with data protection, data management, data footprint reduction among other related topics and best practices, I happen to know of some good resources. Those resources in addition to the links provided above are titled Cloud and Virtual Data Storage Networking (CRC Press) that you can learn more about here as well as find at Amazon among other venues. Also, check out Enterprise Systems Backup and Recovery: A Corporate Insurance Policy by Preston De Guise (aka twitter @backupbear ) which is a great resource for protecting data.
Do you have a web, internet, backup or other IT cloud service provider of some type?
Do you pay for it, or is it a free service?
Do you take your service provider for granted?
Does your service provider take you or your data for granted?
Does your provider offer some form of service level objectives (SLO)?
For example, Recovery Time Objectives (RTO), Recovery Point Objectives (RPO), Quality of Service (QOS) or if a backup service alternate forms of recovery among others?
So what happens when there is a service disruption, do you threaten to leave the provider and if so, how much does that (or would it) cost you to move?
A couple of weeks ago I was using on a Delta airlines flight from LAX to MSP returning from a west coast speaking engagement event.
During the late evening three hour flight, I was using the gogo inflight wifi service to get caught up on some emails, blog items along with other work items in addition to doing a few twitter tweets while flying high over the real clouds from my virtual office.
During that time, I saw a twitter tweet from Devang Panchigar (@storageNerve) commenting that his hosting service provider Bluehost was down or offline. This caught my attention as Bluehost is also my service provider and a quick check verified that my sites and services were still working. I subsequently sent a tweet to Devang indicating that Bluehost or at least from looking at my sites and services were still functioning, or at least for the time being as I was about to find out. Long story short, about 20 to 25 minutes later, I noticed that I could not longer get to any of my sites, low and behold my Bluehost services were also now offline.
Overall, I have been pleased with Bluehost as a service provider including finding their call support staff very accommodating and easy to work with when I have questions or need something taken care of. Normally I would have simply called Bluehost to see what was going on, however being at about 38,000 feet above the clouds, a quick conversation was not going to be possible. Instead, I checked some forums that revealed Bluehost was experiencing some electrical power issues with their data center (I believe in Utah). Looking at some of the forums as well as various twitter comments, I also decided to check to see if Bluehost CEO Matt Heaton blog was functioning (it was).
It would have been too easy to do one of those irate customer type posts telling them how bad they were, how I was dropping them like a hot potato and then doing a blog post telling everyone to never use them again or along those lines that are far to common and often get deleted as spam.
Instead, I took a different approach (you could have read it here however I just checked and it has been deleted). My comment on Matts blog post took a week or so to be moderated (now since deleted). Essentially my post took the opposite approach of going off on the usual customer tirade instead commenting how ironic that a hosting service for my web site which contains content information about resilient data infrastructure themes was offline.
Now I realize that I am not paying for a high end no downtime always available hosting service, however I also realize that I am paying for a more premium package vs. a basic subscription or even a for free service. While I was not happy about the one hour of downtime around midnight, it was comforting to know that no data was lost and my sites were only offline for a short period of time.
I hope Bluehost continues to improve on their services to stay out of the news for a major disruption as well as minimize or eliminate downtime for their for fee based services.
I also hope that Bluehost CEO Matt Heaton continues to listen to what his customers have to say while improving his services to keep us as customers instead of taking us for granted as some providers or vendors do.
Thanks again to Devang for the tip that there was a service disruption, after all, sometimes we take services for granted and in other situations some service providers take their customers for granted.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved