Application Data Availability 4 3 2 1 Data Protection

Application Data Availability 4 3 2 1 Data Protection

4 3 2 1 data protection Application Data Availability Everything Is Not The Same

Application Data Availability 4 3 2 1 Data Protection

This is part two of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application performance, availability, capacity, economic (PACE) attributes that have an impact on data value as well as availability.

4 3 2 1 data protection  Book SDDC

Availability (Accessibility, Durability, Consistency)

Just as there are many different aspects and focus areas for performance, there are also several facets to availability. Note that applications performance requires availability and availability relies on some level of performance.

Availability is a broad and encompassing area that includes data protection to protect, preserve, and serve (backup/restore, archive, BC, BR, DR, HA) data and applications. There are logical and physical aspects of availability including data protection as well as security including key management (manage your keys or authentication and certificates) and permissions, among other things.

Availability = accessibility (can you get to your application and data) + durability (is the data intact and consistent). This includes basic Reliability, Availability, Serviceability (RAS), as well as high availability, accessibility, and durability. “Durable” has multiple meanings, so context is important. Durable means how data infrastructure resources hold up to, survive, and tolerate wear and tear from use (i.e., endurance), for example, Flash SSD or mechanical devices such as Hard Disk Drives (HDDs). Another context for durable refers to data, meaning how many copies in various places.

Server, storage, and I/O network availability topics include:

  • Resiliency and self-healing to tolerate failure or disruption
  • Hardware, software, and services configured for resiliency
  • Accessibility to reach or be reached for handling work
  • Durability and consistency of data to be available for access
  • Protection of data, applications, and assets including security

Additional server I/O and data infrastructure along with storage topics include:

  • Backup/restore, replication, snapshots, sync, and copies
  • Basic Reliability, Availability, Serviceability, HA, fail over, BC, BR, and DR
  • Alternative paths, redundant components, and associated software
  • Applications that are fault-tolerant, resilient, and self-healing
  • Non disruptive upgrades, code (application or software) loads, and activation
  • Immediate data consistency and integrity vs. eventual consistency
  • Virus, malware, and other data corruption or loss prevention

From a data protection standpoint, the fundamental rule or guideline is 4 3 2 1, which means having at least four copies consisting of at least three versions (different points in time), at least two of which are on different systems or storage devices and at least one of those is off-site (on-line, off-line, cloud, or other). There are many variations of the 4 3 2 1 rule shown in the following figure along with approaches on how to manage technology to use. We will go into deeper this subject in later chapters. For now, remember the following.

large version application server storage I/O
4 3 2 1 data protection (via Software Defined Data Infrastructure Essentials)

4    At least four copies of data (or more), Enables durability in case a copy goes bad, deleted, corrupted, failed device, or site.
3    The number (or more) versions of the data to retain, Enables various recovery points in time to restore, resume, restart from.
2    Data located on two or more systems (devices or media/mediums), Enables protection against device, system, server, file system, or other fault/failure.

1    With at least one of those copies being off-premise and not live (isolated from active primary copy), Enables resiliency across sites, as well as space, time, distance gap for protection.

Capacity and Space (What Gets Consumed and Occupied)

In addition to being available and accessible in a timely manner (performance), data (and applications) occupy space. That space is memory in servers, as well as using available consumable processor CPU time along with I/O (performance) including over networks.

Data and applications also consume storage space where they are stored. In addition to basic data space, there is also space consumed for metadata as well as protection copies (and overhead), application settings, logs, and other items. Another aspect of capacity includes network IP ports and addresses, software licenses, server, storage, and network bandwidth or service time.

Server, storage, and I/O network capacity topics include:

  • Consumable time-expiring resources (processor time, I/O, network bandwidth)
  • Network IP and other addresses
  • Physical resources of servers, storage, and I/O networking devices
  • Software licenses based on consumption or number of users
  • Primary and protection copies of data and applications
  • Active and standby data infrastructure resources and sites
  • Data footprint reduction (DFR) tools and techniques for space optimization
  • Policies, quotas, thresholds, limits, and capacity QoS
  • Application and database optimization

DFR includes various techniques, technologies, and tools to reduce the impact or overhead of protecting, preserving, and serving more data for longer periods of time. There are many different approaches to implementing a DFR strategy, since there are various applications and data.

Common DFR techniques and technologies include archiving, backup modernization, copy data management (CDM), clean up, compress, and consolidate, data management, deletion and dedupe, storage tiering, RAID (including parity-based, erasure codes , local reconstruction codes [LRC] , and Reed-Solomon , Ceph Shingled Erasure Code (SHEC ), among others), along with protection configurations along with thin-provisioning, among others.

DFR can be implemented in various complementary locations from row-level compression in database or email to normalized databases, to file systems, operating systems, appliances, and storage systems using various techniques.

Also, keep in mind that not all data is the same; some is sparse, some is dense, some can be compressed or deduped while others cannot. Likewise, some data may not be compressible or dedupable. However, identical copies can be identified with links created to a common copy.

Economics (People, Budgets, Energy and other Constraints)

If one thing in life and technology that is constant is change, then the other constant is concern about economics or costs. There is a cost to enable and maintain a data infrastructure on premise or in the cloud, which exists to protect, preserve, and serve data and information applications.

However, there should also be a benefit to having the data infrastructure to house data and support applications that provide information to users of the services. A common economic focus is what something costs, either as up-front capital expenditure (CapEx) or as an operating expenditure (OpEx) expense, along with recurring fees.

In general, economic considerations include:

  • Budgets (CapEx and OpEx), both up front and in recurring fees
  • Whether you buy, lease, rent, subscribe, or use free and open sources
  • People time needed to integrate and support even free open-source software
  • Costs including hardware, software, services, power, cooling, facilities, tools
  • People time includes base salary, benefits, training and education

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

SDDC Data Infrastructure

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. All applications have some element of performance, availability, capacity, economic (PACE) needs as well as resource demands. There is often a focus around data storage about storage efficiency and utilization which is where data footprint reduction (DFR) techniques, tools, trends and as well as technologies address capacity requirements. However with data storage there is also an expanding focus around storage effectiveness also known as productivity tied to performance, along with availability including 4 3 2 1 data protection. Continue reading the next post (Part III Application Data Characteristics Types Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Six plus data center software defined management dashboards tools

Software defined data infrastructure management insight tools

server storage I/O trends

Updated 1/17/2018

Managing data infrastructures involves using software defined management dashboards tools. Recently I found in my inbox a link to a piece 6 Dashboards for Managing Every Modern Data Center that caught my attention. I was hoping to see who the six different datacenter technologies, dashboard solutions tools were instead of finding list of dashboard considerations for modern data centers and data infrastructures.

Turns out the piece was nothing more than a list of six items featured as part of the vendors (Sunbird) piece about what to look for in a dashboard (e.g. their product). Sure there were some of the usual key performance indicator (KPI) associated with or related to IT Service Management (ITSM), Data Center Infrastructure (Insight/Information) Management (DCIM), Configuration and Change management databases (CMDB), availability, capacity and Performance Management Databases (PMDB) among others.

  • Space
  • Inventory
  • Connectivity
  • Change
  • Environment
  • Power

Dashboard Discussions

Keep in mind however that there are many different types of dashboards (and consoles), some are active along with analytics including correlation, others are passive simply displaying. The focus area also various from physical data center facilities, to applications, to data infrastructures or components such as servers, storage, I/O networks, clouds, virtual, containers among others modern data centers.

Data Infrastructures and SDDI, SDDC, SDI
Data Infrastructures (hardware, software, services, servers, storage, I/O and networks)

This is where some context comes into play as there are different types of dashboards for various audience, technology and focus areas (e.g. domains) across data infrastructure (and other entities). For example do a google search of “dashboard” and see what appears, or “IT dashboard”, “data center dashboard” vs. “datacenter dashboard” among others.

Additional KPIs include:

  • Performance, availability, Capacity and Economic (PACE) attributes
  • Service Level Objectives (SLO), Service Level Agreements (SLAs)
  • Recovery Time Objectives (RTO), Recovery Point Objectives (SLO)
  • IT Service Management (ITSM) and Data Center Infrastructure Management (DCIM)
  • Configuration and Change Management (e.g. things part of CMDB)
  • Performance, availability and capacity (e.g. things part of PMDB)
  • Various focus and layers, cross domain functionality views
  • Costs management including subscriptions, licenses and others

IT Data Center and Data Infrastructure Dashboard Options

For those of you who have made it this far, while not a comprehensive list, the following are some examples of vendors, services or solutions that either are, or have an association with data center, as well as data infrastructure management. Some dashboards or tools are homogenous in that they only work within a given area of focus such as particular cloud, service provider, vendor or solution set. Others are heterogeneous or federated working across different services, solutions, vendors and domain focus areas. Think of these as software defined management (SDM), or, software defined data infrastructure (SDDI) management, software defined data center (SDDC) management among other variations for the modern information factory.

There is a mix of tools that run on site (e.g. on premise) or via cloud services (e.g. manager your on site from the cloud). Likewise, some are for fee, others subscription and some are open source. In addition some of the tools are turnkey while others are do it yourself (DiY) or allow you to customize. Also keep in mind that depending on what your tradecraft (skills, experience, expertise) interest area is, these may or may not be applicable to you, while relevant to others. For example some such as Spiceworks tend to be more helpdesk focused while others on other data center or data infrastructure areas.

There are dashboards for or from AWS, Canonical (Ubuntu), Dell including EMC, Google, HPE, IBM, Microsoft System Center and Azure, NetApp, OpenStack, Oracle, Rackspace, Redhat, Rightscale, Servicenow, Softlayer, Suse and VMware among others.

Blue Medora (various data infrastructure monitoring)
Cloudkitty (open source cloud rating and chargeback)
Collectd (data infrastructure collection and monitoring)
cPanel and whm (web and hosting dashboards)
data infrastructure sddi cpanel

Dashbuilder (customize your dashboard)
Datadog (super easy to get access, download, install, configure and use)
Domo (various data infrastructure monitoring tools)
Extrahop (still waiting to be able to download and try their bits vs. watching a demo)
Firescope (data infrastructure insight and awareness)
Freezer (open source dashboard tools)
Komprise (interesting solution, would like try, however lots of gated material)
Nagios (data infrastructure monitoring)
Openit (data infrastructure tracking, report, monitoring)
Opvizor (data infrastructure monitoring and reporting)

storageio datadog dashboard

Panorama9 (various data infrastructure monitoring and reporting)
Quest (various tools)
Redhat Cloudforms (openstack and cloud management)
Rrdtools (data collection, logging and display)
Sisense (insight and awareness tools)
Solarwinds Server Application Monitor (SAM) among other tools
Teamquest (various monitoring, management, capacity planning tools)
Turbomomic (software defined data infrastructure insight tools)
Virtual Instruments (various monitoring and insight awareness along with analytics)

In addition to the above, there are tools such as Splunk among others that also provide insight and awareness to help avoid flying blind while managing your data center or data infrastructure.

Where to learn more

Learn more via the following links.

  • Data Infrastructure Primer and Overview (Its Whats Inside The Data Center)
  • E2E Awareness and insight for IT environments
  • Server and Storage I/O Benchmarking and Performance Resources
  • Data Center Infrastructure Management (DCIM) and IRM
  • The Value of Infrastructure Insight – Enabling Informed Decision Making
  • More storage and IO metrics that matter
  • Whats a data infrastructure?
  • Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What this all means

    Without insight and awareness you are flying blind, how can you make informed decisions about your information factory, data infrastructures, data center along with applications. There are different focus areas for various audiences up and down the stack layers in data infrastructures and data centers. Key is having insight and awareness including knowing what are some different tool options.

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    In the data center or information factory, not everything is the same

    StorageIO Industry trends and perspectives image

    Sometimes what should be understood, or that is common sense or that you think everybody should know needs to be stated. After all, there could be somebody who does not know what some assume as common sense or what others know for various reasons. At times, there is simply the need to restate or have a reminder of what should be known.

    Storage I/O data center image

    Consequently, in the data center or information factory, either traditional, virtual, converged, private, hybrid or public cloud, everything is not the same. When I say not everything is the same, is that different applications with various service level objectives (SLO’s) and service level agreements (SLA’s). These are based on different characteristics from performance, availability, reliability, responsiveness, cost, security, privacy among others. Likewise, there are different size and types of organizations with various requirements from enterprise to SMB, ROBO and SOHO, business or government, education or research.

    Various levels of HA, BC and DR

    There are also different threat risks for various applications or information services within in an organization, or across different industry sectors. Thus various needs for meeting availability SLA’s, recovery time objectives (RTO’s) and recovery point objectives (RPO’s) for data protection ranging from backup/restore, to high-availability (HA), business continuance (BC), disaster recovery (DR) and archiving. Let us not forget about logical and physical security of information, assets and people, processes and intellectual property.

    Storage IO RTO and RPO image

    Some data centers or information factories are compute intensive while others are data centric, some are IO or activity intensive with a mix of compute and storage. On the other hand, some data centers such as a communications hub may be network centric with very little data sticking or being stored.

    SLA and SLO image

    Even within in a data center or information factory, various applications will have different profiles, protection requirements for big data and little data. There can also be a mix of old legacy applications and new systems developed in-house, purchased, open-source based or accessed as a service. The servers and storage may be software defined (a new buzzword that has already jumped the shark), virtualized or operated in a private, hybrid or community cloud if not using a public service.

    Here are some related posts tied to everything is not the same:
    Optimize Data Storage for Performance and Capacity
    Is SSD only for performance?
    Cloud conversations: Gaining cloud confidence from insights into AWS outages
    Data Center Infrastructure Management (DCIM) and IRM
    Saving Money with Green IT: Time To Invest In Information Factories
    Everything Is Not Equal in the Datacenter, Part 1
    Everything Is Not Equal in the Datacenter, Part 2
    Everything Is Not Equal in the Datacenter, Part 3

    Storage I/O data center image

    Thus, not all things are the same in the data center, or information factories, both those under traditional management paradigms, as well as those supporting public, private, hybrid or community clouds.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)

    StorageIO industry trends cloud, virtualization and big data

    This is the second in a two-part industry trends and perspective looking at learning from cloud incidents, view part I here.

    There is good information, insight and lessons to be learned from cloud outages and other incidents.

    Sorry cynics no that does not mean an end to clouds, as they are here to stay. However when and where to use them, along with what best practices, how to be ready and configure for use are part of the discussion. This means that clouds may not be for everybody or all applications, or at least today. For those who are into clouds for the long haul (either all in or partially) including current skeptics, there are many lessons to be  learned and leveraged.

    In order to gain confidence in clouds, some questions that I routinely am asked include are clouds more or less reliable than what you are doing? Depends on what you are doing, and how you will be using the cloud services. If you are applying HA and other BC or resiliency best practices, you may be able to configure and isolate from the more common situations. On the other hand, if you are simply using the cloud services as a low-cost alternative selecting the lowest price and service class (SLAs and SLOs), you might get what you paid for. Thus, clouds are a shared responsibility, the service provider has things they need to do, and the user or person designing how the service will be used have some decisions making responsibilities.

    Keep in mind that high availability (HA), resiliency, business continuance (BC) along with disaster recovery (DR) are the sum of several pieces. This includes people, best practices, processes including change management, good design eliminating points of failure and isolating or containing faults, along with how the components  or technology used (e.g. hardware, software, networks, services, tools). Good technology used in goods ways can be part of a highly resilient flexible and scalable data infrastructure. Good technology used in the wrong ways may not leverage the solutions to their full potential.

    While it is easy to focus on the physical technologies (servers, storage, networks, software, facilities), many of the cloud services incidents or outages have involved people, process and best practices so those need to be considered.

    These incidents or outages bring awareness, a level set, that this is still early in the cloud evolution lifecycle and to move beyond seeing clouds as just a way to cut cost, and seeing the importance and value HA, resiliency, BC and DR. This means learning from mistakes, taking action to correct or fix errors, find and cut points of failure are part of a technology maturing or the use of it. These all tie into having services with service level agreements (SLAs) with service level objectives (SLOs) for availability, reliability, durability, accessibility, performance and security among others to protect against mayhem or other things that can and do happen.

    Images licensed for use by StorageIO via
    Atomazul / Shutterstock.com

    The reason I mentioned earlier that AWS had another incident is that like their peers or competitors who have incidents in the past, AWS appears to be going through some growing, maturing, evolution related activities. During summer 2012 there was an AWS incident that affected Netflix (read more here: AWS and the Netflix Fix?). It should also be noted that there were earlier AWS outages where Netflix (read about Netflix architecture here) leveraged resiliency designs to try and prevent mayhem when others were impacted.

    Is AWS a lightning rod for things to happen, a point of attraction for Mayhem and others?

    Granted given their size, scope of services and how being used on a global basis AWS is blazing new territory and experiences, similar to what other information services delivery platforms did in the past. What I mean is that while taken for granted today, open systems Unix, Linux, Windows-based along with client-server, midrange or distributed systems, not to mention mainframe hardware, software, networks, processes, procedures, best practices all went through growing pains.

    There are a couple of interesting threads going on over in various LinkedIn Groups based on some reporters stories including on speculation of what happened, followed with some good discussions of what actually happened and how to prevent recurrence of them in the future.

    Over in the Cloud Computing, SaaS & Virtualization group forum, this thread is based on a Forbes article (Amazon AWS Takes Down Netflix on Christmas Eve) and involves conversations about SLAs, best practices, HA and related themes. Have a look at the story the thread is based on and some of the assertions being made, and ensuing discussions.

    Also over at LinkedIn, in the Cloud Hosting & Service Providers group forum, this thread is based on a story titled Why Netflix’ Christmas Eve Crash Was Its Own Fault with a good discussion on clouds, HA, BC, DR, resiliency and related themes.

    Over at the Virtualization Practice, there is a piece titled Is Amazon Ruining Public Cloud Computing? with comments from me and Adrian Cockcroft (@Adrianco) a Netflix Architect (you can read his blog here). You can also view some presentations about the Netflix architecture here.

    What this all means

    Saying you get what you pay for would be too easy and perhaps not applicable.

    There are good services free, or low-cost, just like good free content and other things, however vice versa, just because something costs more, does not make it better.

    Otoh, there are services that charge a premium however may have no better if not worse reliability, same with content for fee or perceived value that is no better than what you get free.

    Additional related material

    Some closing thoughts:

    • Clouds are real and can be used safely; however, they are a shared responsibility.
    • Only you can prevent cloud data loss, which means do your homework, be ready.
    • If something can go wrong, it probably will, particularly if humans are involved.
    • Prepare for the unexpected and clarify assumptions vs. realities of service capabilities.
    • Leverage fault isolation and containment to prevent rolling or spreading disasters.
    • Look at cloud services beyond lowest cost or for cost avoidance.
    • What is your organizations culture for learning from mistakes vs. fixing blame?
    • Ask yourself if you, your applications and organization are ready for clouds.
    • Ask your cloud providers if they are ready for you and your applications.
    • Identify what your cloud concerns are to decide what can be done about them.
    • Do a proof of concept to decide what types of clouds and services are best for you.

    Do not be scared of clouds, however be ready, do your homework, learn from the mistakes, misfortune and errors of others. Establish and leverage known best practices while creating new ones. Look at the past for guidance to the future, however avoid clinging to, and bringing the baggage of the past to the future. Use new technologies, tools and techniques in new ways vs. using them in old ways.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Cloud conversations: Gaining cloud confidence from insights into AWS outages

    StorageIO industry trends cloud, virtualization and big data

    This is the first of a two-part industry trends and perspectives series looking at how to learn from cloud outages (read part II here).

    In case you missed it, there were some public cloud outages during the recent Christmas 2012-holiday season. One incident involved Microsoft Xbox (view the Microsoft Azure status dashboard here) users were impacted, and the other was another Amazon Web Services (AWS) incident. Microsoft and AWS are not alone, most if not all cloud services have had some type of incident and have gone on to improve from those outages. Google has had issues with different applications and services including some in December 2012 along with a Gmail incident that received covered back in 2011.

    For those interested, here is a link to the AWS status dashboard and a link to the AWS December 24 2012 incident postmortem. In the case of the recent AWS incident which affected users such as Netflix, the incident (read the AWS postmortem and Netflix postmortem) was tied to a human error. This is not to say AWS has more outages or incidents vs. others including Microsoft, it just seems that we hear more about AWS when things happen compared to others. That could be due to AWS size and arguably market leading status, diversity of services and scale at which some of their clients are using them.

    Btw, if you were not aware, Microsoft Azure is more than just about supporting SQLserver, Exchange, SharePoint or Office, it is also an IaaS layer for running virtual machines such as Hyper-V, as well as a storage target for storing data. You can use Microsoft Azure storage services as a target for backing up or archiving or as general storage, similar to using AWS S3 or Rackspace Cloud files or other services. Some backup and archiving AaaS and SaaS providers including Evault partner with Microsoft Azure as a storage repository target.

    When reading some of the coverage of these recent cloud incidents, I am not sure if I am more amazed by some of the marketing cloud washing, or the cloud bashing and uniformed reporting or lack of research and insight. Then again, if someone repeats a myth often enough for others to hear and repeat, as it gets amplified, the myth may assume status of reality. After all, you may know the expression that if it is on the internet then it must be true?

    Images licensed for use by StorageIO via
    Atomazul / Shutterstock.com

    Have AWS and public cloud services become a lightning rod for when things go wrong?

    Here is some coverage of various cloud incidents:

    The above are a small sampling of different stories, articles, columns, blogs, perspectives about cloud services outages or other incidents. Assuming the services are available, you can Google or Bing many others along with reading postmortems to gain insight into what happened, the cause, effect and how to prevent in the future.

    Do these recent incidents show a trend of increased cloud outages? Alternatively, do they say that the cloud services are being used more and on a larger basis, thus the impacts become more known?

    Perhaps it is a mix of the above, and like when a magnetic storage tape gets lost or stolen, it makes for good news or copy, something to write about. Granted there are fewer tapes actually lost than in the past, and far fewer vs. lost or stolen laptops and other devices with data on them. There are probably other reasons such as the lightning rod effect given how much industry hype around clouds that when something does happen, the cynics or foes come out in force, sometimes with FUD.

    Similar to traditional hardware or software based product vendors, some service providers have even tried to convince me that they have never had an incident, lost or corrupted or compromised any data, yeah, right. Candidly, I put more credibility and confidence in a vendor or solution provider who tells me that they have had incidents and taken steps to prevent them from recurring. Granted those steps might be made public while others might be under NDA, at least they are learning and implementing improvements.

    As part of gaining insights, here are some links to AWS, Google, Microsoft Azure and other service status dashboards where you can view current and past situations.

    What is your take on IT clouds? Click here to cast your vote and see what others are thinking about clouds.

    Ok, nuff said for now (check out part II here )

    Disclosure: I am a customer of AWS for EC2, EBS, S3 and Glacier as well as a customer of Bluehost for hosting and Rackspace for backups. Other than Amazon being a seller of my books (and my blog via Kindle) along with running ads on my sites and being an Amazon Associates member (Google also has ads), none of those mentioned are or have been StorageIO clients.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Data protection modernization, more than swapping out media

    backup, restore, BC, DR and archiving

    Have you modernized your data protection strategy and environment?

    If not, are you thinking about updating your strategy and environment?

    Why modernize your data protection including backup restore, business continuance (BC), high availability (HA) and disaster recovery (DR) strategy and environment?

    backup, restore, BC, DR and archiving

    Is it to leverage new technology such as disk to disk (D2D) backups, cloud, virtualization, data footprint reduction (DFR) including compression or dedupe?

    Perhaps you have or are considering data protection modernization because somebody told you to or you read about it or watched a video or web cast? Or, perhaps your backup and restore are broke so its time to change media or try something different.

    Lets take a step back for a moment and ask the question of what is your view of data protection modernization?

    Perhaps it is modernizing backup by replacing tape with disk, or disk with clouds?

    Maybe it is leveraging data footprint reduction (DFR) techniques including compression and dedupe?

    Data protection, data footprint reduction, dfr, dedupe, compress

    How about instead of swapping out media, changing backup software?

    Or what about virtualizing servers moving from physical machines to virtual machines?

    On the other hand maybe your view of modernizing data protection is around using a different product ranging from backup software to a data protection appliance, or snapshots and replication.

    The above and others certainly fall under the broad group of backup, restore, BC, DR and archiving, however there is another area which is not as much technology as it is techniques, best practices, processes and procedure based. That is, revisit why data and applications are being protected against what applicable threat risks and associated business risks.

    backup, restore, BC, DR and archiving

    This means reviewing service needs and wants including backup, restore, BC, DR and archiving that in turn drive what data and applications to protect, how often, how many copies and where those are located, along with how long they will be retained.

    backup, restore, BC, DR and archiving

    Modernizing data protection is more than simply swapping out old or broken media like flat tires on a vehicle.

    To be effective, data protection modernization involves taking a step back from the technology, tools and buzzword bingo topics to review what is being protected and why. It also means revisiting service level expectations and clarify wants vs. needs which translates to what if for free that is what is wanted, however for a cost then what is required.

    backup, restore, BC, DR and archiving

    Certainly technologies and tools play a role, however simply using new tools and techniques without revisiting data protection challenges at the source will result in new problems that resemble old problems.

    backup, restore, BC, DR and archiving

    Hence to support growth with a constrained or shrinking budget while maintaining or enhancing service levels, the trick is to remove complexity and costs.

    backup, restore, BC, DR and archiving

    This means not treating all data and applications the same, stretch your available resources to be more effective without compromise on service is mantra of modernizing data protection.

    Ok, nuff said for now, plenty more to discuss later.

    Cheers Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    The blame game: Does cloud storage result in data loss?

    I recently came across a piece by Carl Brooks over at IT Tech News Daily that caught my eye, title was Cloud Storage Often Results in Data Loss. The piece has an effective title (good for search engine: SEO optimization) as it stood out from many others I saw on that particular day.

    Industry Trend: Cloud storage

    What caught my eye on Carls piece is that it reads as if the facts based on a quick survey point to clouds resulting in data loss, as opposed to being an opinion that some cloud usage can result in data loss.

    Data loss

    My opinion is that if not used properly including ignoring best practices, any form of data storage medium or media could result or be blamed for data loss. For some people they have lost data as a result of using cloud storage services just as other people have lost data or access to information on other storage mediums and solutions. For example, data has been lost on tape, Hard Disk Drives (HDDs), Solid State Devices (SSD), Hybrid HDDs (HHDD), RAID and non RAID, local and remote and even optical based storage systems large and small. In some cases, there have been errors or problems with the medium or media, in other cases storage systems have lost access to, or lost data due to hardware, firmware, software, or configuration including due to human error among other issues.

    Data loss

    Technology failure: Not if, rather when and how to decrease impact
    Any technology regardless of what it is or who it is from along with its architecture design and implementation can fail. It is not if, rather when and how gracefully along with what safeguards to decrease the impact, in addition to containing or isolating faults differentiates various products or solutions. How they automatically repair and self heal to keep running or support accessibility and maintain data integrity are important as is how those options are used. Granted a failure may not be technology related per say, rather something associated with human intervention, configuration, change management (or lack thereof) along with accidental or intentional activities.

    Walking the talk
    I have used public cloud storage services for several years including SaaS and AaaS as well as IaaS (See more XaaS here) and knock on wood, have not lost any data yet, loss of access sure, however not data being lost.

    I follow my advice and best practices when selecting cloud providers looking for good value, service level agreements (SLAs) and service level objectives (SLOs) over low cost or for free services.

    In the several years of using cloud based storage and services there has been some loss of access, however no loss of data. Those service disruptions or loss of access to data and services ranged from a few minutes to a little over an hour. In those scenarios, if I could not have waited for cloud storage to become accessible, I could have accessed a local copy if it were available.

    Had a major disruption occurred where it would have been several days before I could gain access to that information, or if it were actually lost, I have a data insurance policy. That data insurance policy is part of my business continuance (BC) and disaster recovery (DR) strategy. My BC and DR strategy is a multi layered approach combining local, offline and offsite as along with online cloud data protection and archiving.

    Assuming my cloud storage service could get data back to a given point (RPO) in a given amount of time (RTO), I have some options. One option is to wait for the service or information to become available again assuming a local copy is no longer valid or available. Another option is to start restoration from a master gold copy and then roll forward changes from the cloud services as that information becomes available. In other words, I am using cloud storage as another resource that is for both protecting what is local, as well as complimenting how I locally protect things.

    Minimize or cut data loss or loss of access
    Anything important should be protected locally and remotely meaning leveraging cloud and a master or gold backup copy.

    To cut the cost of protecting information, I also leverage archives, which mean not all data gets protected the same. Important data is protected more often reducing RPO exposure and speed up RTO during restoration. Other data that is not as important is protected, however on a different frequency with other retention cycles, in other words, tiered data protection. By implementing tiered data protection, best practices, and various technologies including data footprint reduction (DFR) such as archive, compression, dedupe in addition to local disk to disk (D2D), disk to disk to cloud (D2D2C), along with routine copies to offline media (removable HDDs or RHDDs) that go offsite,  Im able to stretch my data protection budget further. Not only is my data protection budget stretched further, I have more options to speed up RTO and better detail for recovery and enhanced RPOs.

    If you are looking to avoid losing data, or loss of access, it is a simple equation in no particular order:

    • Strategy and design
    • Best practices and processes
    • Various technologies
    • Quality products
    • Robust service delivery
    • Configuration and implementation
    • SLO and SLA management metrics
    • People skill set and knowledge
    • Usage guidelines or terms of service (ToS)

    Unfortunately, clouds like other technologies or solutions get a bad reputation or blamed when something goes wrong. Sometimes it is the technology or service that fails, other times it is a combination of errors that resulted in loss of access or lost data. With clouds as has been the case with other storage mediums and systems in the past, when something goes wrong and if it has been hyped, chances are it will become a target for blame or finger pointing vs. determining what went wrong so that it does not occur again. For example cloud storage has been hyped as easy to use, don’t worry, just put your data there, you can get out of the business of managing storage as the cloud will do that magically for you behind the scenes.

    The reality is that while cloud storage solutions can offload functions, someone is still responsible for making decisions on its usage and configuration that impact availability. What separates various providers is their ability to design in best practices, isolate and contain faults quickly, have resiliency integrated as part of a solution along with various SLAs aligned to what the service level you are expecting in an easy to use manner.

    Does that mean the more you pay the more reliable and resilient a solution should be?
    No, not necessarily, as there can still be risks including how the solution is used.

    Does that mean low cost or for free solutions have the most risk?
    No, not necessarily as it comes down to how you use or design around those options. In other words, while cloud storage services remove or mask complexity, it still comes down to how you are going to use a given service.

    Shared responsibility for cloud (and non cloud) storage data protection
    Anything important enough that you cannot afford to lose, or have quick access to should be protected in different locations and on various mediums. In other words, balance your risk. Cloud storage service provider toned to take responsibility to meet service expectations for a given SLA and SLOs that you agree to pay for (unless free).

    As the customer you have the responsibility of following best practices supplied by the service provider including reading the ToS. Part of the responsibility as a customer or consumer is to understand what are the ToS, SLA and SLOs for a given level of service that you are using. As a customer or consumer, this means doing your homework to be ready as a smart educated buyer or consumer of cloud storage services.

    If you are a vendor or value added reseller (VAR), your opportunity is to help customers with the acquisition process to make informed decision. For VARs and solution providers, this can mean up selling customers to a higher level of service by making them aware of the risk and reward benefits as opposed to focus on cost. After all, if a order taker at McDonalds can ask Would you like to super size your order, why cant you as a vendor or solution provider also have a value oriented up sell message.

    Additional related links to read more and sources of information:

    Choosing the Right Local/Cloud Hybrid Backup for SMBs
    E2E Awareness and insight for IT environments
    Poll: What Do You Think of IT Clouds?
    Convergence: People, Processes, Policies and Products
    What do VARs and Clouds as well as MSPs have in common?
    Industry adoption vs. industry deployment, is there a difference?
    Cloud conversations: Loss of data access vs. data loss
    Clouds and Data Loss: Time for CDP (Commonsense Data Protection)?
    Clouds are like Electricity: Dont be scared
    Wit and wisdom for BC and DR
    Criteria for choosing the right business continuity or disaster recovery consultant
    Local and Cloud Hybrid Backup for SMBs
    Is cloud disaster recovery appropriate for SMBs?
    Laptop data protection: A major headache with many cures
    Disaster recovery in the cloud explained
    Backup in the cloud: Large enterprises wary, others climbing on board
    Cloud and Virtual Data Storage Networking (CRC Press, 2011)
    Enterprise Systems Backup and Recovery: A Corporate Insurance Policy

    Poll:  Who is responsible for cloud storage data loss?

    Taking action, what you should (or not) do
    Dont be scared of clouds, however do your homework, be ready, look before you leap and follow best practices. Look into the service level agreements (SLAs) associated with a given cloud storage product or service. Follow best practices about how you or someone else will protect what data is put into the cloud.

    For critical data or information, consider having a copy of that data in the cloud as well as at or in another place, which could be in a different cloud or local or offsite and offline. Keep in mind the theme for critical information and data is not if, rather when so what can be done to decrease the risk or impact of something happening, in other words, be ready.

    Data put into the cloud can be lost, or, loss of access to it can occur for some amount of time just as happens with using non cloud storage such as tape, disk or ssd. What impacts or minimizes your risk of using traditional local or remote as well as cloud storage are the best practices, how configured, protected, secured and managed. Another consideration is the type and quality of the storage product or cloud service can have a big impact. Sure, a quality product or service can fail; however, you can also design and configure to decrease those impacts.

    Wrap up
    Bottom line, do not be scared of cloud storage, however be ready, do your homework, review best practices, understand benefits and caveats, risk and reward. For those who want to learn more about cloud storage (public, private and hybrid) along with data protection, data management, data footprint reduction among other related topics and best practices, I happen to know of some good resources. Those resources in addition to the links provided above are titled Cloud and Virtual Data Storage Networking (CRC Press) that you can learn more about here as well as find at Amazon among other venues. Also, check out Enterprise Systems Backup and Recovery: A Corporate Insurance Policy by Preston De Guise (aka twitter @backupbear ) which is a great resource for protecting data.

    Ok, nuff said for now

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

    What do you do when your service provider drops the ball

    Do you have a web, internet, backup or other IT cloud service provider of some type?

    Do you pay for it, or is it a free service?

    Do you take your service provider for granted?

    Does your service provider take you or your data for granted?

    Does your provider offer some form of service level objectives (SLO)?

    For example, Recovery Time Objectives (RTO), Recovery Point Objectives (RPO), Quality of Service (QOS) or if a backup service alternate forms of recovery among others?

    So what happens when there is a service disruption, do you threaten to leave the provider and if so, how much does that (or would it) cost you to move?

    A couple of weeks ago I was using on a Delta airlines flight from LAX to MSP returning from a west coast speaking engagement event.

    During the late evening three hour flight, I was using the gogo inflight wifi service to get caught up on some emails, blog items along with other work items in addition to doing a few twitter tweets while flying high over the real clouds from my virtual office.

    During that time, I saw a twitter tweet from Devang Panchigar (@storageNerve) commenting that his hosting service provider Bluehost was down or offline. This caught my attention as Bluehost is also my service provider and a quick check verified that my sites and services were still working. I subsequently sent a tweet to Devang indicating that Bluehost or at least from looking at my sites and services were still functioning, or at least for the time being as I was about to find out. Long story short, about 20 to 25 minutes later, I noticed that I could not longer get to any of my sites, low and behold my Bluehost services were also now offline.

    Bluehost

    Overall, I have been pleased with Bluehost as a service provider including finding their call support staff very accommodating and easy to work with when I have questions or need something taken care of. Normally I would have simply called Bluehost to see what was going on, however being at about 38,000 feet above the clouds, a quick conversation was not going to be possible. Instead, I checked some forums that revealed Bluehost was experiencing some electrical power issues with their data center (I believe in Utah). Looking at some of the forums as well as various twitter comments, I also decided to check to see if Bluehost CEO Matt Heaton blog was functioning (it was).

    It would have been too easy to do one of those irate customer type posts telling them how bad they were, how I was dropping them like a hot potato and then doing a blog post telling everyone to never use them again or along those lines that are far to common and often get deleted as spam.

    Instead, I took a different approach (you could have read it here however I just checked and it has been deleted). My comment on Matts blog post took a week or so to be moderated (now since deleted). Essentially my post took the opposite approach of going off on the usual customer tirade instead commenting how ironic that a hosting service for my web site which contains content information about resilient data infrastructure themes was offline.

    Now I realize that I am not paying for a high end no downtime always available hosting service, however I also realize that I am paying for a more premium package vs. a basic subscription or even a for free service. While I was not happy about the one hour of downtime around midnight, it was comforting to know that no data was lost and my sites were only offline for a short period of time.

    What does all of this mean?

    There have been some widely publicized and discussed internet and cloud service related disruptions.

    I hope Bluehost continues to improve on their services to stay out of the news for a major disruption as well as minimize or eliminate downtime for their for fee based services.

    I also hope that Bluehost CEO Matt Heaton continues to listen to what his customers have to say while improving his services to keep us as customers instead of taking us for granted as some providers or vendors do.

    Thanks again to Devang for the tip that there was a service disruption, after all, sometimes we take services for granted and in other situations some service providers take their customers for granted.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Vendors Who Dont’ Want to Be Virtualized?

    Storage I/O trends

    This past week I did a couple of keynote and round table discussions in Plano (Dallas) at Jaspers and in Boston at Smith and Wollensky with a theme of BC/DR for Virtualized environments. In both locations, where we had great participate involvement and discussions, audience members discussed the various merits and their experiences with server virtualization, and one of the many common themes was vendors whose do not support their vertical applications in virtualized environments.

    Say it so Joe (or Jane), especially with so many vendors tripping over themselves to show how their software can be stuffed into a VM in order to jump on the VM bandwagon. How could it be so that some vendors dont’ want to be virtualized?

    It’s true, there are some independent software vendors (ISV) whose vertical packages are commonly deployed in environments of all size who do not for various reasons want nor support their software running in a virtualized environment.

    The reasons some vendors of vertical specific applications do not support their software in virtualized environments can vary from quality of service (QoS), performance, contention and response time or availability concerns, desire to continue selling physical servers and other hardware with their applications, to the desire to keep their application on a server platform that they can control the QoS by insuring that no other applications or changes are made to the server and associated operating system environment.

    Yet another example can be that the vendor has simply not had a chance to test or, to test in various permutations and thus take the route of not supporting their solutions in a virtualized, or, what they may perceive as in a consolidated environment.

    This is in no way a new trend as for decades vendors of vertical software have often take a stance of not allowing other applications to be installed on a server where their software is installed in order for them to maintain QoS and service level agreement (SLA) levels and support guarantees.

    In some cases such as specialized applications including hospital patient care or related systems, this can make sense as well as perhaps complying with regulatory requirements. However there are plenty of other applications where vendors drag their feet or resist supporting virtualized environments without realizing that not all virtualized environments need to be consolidated. That is, a stepping stone or baby step can be to 1st install their software on a VM that has a dedicate physical machine (PM) to validate that their are no instabilities or QoS impacts of running in a VM.

    After some period of time and comfort levels, then the application and its associated VM could be placed along side some other number of VMs in an incremental and methodical manner to determine what if any impacts occur.

    The bottom line is this, not all applications and servers lend themselves to being consolidated for various reasons, however, many of those applications and servers can be virtualized to enable management transparency including facilitating movement to other servers during upgrades or maintenance as well as BC/DR (e.g. life beyond consolidation), a topic that I cover in more detail in my new book “The Green and Virtual Data Center” (Auerbach).

    Likewise, there are some applications that truly for security, QoS, availability, politics, software or hardware dependencies or compatibility among other reasons that should be left alone for now. However there are also many applications where vendors need to re-think or look at why they do not support a virtualized server environment and better articulate those issues to their customers, or, start the testing and qualifications as well as put together best practices guides on how to deploy their applications into virtualized environments.

    Thanks for all of those who ventured out this week in Plano and Boston and participating in the discussion, look forward to seeing and hearing from you again in the not so distant future.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved