post Archives

January 14, 2013December 29, 2025

EMC VMAX 10K, looks like high-end storage systems are still alive (part III)

StorageIO industry trends cloud, virtualization and big data

This is the third in a multi-part series of posts (read first post here and second post here) looking at what else EMC announced today in addition to an enhanced VMAX 10K and dispelling the myth that large storage arrays are dead (or at least for now).

In addition to the VMAX 10K specific updates, EMC also announced the release of a new version of their Enginuity storage software (firmware, storage operating system). Enginuity is supported across all VMAX platforms and features the following:

Replication enhancements include TimeFinder clone refresh, restore and four site SRDF for the VMAX 10K, along with think or thin support. This capability enables functionality across VMAX 10K, 40K or 20K using synchronous or asynchronous and extends earlier 3 site to 4 site and mix modes. Note that larger VMAX systems had the extended replication feature support with VMAX 10K now on par with those. Note that the VMAX can be enhanced with VPLEX in front of storage systems (local or wide area, in region HA and out of region DR) and RecoverPoint behind the systems supporting bi-synchronous (two-way), synchronous and asynchronous data protection (CDP, replication, snapshots).

Unisphere for VMAX 1.5 manages DMX along with VMware VAAI UNMAP and space reclamation, block zero and hardware clone enhancements, IPV6, Microsoft Server 2012 support and VFCache 1.5.

Support for mix of 2.5 inch and 3.5 inch DAEs (disk array enclosures) along with new SAS drive support (high-performance and high-capacity, and various flash-based SSD or EFD).

The addition of a fourth dynamic tier within FAST for supporting third-party virtualized storage, along with compression of in-active, cold or stale data (manual or automatic) with 2 to 1 data footprint reduction (DFR) ratio. Note that EMC was one of early vendors to put compression into its storage systems on a block LUN basis in the CLARiiON (now VNX) along with NetApp and IBM (via their Storwize acquisition). The new fourth tier also means that third-party storage does not have to be the lowest tier in terms of performance or functionality.

Federated Tiered Storage (FTS) is now available on all EMC block storage systems including those with third-party storage attached in virtualization mode (e.g. VMAX). In addition to supporting tiering across its own products, and those of other vendors that have been virtualized when attached to a VMAX, ANSI T10 Data Integrity Field (DIF) is also supported. Read more about T10 DIF here, and here.

Front-end performance enhancements with host I/O limits (Quality of Service or QoS) for multi tenant and cloud environments to balance or prioritize IO across ports and users. This feature can balance based on thresholds for IOPS, bandwidth or both from the VMAX. Note that this feature is independent of any operating system based tool, utility, pathing driver or feature such as VMware DRS and Storage I/O control. Storage groups are created and mapped to specific host ports on the VMAX with the QoS performance thresholds applied to meet specific service level requirements or objectives.

For discussion (or entertainment) purpose, how about the question of if Enginuity qualifies or can be considered as a storage hypervisors (or storage virtualization or virtual storage)? After all, the VMAX is now capable of having third-party storage from other vendors attached to it, something that HDS has done for many years now. For those who feel a storage hypervisor, virtual storage or storage virtualization requires software running on Intel or other commodity based processors, guess what the VMAX uses for CPU processors (granted, you can’t simply download Enginuity software and run on a Dell, HP, IBM, Oracle or SuperMicro server).

I am guessing some of EMC competitors and their surrogates or others who like to play the storage hypervisor card game will be quick to tell you it is not based on various reasons or product comparisons, however you be the judge.

Back to the question of if, traditional high-end storage arrays are dead or dying (from part one in this series).

IMHO as mentioned not yet.

Granted like other technologies that have been declared dead or dying yet still in use (technology zombies), they continue to be enhanced, finding new customers, or existing customers using them in new ways, their roles are evolving, this still alive.

For some environments as has been the case over the past decade or so, there will be a continued migration from large legacy enterprise class storage systems to midrange or modular storage arrays with a mix of SSD and HDD. Thus, watch out for having a death grip not letting go of the past, while being careful about flying blind into the future. Do not be scared, be ready, do your homework with clouds, virtualization and traditional physical resources.

Likewise, there will be the continued migration for some from traditional mid-range class storage arrays to all flash-based appliances. Yet others will continue to leverage all the above in different roles aligned to where their specific features best serve the applications and needs of an organization.

In the case of high-end storage systems such as EMC VMAX (aka formerly known as DMX and Symmetrix before that) based on its Enginuity software, the hardware platforms will continue to evolve as will the software functionality. This means that these systems will evolve to handling more workloads, as well as moving into new environments from service providers to mid-range organizations where the systems were before out of their reach.

Smaller environments have grown larger as have their needs for storage systems while higher end solutions have scaled down to meet needs in different markets. What this means is a convergence of where smaller environments have bigger data storage needs and can afford the capabilities of scaled down or Right-sized storage systems such as the VMAX 10K.

Thus while some of the high-end systems may fade away faster than others, for those that continue to evolve being able to move into different adjacent markets or usage scenarios, they will be around for some time, at least in some environments.

Avoid confusing what is new and cool falling under industry adoption vs. what is productive and practical for customer deployment. Systems like the VMAX 10K are not for all environments or applications; however, for those who are open to exploring alternative solutions and approaches, it could open new opportunities.

If there is a high-end storage system platform (e.g. Enginuity) that continues to evolve, re-invent itself in terms of moving into or finding new uses and markets the EMC VMAX would be at or near the top of such list. For the other vendors of high-end storage system that are also evolving, you can have an Atta boy or Atta girl as well to make you feel better, loved and not left out or off of such list. ;)

Ok, nuff said for now.

Disclosure: EMC is not a StorageIO client; however, they have been in the past directly and via acquisitions that they have done. I am however a customer of EMC via my Iomega IX4 NAS (I never did get the IX2 that I supposedly won at EMCworld ;) ) that I bought on Amazon.com and indirectly via VMware products that I have, oh, and they did sent me a copy of the new book Human Face of Big Data (read more here).

Ok, nuff said (for now).

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 14, 2013December 29, 2025

EMC VMAX 10K, looks like high-end storage systems are still alive (part II)

StorageIO industry trends cloud, virtualization and big data

This is the second in a multi-part series of posts (read first post here) looking at if large enterprise and legacy storage systems are dead, along with what todays EMC VMAX 10K updates mean.

Thus on January 14 2013 it is time for a new EMC Virtual Matrix (VMAX) model 10,000 (10K) storage system. EMC has been promoting their January 14 live virtual event for a while now. January significance is that is when (along with May or June) is when many new systems, solutions or upgrades are made on a staggered basis.

Historically speaking, January and February, along with May and June is when you have seen many of the larger announcements from EMC being made. Case in point, back in February of 2012 VFCache was released, then May (2012) in Las Vegas at EMCworld there were 42 announcements made and others later in the year.

Click here to see images of the car stuffing or click here to watch a video.

Let’s not forget back in February of 2012 VFCache was released, and go back to January 2011 there was the record-setting event in New York City complete with 26 people being compressed, deduped, singled instanced, optimized, stacked and tiered into a mini cooper (Coop) automobile (read and view more here).

Now back to the VMAX 10K enhancements

As an example of a company, product family and specific storage system model, still being alive is the VMAX 10K. Although this announcement by EMC is VMAX 10K centric, there is also a new version of the Enginuity software (firmware, storage operating system, valueware) that runs across all VMAX based systems including VMAX 20K and VMAX 40K. Read here, here and here and here to learn more about VMAX and Enginuity systems in general.

Some main themes of this announcement include Tier 1 reliability, availability and serviceability (RAS) storage systems functionality at tier 2 pricing for traditional, virtual and cloud data centers.

Some other themes of this announcement by EMC:

Flexible, scalable and resilient with performance to meet dynamic needs

Support private, public and hybrid cloud along with federated storage models

Simplified decision-making, acquisition, installation and ongoing management

Enable traditional, virtual and cloud workloads

Complement its siblings VMAX 40K, 20K and SP (Service Provider) models

Note that the VMAX SP is a model configured and optimized for easy self-service and private cloud, storage as a service (SaaS), IT as a Service (ITaaS) and public cloud service providers needing multi-tenant capabilities with service catalogs and associated tools.

So what is new with the VMAX 10K?

It is twice as fast (per EMC performance results) as earlier VMAX 10K by leveraging faster 2.8GHz Intel westmere vs. earlier 2.5GHz westmere processors. In addition to faster cores, there are more, from 4 to 6 on directors, from 8 to 12 on VMAX 10K engines. The PCIe (Gen 2) IO busses remain unchanged as does the RapidIO interconnect. RapidIO used for connecting nodes and engines, while PCIe is used for adapter and device connectivity. Memory stays the same at up to 128GB of global DRAM cache, along with dual virtual matrix interfaces (how the nodes are connected). Note that there is no increase in the amount of DRAM based cache memory in this new VMAX 10K model.

This should prompt the question of for traditional cache centric or dependent for performance storage systems such as VMAX, how much are they now CPU and their associated L1 / L2 cache dependent or effective? Also how much has the Enginuity code under the covers been enhanced to leverage the multiple cores and threads thus shifting from being cache memory dependent processor hungry.

Also new with the updated VMAX 10K include:

Support for dense 2.5 inch drives, along with mixed 2.5 inch and 3.5 inch form factor devices with a maximum of 1,560 HDDs. This means support for 2.5 inch 1TB 7,200 RPM SAS HDDs, along with fast SAS HDDs, SLC/MLC and eMLC solid state devices (SSD) also known as electronic flash devices (EFD). Note that with higher density storage configurations, good disk enclosures become more important to counter or prevent the effects of drive vibration, something that leading vendors are paying attention to and so should customers.

EMC is also with the VMAX 10K adding support for certain 3rd party racks or cabinets to be used for mounting the product. This means being able to mount the VMAX main system and DAE components into selected cabinets or racks to meet specific customer, colo or other environment needs for increased flexibility.

For security, VMAX 10K also supports Data at Rest Encryption or (D@RE) which is implemented within the VMAX platform. All data encrypted on every drive, every drive type (drive independent) within the VMAX platform to avoid performance impacts. AES 256 fixed block encryption with FIPS 140-2 validation (#1610) using embedded or external key management including RSA Key Manager. Note that since the storage system based encryption is done within the VMAX platform or controller, not only is the encrypt / decrypt off-loaded from servers, it also means that any device from SSD to HDD to third-party storage arrays can be encrypted. This is in contrast to drive based approaches such as self encrypting devices (SED) or other full drive encryption approaches. With embedded key management, encryption keys kept and managed within the VMAX system while external mode leverages RSA key management as part of a broader security solution approach.

In terms of addressing ease of decision-making and acquisition, EMC has bundled core Enginuity software suite (virtual provisioning, FTS and FLM, DCP (dynamic cache partitioning), host I/O limits, Optimizer/virtual LUN and integrated RecoverPoint splitter). In addition are bundles for optimization (FAST VP, EMC Unisphere for VMAX with heat map and dashboards), availability (TimeFinder for VMAX 10K) and migration (Symmetrix migration suite, Open Replicator, Open Migrator, SRDF/DM, Federated Live Migration). Additional optional software include RecoverPoint CDP, CRR and CLR, Replication Manager, PowerPath, SRDF/S, SRDF/A and SRDF/DM, Storage Configuration Advisor, Open Replicator with Dynamic Mobility and ControlCenter/ProSphere package.

Who needs a VMAX 10K or where can it be used?

As the entry-level model of the VMAX family, certain organizations who are growing and looking for an alternative to traditional mid-range storage systems should be a primary opportunity. Assuming the VMAX 10K can sell at tier-2 prices with a focus of tier-1 reliability, feature functionality, and simplification while allowing their channel partners to make some money, then EMC can have success with this product. The challenge however will be helping their direct and channel partner sales organizations to avoid competing with their own products (e.g. high-end VNX) vs. those of others.

Consolidation of servers with virtualization, along with storage system consolidation to remove complexity in management and costs should be another opportunity with the ability to virtualize third-party storage. I would expect EMC and their channel partners to place the VMAX 10K with its storage virtualization of third-party storage as an alternative to HDS VSP (aka USP/USPV) and the HP XP P9000 (Hitachi based) products, or for block storage needs the NetApp V-Series among others. There could be some scenarios where the VMAX 10K could be positioned as an alternative to the IBM V7000 (SVC based) for virtualizing third-party storage, or for larger environments, some of the software based appliances where there is a scaling with stability (performance, availability, capacity, ease of management, feature functionality) concerns.

Another area where the VMAX 10K could see action which will fly in the face of some industry thinking is for deployment in new and growing managed service providers (MSP), public cloud, and community clouds (private consortiums) looking for an alternative to open source based, or traditional mid-range solutions. Otoh, I cant wait to hear somebody think outside of both the old and new boxes about how a VMAX 10K could be used beyond traditional applications or functionality. For example filling it up with a few SSDs, and then balance with 1TB 2.5 inch SAS HDD and 3.5 inch 3TB (or larger when available) HDDs as an active archive target leveraging the built-in data compression.

How about if EMC were to support cloud optimized HDDs such as the Seagate Constellation Cloud Storage (CS) HDDs that were announced late in 2012 as well as the newer enterprise class HDDs for opening up new markets? Also keep in mind that some of the new 2.5 inch SAS 10,000 (10K) HDDs have the same performance capabilities as traditional 3.5 inch 15,000 (15K) RPM drives in a smaller footprint to help drive and support increased density of performance and capacity with improved energy effectiveness.

How about attaching a VMAX 10K with the right type of cost-effective (aligned to a given scenario) SSD or HDDs or third-party storage to a cluster or grid of servers that are running OpenStack including Swift, CloudStack, Basho Riak CS, Celversafe, Scality, Caringo, Ceph or even EMCs own ATMOS (that supports external storage) for cloud storage or object based storage solutions? Granted that would be thinking outside of the current or new box thinking to move away from RAID based systems in favor or low-cost JBOD storage in servers, however what the heck, let’s think in pragmatic ways.

Will EMC be able to open new markets and opportunities by making the VMAX and its Enginuity software platform and functionality more accessible and affordable leveraging the VMAX 10K as well as the VMAX SP? Time will tell, after all, I recall back in the mid to late 90s, and then again several times during the 2000s similar questions or conversations not to mention the demise of the large traditional storage systems.

Continue reading about what else EMC announced on January 14 2013 in addition to VMAX 10K updates here in the next post in this series. Also check out Chucks EMC blog to see what he has to say.

Ok, nuff said (for now).

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 14, 2013December 29, 2025

EMC VMAX 10K, looks like high-end storage systems are still alive

StorageIO industry trends cloud, virtualization and big data

This is the first in a multi-part series of posts looking at if large enterprise and legacy storage systems are dead, along with what todays EMC VMAX 10K updates means.

EMC has announced an upgrade, refresh or new version of their previously announced Virtual matrix (VMAX) 10,000 (10K), part of the VMAX family of enterprise class storage systems formerly known as DMX (Direct Matrix) and Symmetrix. I will get back to more coverage on the VMAX 10K and other EMC enhancements in a few moments in part two and three of this series.

Have you heard the industry myth about the demise or outright death of traditional storage systems? This has been particularly the case for high-end enterprise class systems, which by the way which were first, declared dead back in the mid-1990s then at the hands of emerging mid-range storage systems.

Enterprise class storage systems include EMC VMAX, Fujitsu Eternus DX8700, HDS, HP XP P9000 based on the HDS high-end product (OEM from HDS parent Hitachi Ltd.). Note that some HPers or their fans might argue that the P10000 (formerly known as 3PAR) declared as tier 1.5 should also be on the list; I will leave that up to you to decide.

Let us not forget the IBM DS8000 series (whose predecessors was known as the ESS and VSS before that); although some IBMers will tell you that XIV should also be in this list. High-end enterprise class storage systems such as those mentioned above are not alone in being declared dead at the hands of new all solid-state devices (SSD) and their startup vendors, or mixed and hybrid-based solutions.

Some are even declaring dead due to new SSD appliances or systems, and by storage hypervisor or virtual storage array (VSA) the traditional mid-range storage systems that were supposed to have killed off the enterprise systems a decade ago (hmm, DejaVu?).

The mid-range storage systems include among others block (SAN and DAS) and file (NAS) systems from Data Direct Networks (DDN), Dell Complement, EqualLogic and MD series (Netapp Engenio based), EMC VNX and Isilon, Fujitsu Eternus, and HDS HUS mid-range formerly known as AMS. Let us not forget about HP 3PAR or P2000 (DotHill based) or P6000 (EVA which is probably being put out to rest). Then there are the various IBM products (their own and what they OEM from others), NEC, NetApp (FAS and Engenio), Oracle and Starboard (formerly known as Reldata). Note that there are many startups that could be in the above list as well if they were not considering the above to be considered dead, thus causing themselves to also be extinct as well, how ironic ;).

What are some industry trends that I am seeing?

Some vendors and products might be nearing the ends of their useful lives

Some vendors, their products and portfolios continue to evolve and expand

Some vendors and their products are moving into new or adjacent markets

Some vendors are refining where and what to sell when and to who

Some vendors are moving up market, some down market

Some vendors are moving into new markets, others are moving out of markets

Some vendors are declaring others dead to create a new market for their products

One size or approach or technology does not fit all needs, avoid treating all the same

Leverage multiple tools and technology in creative ways

Maximize return on innovation (the new ROI) by using various tools, technologies in ways to boost productivity, effectiveness while removing complexity and cost

Realization that cutting cost can result in reduced resiliency, thus look for and remove complexity with benefit of removing costs without compromise

Storage arrays are moving into new roles, including as back-end storage for cloud, object and other software stacks running on commodity servers to replace JBOD (DejaVu anyone?).

Keep in mind that there is a difference between industry adoption (what is talked about) and customer deployment (what are actually bought and used). Likewise there is technology based on GQ (looks and image) and G2 (functionality, experience).

There is also an industry myth that SSD cannot or has not been successful in traditional storage systems which in some cases has been true with some products or vendors. Otoh, some vendors such as EMC, NetApp and Oracle (among others) are having good success with SSD in their storage systems. Some SSD startup vendors have been more successful on both the G2 and GQ front, while some focus on the GQ or image may not be as successful (or at least yet) in the industry adoption vs. customer deployment game.

For the above mentioned storage systems vendors and products (among others), or at least for most of them there is still have plenty of life in them, granted their role and usage is changing including in some cases being found as back-end storage systems behind servers running virtualization, cloud, object storage and other storage software stacks. Likewise, some of the new and emerging storage systems (hardware, software, valueware, services) and vendors have bright futures while others may end up on the where are they now list.

Are high-end enterprise class or other storage arrays and systems dead at the hands of new startups, virtual storage appliances (VSA), storage hypervisors, storage virtualization, virtual storage and SSD?

Are large storage arrays dead at the hands of SSD?

Have SSDs been unsuccessful with storage arrays (with poll)?

Here are links to two polls where you can cast your vote.

Cast your vote and see results of if large storage arrays and systems are dead here.

Cast your vote and see results of if SSD has not been successful in storage systems.

So what about it, are enterprise or large storage arrays and systems dead?

Perhaps in some tabloids or industry myths (or that some wish for) or in some customer environments, as well as for some vendors or their products that can be the case.

However, IMHO for many other environments (and vendors) the answer is no, granted some will continue to evolve from legacy high-end enterprise class storage systems to mid-range or to appliance or VSA or something else.

There is still life many of the storage systems architectures, platforms and products that have been declared dead for over a decade.

Continue reading about the specifics of the EMC VMAX 10K announcement in the next post in this series here. Also check out Chucks EMC blog to see what he has to say.

Ok, nuff said (for now).

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 7, 2013December 29, 2025

Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)

StorageIO industry trends cloud, virtualization and big data

This is the second in a two-part industry trends and perspective looking at learning from cloud incidents, view part I here.

There is good information, insight and lessons to be learned from cloud outages and other incidents.

Sorry cynics no that does not mean an end to clouds, as they are here to stay. However when and where to use them, along with what best practices, how to be ready and configure for use are part of the discussion. This means that clouds may not be for everybody or all applications, or at least today. For those who are into clouds for the long haul (either all in or partially) including current skeptics, there are many lessons to be learned and leveraged.

In order to gain confidence in clouds, some questions that I routinely am asked include are clouds more or less reliable than what you are doing? Depends on what you are doing, and how you will be using the cloud services. If you are applying HA and other BC or resiliency best practices, you may be able to configure and isolate from the more common situations. On the other hand, if you are simply using the cloud services as a low-cost alternative selecting the lowest price and service class (SLAs and SLOs), you might get what you paid for. Thus, clouds are a shared responsibility, the service provider has things they need to do, and the user or person designing how the service will be used have some decisions making responsibilities.

Keep in mind that high availability (HA), resiliency, business continuance (BC) along with disaster recovery (DR) are the sum of several pieces. This includes people, best practices, processes including change management, good design eliminating points of failure and isolating or containing faults, along with how the components or technology used (e.g. hardware, software, networks, services, tools). Good technology used in goods ways can be part of a highly resilient flexible and scalable data infrastructure. Good technology used in the wrong ways may not leverage the solutions to their full potential.

While it is easy to focus on the physical technologies (servers, storage, networks, software, facilities), many of the cloud services incidents or outages have involved people, process and best practices so those need to be considered.

These incidents or outages bring awareness, a level set, that this is still early in the cloud evolution lifecycle and to move beyond seeing clouds as just a way to cut cost, and seeing the importance and value HA, resiliency, BC and DR. This means learning from mistakes, taking action to correct or fix errors, find and cut points of failure are part of a technology maturing or the use of it. These all tie into having services with service level agreements (SLAs) with service level objectives (SLOs) for availability, reliability, durability, accessibility, performance and security among others to protect against mayhem or other things that can and do happen.

Images licensed for use by StorageIO via
Atomazul / Shutterstock.com

The reason I mentioned earlier that AWS had another incident is that like their peers or competitors who have incidents in the past, AWS appears to be going through some growing, maturing, evolution related activities. During summer 2012 there was an AWS incident that affected Netflix (read more here: AWS and the Netflix Fix?). It should also be noted that there were earlier AWS outages where Netflix (read about Netflix architecture here) leveraged resiliency designs to try and prevent mayhem when others were impacted.

Is AWS a lightning rod for things to happen, a point of attraction for Mayhem and others?

Granted given their size, scope of services and how being used on a global basis AWS is blazing new territory and experiences, similar to what other information services delivery platforms did in the past. What I mean is that while taken for granted today, open systems Unix, Linux, Windows-based along with client-server, midrange or distributed systems, not to mention mainframe hardware, software, networks, processes, procedures, best practices all went through growing pains.

There are a couple of interesting threads going on over in various LinkedIn Groups based on some reporters stories including on speculation of what happened, followed with some good discussions of what actually happened and how to prevent recurrence of them in the future.

Over in the Cloud Computing, SaaS & Virtualization group forum, this thread is based on a Forbes article (Amazon AWS Takes Down Netflix on Christmas Eve) and involves conversations about SLAs, best practices, HA and related themes. Have a look at the story the thread is based on and some of the assertions being made, and ensuing discussions.

Also over at LinkedIn, in the Cloud Hosting & Service Providers group forum, this thread is based on a story titled Why Netflix’ Christmas Eve Crash Was Its Own Fault with a good discussion on clouds, HA, BC, DR, resiliency and related themes.

Over at the Virtualization Practice, there is a piece titled Is Amazon Ruining Public Cloud Computing? with comments from me and Adrian Cockcroft (@Adrianco) a Netflix Architect (you can read his blog here). You can also view some presentations about the Netflix architecture here.

What this all means

Saying you get what you pay for would be too easy and perhaps not applicable.

There are good services free, or low-cost, just like good free content and other things, however vice versa, just because something costs more, does not make it better.

Otoh, there are services that charge a premium however may have no better if not worse reliability, same with content for fee or perceived value that is no better than what you get free.

Additional related material

Some closing thoughts:

Clouds are real and can be used safely; however, they are a shared responsibility.
Only you can prevent cloud data loss, which means do your homework, be ready.
If something can go wrong, it probably will, particularly if humans are involved.
Prepare for the unexpected and clarify assumptions vs. realities of service capabilities.
Leverage fault isolation and containment to prevent rolling or spreading disasters.
Look at cloud services beyond lowest cost or for cost avoidance.
What is your organizations culture for learning from mistakes vs. fixing blame?
Ask yourself if you, your applications and organization are ready for clouds.
Ask your cloud providers if they are ready for you and your applications.
Identify what your cloud concerns are to decide what can be done about them.
Do a proof of concept to decide what types of clouds and services are best for you.

Do not be scared of clouds, however be ready, do your homework, learn from the mistakes, misfortune and errors of others. Establish and leverage known best practices while creating new ones. Look at the past for guidance to the future, however avoid clinging to, and bringing the baggage of the past to the future. Use new technologies, tools and techniques in new ways vs. using them in old ways.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 7, 2013December 29, 2025

Cloud conversations: Gaining cloud confidence from insights into AWS outages

StorageIO industry trends cloud, virtualization and big data

This is the first of a two-part industry trends and perspectives series looking at how to learn from cloud outages (read part II here).

In case you missed it, there were some public cloud outages during the recent Christmas 2012-holiday season. One incident involved Microsoft Xbox (view the Microsoft Azure status dashboard here) users were impacted, and the other was another Amazon Web Services (AWS) incident. Microsoft and AWS are not alone, most if not all cloud services have had some type of incident and have gone on to improve from those outages. Google has had issues with different applications and services including some in December 2012 along with a Gmail incident that received covered back in 2011.

For those interested, here is a link to the AWS status dashboard and a link to the AWS December 24 2012 incident postmortem. In the case of the recent AWS incident which affected users such as Netflix, the incident (read the AWS postmortem and Netflix postmortem) was tied to a human error. This is not to say AWS has more outages or incidents vs. others including Microsoft, it just seems that we hear more about AWS when things happen compared to others. That could be due to AWS size and arguably market leading status, diversity of services and scale at which some of their clients are using them.

Btw, if you were not aware, Microsoft Azure is more than just about supporting SQLserver, Exchange, SharePoint or Office, it is also an IaaS layer for running virtual machines such as Hyper-V, as well as a storage target for storing data. You can use Microsoft Azure storage services as a target for backing up or archiving or as general storage, similar to using AWS S3 or Rackspace Cloud files or other services. Some backup and archiving AaaS and SaaS providers including Evault partner with Microsoft Azure as a storage repository target.

When reading some of the coverage of these recent cloud incidents, I am not sure if I am more amazed by some of the marketing cloud washing, or the cloud bashing and uniformed reporting or lack of research and insight. Then again, if someone repeats a myth often enough for others to hear and repeat, as it gets amplified, the myth may assume status of reality. After all, you may know the expression that if it is on the internet then it must be true?

Images licensed for use by StorageIO via
Atomazul / Shutterstock.com

Have AWS and public cloud services become a lightning rod for when things go wrong?

Here is some coverage of various cloud incidents:

Huffington post coverage of February 2011 Google Gmail incident
Microsoft Azure coverage by Allthingsd.com
Neowin.net covering Microsoft Xbox incident
Google’s Gmail blog coverage of Gmail outage
Forbes article Amazon AWS Takes Down Netflix on Christmas Eve
Over at Performance Critical Apps they assert the AWS incident was Netflix fault
From The Virtualization Practice: Amazon Ruining Public Cloud Computing?
Here is Netflix architect Adrian Cockcroft discussing the recent incident
From StorageIOblog Amazon Web Services (AWS) and the Netflix Fix?
From CRN, here are some cloud service availability status via Nasuni

The above are a small sampling of different stories, articles, columns, blogs, perspectives about cloud services outages or other incidents. Assuming the services are available, you can Google or Bing many others along with reading postmortems to gain insight into what happened, the cause, effect and how to prevent in the future.

Do these recent incidents show a trend of increased cloud outages? Alternatively, do they say that the cloud services are being used more and on a larger basis, thus the impacts become more known?

Perhaps it is a mix of the above, and like when a magnetic storage tape gets lost or stolen, it makes for good news or copy, something to write about. Granted there are fewer tapes actually lost than in the past, and far fewer vs. lost or stolen laptops and other devices with data on them. There are probably other reasons such as the lightning rod effect given how much industry hype around clouds that when something does happen, the cynics or foes come out in force, sometimes with FUD.

Similar to traditional hardware or software based product vendors, some service providers have even tried to convince me that they have never had an incident, lost or corrupted or compromised any data, yeah, right. Candidly, I put more credibility and confidence in a vendor or solution provider who tells me that they have had incidents and taken steps to prevent them from recurring. Granted those steps might be made public while others might be under NDA, at least they are learning and implementing improvements.

As part of gaining insights, here are some links to AWS, Google, Microsoft Azure and other service status dashboards where you can view current and past situations.

AWS service status dashboard
Bluehost server status dashboard
Google App status dashboard
HP cloud service status console (requires login)
Microsoft Azure service status dashboard
Microsoft Xbox service status dashboard
Rackspace service status dashboards

What is your take on IT clouds? Click here to cast your vote and see what others are thinking about clouds.

Ok, nuff said for now (check out part II here )

Disclosure: I am a customer of AWS for EC2, EBS, S3 and Glacier as well as a customer of Bluehost for hosting and Rackspace for backups. Other than Amazon being a seller of my books (and my blog via Kindle) along with running ads on my sites and being an Amazon Associates member (Google also has ads), none of those mentioned are or have been StorageIO clients.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 5, 2013December 29, 2025

The Human Face of Big Data, a Book Review

My copy of the new book The Human Face of Big Data created by Rick Smolan and Jennifer Erwitt arrived yesterday compliments of EMC (the lead sponsor). In addition to EMC, the other sponsors of the book are Cisco, VMware, FedEx, Originate and Tableau software.

To say this is a big book would be an understatement, then again, big data is a big topic with a lot of diversity if you open your eyes and think in a pragmatic way, which once you open and see the pages you will see. This is physically a big book (11x 14 inches) with lots of pictures, texts, stories, factoids and thought stimulating information of the many facets and dimensions of big data across 224 pages.

While Big Data as a buzzword and industry topic theme might be new, along with some of the related technologies, techniques and focus areas, other as aspects have been around for some time. Big data means many things to various people depending on their focus or areas of interest ranging from analytics to images, videos and other big files. A common theme is the fact that there is no such thing as an information or data recession, and that people and data are living longer, getting larger, and we are all addicted to information for various reasons.

Big data needs to be protected and preserved as it has value, or its value can increase over time as new ways to leverage it are discovered which also leads to changing data access and life cycle patterns. With many faces, facets and areas of interests applying to various spheres of influence, big data is not limited to programmatic, scientific, analytical or research, yet there are many current and use cases in those areas.

Big data is not limited to videos for security surveillance, entertainment, telemetry, audio, social media, energy exploration, geosciences, seismic, forecasting or simulation, yet those have been areas of focus for years. Some big data files or objects are millions of bytes (MBytes), billion of bytes (GBytes) or trillion of bytes (TBytes) in size that when put into file systems or object repositories, add up to Exabytes (EB – 1000 TBytes) or Zettabytes (ZB – 1000 EBs). Now if you think those numbers are far-fetched, simply look back to when you thought a TByte, GByte let alone a MByte was big or far-fetched future. Remember, there is no such thing as a data or information recession, people and data are living longer and getting larger.

Big data is more than hadoop, map reduce, SAS or other programmatic and analytical focused tool, solution or platform, yet those all have been and will be significant focus areas in the future. This also means big data is more than data warehouse, data mart, data mining, social media and event or activity log processing which also are main parts have continued roles going forward. Just as there are large MByte, GByte or TByte sized files or objects, there are also millions and billions of smaller files, objects or pieces of information that are part of the big data universe.

You can take a narrow, product, platform, tool, process, approach, application, sphere of influence or domain of interest view towards big data, or a pragmatic view of the various faces and facets. Of course you can also spin everything that is not little-data to be big data and that is where some of the BS about big data comes from. Big data is not exclusive to the data scientist, researchers, academia, governments or analysts, yet there are areas of focus where those are important. What this means is that there are other areas of big data that do not need a data science, computer science, mathematical, statistician, Doctoral Phd or other advanced degree or training, in other words big data is for everybody.

Cover image of Human Face of Big Data Book

Back to how big this book is in both physical size, as well as rich content. Note the size of The Human Face of Big Data book in the adjacent image that for comparison purposes has a copy of my last book Cloud and Virtual Data Storage Networking (CRC), along with a 2.5 inch hard disk drive (HDD) and a growler. The Growler is from Lift Bridge Brewery (Stillwater, MN), after all, reading a big book about big data can create the need for a big beer to address a big thirst for information ;).

The Human Face of Big Data is more than a coffee table or picture book as it is full of with information, factoids and perspectives how information and data surround us every day. Check out the image below and note the 2.5 inch HDD sitting on the top right hand corner of the page above the text. Open up a copy of The Human Face of Big Data and you will see examples of how data and information are all around us, and our dependence upon it.

A look inside the book The Humand Face of Big Data image

Book Details:
Copyright 2012
Against All Odds Productions
ISBN 978-1-4549-0827-2
Hardcover 224 pages, 11 x 0.9 x 14 inches
4.8 pounds, English

There is also an applet to view related videos and images found in the book at HumanFaceofBigData.com/viewer in addition to other material on the companion site www.HumanFacesofBigData.com.

Get your copy of
The Human Face of Big Data at Amazon.com by clicking here or at other venues including by clicking on the following image (Amazon.com).

Some added and related material:
Little data, big data and very big data (VBD) or big BS?
How many degrees separate you and your information?
Hardware, Software, what about Valueware?
Changing Lifecycles and Data Footprint Reduction (Data doesnt have to lose value over time)
Garbage data in, garbage information out, big data or big garbage?
Industry adoption vs. industry deployment, is there a difference?
Is There a Data and I/O Activity Recession?
Industry trend: People plus data are aging and living longer
Supporting IT growth demand during economic uncertain times
No Such Thing as an Information Recession

For those who can see big data in a broad and pragmatic way, perhaps using the visualization aspect this book brings forth the idea that there are and will be many opportunities. Then again for those who have a narrow or specific view of what is or is not big data, there is so much of it around and various types along with focus areas you too will see some benefits.

Do you want to play in or be part of a big data puddle, pond, or lake, or sail and explore the oceans of big data and all the different aspects found in, under and around those bigger broader bodies of water.

Bottom line, this is a great book and read regardless of if you are involved with data and information related topics or themes, the format and design lend itself to any audience. Broaden your horizons, open your eyes, ears and thinking to the many facets and faces of big data that are all around us by getting your copy of The Human Face of Big Data (Click here to go to Amazon for your copy) book.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

December 18, 2012November 26, 2023

December 2012 StorageIO Update news letter

December 2012 News letter

Welcome to the December 2012 year end edition of the StorageIO Update news letter including a new format and added content.

You can get access to this news letter via various social media venues (some are shown below) in addition to StorageIO web sites and subscriptions.

Click on the following links to view the December 2012 edition as brief (short HTML sent via Email) version, or the full HTML or PDF versions.

Visit the news letter page to view previous editions of the StorageIO Update.

You can subscribe to the news letter by clicking here.

Enjoy this edition of the StorageIO Update news letter, let me know your comments and feedback.

Nuff said for now

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

December 13, 2012December 29, 2025

Hardware, Software, what about Valueware?

I am surprised nobody has figured out how to use the term valueware to describe their hardware, software or services solutions, particular around cloud, big data, little data, converged solution stacks or bundles, virtualization and related themes.

Cloud and virtualization building blocks transformed into Valueware

Note that I’m referring to IT hardware and not what you would usually find at a TrueValue hardware store (disclosure, I like to shop there for things to innovate with and address the non IT to do project list).

Instead of value add software or what might otherwise be called an operating system (OS), or middleware, glue, hypervisor, shims or agents, I wonder who will be first to use valueware? Or who will be the first to say they were the first to articulate the value of their industry unique and revolutionary solution using valueware?

For those not familiar, converged solution stack bundles combine server, storage and networking hardware along with management software and other tools in a prepackaged solution from the same or multiple vendors. Examples include Dell VIS (not to be confused with their reference architectures or fish in Dutch), VCE or EMC vBlocks, IBM Puresystems, NetApp FlexPods and Oracle Exaboxes among others.

Why is it that the IT or ICT (for my European friends) industries are not using valueware?

Is Valueware not being used because it has not been brought to their attention yet or part of anybody’s buzzword bingo list or read about in an industry trade rag (publication) or blog (other than here) or on twitter?

Is it because the term value in some marketers opinion or view their research focus groups associate with being cheap or low-cost? If that is the case, I wonder how many of those marketing focus groups actually include active IT or ICT professionals. If those research marketing focus groups contact practicing IT or ICT pros, then there would be a lower degree of separation to the information, vs. professional focus group or survey participants who may have a larger degree of separation from practioneers.

Depending on who uses valueware first and how used, if it becomes popular or trendy, rest assured there would be bandwagon racing to the train station to jump on board the marketing innovation train.

On the other hand, using valueware could be an innovative way to help articulate soft product value (read more about hard and soft product here). For those not familiar, hard product does not simply mean hardware, it includes many technologies (including hardware, software, networks, services) that combined with best practices and other things to create a soft product (solution experience).

Whatever the reason, I am assuming that valueware is not going to be used by creative marketers so let us have some fun with it instead.

Let me rephrase that, let us leave valueware alone, instead look at the esteemed company it is in or with (some are for fun, some are for real).

APIware (having some fun with those who see the world via APIs)
Cloudware (not to be confused with cloud washing)
Firmware (software tied to hardware, is it hardware or software? ;) )
Hardware (something software, virtualization and clouds run on)
Innovationware (not to be confused with a data protection company called Innovation)
Larryware (anything Uncle Larry wants it to be)

Marketware (related to marketecture)
Middleware (software to add value or glue other software together)
Netware (RIP Ray Noorda)
Peopleware (those who use or support IT and cloud services)
Santaware (come on, tis the season right)
Sleepware (disks and servers spin down to sleep using IPM techniques)
Slideware (software defined marketing presentations)
Software (something that runs on hardware)
Solutionware (could be a variation of implementation of soft product)
Stackware (something that can also be done with Tupperware)
Tupperware (something that can be used for food storage)
Valueware (valueware.us points to this page, unless somebody wants to buy or rent it ;) )
Vaporware (does vaporware actually exist?)

More variations can be added to the above list, for example substituting ware for wear. However, I will leave that up to your own creativity and innovation skills.

Let’s see if anybody starts to use Valueware as part of their marketware or value proposition slideware pitches, and if you do use it, let me know, be happy to give you a shout out.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

December 12, 2012December 29, 2025

Predictions, did Mayans have it right, or did we read it wrong?

It is late in the day December 12, 2012 and best I can tell, we are still here, and for some, by time you read this it will be a few days or weeks later which means that either the Mayan calendar had it wrong, or we misinterpret it. Some would say that December 12, 2012 is not the important date, that it is really December 21, 2012 that the world will end, ok, lets wait and see what happens in a few more days.

However taking a step back from the Mayan calendar it dawned on me that some predictions such today’s Mayan calendar forecast is similar to others that happen around this time of the year. That is the annual information technology or IT related predictions made by pundits or anybody else with an opinion, most of which in theory their concepts are not even close. Granted many predictions make good press and media things to read or listen to for entertainment. In some cases, these predictions are variations of what we’re predicted last year in 2011 and the year before in 2010 and they year before that and so forth.

I’m still working on my predictions for 2013 and forward-looking into 2014, however I keep getting interrupted fending off vendors and their PR surrogates calling or emailing asking me if they can make contributions, or write my list for me (how thoughtful of them ;) ). For now one of my predictions is that I hope to get my predictions for 2013 done before 2013, however if you need something to hold you over, check this out from last year, or this from a few months ago.

I will also say that for 2013, those who see or view cloud, virtualization, big data (and little data) in pragmatic terms will be very prosperous. On the other hand, those who have narrow or constrained views will be envious of the others. Likewise plenty of new additions to the buzzword bingo line up with software defined having strong representation.

Like the Mayan calendar predictions, with annual technology predictions, are we reading them wrong, or are they simply wrong and who if anybody cares, or are they just garbage in and garbage out, or big data garbage in, big data garbage out results?

In the meantime, I need to check that my local and cloud backups are working, try a restore test, have plenty of cash on hand, gas tanks full, cerveza in the fridge, propane for the generator and other things ready if the Mayans had it right, just off by a few days ;) .

Ok, nuff said (for now).

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

December 6, 2012December 29, 2025

Storage comments from the field and customers in the trenches

When I was in Europe presenting some sessions at conferences and doing some seminars last month I meet and spoke with one of the attendees at the StorageExpo Holland event. The persons name (Han Breemer) came up to visit with me after one of my presentations that include SSD is in your future: When, where, with what and how, and Cloud and Virtual Data Storage Networking industry trends and perspectives. Note you can find additional material from various conferences and events on the backup, restore, BC, DR and archiving accessible via the resources menu on the StorageIO web site.

As I always do, I invite attendees to feel free and follow-up via email, twitter, Linked In, Google+ or other venue with questions, comments, discussions and what they are seeing or running into in their environments.

Some of the many different items discussed during my StorageExpo presentations included:

Recently Hans followed up and sent me some comments and asked if I would be willing to share them with others such as who ever happens to read this. I also suggested to Hans that he also start a blog (here is link to his new blog), and that I would be happy to post his comments for others to see and join in the conversation which are shown below.

Hans Breemer wrote:

Hi Greg,

we met each other recently at the Dutch Storage Expo after one of your sessions. We briefly discussed the current trends in the storage market, and the “risks” or “threats” (read: challenges) it means to “us”, the storage guys. Often neglected by the sales guys…

Please allow me a few lines to elaborate a bit more and share some thoughts from the field. :-)

1. Bigger is not better?

Each iteration in the new disk technologies (SATA or SAS) means we get less IOPS for the bucks. Pound for pound that is. Of course the absolute amount of IOPS we can get from a HDD increases all the time. where 175 IOPS was top speed a few years ago, we sometimes see figures close to 220 IOPS per physical drive now. This looks good in the brochure, just as the increased capacity does. However, what the brochure doesn’t tell us that if we look at the IOPS/capacity ratio, we’re walking backwards. a few years ago we could easily sell over 1000 IOPS/TB. Currently we can’t anymore. We’re happy to reach 500 IOPS/TB. I know this has always been like that. However with the introduction of SATA in the enterprise storage world, I feel things have gotten even worse.

2. But how about SSD’s then?

True and agree. In the world of HDD’s growing bigger and bigger, we actually need SSD’s, and this technology is the way forward in an IOPS perspective. SSD’s have a great future ahead of them (despite being with us already for some time). I do doubt that at the moment SSD’s already have the economical ability to fill the gap though. They offer many of thousands of IOPS, and for dedicated high-end solutions they offer what we weren’t able to deliver for decades. More IOPS than you need! But what about the “1000 IOPS/TB” market? Let’s call it the middle market.

3. SSD’s as a lubricant?

You must have heard every vendor about Adaptive Storage Tiering, Auto Tiering etc. All based on the theorem that most of our IO’s come from a relative small disk section. Thus we can improve the total performance of our array by only adding a few percent of SSD. Smart technology identifies the hot tracks on our disks, and promotes these to SSD’s. We can even demote cold tracks to big SATA drives. Think green, think ecological footprint, etc. For many applications this works well. Regular Windows server, file servers, VMWare ESX server actually seems to like adaptive storage tiering ,and I think I know why, a positive tradeoff of using VMDK’s. (I might share a few lines about FAST VP do’s and dont’s next time if you don’t mind)

4. How about the middle market them you might ask? or, SSD’s as a band-aid?

For the middle market, the above developments is sort of disaster. Think SAP running on Sun Solaris, think the average Microsoft SQL Server, think Oracle databases. These are the typical applications that need “middle market” IOPS. Many of these applications have a freakish IO pattern. OLTP during daytime, backup in the evening and batch jobs at night. Not to mention end of month runs, DTA (Dev-Test-Acceptance) streets that sleep for two weeks or are constantly upgraded or restored. These applications hardly benefit from “smart technologies”. The IO behavior is too random, too unpredictable leading to saturated SATA pools, and EFD’s that are hardly doing more IO’s than the FC drives they’re supposed to relief. Add more SSD’s we’re told. Use less SATA we’re told. but it hardly works. Recently we acquired a few new Vmax arrays without EFD or FASTVP, for the sole purpose of hosting these typical middle market applications. Affordable, predictable performance. But then again, our existing Vmax 20k had full size 600GB 15rpm drives, with the Vmax 40k we’re “encouraged” to use small form factor 600GB 10krpm drives. Again a small step backwards?

5. The storage tiering debacle.

Last but not least, some words I’d like to share with you about storage tiering. We’re encouraged (again) to sell storage in different tiers. Makes sense. To some extent it does yes. Host you most IO eager application on expensive, SSD based storage. And host your DTA or other less business critical application on FC or SATA quality HDD’s. But what if the less business critical application needs to be backed up in the evening, and while doing so completely saturates your SATA pool? Or what if the Dev server creates just as many IO’s as the Prod environment does? People don’t seem to care it seems. To have people realize how much IO’s they actually need and use, we are reporting IO graphs for all servers in our environment. Our tiering model is based on IOPS/TB and IO response time.

Tier X would be expensive, offering 800 IOPS/TB @ avg 10ms
Tier Y would be the cheaper option offering 400 IOPS/TB @ avg 15 ms

The next step will be to implement front end controls an actually limit a host to some ceiling. for instance, 2 times the limit described in the tier description. thus allowing for peak loads and backups.

Do we need to? I think so…

Greg, this small message is slowly turning into a plea. And that is actually what it is, a plea to our storage vendors, and to our evangelists. If they want us to deliver, I feel they should talk to us, and listen to us (and you!).

Cheers,

Hans Breemer

ps, I love my job, this world and my role to translate promises and demands into solutions that work for my customers. I do take care though not to create solution that will not work, despite what the brochure said.

pps, please feel free to share the above if needed.

Here is my response to Hans:

Hello Hans good to hear from you and thanks for the comments.

Great perspectives and in the course of talking with your peers around the world, you are not alone in your thinking.

Often I see disconnects between customers and vendors. Vendors (often driven by their market research) they know what the customer needs and issues are, and many actually do. However I often see a reliance on market research data with many degrees of separation as opposed to direct and candied insight. Likewise some vendors spend more time talking about how they listen to the customer vs. how time they actually do so.

On the other hand, I routinely see customers fall into the trap of communicating wants (nice to haves) instead of articulating needs (what is required). Then there is confusing industry adoption with customer deployment, not to mention concerns over vendor, technology or services lock-in.

Hope all else is well.

Cheers
gs

Check out Hans new blog and feel free to leave your comments and perspectives here or via other venues.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

December 3, 2012December 29, 2025

HPs big December 3rd storage announcement

HP has been talking and promoting for several weeks (ok, months) their upcoming December 3rd storage announcements from the HP discovery event in Frankfurt Germany.

Well its now afternoon which means the early Monday morning December 3rd embargos have been lifted so I can now talk about what HP shared last Friday about todays announcements. Basically what I received was a series of press releases as well link to their updated web site providing information about todays announcements.

HP Redefines Storage Simplicity with Single Architecture for Enterprises of All Sizes
HP Expands 3PAR Storage Portfolio
HP Extends Enterprise-class Backup and Recovery Across Entire StoreOnce Portfolio
New HP Storage Services Transform How Organizations Manage, Store and Protect Data
HP StoreAll Storage Brings Real-time Intelligence to Big Data Retention and Cloud Storage Customers
HP Introduces HP Capacity on Demand

HP has enhanced the 3PAR aka P10000 with new models including for entry-level, as well as for higher performance enterprises needs. This also should beg the question for many longtime EVA (excuse me, P6000) customers, have they hit the end of the line? For scale out storage, HP has the StoreAll solutions (think about products formerly marketed as certain X9000 models based on Ibrix) with enhancements for analytics, bulk and various types of big data. In addition HP has enhanced its backup and recovery capabilities and Dedupe products including integration with Autonomy (here and here) along with capacity on demand services.

New 3PAR (P10000 models)

New StoreAll storage system

From the surface and what I have been able to see so far, looks like a good set of incremental enhancements from HP. Not much else to say until I can get some time to dig around deep to see what can be found on more details, however check out Calvin Zito (aka @hpstorageguy) the HP storage blogger who should have more information from HP.

Ok, nuff said (for now).

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

November 26, 2012December 29, 2025

Ceph Day Amsterdam 2012 (Object and cloud storage)

Recently while I was in Europe presenting some sessions at conferences and doing some seminars, I was invited by Ed Saipetch (@edsai) of Inktank.com to attend the first Ceph Day in Amsterdam.

Ceph day image

As luck or fate would turn out, I was in Nijkerk which is about an hour train ride from Amsterdam central station plus a free day in my schedule. After a morning train ride and nice walk from Amsterdam Central I arrived at the Tobacco Theatre (a former tobacco trading venue) where Ceph Day was underway, and in time for lunch of Krokettens sandwich.

Lets take a quick step back and address for those not familiar what is Ceph (Cephalanthera) and why it was worth spending a day to attend this event. Ceph is an open source distributed object scale out (e.g. cluster or grid) software platform running on industry standard hardware.

Ceph is used for deploying object storage, cloud storage and managed services, general purpose storage for research, commercial, scientific, high performance computing (HPC) or high productivity computing (commercial) along with backup or data protection and archiving destinations. Other software similar in functionality or capabilities to Ceph include OpenStack Swift, Basho Riak CS, Cleversafe, Scality and Caringo among others. There are also the tin wrapped software (e.g. appliances or pre-packaged) solutions such as Dell DX (Caringo), DataDirect Networks (DDN) WOS, EMC ATMOS and Centera, Amplidata and HDS HCP among others. From a service standpoint, these solutions can be used to build services similar Amazon S3 and Glacier, Rackspace Cloud files and Cloud Block, DreamHost DreamObject and HP Cloud storage among others.

At the heart of Ceph is RADOS a distributed object store that consists of peer nodes functioning as object storage devices (OSD). Data can be accessed via REST (Amazon S3 like) APIs, Libraries, CEPHFS and gateway with information being spread across nodes and OSDs using a CRUSH based algorithm (note Sage Weil is one of the authors of CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data). Ceph is scalable in terms of performance, availability and capacity by adding extra nodes with hard disk drives (HDD) or solid state devices (SSDs). One of the presentations pertained to DreamHost that was an early adopter of Ceph to make their DreamObjects (cloud storage) offering.

In addition to storage nodes, there are also an odd number of monitor nodes to coordinate and manage the Ceph cluster along with optional gateways for file access. In the above figure (via DreamHost), load balancers sit in front of gateways that interact with the storage nodes. The storage node in this example is a physical server with 12 x 3TB HDDs each configured as a OSD.

In the DreamHost example above, there are 90 storage nodes plus 3 management nodes, the total raw storage capacity (no RAID) is about 3PB (12 x 3TB = 36TB x 90 = 3.24PB). Instead of using RAID or mirroring, each objects data is replicated or copied to three (e.g. N=3) different OSDs (on separate nodes), where N is adjustable for a given level of data protection, for a usable storage capacity of about 1PB.

Note that for more usable capacity and lower availability, N could be set lower, or a larger value of N would give more durability or data protection at higher storage capacity overhead cost. In addition to using JBOD configurations with replication, Ceph can also be configured with a combination of RAID and replication providing more flexibility for larger environments to balance performance, availability, capacity and economics.

One of the benefits of Ceph is the flexibility to configure it how you want or need for different applications. This can be in a cost-effective hardware light configuration using JBOD or internal HDDs in small form factor generally available servers, or high density servers and storage enclosures with optional RAID adapters along with SSD. This flexibility is different from some cloud and object storage systems or software tools which take a stance of not using or avoiding RAID vs. providing options and flexibility to configure and use the technology how you see fit.

Here are some links to presentations from Ceph Day:
Introduction and Welcome by Wido den Hollander
Ceph: A Unified Distributed Storage System by Sage Weil
Ceph in the Cloud by Wido den Hollander
DreamObjects: Cloud Object Storage with Ceph by Ross Turk
Cluster Design and Deployment by Greg Farnum
Notes on Librados by Sage Weil

While at Ceph day, I was able to spend a few minutes with Sage Weil Ceph creator and founder of inktank.com to record a pod cast (listen here) about what Ceph is, where and when to use it, along with other related topics. Also while at the event I had a chance to sit down with Curtis (aka Mr. Backup) Preston where we did a simulcast video and pod cast. The simulcast involved Curtis recording this video with me as a guest discussing Ceph, cloud and object storage, backup, data protection and related themes while I recorded this pod cast.

One of the interesting things I heard, or actually did not hear while at the Ceph Day event that I tend to hear at related conferences such as SNW is a focus on where and how to use, configure and deploy Ceph along with various configuration options, replication or copy modes as opposed to going off on erasure codes or other tangents. In other words, instead of focusing on the data protection protocol and algorithms, or what is wrong with the competition or other architectures, the Ceph Day focused was removing cloud and object storage objections and enablement.

Where do you get Ceph? You can get it here, as well as via 42on.com and inktank.com.

Thanks again to Sage Weil for taking time out of his busy schedule to record a pod cast talking about Ceph, as well 42on.com and inktank for hosting, and the invitation to attend the first Ceph Day in Amsterdam.

Returning to Amsterdam central station after Ceph Day

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

November 26, 2012December 29, 2025

Seven databases in seven weeks, a book review of NoSQL databases

Seven Databases in Seven Weeks (A Guide to Modern Databases and the NoSQL Movement) is a book written Eric Redmond (@coderoshi) and Jim Wilson (@hexlib), part of The Pragmatic Programmers (@pragprog) series that takes a look at several non SQL based database systems.

Coverage includes PostgreSQL, Riak, Apache HBase, MongoDB, Apache CouchDB, Neo4J and Redis with plenty of code and architecture examples. Also covered include relational vs. key value, columnar and document based systems among others.

The details: Seven Databases in Seven Weeks
Paperback: 352 pages
Publisher: Pragmatic Bookshelf (May 18, 2012)
Language: English
ISBN-10: 1934356921
ISBN-13: 978-1934356920
Product Dimensions: 7.5 x 0.8 x 9 inches

Buzzwords (or keywords) include availability, consistency, performance and related themes. Others include MongoDB, Cassandra, Redis, Neo4J, JSON, CouchDB, Hadoop, HBase, Amazon Dynamo, Map Reduce, Riak (Basho) and Postgres along with data models including relational, key value, columnar, document and graph along with big data, little data, cloud and object storage.

While this book is not a how to tutorial or installation guide, it does give a deep dive into the different databases covered. The benefit is gaining an understanding of what the different databases are good for, strengths, weakness, where and when to use or choose them for various needs.

A look inside my copy of Seven Databases in Seven Days

Who should this book includes applications developers, programmers, Cloud, big data and IT/ICT architects, planners and designers along with database, server, virtualization and storage professionals. What I like about the book is that it is a great intro and overview along with sufficient depth to understand what these different solutions can and cannot do, when, where and why to use these tools for different situations in a quick read format and plenty of detail.

Would I recommend buying it: Yes, I bought a copy myself on Amazon.com, get your copy by clicking here.

Ok, nuff said

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

November 24, 2012December 29, 2025

Garbage data in, garbage information out, big data or big garbage?

Do you know the computer technology saying, garbage data in results in garbage information out?

In other words even with the best algorithms and hardware, bad, junk or garbage data put in results in garbage information delivered. Of course, you might have data analysis and cleaning software to look for, find and remove bad or garbage data, however that’s for a different post on another day.

If garbage data in results in garbage information out, does garbage big data in result in big garbage out?

I’m sure my sales and marketing friends or their surrogates will jump at the opportunity to tell me why and how big data is the solution to the decades old garbage data in problem.

Likewise they will probably tell me big data is the solution to problems that have not even occurred or been discovered yet, yeah right.

However garbage data does not discriminate or show preference towards big data or little data, in fact it can infiltrate all types of data and systems.

Lets shift gears from big and little data to how all of that information is protected, backed up, replicated, copied for HA, BC, DR, compliance, regulatory or other reasons. I wonder how much garbage data is really out there and many garbage backups, snapshots, replication or other copies of data exist? Sounds like a good reason to modernize data protection.

If we don’t know where the garbage data is, how can we know if there is a garbage copy of the data for protection on some other tape, disk or cloud. That also means plenty of garbage data to compact (e.g. compress and dedupe) to cut its data footprint impact particular with tough economic times.

Does this mean then that the cloud is the new destination for garbage data in different shapes or forms, from online primary to back up and archive?

Does that then make the cloud the new virtual garbage dump for big and little data?

Hmm, I think I need to empty my desktop trash bin and email deleted items among other digital house keeping chores now.

On the other hand, just had a thought about orphaned data and orphaned storage, however lets leave those sleeping dogs lay where they rest for now.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

Share this:

Share this:

Share this:

Share this:

Share this:

The Human Face of Big Data, a Book Review

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: