Can RAID extend the life of nand flash SSD?

Storage I/O trends

Can RAID extend nand flash SSD life?

Imho, the short answer is YES, under some circumstances.

There is a myth and some FUD that RAID (Redundant Array of Independent Disks) can shorten the life durability of nand flash SSD (Solid State Device) vs. HDD (Hard Disk Drives) due to extra IOP’s. The reality is that depending on how configured, RAID level, implementation and other factors, nand flash SSD can be extended as I discuss in this here video.

Video

Nand flash SSD cells and wear

First, there is a myth that nand flash SSD does not have moving parts like hard disk drives (HDD’s) thus do not wear out or break. That is just a myth in that nand flash by its nature wears out with write usage. This is due to how they store data in cells that have a rated number of program erase (P/E) cycles that vary by type of medium. For example, Single Level Cell (SLC) has a longer P/E life duration vs. Multi-Level Cells (MLC) and eMLC that stack multiple cells together.

There are a number of factors that contribute to nand flash wear, also known as duty cycle or durability tied to P/E. For example, some storage systems or controllers do a better job both at the lower level flash translation layer (FTL) in addition to controllers, firmware, caching using DRAM and IO optimization such as write ordering or grouping.

Now what about this RAID and SSD thing?

Ok first as a recap keep in mind that there are many RAID levels along with variations, enhancements and where, or how implemented ranging from software to hardware, adapters to controllers to storage systems.

In the case of RAID 1 or mirroring, just like replication or other one to one or one too many copy operation a write to one device is echoed to another. In the case of RAID 5, data is spread across drives and parity; however, the parity is rotated across all drives in an equal manner.

Some FUD or myths or misunderstandings come into play is that not all RAID 5 implementations as an example are not the same. Some do a better job of buffering or caching data in battery protected mirrored DRAM memory until a full stripe write can occur, or if needed, a partial write.

Another attribute is the chunk or shard size (how much data is sent to each drive member) along with the stripe width (how many drives). Some systems have narrow stripes of say 3+1 or 4+1 or 5+1 while others can be 14+1 or 15+1 or wider. Thus, data can be written across a wider number of drives reducing the P/E consumption or use of a single drive depending on implementation.

How about RAID 6 (dual parity)?

Same thing, it is a matter of how well the implementation is, how the write gathering is done and so forth.

What about RAID wearing out nand flash SSD?

While it is possible that it has or can occur depending on type of RAID implementation, lack of caching or optimization, configuration, type of SSD, RAID level and other things, in general I will say myth busted.

Want some proof?

I could go through a long technical proof point and citing lots of facts, figures, experts and so forth leaving you all silenced and dazed similar to the students listening to Ben Stein in Ferris Buelers day off (Click here to see what I mean) asking “anybody anybody Buleler?

Ben Stein via https://nostagjicmoviesandthings.blogspot.com
Image via nostagjicmoviesandthings.blogspot.com

How about some simple SSD and storage math?

On a very conservative basis, my estimate is that around 250PB of nand flash SSD drives are shipped and installed on a revenue basis attached to or in storage systems and appliances. Combine what Dell + DotHill + EMC + Fujitsu + HDS + HP + IBM (including TMS) + NEC + NetApp + NEC + Oracle among other legacy along with new all flash as well as hybrid vendors (e.g. Cloudbyte, FusionIO (Via their Nexgen acquisition), Kaminario, Greenbytes, Nutanix or Nimble, Purestorage, Starboard or Solidfire, Tegile or Tintri, Violin or Whiptail among others).

It is also a safe assumption based on how customers configure and use those and other storage systems is with some form of RAID. Thus if things were as bad as some researchers were, vendors and their pundits have made them out to be, wouldn’t’t we be hearing of those issues?

Is it just a RAID 5 problem and that RAID 6 magically corrects the problem?

Well, that depends on apples to apples vs. apples to oranges comparisons.

For example if you are using a 14+2 (16 drive) RAID 6 to compare to say a 3+1 (4 drive) RAID 5 that is not a fair comparison. Granted, it is a handy one if you are a vendor that supports wider RAID groups, stripes and ranks vs. those who do not. However also keep in mind that some legacy vendors actually also support wide stripes and RAID groups.

So in some cases the magic is not in the RAID level, rather the implementation or how configured or lack thereof.

Video

Watch this TechTarget produced video recorded live while I was at EMCworld 2013 to learn more.

Otherwise, ok, nuff said (for now).

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

NetApp EF540, something familiar, something new

StorageIO Industry trends and perspectives image

NetApp announced the other day a new all nand flash solid-state devices (SSD) storage system called the EF540 that is available now. The EF540 has something’s new and cool, along with some things familiar, tried, true and proven.

What is new is that the EF540 is an all nand flash multi-level cell (MLC) SSD storage system. What is old is that the EF540 is based on the NetApp E-Series (read more here and here) and SANtricity software with hundreds of thousands installed systems. As a refresher, the E-Series are the storage system technologies and solutions obtained via the Engenio acquisition from LSI in 2011.

Image of NetApp EF540 via ntapgeek.com
Image via www.ntapgeek.com

The EF540 expands the NetApp SSD flash portfolio which includes products such as FlashCache (read cache aka PAM) for controllers in ONTAP based storage systems. Other NetApp items in the NetApp flash portfolio include FlashPool SSD drives for persistent read and write storage in ONTAP based systems. Complimenting FlashCache and FlashPool is the server-side PCIe caching card and software FlashAccel. NetApp is claiming to have revenue shipped 36PB of flash complimenting over 3 Exabytes (EB) of storage while continuing to ship a large amount of SAS and SATA HDD’s.

NetApp also previewed its future FlashRay storage system that should appear in beta later in 2013 and general availability in 2014.

In addition to SSD and flash related announcements, NetApp also announced enhancements to its ONTAP FAS/V6200 series including the FAS/V6220, FAS/V6250 and FAS/V6290.

Some characteristics of the NetApp EF540 and SANtricity include:

  • Two models with 12 or 24 x 6Gbs SAS 800GB MLC SSD devices
  • Up to 9.6TB or 19.2TB physical storage in a 2U (3.5 inch) tall enclosure
  • Dual controllers for redundancy, load-balancing and availability
  • IOP performance of over 300,000 4Kbyte random 100% reads under 1ms
  • 6GByte/sec performance of 512Kbyte sequential reads, 5.5Gbyte/sec random reads
  • Multiple RAID levels (0, 1, 10, 3, 5, 6) and flexible group sizes
  • 12GB of DRAM cache memory in each controller (mirrored)
  • 4 x 8GFC host server-side ports per controller
  • Optional expansion host ports (6Gb SAS, 8GFC, 10Gb iSCSI, 40Gb IBA/SRP)
  • Snapshots and replication (synchronous and asynchronous) including to HDD systems
  • Can be used for traditional IOP intensive little-data, or bandwidth for big-data
  • Proactive SSD wear monitoring and notification alerts
  • Utilizes SANtricity version 10.84

Poll, Are large storage arrays day’s numbered?

EMC and NetApp (along with other vendors) continue to sell large numbers of HDD’s as well as large amounts of SSD. Both EMC and NetApp are taking similar approaches of leveraging PCIe flash cards as cache adding software functionality to compliment underlying storage systems. The benefit is that the cache approach is less disruptive for many environments while allowing improved return on investment (ROI) of existing assets.

EMC

NetApp

Storage systems with HDD and SSD

VMAX, VNX

FAS/V, E-Series

Storage systems with SSD cache

FastCache,

FlashCache

All SSD based storage

VMAX, VNX

EF540

All new SSD system in development

Project X

FlashRay

Server side PCIe SSD cache

VFCache

FlashAcell

Partner ecosystems

Yes

Yes

The best IO is the one that you do not have to do, however the next best are those that have the least cost or affect which is where SSD comes into play. SSD is like real estate in that location matters in terms of providing benefit, as well as how much space or capacity is needed.

What does this all mean?
The NetApp EF540 based on the E-Series storage system architecture is like one of its primary competitors (e.g. EMC VNX also available as an all-flash model). The similarity is that both have been competitors, as well as have been around for over a decade with hundreds of thousands of installed systems. The similarities are also that both continue to evolve their code base leveraging new hardware and software functionality. These improvements have resulted in improved performance, availability, capacity, energy effectiveness and cost reduction.

Whats your take on RAID still being relevant?

From a performance perspective, there are plenty of public workloads and benchmarks including Microsoft ESRP and SPC among others to confirm its performance. Watch for NetApp to release EF540 SPC results given their history of doing so with other E-Series based systems. With those or other results, compare and contrast to other solutions looking not just at IOPS or MB/sec (bandwidth), also latency, functionality and cost.

What does the EF540 compete with?
The EF540 competes with all flash-based SSD solutions (Violin, Solidfire, Purestorage, Whiptail, Kaminario, IBM/TMS, up-coming EMC Project “X” (aka XtremeIO)) among others. Some of those systems use general-purpose servers combined SSD drives, PCIe cards along with management software where others leverage customized platforms with software. To a lesser extent, competition will also be mixed mode SSD and HDD solutions along with some PCIe target SSD cards for some situations.

What to watch and look for:
It will be interesting to view and contrast public price performance results using SPC or Microsoft ESRP among others to see how the EF540 compares. In addition, it will be interesting to compare other storage based, as well as SSD systems beyond the number of IOPS. What will be interesting is to keep an eye on latency, as well as bandwidth, feature functionality and associated costs.

Given that the NetApp E-Series are OEM or sold by third parties, let’s see if something looking similar or identical to the EF540 appear at any of those or new partners. This includes traditional general purpose and little-data environments, along with cloud, managed service provider, high performance compute and high productivity compute (HPC), super computer (SC), big data and big bandwidth among others.

Poll, Have SSD been successful in traditional storage systems and arrays

The EF540 could also appear as a storage or IO accelerator for large-scale out, clustered, grid and object storage systems for meta data, indices, key value stores among other uses either direct attached to servers, or via shared iSCSI, SAS, FC and InfiniBand (IBA) SCSI Remote Protocol (SRP).

Keep an eye on how the startups that have been primarily Just a Bunch Of SSD (JBOS) in a box start talking about adding new features and functionality such as snapshots, replication or price reductions. Also, keep an eye and ear open to what EMC does with project “X” along with NetApp FlashRay among other improvements.

For NetApp customers, prospects, partners, E-Series OEMs and their customers with the need for IO consolidation, or performance optimization for big-data, little-data and related applications the EF540 opens up new opportunities and should be good news. For EMC competitors, they now have new competition which also signals an expanding market with new opportunities in adjacent areas for growth. This also further signals the need for diverse ssd portfolios and product options to meet different customer application needs, along with increased functionality vs. lowest cost for high capacity fast nand SSD storage.

Some related reading:

Disclosure: NetApp, Engenio (when LSI), EMC and TMS (now IBM) have been clients of StorageIO.

Ok, nuff said

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

EMC VMAX 10K, looks like high-end storage systems are still alive (part II)

StorageIO industry trends cloud, virtualization and big data

This is the second in a multi-part series of posts (read first post here) looking at if large enterprise and legacy storage systems are dead, along with what todays EMC VMAX 10K updates mean.

Thus on January 14 2013 it is time for a new EMC Virtual Matrix (VMAX) model 10,000 (10K) storage system. EMC has been promoting their January 14 live virtual event for a while now. January significance is that is when (along with May or June) is when many new systems, solutions or upgrades are made on a staggered basis.

Historically speaking, January and February, along with May and June is when you have seen many of the larger announcements from EMC being made. Case in point, back in February of 2012 VFCache was released, then May (2012) in Las Vegas at EMCworld there were 42 announcements made and others later in the year.

Click here to see images of the car stuffing or click here to watch a video.

Let’s not forget back in February of 2012 VFCache was released, and go back to January 2011 there was the record-setting event in New York City complete with 26 people being compressed, deduped, singled instanced, optimized, stacked and tiered into a mini cooper (Coop) automobile (read and view more here).

Now back to the VMAX 10K enhancements

As an example of a company, product family and specific storage system model, still being alive is the VMAX 10K. Although this announcement by EMC is VMAX 10K centric, there is also a new version of the Enginuity software (firmware, storage operating system, valueware) that runs across all VMAX based systems including VMAX 20K and VMAX 40K. Read here, here and here and here to learn more about VMAX and Enginuity systems in general.

Some main themes of this announcement include Tier 1 reliability, availability and serviceability (RAS) storage systems functionality at tier 2 pricing for traditional, virtual and cloud data centers.

Some other themes of this announcement by EMC:

  • Flexible, scalable and resilient with performance to meet dynamic needs
  • Support private, public and hybrid cloud along with federated storage models
  • Simplified decision-making, acquisition, installation and ongoing management
  • Enable traditional, virtual and cloud workloads
  • Complement its siblings VMAX 40K, 20K and SP (Service Provider) models

Note that the VMAX SP is a model configured and optimized for easy self-service and private cloud, storage as a service (SaaS), IT as a Service (ITaaS) and public cloud service providers needing multi-tenant capabilities with service catalogs and associated tools.

So what is new with the VMAX 10K?

It is twice as fast (per EMC performance results) as earlier VMAX 10K by leveraging faster 2.8GHz Intel westmere vs. earlier 2.5GHz westmere processors. In addition to faster cores, there are more, from 4 to 6 on directors, from 8 to 12 on VMAX 10K engines. The PCIe (Gen 2) IO busses remain unchanged as does the RapidIO interconnect.  RapidIO  used for connecting nodes and engines,  while PCIe is used for adapter and device connectivity. Memory stays the same at up to 128GB of global DRAM cache, along with dual virtual matrix interfaces (how the nodes are connected). Note that there is no increase in the amount of DRAM based cache memory in this new VMAX 10K model.

This should prompt the question of for traditional cache centric or dependent for performance storage systems such as VMAX, how much are they now CPU and their associated L1 / L2 cache dependent or effective? Also how much has the Enginuity code under the covers been enhanced to leverage the multiple cores and threads thus shifting from being cache memory dependent processor hungry.

Also new with the updated VMAX 10K include:

  • Support for dense 2.5 inch drives, along with mixed 2.5 inch and 3.5 inch form factor devices with a maximum of 1,560 HDDs. This means support for 2.5 inch 1TB 7,200 RPM SAS HDDs, along with fast SAS HDDs, SLC/MLC and eMLC solid state devices (SSD) also known as electronic flash devices (EFD). Note that with higher density storage configurations, good disk enclosures become more important to counter or prevent the effects of drive vibration, something that leading vendors are paying attention to and so should customers.
  • EMC is also with the VMAX 10K adding support for certain 3rd party racks or cabinets to be used for mounting the product. This means being able to mount the VMAX main system and DAE components into selected cabinets or racks to meet specific customer, colo or other environment needs for increased flexibility.
  • For security, VMAX 10K also supports Data at Rest Encryption or (D@RE) which is implemented within the VMAX platform. All data encrypted on every drive, every drive type (drive independent) within the VMAX platform to avoid performance impacts. AES 256 fixed block encryption with FIPS 140-2 validation (#1610) using embedded or external key management including RSA Key Manager. Note that since the storage system based encryption is done within the VMAX platform or controller, not only is the encrypt / decrypt off-loaded from servers, it also means that any device from SSD to HDD to third-party storage arrays can be encrypted. This is in contrast to drive based approaches such as self encrypting devices (SED) or other full drive encryption approaches. With embedded key management, encryption keys kept and managed within the VMAX system while external mode leverages RSA key management as part of a broader security solution approach.
  • In terms of addressing ease of decision-making and acquisition, EMC has bundled core Enginuity software suite (virtual provisioning, FTS and FLM, DCP (dynamic cache partitioning), host I/O limits, Optimizer/virtual LUN and integrated RecoverPoint splitter). In addition are bundles for optimization (FAST VP, EMC Unisphere for VMAX with heat map and dashboards), availability (TimeFinder for VMAX 10K) and migration (Symmetrix migration suite, Open Replicator, Open Migrator, SRDF/DM, Federated Live Migration). Additional optional software include RecoverPoint CDP, CRR and CLR, Replication Manager, PowerPath, SRDF/S, SRDF/A and SRDF/DM, Storage Configuration Advisor, Open Replicator with Dynamic Mobility and ControlCenter/ProSphere package.

Who needs a VMAX 10K or where can it be used?

As the entry-level model of the VMAX family, certain organizations who are growing and looking for an alternative to traditional mid-range storage systems should be a primary opportunity. Assuming the VMAX 10K can sell at tier-2 prices with a focus of tier-1 reliability, feature functionality, and simplification while allowing their channel partners to make some money, then EMC can have success with this product. The challenge however will be helping their direct and channel partner sales organizations to avoid competing with their own products (e.g. high-end VNX) vs. those of others.

Consolidation of servers with virtualization, along with storage system consolidation to remove complexity in management and costs should be another opportunity with the ability to virtualize third-party storage. I would expect EMC and their channel partners to place the VMAX 10K with its storage virtualization of third-party storage as an alternative to HDS VSP (aka USP/USPV) and the HP XP P9000 (Hitachi based) products, or for block storage needs the NetApp V-Series among others. There could be some scenarios where the VMAX 10K could be positioned as an alternative to the IBM V7000 (SVC based) for virtualizing third-party storage, or for larger environments, some of the software based appliances where there is a scaling with stability (performance, availability, capacity, ease of management, feature functionality) concerns.

Another area where the VMAX 10K could see action which will fly in the face of some industry thinking is for deployment in new and growing managed service providers (MSP), public cloud, and community clouds (private consortiums) looking for an alternative to open source based, or traditional mid-range solutions. Otoh, I cant wait to hear somebody think outside of both the old and new boxes about how a VMAX 10K could be used beyond traditional applications or functionality. For example filling it up with a few SSDs, and then balance with 1TB 2.5 inch SAS HDD and 3.5 inch 3TB (or larger when available) HDDs as an active archive target leveraging the built-in data compression.

How about if EMC were to support cloud optimized HDDs such as the Seagate Constellation Cloud Storage (CS) HDDs that were announced late in 2012 as well as the newer enterprise class HDDs for opening up new markets? Also keep in mind that some of the new 2.5 inch SAS 10,000 (10K) HDDs have the same performance capabilities as traditional 3.5 inch 15,000 (15K) RPM drives in a smaller footprint to help drive and support increased density of performance and capacity with improved energy effectiveness.

How about attaching a VMAX 10K with the right type of cost-effective (aligned to a given scenario) SSD or HDDs or third-party storage to a cluster or grid of servers that are running OpenStack including Swift, CloudStack, Basho Riak CS, Celversafe, Scality, Caringo, Ceph or even EMCs own ATMOS (that supports external storage) for cloud storage or object based storage solutions? Granted that would be thinking outside of the current or new box thinking to move away from RAID based systems in favor or low-cost JBOD storage in servers, however what the heck, let’s think in pragmatic ways.

Will EMC be able to open new markets and opportunities by making the VMAX and its Enginuity software platform and functionality more accessible and affordable leveraging the VMAX 10K as well as the VMAX SP? Time will tell, after all, I recall back in the mid to late 90s, and then again several times during the 2000s similar questions or conversations not to mention the demise of the large traditional storage systems.

Continue reading about what else EMC announced on January 14 2013 in addition to VMAX 10K updates here in the next post in this series. Also check out Chucks EMC blog to see what he has to say.

Ok, nuff said (for now).

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

IBM buys flash solid state device (SSD) industry veteran TMS

How much flash (or DRAM) based Solid State Device (SSD) do you want or need?

IBM recently took a flash step announcing it wants and needs more SSD capabilities in different packaging and functionality capabilities to meet the demands and opportunities of customers, business partners and prospects by acquiring Texas Memory Systems (TMS).

IBM buys SSD flash vendor TMS

Unlike most of the current generation of SSD vendors besides those actually making the dies (chips or semiconductors) or SSD drives that are startups or relatively new, TMS is the industry veteran. Where most of the current SSD vendors experiences (as companies) is measured in months or at best years, TMS has seen several generations and SSD adoption cycles during its multi-decade existence.

IBM buys SSD vendor Texas Memory Systems TMS

What this means is that TMS has been around during past dynamic random access memory (DRAM) based SSD cycles or eras, as well as being an early adopter and player in the current nand flash SSD era or cycle.

Granted, some in the industry do not consider the previous DRAM based generation of products as being SSD, and vice versa, some DRAM era SSD aficionados do not consider nand flash as being real SSD. Needless to say that there are many faces or facets to SSD ranging in media (DRAM, and nand flash among others) along with packaging for different use cases and functionality.

IBM along with some other vendors recognize that the best type of IO is the one that you do not have to do. However reality is that some type of Input Output (IO) operations need to be done with computer systems. Hence the second best type of IO is the one that can be done with the least impact to applications in a cost-effective way to meet specific service level objectives (SLO) requirements. This includes leveraging main memory or DRAM as cache or buffers along with server-based PCIe SSD flash cards as cache or target devices, along with internal SSD drives, as well as external SSD drives and SSD drives and flash cards in traditional storage systems or appliances as well as purpose-built SSD storage systems.

While TMS does not build the real nand flash single level cell (SLC) or multi-level cell (MLC) SSD drives (like those built by Intel, Micron, Samsung, SANdisk, Seagate, STEC and Western Digital (WD) among others), TMS does incorporate nand flash chips or components that are also used by others who also make nand flash PCIe cards and storage systems.

StorageIO industry trend for storage IO

IMHO this is a good move for both TMS and IBM, both of whom have been StorageIO clients in the past (here, here and here) that was a disclosure btw ;) as it gives TMS, their partners and customers a clear path and large organization able to invest in the technologies and solutions on a go forward basis. In other words, TMS who had looked to be bought gets certainty about their future as do they clients.

IBM who has used SSD based components such as PCIe flash SSD cards and SSD based drives from various suppliers gets a PCIe SSD card of their own, along with purpose-built mature SSD storage systems that have lineages to both DRAM and nand flash-based experiences. Thus IBM controls some of their own SSD intellectual property (e.g. IP) for PCIe cards that can go in theory into their servers, as well as storage systems and appliances that use Intel based (e.g. xSeries from IBM) and IBM Power processor based servers as a platform such. For example DS8000 (Power processor), and Intel based XIV, SONAS, V7000, SVC, ProtecTier and Pursystems (some are Power based).

In addition IBM also gets a field proven purpose-built all SSD storage system to compete with those from startups (Kaminario, Purestorage, Solidfire, Violin and Whiptail among others), as well as those being announced from competitors such as EMC (e.g. project X and project thunder) in addition to SSD drives that can go into servers and storage systems.

The question should not be if SSD is in your future, rather where will you be using it, in the server or a storage system, as a cache or a target, as a PCIe target or cache card or as a drive or as a storage system. This also means the question of how much SSD do you need along with what type (flash or DRAM), for what applications and how configured among other topics.

Storage and Memory Hirearchy diagram where SSD fits

What this means is that there are many locations and places where SSD fits, one type of product or model does not fit or meet all requirements and thus IBM with their acquisition of TMS, along with presumed partnership with other SSD based components will be able to offer a diverse SSD portfolio.

StorageIO industry trend for storage IO

The industry trend is for vendors such as Cisco, Dell, EMC, IBM, HP, NetApp, Oracle and others all of whom are either physical server and storage vendors, or in the case of EMC, virtual servers partnered with Cisco (vBlock and VCE) and Lenovo for physical servers.

Different types and locations for SSD

Thus it only makes sense for those vendors to offer diverse SSD product and solution offerings to meet different customer and application needs vs. having a single solution that users adapt to. In other words, if all you have is a hammer, everything needs to look like a nail, however if you have a tool box of various technologies, then it comes down to being able to leverage including articulating what to use when, where, why and how for different situations.

I think this is a good move for both IBM and TMS. Now lets watch how IBM and TMS can go beyond the press release, slide decks and webex briefings covering why it is a good move to justify their acquisition and plans, moving forward and to see the results of what is actually accomplished near and long-term.

Read added industry trends and perspective commentary about IBM buying TMS here and here, as well as check out these related posts and content:

How much SSD do you need vs. want?
What is the best kind of IO? The one you do not have to do
Is SSD dead? No, however some vendors might be
Has SSD put Hard Disk Drives (HDDs) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
EMC VFCache respinning SSD and intelligent caching (Part I)
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
Speaking of speeding up business with SSD storage
Is SSD dead? No, however some vendors might be
Part I: PureSystems, something old, something new, something from big blue
The Many Faces of Solid State Devices/Disks (SSD)
SSD and Green IT moving beyond green washing

Meanwhile, congratulations to both IBM and TMS, ok, nuff said (for now).

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

How much SSD do you need vs. want?

Storage I/O Industry Trends and Perspectives

I have been getting asked by IT customers, VAR’s and even vendors how much solid state device (SSD) storage is needed or should be installed to address IO performance needs to which my standard answer is it depends.

I also am also being asked if there is rule of thumb (RUT) of how much SSD you should have either in terms of the number of devices or a percentage; IMHO, the answer is it depends. Sure, there are different RUTs floating around based on different environments, applications, workloads however are they applicable to your needs.

What I would recommend is instead of focusing on percentages, RUTs, or other SWAG estimate’s or PIROMA calculations, look at your current environment and decide where the activity or issues are. If you know how many fast hard disk drives (HDD) are needed to get to a certain performance level and amount of used capacity that is a good starting point.

If you do not have that information, use tools from your server, storage or third-party provider to gain insight into your activity to help size SSD. Also if you have a database environment and are not familiar with the tools, talk with your DBA’s to have them run some reports that show performance information the two of you can discuss to zero in hot spots or opportunity for SSD.

Keep in mind when looking at SSD what is that you are trying to address by installing SSD. For example, is there a specific or known performance bottleneck resulting in poor response time or latency or is there a general problem or perceived opportunity?

Storage I/O Industry Trends and Perspectives

Is there a lack of bandwidth for large data transfers or is there a constraint on how many IO operations per second (e.g. IOPS) or transaction or activity that can be done in a given amount of time. In other words the more you know where or what the bottleneck is including if you can trace it back to a single file, object, database, database table or other item the closer you are to answering how much SSD you will need.

As an example if using third-party tools or those provided by SSD vendors or via other sources you decide that your IO bottleneck are database transaction logs and system paging files, then having enough SSD space capacity to fit those in part of the solution. However, what happens when you remove the first set of bottlenecks, what new ones will appear and will you have enough space capacity on your SSD to accommodate the next in line hot spot?

Keep in mind that you may want more SSD however what can you get budget approval to buy now without having more proof and a business case. Get some extra SSD space capacity to use for what you are confident can address other bottlenecks, or, enable new capabilities.

On other hand if you can only afford enough SSD to get started, make sure you also protect it. If you decide that two SSD devices (PCIe cache or target cards, drives or appliances) will take care of your performance and capacity needs, make sure to keep availability in mind. This means having extra SSD devices for RAID 1 mirroring, replication or other form of data protection and availability. Keep in mind that while traditional hard disk drive (HDD) storage is often gauged on cost per capacity, or dollar per GByte or dollar per TByte, with SSD measure its value on cost to performance. For example, how many IOPS, or response time improvement or bandwidth are obtained to meet your specific needs per dollar spent.

Related links
What is the best kind of IO? The one you do not have to do
Is SSD dead? No, however some vendors might be
Speaking of speeding up business with SSD storage
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
EMC VFCache respinning SSD and intelligent caching (Part I)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Are large storage arrays dead at the hands of SSD?

Storage I/O trends

An industry trends and perspective.

.

Are large storage arrays dead at the hands of SSD? Short answer NO not yet.
There is still a place for traditional storage arrays or appliances particular those with extensive features, functionality and reliability availability serviceability (RAS). In other words, there is still a place for large (and small) storage arrays or appliances including those with SSDs.

Is there a place for newer flash SSD storage systems, appliances and architectures? Yes
Similar to how there is a place for traditional midrange storage arrays or appliances have found their roles vs. traditional higher end so-called enterprise arrays. Think as an example  EMC CLARiiON/VNX or HP EVA/P6000 or HDS AMS/HUS or NetApp FAS or IBM DS5000 or IBM V7000 among others vs. EMC Symmetrix/DMX/VMAX or HP P10000/3Par or HDS VSP/USP or IBM DS8000. In addition to traditional enterprise or high-end storage systems and midrange also known as modular, there are also specialized appliances or targets such as for backup/restore and archiving. Also do not forget the IO performance SSD appliances like those from TMS among others that have been around for a while.

Is the role of large storage systems changing or evolving? Yes
Given their scale and ability to do large amounts of work in a dense footprint, for some the role of these systems is still mission critical tier 1 application and data support. For other environments, their role continues to evolve being used for high-density tier 2 bulk or even near-line storage for on-line access at scale.

Storage I/O trends

Does this mean there is completion between the old and new systems? Yes
In some circumstances as we have seen already with SSD solutions. Some will place as competing or replacements while others as complementing. For example in the PCIe flash SSD card segment EMC VFCache is positioned is complementing Dell, EMC, HDS, HP, IBM, NetApp, Oracle or others storage vs. FusionIO who positions as a replacement for the above and others. Another scenario is how some SSD vendors have and continue to position their all-flash SSD arrays using either drives or PCIe cards to complement and coexist with other storage systems in an environment (e.g. data center level tiering) vs. as a replacement. Also keep in mind SSD solutions that also support a mix of flash devices and traditional HDDs for capacity and cost savings or cloud access in the same solution.

Does this mean that the industry has adopted all SSD appliances as the state of art?
Avoid confusing industry adoption or talk with industry and customer deployment. They are similar, however one is focused on what the industry talks about or discusses as state of art or the future while the other is what customers are doing. Certainly some of the new flash SSD appliance and storage startups such as Solidfire, Nexgen, Violin, Whiptail or veteran TMS among others have promising futures, some of which may actually be in play with the current SSD market shakeout and consolidation.

Does that mean everybody is going SSD?
SSD customer adoption and deployment continues to grow, however so too does the deployment of high-capacity HDDs.

Storage I/O trends

Do SSDs need HDDs, do HDDs need SSDs? Yes
Granted there are environments where needs can be addressed by all of one or the other. However at least near term, there is a very strong market for tiering and mix of SSD, some fast HDDs and lots of high-capacity HDDs to meet various needs including performance, availability, capacity, energy and economics. After all, there is no such thing, as a data or information recession yet budgets are tight or being reduced. Likewise, people and data are living longer.

What does this mean?
If there, were no such thing as a data recession and budgets a non-issue, perhaps everything could move to all flash SSD storage systems. However, we also know that people and data are living longer along with changing data life-cycle patterns. There is also the need for performance to close the traditional data center IO performance to space capacity gap and bottlenecks as well as store and keep data longer.

There will continue to be a need for a mix of high-capacity and high performance. More IO will continue to gravitate towards the IO appliances, however more data will settle in for longer-term retention and continued access as data life-cycle continue to evolve. Watch for more SSD and cache in the large systems, along with higher density SAS-NL (SAS Near Line e.g. high capacity) type drives appearing in those systems.

If you like new shiny new toys or technology (SNTs) to buy, sell or talk about, there will be plenty of those to continue industry adoption while for those who are focused on industry deployment, there will be a mix of new, and continued evolution for implementation.

Related links
Industry adoption vs. industry deployment, is there a difference?

Industry trend: People plus data are aging and living longer

No Such Thing as an Information Recession

Changing Lifecycles & Data Footprint Reduction
What is the best kind of IO? The one you do not have to do
Is SSD dead? No, however some vendors might be
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
IT and storage economics 101, supply and demand
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
Researchers and marketers don’t agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

More Storage IO momentus HHDD and SSD moments part II

This follows the first of a two-part series on my latest experiences with Hybrid Hard Disk Drives (HHDD’s) and Solid State Devices (SSD’s). In my ongoing last momentus moment post I discussed what I have done with HHDD’s and setting the stage for expanded SSD use. I have the newer HHDD’s, e.g. Seagate Momentus XT II 750GB (8GB SLC nand flash) installed and have since bought another from Amazon as well as having some of the older 500GB (4GB SLC nand flash) in various systems. Those are all functioning great, however still waiting and looking forward to the rumored firmware enhancements to boost write capabilities.

This brings me up to the latest momentus moment which now includes SSD’s.

Well its two years later and I now have a 256GB (usable capacity is lower) Samsung SSD that I bought from Amazon.com and installed in one of my laptops and just as when I made the first switch to HHDD’s, I also have a backup copy/clone to fall back to in case of emergency.

Was it worth the wait? Yes, particularly using the HHDD’s to bridge the gap and enable some productivity gain which more than paid for them based on some different projects. I’m already seeing productivity improvements that will make future upgrades more easy to justify (to myself).

I deviated from my strategy a bit and installed the SSD about six months earlier than I was planning to do so because of a physical barrier. That physical barrier was my new traveling laptop only accepts 7mm height 2.5 inch small form factor devices and the 750GB HHDD that I had planned on installing was 2.5mm to thick which pushed up the SSD installation.

What will become of the 750GB HHDD? Its being redeployed to help speed up file serving, backups and other functions.

Will I replace the HHDD’s in my other workstations and laptops now with SSD’s? Across the board no, not yet, however there is one other system that is a prime candidate to maybe upgrade in a month or two (maybe less).

Will I stick with the Samsung SSD’s or look at other options? I’m keeping my options open and using this as a gauge to test and compare other options in a real world working environment as opposed to a lab bench test simulation. In other words, taking the next step past the lab test and product reviews, gaining comfort and confidence and then trying out with real use activity.

What will happen in the future as I install more SSD’s and have surplus HHDD’s? Redeployed them of course into file or NAS servers, backup targets that in turn will replace HDD’s that will either get retired, or redeployed to replace older, smaller capacity, higher cost to handle HDD’s used for offsite protection.

I tried using the software that came with the SSD to do the cloning and should have known better, however wanted to see what the latest version of ghost was like (it was a waste of time to be polite). Instead I used Seagate Discwizard (aka Acronis) which requires at least one Seagate product (source or target) for cloning.

Cloning from the Seagate HHDD that have been previously cloned from the Hitachi HDD that came with the laptop, was a none issue. However, I wanted to see what would happen if I attached the Samsung SSD to the Seagate Goflex cable and clone directly from the Hitachi HDD, it worked. Hence another reason to have some of the Seagate Goflex cables (USB and eSATA) like the ones I bought at Amazon.com around in your toolbox.

While I do not have concrete empirical numbers to share, cloning from a HDD to a SSD is shall we say fast, however, what’s really fun to watch is cloning from a HHDD to a SSD using an eSata (GoFlex) connector adapter. The reason I say that it is fun is that you don’t have to sit and wait for hours, it’s not minutes to move 100s of GBs, however you can very much see the progress bar move at a good pace.

Also, I put the HHDD on an eSata port and try that out as a backup or data dump target if you have the need for speed, capacity and cost effectiveness, yes its fast, has lots of capacity and so forth. Now if Seagate and Synology or EMC Iomega would get their acts together and add support for the HHDD’s in those different unified SMB and SOHO NAS solutions, that would be way cool.

Will I be racing to put SSD’s in my other laptops or workstations soon? Probably not as there are things in the works and working their way into and through the market place that I wanted to wait for, and thus will wait for now, that is unless a more interesting opportunity pops up.

Related links on SDD, HHDD and HDD
More Storage IO momentus HHDD and SSD moments part I
More Storage IO momentus HHDD and SSD moments part II
IO IO it is off to Storage and IO metrics we go
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Other Momentus moments posts here here, here, here and here
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

More Storage IO momentus HHDD and SSD moments part I

This is the first of a two part series on my latest experiences with HHDD and SSD’s

About two years ago I wanted to start installing solid state devices (SSD’s) into my workstations and laptops. Like many others, I found the expensive price for the limited capacity gains of the then generation SSD’s did not make for a good business decision based on my needs. Don’t get me wrong, I have been a huge fan of SSD for decades as an IT user, vendor, analysts, consultant and consumer and still am. In fact I have some SSD’s used for different purposes as well as many Hard Disk Drives (HDD) and Hybrid Hard Disk Drives (HHDD’s). Almost two years ago when I first tested the HHDD’s, I did an first post in this ongoing series and this two-part post is part of that string of experiences observed evolving from HDD’s to HHDD’s to SSD’s


Image courtesy of Seagate.com

As a refresher, HHDD’s like the Seagate Momentus XT combine a traditional 7,200 RPM 2.5 inch 500GB or 750GB HDD with an integrated single level cell (SLC) nand flash SSD within the actual device. The SSD in the HHDD’s is part of the HDD’s controller complementing the existing DRAM buffer by adding 4GB (500GB models) or 8GB (750GB models) of fast nand flash SSD cache. This means that no external special controller, adapter, data movement or migration software are required to get the performance boost over a traditional HDD and the capacity above a SSD at an affordable cost. In other words, the HHDD’s bridge the gap between those who need large capacity and some performance increases, without having to spend a lot on a lower capacity SSD.

However based on my needs or business requirements two years ago I found the justification to get all the extra performance of  SSD not quite there when. Back two years ago my thinking was that it would be about two maybe three years before the right point for a mix of performance, availability (or reliability e.g. duty cycles), capacity and economics aligned.

Note that this was based on my specific needs and requirements as opposed to my wants or wishes (I wanted SSD back then, however my budget needed to go elsewhere). My requirements and performance needs are probably not the same as yours or others might be. I also wanted to see the incremental technology, product and integration improvements ranging from duty cycle or program/erase cycles (P/E) with newer firmware and flash translation layers (FTLs) among other things. Particularly with multilevel cell (MLC) or enhanced multilevel cell (eMLC) which helps bring the cost down while boosting the capacity, I’m seeing enough to have more confidence in those devices. Note that for the past couple of years I have used single level cell (SLC) nand flash SSD technology in my HHDD’s, the same SSD flash technology that has been found in enterprise class storage.

While I wanted SSD’s two years ago in my laptops and workstations to improve productivity which involves a lot of content creation in addition to consumption, however as mentioned above, there were barriers. So instead of sitting on the sidelines, waiting for SSD’s to either become lower cost, or more capacity for a given cost, or wishing somebody would send me some free stuff (that may or may not have worked), I took a different route. That route was to try the HHDD’s such as Seagate Momentus XT.

Disclosure: Seagate sent me my first HHDD for first testing and verifications before buying several more from Amazon.com and installing them in all laptops, workstations and a server (not all servers have the HHDD’s, or at least yet).

The main reason I went with the HHDD’s two years ago and continue to use them today is to bridge the gap and gain some benefit vs. waiting and wishing and talking about what SSD’s would enable me to do in the future while missing out on productivity enhancements.

The HHDD’s also appealed to me in that my laptops are space constrained for putting two drives and playing the hybrid configuration game of installing both a small SSD and HDD and migrating data back and forth. Sure I could do that for in the office or carry an extra external device around however been there, done that in the past and want to move away from those types of models where possible.

Related links on SDD, HHDD and HDD
More Storage IO momentus HHDD and SSD moments part I
More Storage IO momentus HHDD and SSD moments part II
IO IO it is off to Storage and IO metrics we go
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Other Momentus moments posts here here, here, here and here
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now, lets resume this discussion in part II.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Why SSD based arrays and storage appliances can be a good idea (Part II)

This is the second of a two-part post about why storage arrays and appliances with SSD drives can be a good idea, here is link to the first post.

So again, why would putting drive form factors SSDs be a bad idea for existing storage systems, arrays and appliances?

Benefits of SSD drive in storage systems, arrays and appliances:

  • Familiarity with customers who buy and use these devices
  • Reduces time to market enabling customers to innovate via deployment
  • Establish comfort and confidence with SSD technology for customers
  • Investment protection of currently installed technology (hardware and software)
  • Interoperability with existing interfaces, infrastructure, tools and policies
  • Reliability, availability and serviceability (RAS) depending on vendor implementation
  • Features and functionality (replicate, snapshot, policy, tiering, application integration)
  • Known entity in terms of hardware, software, firmware and microcode (good or bad)
  • Share SSD technology across more servers or accessing applications
  • Good performance assuming no controller, hardware or software bottlenecks
  • Wear leveling and other SSD flash management if implemented
  • Can end performance bottlenecks if backend (drives) are a problem
  • Coexist or complemented with server-based SSD caching

Note, the mere presence of SSD drives in a storage system, array or appliance will not guarantee or enable the above items to be enabled, nor to their full potential. Different vendors and products will implement to various degrees of extensibility SSD drive support, so look beyond the check box of feature, functionality. Dig in and understand how extensive and robust the SSD implementation is to meet your specific requirements.

Caveats of SSD drives in storage systems, arrays and appliances:

  • May not use full performance potential of nand flash SLC technology
  • Latency can be an issue for those who need extreme speed or performance
  • May not be the most innovative newest technology on the block
  • Fun for startup vendors, marketers and their fans to poke fun at
  • Not all vendors add value or optimization for endurance of drive SSD
  • Seen as not being technology advanced vs. legacy or mature systems

Note that different vendors will have various performance characteristics, some good for IOPs, others for bandwidth or throughput while others for latency or capacity. Look at different products to see how they will vary to meet your particular needs.

Cost comparisons are tricky. SSD in HDD form factors certainly cost more than raw flash dies, however PCIe cards and FTL (flash translation layer) controllers also cost more than flash chips by themselves. In other words, apples to apples comparisons are needed. In the future, ideally the baseboard or motherboard vendors will revise the layout to support nand flash (or its replacement) with DRAM DIMM type modules along with associated FTL and BIOS to handle the flash program/erase cycles (P/E) and wear leveling management, something that DRAM does not have to encounter. While that provides great location or locality of reference (figure 1), it is also a more complex approach that takes time and industry cooperation.

Locality of reference for memory and storage
Figure 1: Locality of reference for memory and storage

Certainly, for best performance, just like realty location matters and thus locality of reference comes into play. That is put the data as close to the server as possible, however when sharing is needed, then a different approach or a companion technique is required.

Here are some general thoughts about SSD:

  • Some customers and organizations get the value and role of SSD
  • Some see where SSD can replace HDD, others see where it compliments
  • Yet others are seeing the potential, however are moving cautiously
  • For many environments better than current performance is good enough
  • Environments with the need for speed need every bit of performance they can get
  • Storage systems and arrays or appliances continue to evolve including the media they use
  • Simply looking at how some storage arrays, systems and appliances have evolved, you can get an idea on how they might look in the future which could include not only SAS as a backend or target, also PCIe. After all, it was not that long ago where backend drive connections went from propriety to open parallel SCSI or SSA to Fibre Channel loop (or switched) to SAS.
  • Engineers and marketers tend to gravitate to newer products nand technology, which is good, as we need continued innovation on that front.
  • Customers and business people tend to gravitate towards deriving greatest value out of what is there for as long as possible.
  • Of course, both of the latter two points are not always the case and can be flip flopped.
  • Ultrahigh end environments and corner case applications will continue to push the limits and are target markets for some of the newer products and vendors.
  • Likewise, enterprise, mid market and other mainstream environments (outside of their corner case scenarios) will continue to push known technology to its limits as long as they can derive some business benefit value.

While not perfect, SSD in a HDD form factor with a SAS or SATA interface properly integrated by vendors into storage systems (or arrays or appliances) are a good fit for many environments today. Likewise, for some environments, new from the ground up SSD based solutions that leverage flash DIMM or daughter cards or PCIe flash cards are a fit. So to are PCIe flash cards either as a target, or as cache to complement storage system (arrays and appliances). Certainly, drive slots in arrays take up space for SSD, however so to does occupying PCIe space particularly in high density servers that require every available socket and slot for compute and DRAM memory. Thus, there are pros and cons, features and benefits of various approaches and which is best will depend on your needs and perhaps preferences, which may or may not be binary.

I agree that for some applications and solutions, non drive form factor SSD make sense while in others, compatibility has its benefits. Yet in other situations nand flash such as SLC combined with HDD and DRAM tightly integrated such as in my Momentus XT HHDD is good for laptops, however probably not a good fit for enterprise yet. Thus, SSD options and placements are not binary, of course, sometimes opinions and perspectives will be.

For some situations PCIe, based cards in servers or appliances make sense, either as a target or as cache. Likewise for other scenarios drive format SSD make sense in servers and storage systems, appliances, arrays or other solutions. Thus while all of those approaches are used for storing binary digital data, the solutions of what to use when and where often will not be binary, that is unless your approach is to use one tool or technique for everything.

Here are some related links to learn more about SSD, where and when to use what:
Why SSD based arrays and storage appliances can be a good idea (Part I)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
Speaking of speeding up business with SSD storage
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments, Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Why SSD based arrays and storage appliances can be a good idea (Part I)

This is the first of a two-part series, you can read part II here.

Robin Harris (aka @storagemojo) recently in a blog post asks a question and thinks solid state devices (SSDs) using SAS or SATA interface in traditional hard disk drive (HDD) form factors are a bad idea in storage arrays (e.g. storage systems or appliances). My opinion is that as with many things about storing, processing or moving binary digital data (e.g. 1s and 0s) the answer is not always clear. That is there may not be a right or wrong answer instead it depends on the situation, use or perhaps abuse scenario. For some applications or vendors, adding SSD packaged in HDD form factors to existing storage systems, arrays and appliances makes perfect sense, likewise for others it does not, thus it depends (more on that in a bit). While we are talking about SSD, Ed Haletky (aka @texiwill) recently asked a related question of Fix the App or Add Hardware, which could easily be morphed into a discussion of Fix the SSD, or Add Hardware. Hmmm, maybe a future post idea exists there.

Lets take a step back for a moment and look at the bigger picture of what prompts the question of what type of SSD to use where and when along as well as why various vendors want you to look at things a particular way. There are many options for using SSD that is packaged in various ways to meet diverse needs including here and here (see figure 1).

Various SSD packaging options
Figure 1: Various packaging and deployment options for SSD

The growing number of startup and established vendors with SSD enabled storage solutions vying to win your hearts, minds and budget is looking like the annual NCAA basketball tournament (aka March Madness and march metrics here and here). Some of vendors have or are adding SSD with SAS or SATA interfaces that plug into existing enclosures (drive slots). These SSDs have the same form factor of a 2.5 inch small form factor (SFF) or 3.5 inch HDDs with a SAS or SATA interface for physical and connectivity interoperability. Other vendors have added PCIe based SSD cards to their storage systems or appliances as a cache (read or read and write) or a target device similar to how these cards are installed in servers.

Simply adding SSD either in a drive form factor or as a PCIe card to a storage system or appliance is only part of a solution. Sure, the hardware should be faster than a traditional spinning HDD based solution. However, what differentiates the various approaches and solutions is what is done with the storage systems or appliances software (aka operating system, storage applications, management, firmware or micro code).

So are SSD based storage systems, arrays and appliances a bad idea?

If you are a startup or established vendor able to start from scratch with a clean sheet design not having to worry about interoperability and customer investment protection (technology, people skills, software tools, etc), then you would want to do something different. For example, leverage off the shelf components such as a PCIe flash SSD card in an industry standard server combined with your software for a solution. You could also use extra DRAM memory in those servers combined with PCIe flash SSD cards perhaps even with embedded HDDs for a backing or preservation medium.

Other approaches might use a mix of DRAM, PCIe flash cards, as either a cache or target combined with some drive form factor SSDs. In other words, there is no right or wrong approach; sure, there are different technical merits that have advantages for various applications or environments. Likewise, people have preferences particular for technology focused who tend to like one approach vs. another. Thus, we have many options to leverage, use or abuse.

In his post, Robin asks a good question of if nand flash SSD were being put into a new storage system, why not use the PCIe backplane vs. using nand flash on DIMM vs. using drive formats, all of which are different packaging options (Figure 1). Some startups have gone the all backplane approach, some have gone with the drive form factor, some have gone with a mix and some even using HDDs in the background. Likewise some traditional storage system and array vendors who support a mix of SSD and HDD drive form factor devices also leverage PCIe cards, either as a server-based cache (e.g. EMC VFCahe) or installed as a performance accelerator module (e.g. NetApp PAM) in their appliances.

While most vendors who put SSD drive form factor drives into their storage systems or appliances (or serves for that matter) use them as data targets for creating LUNs or file systems, others use them for internal functionality. By internal functionality I mean instead of the SSD appearing as another drive or target, they are used exclusively by the storage system or appliance for caching or similar purposes. On storage systems, this can be to increase the size of persistent cache such as EMC on the CLARiiON and VNX (e.g. FAST Cache). Another use is on backup or dedupe target appliances where SSDs are used to store dictionary, index or meta data repositories as opposed to being a general data pool.

Part two of this post looks at the benefits and caveats of SSD in storage arrays.

Here are some related links to learn more about SSD, where and when to use what:
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers don’t agree on future of nand flash SSD
Speaking of speeding up business with SSD storage
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments, Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?

Ok, nuff said for now, check part II.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Researchers and marketers dont agree on future of nand flash SSD

Marketers particular those involved with anything resembling Solid State Devices (SSD) will tell you SSD is the future as will some researchers along with their fans and pundits. Some will tell you that the future only has room for SSD with the current flavor de jour being nand flash (both Single Level Cell aka SLC and Multi Level Cell aka MLC) with any other form of storage medium (e.g. Hard Disk Drives or HDD and tape summit resources) being dead and to avoid wasting your money on them.

Of course others and their fans or supporters who do not have an SSD play or product will tell forget about them, they are not ready yet.

Then there are those who take no sides per say, simply providing comments and perspectives along with things to be considered that also get used to spin stories for or against by others.

For the record, I have been a fan and user of various forms of SSD along with other variations of tiered storage mediums using them for where they fit best for several decades as a customer in IT, as a vendor, analyst and advisory consultant. Thus my perspective and opinion is that SSDs do in fact have a very bright future. However I also believe that other storage mediums are not dead yet although their roles are evolving while their technologies continue be developed. In other words, use the right technology and tool, packaged and deployed in the best most effective way for the task at hand.

Memory and tiered storage hirearchy
Memory and tiered storage hierarchy

Consequently while some SSD vendors, their fans, supporters, pundits and others might be put off by some recent UCSD research that does not paint SSD and particular nand flash in the best long-term light, it caught my attention and here is why. First I have already seen in different venues where some are using the research as a tool, club or weapon against SSD and in particular nand flash which should be no surprise. Secondly I have also seen those who don’t agree with the research at best dismiss the findings. Others are using it as a conversation or topic piece for their columns or other venues such as here.

The reason the UCSD research caught my eye was that it appeared to be looking at how will nand SSD technology evolve from where it is today to where it will be in ten years or so.

While ten years may seem like a long time, just look back at how fast things evolved over the past decade. Granted the UCSD research is open to discussion, debate and dismissal as clear in the comments of this article here. However the research does give a counter point or perspective to some of the hype which can mean somewhere between the two extremes, exists reality and where things are headed or need to be discussed. While I do not agree with all the observations or opinions of the research, it does give stimulus for discussing things including best practices around deployment vs. simply talking about adoption.

It has taken many decades for people to become comfortable or familiar with the pros and cons of HDD or tape for that matter.

Likewise some are familiar with (good or bad) with DRAM based SSD of earlier generations. On the other hand, while many people use various forms of nand flash SSD ranging from what is inside their cell phone or SD cards for cameras to USB thumb drives to SSD on drives, on PCIe cards or in storage systems and appliances, there is still an evolving comfort and confidence level for business and enterprise storage use. Some have embraced, some have dismissed, many if not most are intrigued wanting to know more, are using nand flash SSD in some shape or form, while gaining confidence.

Part of gaining confidence is moving beyond the industry hype looking at and understanding what are the pros, cons and how to leverage or work around the constraints. A long time ago a wise person told me that it is better to know the good, bad and ugly about a product, service or technology so that you could leverage the best, configure, plan and manage around the bad to avoid or minimized the ugly. Based on that philosophy I find many IT customers and even some VARs and vendors wanting to know the good, the bad and they ugly not for hanging out a vendor or their technology and products, rather so that they can be comfortable in knowing when, where, why and how to use to be most effective.

Industry Trends and Perspectives

Granted to get some of the not so good information may need NDA (Non Disclosure Agreement) or other confidentially discussions as after all, what vendor or solution provider wants to show or let anything less than favorable out into the blogosphere, twittersphere, googleplus, tabloids, news sphere or other competitive landscapes venues.

Ok, lets bring this back to the UCSD research report titled The Bleak Future of NAND Flash Memory

UCSD research report: The Bleak Future of NAND Flash Memory
Click here or on the above image to read the UCSD research report

I’m not concerned that the UCSD research was less than favorable as some others might be, after all, it is looking out into the future and if a concern, provides a glimpse of what to keep an eye on.

Likewise, looking back, the research report could be taken as simply a barometer of what could happen if no improvements or new technologies evolve. For example, the HDD would have hit the proverbial brick wall also known as the super parametric barrier many years ago if new recording methods and materials had not been deployed including a shift to perpendicular recording, something that was recently added to tape.

Tomorrows SSDs and storage mediums will still be based on nand flash including SLC, MLC, eMLC along with other variants not to mention phased change memory (PCM) and other possible contenders.

Todays SSDs have shifted from being DRAM based with HDD or even flash-based persistent backing storage to nand flash-based, both SLC and MLC with enhanced or enterprise MLC appearing. Likewise the density of SSDs continue to increase meaning more data packed into the same die or footprint, more dies stacked in a chip package to boost capacity while decreasing cost. However what is also happening is behind the scenes which is a big differentiator with SSDs and that is the quality of some firmware and low-level page management at the flash translation layer (FTL). Hence they saying that anybody with a soldering iron and ability to pull together off the shelves FTLs and packaging can create some form of an SSD. How effective a product will be is based on the intelligence and robustness of the combination of the dies, FTL, controller and associated firmware and device drivers along with other packaging options plus the testing, validation and verification they undergo.

Various packaging options and where SSD can be deployed
Various SSD locations, types, packaging and usage scenario options

Good SSD vendors and solution providers I believe will be able to discuss your concerns around endurance, duty cycles, data integrity and other related topics to set up confidence with current and future issues, granted you may have to go under NDA to gain that insight. On the other hand, those who feel threatened or not able or interested in addressing or demonstrating confidence for the long haul will be more likely to dismiss studies, research, reports, opinions or discussions that dig deeper into creating confidence via understanding of how things work so that customers can more fully leverage those technologies.

Some will view and use reports such as the one from UCSD as a club or weapon against SSD and in particular against nand flash to help their cause or campaign while others will use it to stimulate controversy and page hit views. My reason for bringing up the topic and discussion it to stimulate thinking and help increase awareness and confidence in technologies such as SSD near and long-term. Regardless of if your view is that SSD will replace HDD, or that they will continue to coexist as tiered storage mediums into the future, gaining confidence in the technologies along with when, where and how to use them are important steps in shifting from industry adoption to customer deployment.

What say you?

Is SSD the best thing and you are dumb or foolish if you do not embrace it totally or a fan, pundit cheerleader view?

Or is SSD great when and where used in the right place so embrace it?

How will SSD continue to evolve including nand and other types of memories?

Are you comfortable with SSD as a long term data storage medium, or for today, its simply a good way to discuss performance bottlenecks?

On the other hand, is SSD interesting, however you are not comfortable or have confidence with the technology, yet you want to learn more, in other words a skeptics view?

Or perhaps the true cynic view which is that SSD are nothing but the latest buzzword bandwagon fad technology?

Ok, nuff said for now, other than here is some extra related SSD material:
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments, Part II: The call to duty, SSD endurance
Part I: EMC VFCache respinning SSD and intelligent caching
Part II: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Industry adoption vs. industry deployment, is there a difference?
Data Center I/O Bottlenecks Performance Issues and Impacts
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperability support matrix

Cheers
gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

EMC VFCache respinning SSD and intelligent caching (Part II)

This is the second of a two part series pertaining to EMC VFCache, you can read the first part here.

In this part of the series, lets look at some common questions along with comments and perspectives.

Common questions, answers, comments and perspectives:

Why would EMC not just go into the same market space and mode as FusionIO, a model that many other vendors seam eager to follow? IMHO many vendors are following or chasing FusionIO thus most are selling in the same way perhaps to the same customers. Some of those vendors can very easily if they were not already also make a quick change to their playbook adding some new moves to reach broader audience. Another smart move here is that by taking a companion or complimentary approach is that EMC can continue selling existing storage systems to customers, keep those investments while also supporting competitors products. In addition, for those customers who are slow to adopt the SSD based techniques, this is a relatively easy and low risk way to gain confidence. Granted the disk drive was declared dead several years (and yes also several decades) ago, however it is and will stay alive for many years due to SSD helping to close the IO storage and performance gap.

Storage IO performance and capacity gap
Data center and storage IO performance capacity gap (Courtesy of Cloud and Virtual Data Storage Networking (CRC Press))

Has this been done before? There have been other vendors who have done LUN caching appliances in the past going back over a decade. Likewise there are PCIe RAID cards that support flash SSD as well as DRAM based caching. Even NetApp has had similar products and functionality with their PAM cards.

Does VFCache work with other PCIe SSD cards such as FusionIO? No, VFCache is a combination of software IO intercept and intelligent cache driver along with a PCIe SSD flash card (which could be supplied as EMC has indicated from different manufactures). Thus VFCache to be VFCache requires the EMC IO intercept and intelligent cache software driver.

Does VFCache work with other vendors storage? Yes, Refer to the EMC support matrix, however the product has been architected and designed to install and coexist into a customers existing environment which means supporting different EMC block storage systems as well as those from other vendors. Keep in mind that a main theme of VFCache is to compliment, coexist, enhance and protect customers investments in storage systems to improve their effectiveness and productivity as opposed to replacing them.

Does VFCache introduce a new point of vendor lockin or stickiness? Some will see or place this as a new form of vendor lockin, others assuming that EMC supports different vendors storage systems downstream as well as offer options for different PCIe flash cards and keeps the solution affordable will assert it is no more lockin that other solutions. In fact by supporting third party storage systems as opposed to replacing them, smart sales people and marketeers will place VFCache as being more open and interoperable than some other PCIe flash card vendors approach. Keep in mind that avoiding vendor lockin is a shared responsibility (read more here).

Does VFCache work with NAS? VFCache does not work with NAS (NFS or CIFS) attached storage.

Does VFCache work with databases? Yes, VFCache is well suited for little data (e.g. database) and traditional OLTP or general business application process that may not be covered or supported by other so called big data focused or optimized solutions. Refer to this EMC document (and this document here) for more information.

Does VFCache only work with little data? While VFCache is well suited for little data (e.g. databases, share point, file and web servers, traditional business systems) it also able to work with other forms of unstructured data.

Does VFCache need VMware? No, While VFCache works with VMware vSphere including a vCenter plug in, however it does not need a hypervisor and is practical in a physical machine (PM) as it is in a virtual machine (VM).

Does VFCache work with Microsoft Windows? Yes, Refer to the EMC support matrix for specific server operating systems and hypervisor version support.

Does VFCache work with other unix platforms? Refer to the EMC support matrix for specific server operating systems and hypervisor version support.

How are reads handled with VFCache? The VFCache software (driver if you prefer) intercepts IO requests to LUNs that are being cached performing a quick lookup to see if there is a valid cache entry in the physical VFCache PCIe card. If there is a cache hit the IO is resolved from the closer or local PCIe card cache making for a lower latency or faster response time IO. In the case of a cache miss, the VFCache driver simply passes the IO request onto the normal SCSI or block (e.g. iSCSI, SAS, FC, FCoE) stack for processing by the downstream storage system (or appliance). Note that when the requested data is retrieved from the storage system, the VFCache driver will based on caching algorithms determinations place a copy of the data in the PCIe read cache. Thus the real power of the VFCache is the software implementing the cache lookup and cache management functions to leverage the PCIe card that complements the underlying block storage systems.

How are writes handled with VFCache? Unless put into a write cache mode which is not the default, VFCache software simply passes the IO operation onto the IO stack for downstream processing by the storage system or appliance attached via a block interface (e.g. iSCSI, SAS, FC, FCoE). Note that as part of the caching algorithms, the VFCache software will make determinations of what to keep in cache based on IO activity requests similar to how cache management results in better cache effectiveness in a storage system. Given EMCs long history of working with intelligent cache algorithms, one would expect some of that DNA exists or will be leveraged further in future versions of the software. Ironically this is where other vendors with long cache effectiveness histories such as IBM, HDS and NetApp among others should also be scratching their collective heads saying wow, we can or should be doing that as well (or better).

Can VFCache be used as a write cache? Yes, while its default mode is to be used as a persistent read cache to compliment server and application buffers in DRAM along with enhance effectiveness of downstream storage system (or appliances) caches, VFCache can also be configured as a persistent write cache.

Does VFCache include FAST automated tiering between different storage systems? The first version is only a caching tool, however think about it a bit, where the software sits, what storage systems it can work with, ability to learn and understand IO paths and patterns and you can get an idea of where EMC could evolve it to, similar to what they have done with recoverpoint among other tools.

Changing data access patterns and lifecycles
Evolving data access patterns and life cycles (more retention and reads)

Does VFCache mean all or nothing approach with EMC? While the complete VFCache solution comes from EMC (e.g. PCIe card and software), the solution will work with other block attached storage as well as existing EMC storage systems for investment protection.

Does VFCache support NAS based storage systems? The first release of VFCache only supports block based access, however the server that VFCache is installed in could certainly be functioning as a general purpose NAS (NFS or CIFS) server (see supported operating systems in EMC interoperability notes) in addition to being a database or other other application server.

Does VFCache require that all LUNs be cached? No, you can select which LUNs are cached and which ones are not.

Does VFCache run in an active / active mode? In the first release it is active passive, refer to EMC release notes for details.

Can VFCache be installed in multiple physical servers accessing the same shared storage system? Yes, however refer to EMC release notes on details about active / active vs. active / passive configuration rules for ensuring data integrity.

Who else is doing things like this? There are caching appliance vendors as well as others such as NetApp and IBM who have used SSD flash caching cards in their storage systems or virtualization appliances. However keep in mind that VFCache is placing the caching function closer to the application that is accessing it there by improving on the locality of reference (e.g. storage and IO effectiveness).

Does VFCache work with SSD drives installed in EMC or other storage systems? Check the EMC product support matrix for specific tested and certified solutions, however in general if the SSD drive is installed in a storage system that is supported as a block LUN (e.g. iSCSI, SAS, FC, FCoE) in theory it should be possible to work with VFCache. Emphasis, visit the EMC support matrix.
What type of flash is being used?

What type of nand flash SSD memory is EMC using in the PCIe card? The first release of VFCache is leveraging enterprise class SLC (Single Level Cell) nand flash which has been used in other EMC products for its endurance, long duty cycle to minnimize or eliminate concerns of wear and tear while meeting read and write performance. EMC has indicated that they will also as part of an industry trend leverage MLC along with Enterprise MLC (EMLC) technologies on a go forward basis.

Doesnt nand ssd flash cache wear out? While nand flash SSD can wear out over time due to extensive write use, the VFCache approach mitigates this by being primarily a read cache reducing the number or program / erase cycles (P/E cycles) that occur with write operations as well as initially leveraging longer duty cycle SLC flash. EMC also has several years experience from implementing wear leveling algorithms into the storage systems controllers to increase duty cycle and reduce wear on SLC flash which will play forward as MLC or Enterprise MLC (EMLC) techniques are leveraged. This differs from vendors who are positioning their SLC or MLC based flash PCIe SSD cards for mainly write operations which will cause more P/E cycles to occur at a faster rate reducing the duty or useful life of the device.

How much capacity does the VFCache PCIe card contain? The first release supports a 300GB card and EMC has indicated that added capacity and configuration options are in their plans.

Does this mean disks are dead? Contrary to popular industry folk lore (or wish) the hard disk drive (HDD) has plenty of life left part of which has been increased by being complimented by VFCache.

Various options and locations for SSD along with different usage scenarios
Various SSD locations, types, packaging and usage scenario options

Can VFCache work in blade servers? The VFCache software is transparent to blade, rack mount, tower or other types of servers. The hardware part of VFCache is a PCIe card which means that the blade server or system will need to be able to accommodate a PCIe card to compliment the PCIe based mezzaine IO card (e.g. iSCSI, SAS, FC, FCOE) used for accessing storage. What this means is that for blade systems or server vendors such as IBM who have a PCIe expansion module for their H series blade systems (it consumes a slot normally used by a server blade), PCIe cache cards like those being initially released by IBM could work, however check with the EMC interoperability matrix, as well as your specific blade server vendor for PCIe expansion capabilities. Given that EMC leverages Cisco UCS for their vBlocks, one would assume that those systems will also see VFCache modules in those systems. NetApp partners with Cisco using UCS in their FlexPods so you see where that could go as well along with potential other server vendors support including Dell, HP, IBM and Oracle among others.

What about benchmarks? EMC has released some technical documents that show performance improvements in Oracle environments such as this here. Hopefully we will see EMC also release other workloads for different applications including Microsoft Exchange Solutions Proven (ESRP) along with SPC similar to what IBM recently did with their systems among others.

How do the first EMC supplied workload simulations compare vs. other PCIe cards? This is tough to gauge as many SSD solutions and in particular PCIe cards are doing apples to oranges comparisons. For example to generate a high IOPs rating for marketing purposes, most SSD solutions are stress performance tested at 512 bytes or 1/2 of a KByte or at least 1/8 of a small 4Kbyte IO. Note that operating systems such as Windows are moving to 4Kbyte page allocation size to align with growing IO sizes with databases moving from the old average of 4Kbytes to 8Kbytes and larger. What is important to consider is what is the average IO size and activity profile (e.g. reads vs. writes, random vs. sequential) for your applications. If your application is doing ultra small 1/2 Kbyte IOs, or even smaller 64 byte IOs (which should be handled by better application or file system caching in DRAM), then the smaller IO size and record setting examples will apply. However if your applications are more mainstream or larger, then those smaller IO size tests should be taken with a grain of salt. Also keep latency in mind that many target or oppourtunity applications for VFCache are response time sensitive or can benefit by the improved productivity they enable.

What is locality of reference? Locality of reference refers to how close data is to where it is being requested or accessed from. The closer the data to the application requesting the faster the response time or quick the work gets done. For example in the figure below L1/L2/L3 on board processor caches are the fastest, yet smallest while closest to the application running on the server. At the other extreme further down the stack, storage becomes large capacity, lower cost, however lower performing.

Locality of reference data and storage memory

What does cache effectiveness vs. cache utilization mean? Cache utilization is an indicator of how much the available cache capacity is being used however it does not give an indicator of if the cache is being well used or not. For example, cache could be 100 percent used, however there could be a low hit rate. Thus cache effectiveness is a gauge of how well the available cache is being used to improve performance in terms of more work being done (IOPS or bandwidth) or lower of latency and response time.

Isnt more cache better? More cache is not better, it is how the cache is being used, this is a message that I would be disappointed in HDS if they were not to bring up as a point of messaging (or rebuttal) given their history of emphasis cache effectiveness vs. size or quantity (Hu, that is a hint btw ;).

What is the performance impact of VFCache on the host server? EMC is saying greatest of 5 percent or less CPU consumption which they claim is several times less than the competitions worst scenario, as well as claiming 512MB to 1GB of DRM on the server vs. several times that of their competitors. The difference could be expected to be via more off load functioning including flash translation layer (FTL), wear leveling and other optimization being handled by the PCIe card vs. being handled in the servers memory and using host server CPU cycles.

How does this compare to what NetApp or IBM does? NetApp, IBM and others have done caching with SSD in their storage systems, or leveraging third party PCIe SSD cards from different vendors to be installed in servers to be used as a storage target. Some vendors such as LSI have done caching on the PCIe cards (e.g. CacheCaid which in theory has a similar software caching concept to VFCache) to improve performance and effectiveness across JBOD and SAS devices.

What about stale (old or invalid) reads, how does VFCache handle or protect against those? Stale reads are handled via the VFCache management software tool or driver which leverages caching algorithms to decide what is valid or invalid data.

How much does VFCache cost? Refer to EMC announcement pricing, however EMC has indicated that they will be competitive with the market (supply and demand).

If a server shutdowns or reboots, what happens to the data in the VFCache? Being that the data is in non volatile SLC nand flash memory, information is not lost when the server reboots or loses power in the case of a shutdown, thus it is persistent. While exact details are not know as of this time, it is expected that the VFCache driver and software do some form of cache coherency and validity check to guard against stale reads or discard any other invalid cache entries.

Industry trends and perspectives

What will EMC do with VFCache in the future and on a larger scale such as an appliance? EMC via its own internal development and via acquisitions has demonstrated ability to use various clustered techniques such as RapidIO for VMAX nodes, InfiniBand for connecting Isilon  nodes. Given an industry trend with several startups using PCIe flash cards installed in a server that then functions as a IO storage system, it seems likely given EMCs history and experience with different storage systems, caching, and interconnects that they could do something interesting. Perhaps Oracle Exadata III (Exadata I was HP, Exadata II was Sun/Oracle) could be an EMC based appliance (That is pure speculation btw)?

EMC has already shown how it can use SSD drives as a cache extension in VNX and CLARiiON servers ( FAST CACHE ) in addition to as a target or storage tier combined with Fast for tiering. Given their history with caching algorithms, it would not be surprising to see other instantiations of the technology deployed in complimentary ways.

Finally, EMC is showing that it can use nand flash SSD in different ways, various packaging forms to apply to diverse applications or customer environments. The companion or complimentary approach EMC is currently taking contrasts with some other vendors who are taking an all or nothing, its all SSD as disk is dead approach. Given the large installed base of disk based systems EMC as well as other vendors have in place, not to mention the investment by those customers, it makes sense to allow those customers the option of when, where and how they can leverage SSD technologies to coexist and complement their environments. Thus with VFCache, EMC is using SSD as a cache enabler to discuss the decades old and growing storage IO to capacity performance gap in a force multiplier model that spreads the cost over more TBytes, PBytes or EBytes while increasing the overall benefit, in other words effectiveness and productivity.

Additional related material:
Part I: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Unified storage systems showdown: NetApp FAS vs. EMC VNX
Industry adoption vs. industry deployment, is there a difference?
Two companies on parallel tracks moving like trains offset by time: EMC and NetApp
Data Center I/O Bottlenecks Performance Issues and Impacts
From bits to bytes: Decoding Encoding
Who is responsible for vendor lockin
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperabity support matrix

Ok, nuff said for now, I think I see some storm clouds rolling in

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

EMC VFCache respinning SSD and intelligent caching (Part I)

This is the first part of a two part series covering EMC VFCache, you can read the second part here.

EMC formerly announced VFCache (aka Project Lightning) an IO accelerator product that comprises a PCIe nand flash card (aka Solid State Device or SSD) and intelligent cache management software. In addition EMC is also talking about the next phase of the flash business unit and project Thunder. The approach EMC is taking with vFCache should not be a surprise given their history of starting out with memory and SSD evolving it into an intelligent cache optimized storage solution.

Storage IO performance and capacity gap
Data center and storage IO performance capacity gap (Courtesy of Cloud and Virtual Data Storage Networking (CRC Press))

Could we see the future of where EMC will take VFCache along with other possible solutions already being hinted at by the EMC flash business unit by looking where they have been already?

Likewise by looking at the past can we see the future or how VFCache and sibling product solutions could evolve?

After all, EMC is no stranger to caching with both nand flash SSD (e.g. FLASH CACHE, FAST and SSD drives) along with DRAM based across their product portfolio not too mention being a core part of their company founding products that evolved into HDDs and more recent nand flash SSDs among others.

Industry trends and perspectives

Unlike others who also offer PCIe SSD cards such as FusionIO with a focus on eliminating SANs or other storage (read their marketing), EMC not surprisingly is marching to a different beat. The beat EMC is marching too or perhaps leading by example for others to follow is that of going mainstream and using PCIe SSD cards as a cache to compliment theirs as well as other vendors storage systems vs. replacing them. This is similar to what EMC and other mainstream storage vendors have done in the past such as with SSD drives being used as flash cache extension on CLARiiON or VNX based systems as well as target or storage tier.

Various options and locations for SSD along with different usage scenarios
Various SSD locations, types, packaging and usage scenario options

Other vendors including IBM, NetApp and Oracle among others have also leveraged various packaging options of Single Level Cell (SLC) or Multi Level Cell (MLC) flash as caches in the past. A different example of SSD being used as a cache is the Seagate Momentus XT which is a desktop, workstation consumer type device. Seagate has shipped over a million of the Momentus XT which use SLC flash as a cache to compliment and enhance the integrated HDD performance (a 750GB with 8GB SLC memory is in the laptop Im using to type this with).

One of the premises of solutions such as those mentioned above for caching is to discuss changing data access patterns and life cycles shown in the figure below.

Changing data access patterns and lifecycles
Evolving data access patterns and life cycles (more retention and reads)

Put a different way, instead of focusing on just big data or corner case (granted some of those are quite large) or ultra large cloud scale out solutions, EMC with VFCache is also addressing their core business which includes little data. What will be interesting to watch and listen too is how some vendors will start to jump up and down saying that they have done or enabling what EMC is announcing for some time. In some cases those vendors will be rightfully doing and making noise on something that they should have made noise about before.

EMC is bringing the SSD message to the mainstream business and storage marketplace showing how it is a compliment to, vs. a replacement of existing storage systems. By doing so, they will show how to spread the cost of SSD out across a larger storage capacity footprint boosting the effectiveness and productive of those systems. This means that customers who install the VFCache product can accelerate the performance of both their existing EMC as well as storage systems from other vendors preserving their technology along with people skills investment.

 

Key points of VFCache

  • Combines PCIe SLC nand flash card (300GB) with intelligent caching management software driver for use in virtualized and traditional servers

  • Making SSD complimentary to existing installed block based disk (and or SSD) storage systems to increase their effectiveness

  • Providing investment protection while boosting productivity of existing EMC and third party storage in customer sites

  • Brings caching closer to the application where the data is accessed while leverage larger scale direct attached and SAN block storage

  • Focusing message for SSD back on to little data as well as big data for mainstream broad customer adoption scenarios

  • Leveraging benefit and strength of SSD as a read cache and scalable of underlying downstream disk for data storage

  • Reducing concerns around SSD endurance or duty cycle wear and tear by using as a read cache

  • Off loads underlying storage systems from some read requests enabling them to do more work for other servers

Additional related material:
Part II: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Unified storage systems showdown: NetApp FAS vs. EMC VNX
Industry adoption vs. industry deployment, is there a difference?
Two companies on parallel tracks moving like trains offset by time: EMC and NetApp
Data Center I/O Bottlenecks Performance Issues and Impacts
From bits to bytes: Decoding Encoding
Who is responsible for vendor lockin
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperabity support matrix

Ok, nuff said for now, I think I see some storm clouds rolling in

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

StorageIO Momentus Hybrid Hard Disk Drive (HHDD) Moments

This is the third in a series of posts that I have done about Hybrid Hard Disk Drives (HHDDs) along with pieces about Hard Disk Drives (HDD) and Solid State Devices (SSDs). Granted the HDD received its AARP card several years ago when it turned 50 and is routinely declared dead (or read here) even though it continues to evolve along SSD maturing and both expanding into different markets as well as usage roles.

For those who have not read previous posts about Hybrid Hard Disk Drives (HHDDs) and the Seagate Momentus XT you can find them here and here.

Since my last post, I have been using the HHDDs extensively and recently installed the latest firmware. The release of new HHDD firmware by Seagate for the Momentus XT (SD 25) like its predecessor SD24 cleaned up some annoyances and improved on overall stability. Here is a Seagate post by Mark Wojtasiak discussing SD25 and feedback obtained via the Momentus XT forum from customers.

If you have never done a HDD firmware update, its not as bad or intimidating as might be expected. The Seagate firmware update tools make it very easy, that is assuming you have a recent good backup of your data (one that can be restored) and about 10 to 15 minutes of time for a couple of reboots.

Speaking of stability, the Momentus XT HHDDs have been performing well helping to speed up accessing large documents on various projects including those for my new book. Granted an SSD would be faster across the board, however the large capacity at the price point of the HHDD is what makes it a hybrid value proposition. As I have said in previous posts, if you have the need for speed all of the time and time is money, get an SSD. Likewise if you need as much capacity as you can get and performance is not your primary objective, then leverage the high capacity HDDs. On the other hand, if you need a balance of some performance boost with capacity boost and a good value, then check out the HHDDs.

Image of Momentus XT courtesy of www.Seagate.com

Lets shift gears from that of the product or technology to that of common questions that I get asked about HHDDs.

Common questions I get asked about HHDDs include:

What is a Hybrid Hard Disk Drive?

A Hybrid Hard Disk Drive includes a combination of rotating HDD, solid state flash persistent memory along with volatile dynamic random access memory (DRAM) in an integrated package or product. The value proposition and benefit is a balance of performance and capacity at a good price for those environments, systems or applications that do not need all SSD performance (and cost) vs. those that need some performance in addition to large capacity.

How the Seagate Momentus XT differs from other Hybrid Disks?
One approach is to take a traditional HDD and pair it with a SSD using a controller packaged in various ways. For example on a large scale, HDDs and SSDs coexist in the same tiered storage system being managed by the controllers, storage processors or nodes in the solution including automated tiering and cache promotion or demotion. The main difference however between other storage systems, tiering and pairing and HHDDs is that in the case of the Momentus XT the HDD, SLC flash (SSD functionality) and RAM cache and their management are all integrated within the disk drive enclosure.

Do I use SSDs and HDDs or just HHDDs?
I have HHDDs installed internally in my laptops. I also have HDDs which are installed in servers, NAS and disk to disk (D2D) backup devices and Digital Video Recorders (DVRs) along with external SSD and Removable Hard Disk Drives (RHDDs). The RHDDs are used for archive and master or gold copy data protection that go offsite complimenting how I also use cloud backup services as part of my data protection strategy.

What are the technical specifications of a HHDD such as the Seagate Momentus XT?
3Gbs SATA interface, 2.5 inch 500GB 7,200 RPM HDD with 32MB RAM cache and integrated 4GByte SLC flash all managed via internal drive processor. Power consumption varies depending what the device is doing such as initial power up, idle, normal or other operating modes. You can view the Seagate Momentus XT 500GB (ST95005620AS which is what I have) specifications here as well as the product manual here.


One of my HHDDs on a note pad (paper) and other accessories

Do you need a special controller or management software?
Generally speaking no, the HHDD that I have been using plugged and played into my existing laptops internal bay replacing the HDD that came with those systems. No extra software was needed for Windows, no data movement or migration tools needed other than when initially copying from the source HDD to the new HHDD. The HHDD do their own caching, read ahead and write behind independent of the operating system or controller. Now the reason I say generally speaking is that like many devices, some operating systems or controllers may be able to leverage advanced features so check your particular system capabilities.

How come the storage system vendors are not talking about these HHDDs?
Good question which I assume it has a lot to do with the investment (people, time, engineering, money and marketing) that they have or are making in controller and storage system software functionality to effectively create hybrid tiered storage systems using SSD and HDDs on different scales. There have been some packaged HHDD systems or solutions brought to market by different vendors that combine HDD and SSD into a single physical package glued together with some software and controllers or processors to appear as a single system. I would not be surprised to see discrete HHDDs (where the HDD and flash SSD and RAM are all one integrated product) appear in lower end NAS or multifunction storage systems as well as for backup, dedupe or other system that requires large amounts of capacity space and performance boost now and then.

Why do I think this? Simple, say you have five HHDDs each with 500GB of capacity configured as a RAID5 set resulting in 2TByte of capacity. Using as a hypothetical example the Momentus XT yields 5 x 4GByte or 20GByte of flash cache helps accelerate write operations during data dumps, backup or other updates. Granted that is an overly simplified example and storage systems can be found with hundreds of GByte of cache, however think in terms of value or low cost balancing performance and capacity to cost for different usage scenarios. For example, applications such as bulk or scale out file and object storage including cloud or big data, entertainment, Server (Citrix/Xen, Microsoft/HyperV, VMware/vSphere) and Desktop virtualization or VDI, Disk to Disk (D2D) backup, business analytics among others. The common tenets of those applications and usage scenario is a combination of I/O and storage consolidation in a cost effective manner addressing the continuing storage capacity to I/O performance gap.

Data Center and I/O Bottlenecks

Storage and I/O performance gap

Do you have to backup HHDDs?
Yes, just as you would want to backup or protect any SSD or HHD device or system.

How does data get moved between the SSD and the HDD?
Other than the initial data migration from the old HDD (or SSD) to the HHDD, unless you are starting with a new system, once your data and applications exist on the HHDD, it automatically via the internal process of the device manages the RAM, flash and HDD activity. Unlike in a tiered storage system where data blocks or files may be moved between different types of storage devices, inside the HHDD, all data gets written to the HDD, however the flash and RAM are used as buffers for caching depending on activity needs. If you have sat through or listened to a NetApp or HDS use of cache for tiering discussion what the HHDDs do is similar in concept however on a smaller scale at the device level, potentially even in a complimentary mode in the future? Other functions performed inside the HHDD by its processor includes reading and writing, managing the caches, bad block replacement or re vectoring on the HDD, wear leveling of the SLC flash and other routine tasks such as integrity checks and diagnostics. Unlike paired storage solutions where data gets moved between tiers or types of devices, once data is stored in the HHDD, it is managed by the device similar to how a SSD or HDD would move blocks of data to and from the specific media along with leveraging RAM cache as a buffer.

Where is the controller that manages the SSD and HDD?
The HHDD itself is the controller per say in that the internal processor that manages the HDD also directly access the RAM and flash.

What type of flash is used and will it wear out?
The XT uses SLC (single level cell) flash which with wear leveling has a good duty cycle (life span) and is what is typically found in higher end flash SSD solutions vs. lower cost MLC (multi level cell)

Have I lost any data from it yet?
No, at least nothing that was not my own fault from saving the wrong file in the wrong place and having to recover from one of my recent D2D copies or the cloud. Oh, regarding what have I done with the HDDs that were replaced by the HHDDs? They are now an extra gold master backup copy as of a particular point in time and are being kept in a safe secure facility, encrypted of course.

Have you noticed a performance improvement?
Yes, performance will vary however in many cases I have seen performance comparable to SSD on both reads and writes as long as the HDDs keep up with the flash and RAM cache. Even as larger amounts of data are written, I have seen better performance than compared to HDDs. The caveat however is that initially you may see little to marginal performance improvement however over time, particularly on the same files, performance tends to improve. Working on large tens to hundreds of MByte size documents I noticed good performance when doing saves compared to working with them on a HDD.

What do the HHDDs cost?
Amazon.com has the 500GB model for about $100 which is about $40 to $50 less than when I bought my most recent one last fall. I have heard from other people that you can find them at even lower prices at other venues. In the theme of disclosures, I bought one of my HHDDs from Amazon and Seagate gave me one to test.

Will I buy more HHDDs or switch to SSDs?
Where applicable I will add SSDs as well as HDDs, however where possible and practical, I will also add HHDDs perhaps even replacing the HDDs in my NAS system with HHDDs at some time or maybe trying them in a DVR.

What is the down side to the HHDDs?
Im generating and saving more data on the devices at a faster rate which means that when I installed them I was wondering if I would ever fill up a 500GB drive. I still have hundreds of GBytes free or available for use, however I also am able to cary more reference data or information than in the past. In addition to more reference data including videos, audio, images, slide decks and other content, I have also been able to keep more versions or copies of documents which has been handy on the book project. Data that changes gets backed up D2D as well as to my cloud provider including while traveling. Leveraging compression and dedupe, given that many chapters or other content are similar, not as much data actually gets transmitted when doing cloud backups which has been handy when doing a backup from a airplane flying over the clouds. A wish for the XT type of HHDD that I have is for vendors such as Seagate to add Self Encrypting Disk (SED) capabilities to them along with applying continued intelligent power management (IPM) enhancements.

Why do I like the HHDD?
Simple, it solves both business and technology challenges while being an enabler, it gives me a balance of performance for productivity and capacity in a cost effective manner while being transparent to the systems it works with.

Here are some related links to additional material:
Data Center I/O Bottlenecks Performance Issues and Impacts
Has SSD put Hard Disk Drives (HDDs) On Endangered Species List?
Seagate Momentus XT SD 25 firmware
Seagate Momentus XT SD25 firmware update coming this week
A Storage I/O Momentus Moment
Another StorageIO Hybrid Momentus Moment
As the Hard Disk Drive (HDD) continues to spin
Has SSD put Hard Disk Drives (HDDs) On Endangered Species List?
Funeral for a Friend
As the Hard Disk Drive (HDD) continues to spin
Seagate Momentus XT product specifications
Seagate Momentus XT product manual
Technology Tiering, Servers Storage and Snow Removal
Self Encrypting Disks (SEDs)

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author The Green and Virtual Data Center (CRC), Resilient Storage Networks (Elsevier) and coming summer 2011 Cloud and Virtual Data Storage Networking (CRC)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved