Why SSD based arrays and storage appliances can be a good idea (Part I)

This is the first of a two-part series, you can read part II here.

Robin Harris (aka @storagemojo) recently in a blog post asks a question and thinks solid state devices (SSDs) using SAS or SATA interface in traditional hard disk drive (HDD) form factors are a bad idea in storage arrays (e.g. storage systems or appliances). My opinion is that as with many things about storing, processing or moving binary digital data (e.g. 1s and 0s) the answer is not always clear. That is there may not be a right or wrong answer instead it depends on the situation, use or perhaps abuse scenario. For some applications or vendors, adding SSD packaged in HDD form factors to existing storage systems, arrays and appliances makes perfect sense, likewise for others it does not, thus it depends (more on that in a bit). While we are talking about SSD, Ed Haletky (aka @texiwill) recently asked a related question of Fix the App or Add Hardware, which could easily be morphed into a discussion of Fix the SSD, or Add Hardware. Hmmm, maybe a future post idea exists there.

Lets take a step back for a moment and look at the bigger picture of what prompts the question of what type of SSD to use where and when along as well as why various vendors want you to look at things a particular way. There are many options for using SSD that is packaged in various ways to meet diverse needs including here and here (see figure 1).

Various SSD packaging options
Figure 1: Various packaging and deployment options for SSD

The growing number of startup and established vendors with SSD enabled storage solutions vying to win your hearts, minds and budget is looking like the annual NCAA basketball tournament (aka March Madness and march metrics here and here). Some of vendors have or are adding SSD with SAS or SATA interfaces that plug into existing enclosures (drive slots). These SSDs have the same form factor of a 2.5 inch small form factor (SFF) or 3.5 inch HDDs with a SAS or SATA interface for physical and connectivity interoperability. Other vendors have added PCIe based SSD cards to their storage systems or appliances as a cache (read or read and write) or a target device similar to how these cards are installed in servers.

Simply adding SSD either in a drive form factor or as a PCIe card to a storage system or appliance is only part of a solution. Sure, the hardware should be faster than a traditional spinning HDD based solution. However, what differentiates the various approaches and solutions is what is done with the storage systems or appliances software (aka operating system, storage applications, management, firmware or micro code).

So are SSD based storage systems, arrays and appliances a bad idea?

If you are a startup or established vendor able to start from scratch with a clean sheet design not having to worry about interoperability and customer investment protection (technology, people skills, software tools, etc), then you would want to do something different. For example, leverage off the shelf components such as a PCIe flash SSD card in an industry standard server combined with your software for a solution. You could also use extra DRAM memory in those servers combined with PCIe flash SSD cards perhaps even with embedded HDDs for a backing or preservation medium.

Other approaches might use a mix of DRAM, PCIe flash cards, as either a cache or target combined with some drive form factor SSDs. In other words, there is no right or wrong approach; sure, there are different technical merits that have advantages for various applications or environments. Likewise, people have preferences particular for technology focused who tend to like one approach vs. another. Thus, we have many options to leverage, use or abuse.

In his post, Robin asks a good question of if nand flash SSD were being put into a new storage system, why not use the PCIe backplane vs. using nand flash on DIMM vs. using drive formats, all of which are different packaging options (Figure 1). Some startups have gone the all backplane approach, some have gone with the drive form factor, some have gone with a mix and some even using HDDs in the background. Likewise some traditional storage system and array vendors who support a mix of SSD and HDD drive form factor devices also leverage PCIe cards, either as a server-based cache (e.g. EMC VFCahe) or installed as a performance accelerator module (e.g. NetApp PAM) in their appliances.

While most vendors who put SSD drive form factor drives into their storage systems or appliances (or serves for that matter) use them as data targets for creating LUNs or file systems, others use them for internal functionality. By internal functionality I mean instead of the SSD appearing as another drive or target, they are used exclusively by the storage system or appliance for caching or similar purposes. On storage systems, this can be to increase the size of persistent cache such as EMC on the CLARiiON and VNX (e.g. FAST Cache). Another use is on backup or dedupe target appliances where SSDs are used to store dictionary, index or meta data repositories as opposed to being a general data pool.

Part two of this post looks at the benefits and caveats of SSD in storage arrays.

Here are some related links to learn more about SSD, where and when to use what:
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers don’t agree on future of nand flash SSD
Speaking of speeding up business with SSD storage
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments, Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?

Ok, nuff said for now, check part II.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

StorageIO books by Greg Schulz added to Intel Recommended Reading Lists

My two most recent books The Green and Virtual Data Center and Cloud and Virtual Data Storage Networking both published by CRC Press/Taylor and Francis have been added to the Intel Recommended Reading List for Developers.

Intel Recommended Reading

If you are not familiar with the Intel Recommended Reading List for Developers, it is a leading comprehensive list of different books across various technology domains covering hardware, software, servers, storage, networking, facilities, management, development and more.

Cloud and Virtual Data Storage NetworkingIntel Recommended Reading List

So what are you waiting for, check out the Intel Recommended Reading list for Developers where you can find a diverse line up of different books of which I’m honored to have two of mine join the esteemed list. Here is a link to a free chapter download from Cloud and Virtual Data Storage Networking.

Ok, nuff said for now.

cheers
gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Researchers and marketers dont agree on future of nand flash SSD

Marketers particular those involved with anything resembling Solid State Devices (SSD) will tell you SSD is the future as will some researchers along with their fans and pundits. Some will tell you that the future only has room for SSD with the current flavor de jour being nand flash (both Single Level Cell aka SLC and Multi Level Cell aka MLC) with any other form of storage medium (e.g. Hard Disk Drives or HDD and tape summit resources) being dead and to avoid wasting your money on them.

Of course others and their fans or supporters who do not have an SSD play or product will tell forget about them, they are not ready yet.

Then there are those who take no sides per say, simply providing comments and perspectives along with things to be considered that also get used to spin stories for or against by others.

For the record, I have been a fan and user of various forms of SSD along with other variations of tiered storage mediums using them for where they fit best for several decades as a customer in IT, as a vendor, analyst and advisory consultant. Thus my perspective and opinion is that SSDs do in fact have a very bright future. However I also believe that other storage mediums are not dead yet although their roles are evolving while their technologies continue be developed. In other words, use the right technology and tool, packaged and deployed in the best most effective way for the task at hand.

Memory and tiered storage hirearchy
Memory and tiered storage hierarchy

Consequently while some SSD vendors, their fans, supporters, pundits and others might be put off by some recent UCSD research that does not paint SSD and particular nand flash in the best long-term light, it caught my attention and here is why. First I have already seen in different venues where some are using the research as a tool, club or weapon against SSD and in particular nand flash which should be no surprise. Secondly I have also seen those who don’t agree with the research at best dismiss the findings. Others are using it as a conversation or topic piece for their columns or other venues such as here.

The reason the UCSD research caught my eye was that it appeared to be looking at how will nand SSD technology evolve from where it is today to where it will be in ten years or so.

While ten years may seem like a long time, just look back at how fast things evolved over the past decade. Granted the UCSD research is open to discussion, debate and dismissal as clear in the comments of this article here. However the research does give a counter point or perspective to some of the hype which can mean somewhere between the two extremes, exists reality and where things are headed or need to be discussed. While I do not agree with all the observations or opinions of the research, it does give stimulus for discussing things including best practices around deployment vs. simply talking about adoption.

It has taken many decades for people to become comfortable or familiar with the pros and cons of HDD or tape for that matter.

Likewise some are familiar with (good or bad) with DRAM based SSD of earlier generations. On the other hand, while many people use various forms of nand flash SSD ranging from what is inside their cell phone or SD cards for cameras to USB thumb drives to SSD on drives, on PCIe cards or in storage systems and appliances, there is still an evolving comfort and confidence level for business and enterprise storage use. Some have embraced, some have dismissed, many if not most are intrigued wanting to know more, are using nand flash SSD in some shape or form, while gaining confidence.

Part of gaining confidence is moving beyond the industry hype looking at and understanding what are the pros, cons and how to leverage or work around the constraints. A long time ago a wise person told me that it is better to know the good, bad and ugly about a product, service or technology so that you could leverage the best, configure, plan and manage around the bad to avoid or minimized the ugly. Based on that philosophy I find many IT customers and even some VARs and vendors wanting to know the good, the bad and they ugly not for hanging out a vendor or their technology and products, rather so that they can be comfortable in knowing when, where, why and how to use to be most effective.

Industry Trends and Perspectives

Granted to get some of the not so good information may need NDA (Non Disclosure Agreement) or other confidentially discussions as after all, what vendor or solution provider wants to show or let anything less than favorable out into the blogosphere, twittersphere, googleplus, tabloids, news sphere or other competitive landscapes venues.

Ok, lets bring this back to the UCSD research report titled The Bleak Future of NAND Flash Memory

UCSD research report: The Bleak Future of NAND Flash Memory
Click here or on the above image to read the UCSD research report

I’m not concerned that the UCSD research was less than favorable as some others might be, after all, it is looking out into the future and if a concern, provides a glimpse of what to keep an eye on.

Likewise, looking back, the research report could be taken as simply a barometer of what could happen if no improvements or new technologies evolve. For example, the HDD would have hit the proverbial brick wall also known as the super parametric barrier many years ago if new recording methods and materials had not been deployed including a shift to perpendicular recording, something that was recently added to tape.

Tomorrows SSDs and storage mediums will still be based on nand flash including SLC, MLC, eMLC along with other variants not to mention phased change memory (PCM) and other possible contenders.

Todays SSDs have shifted from being DRAM based with HDD or even flash-based persistent backing storage to nand flash-based, both SLC and MLC with enhanced or enterprise MLC appearing. Likewise the density of SSDs continue to increase meaning more data packed into the same die or footprint, more dies stacked in a chip package to boost capacity while decreasing cost. However what is also happening is behind the scenes which is a big differentiator with SSDs and that is the quality of some firmware and low-level page management at the flash translation layer (FTL). Hence they saying that anybody with a soldering iron and ability to pull together off the shelves FTLs and packaging can create some form of an SSD. How effective a product will be is based on the intelligence and robustness of the combination of the dies, FTL, controller and associated firmware and device drivers along with other packaging options plus the testing, validation and verification they undergo.

Various packaging options and where SSD can be deployed
Various SSD locations, types, packaging and usage scenario options

Good SSD vendors and solution providers I believe will be able to discuss your concerns around endurance, duty cycles, data integrity and other related topics to set up confidence with current and future issues, granted you may have to go under NDA to gain that insight. On the other hand, those who feel threatened or not able or interested in addressing or demonstrating confidence for the long haul will be more likely to dismiss studies, research, reports, opinions or discussions that dig deeper into creating confidence via understanding of how things work so that customers can more fully leverage those technologies.

Some will view and use reports such as the one from UCSD as a club or weapon against SSD and in particular against nand flash to help their cause or campaign while others will use it to stimulate controversy and page hit views. My reason for bringing up the topic and discussion it to stimulate thinking and help increase awareness and confidence in technologies such as SSD near and long-term. Regardless of if your view is that SSD will replace HDD, or that they will continue to coexist as tiered storage mediums into the future, gaining confidence in the technologies along with when, where and how to use them are important steps in shifting from industry adoption to customer deployment.

What say you?

Is SSD the best thing and you are dumb or foolish if you do not embrace it totally or a fan, pundit cheerleader view?

Or is SSD great when and where used in the right place so embrace it?

How will SSD continue to evolve including nand and other types of memories?

Are you comfortable with SSD as a long term data storage medium, or for today, its simply a good way to discuss performance bottlenecks?

On the other hand, is SSD interesting, however you are not comfortable or have confidence with the technology, yet you want to learn more, in other words a skeptics view?

Or perhaps the true cynic view which is that SSD are nothing but the latest buzzword bandwagon fad technology?

Ok, nuff said for now, other than here is some extra related SSD material:
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments, Part II: The call to duty, SSD endurance
Part I: EMC VFCache respinning SSD and intelligent caching
Part II: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Industry adoption vs. industry deployment, is there a difference?
Data Center I/O Bottlenecks Performance Issues and Impacts
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperability support matrix

Cheers
gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

EMC VFCache respinning SSD and intelligent caching (Part II)

This is the second of a two part series pertaining to EMC VFCache, you can read the first part here.

In this part of the series, lets look at some common questions along with comments and perspectives.

Common questions, answers, comments and perspectives:

Why would EMC not just go into the same market space and mode as FusionIO, a model that many other vendors seam eager to follow? IMHO many vendors are following or chasing FusionIO thus most are selling in the same way perhaps to the same customers. Some of those vendors can very easily if they were not already also make a quick change to their playbook adding some new moves to reach broader audience.

Another smart move here is that by taking a companion or complimentary approach is that EMC can continue selling existing storage systems to customers, keep those investments while also supporting competitors products. In addition, for those customers who are slow to adopt the SSD based techniques, this is a relatively easy and low risk way to gain confidence. Granted the disk drive was declared dead several years (and yes also several decades) ago, however it is and will stay alive for many years due to SSD helping to close the IO storage and performance gap.

Storage IO performance and capacity gap
Data center and storage IO performance capacity gap (Courtesy of Cloud and Virtual Data Storage Networking (CRC Press))

Has this been done before? There have been other vendors who have done LUN caching appliances in the past going back over a decade. Likewise there are PCIe RAID cards that support flash SSD as well as DRAM based caching. Even NetApp has had similar products and functionality with their PAM cards.

Does VFCache work with other PCIe SSD cards such as FusionIO? No, VFCache is a combination of software IO intercept and intelligent cache driver along with a PCIe SSD flash card (which could be supplied as EMC has indicated from different manufactures). Thus VFCache to be VFCache requires the EMC IO intercept and intelligent cache software driver.

Does VFCache work with other vendors storage? Yes, Refer to the EMC support matrix, however the product has been architected and designed to install and coexist into a customers existing environment which means supporting different EMC block storage systems as well as those from other vendors. Keep in mind that a main theme of VFCache is to compliment, coexist, enhance and protect customers investments in storage systems to improve their effectiveness and productivity as opposed to replacing them.

Does VFCache introduce a new point of vendor lockin or stickiness? Some will see or place this as a new form of vendor lockin, others assuming that EMC supports different vendors storage systems downstream as well as offer options for different PCIe flash cards and keeps the solution affordable will assert it is no more lockin that other solutions. In fact by supporting third party storage systems as opposed to replacing them, smart sales people and marketeers will place VFCache as being more open and interoperable than some other PCIe flash card vendors approach. Keep in mind that avoiding vendor lockin is a shared responsibility (read more here).

Does VFCache work with NAS? VFCache does not work with NAS (NFS or CIFS) attached storage.

Does VFCache work with databases? Yes, VFCache is well suited for little data (e.g. database) and traditional OLTP or general business application process that may not be covered or supported by other so called big data focused or optimized solutions. Refer to this EMC document (and this document here) for more information.

Does VFCache only work with little data? While VFCache is well suited for little data (e.g. databases, share point, file and web servers, traditional business systems) it also able to work with other forms of unstructured data.

Does VFCache need VMware? No, While VFCache works with VMware vSphere including a vCenter plug in, however it does not need a hypervisor and is practical in a physical machine (PM) as it is in a virtual machine (VM).

Does VFCache work with Microsoft Windows? Yes, Refer to the EMC support matrix for specific server operating systems and hypervisor version support.

Does VFCache work with other unix platforms? Refer to the EMC support matrix for specific server operating systems and hypervisor version support.

How are reads handled with VFCache? The VFCache software (driver if you prefer) intercepts IO requests to LUNs that are being cached performing a quick lookup to see if there is a valid cache entry in the physical VFCache PCIe card. If there is a cache hit the IO is resolved from the closer or local PCIe card cache making for a lower latency or faster response time IO. In the case of a cache miss, the VFCache driver simply passes the IO request onto the normal SCSI or block (e.g. iSCSI, SAS, FC, FCoE) stack for processing by the downstream storage system (or appliance). Note that when the requested data is retrieved from the storage system, the VFCache driver will based on caching algorithms determinations place a copy of the data in the PCIe read cache. Thus the real power of the VFCache is the software implementing the cache lookup and cache management functions to leverage the PCIe card that complements the underlying block storage systems.

How are writes handled with VFCache? Unless put into a write cache mode which is not the default, VFCache software simply passes the IO operation onto the IO stack for downstream processing by the storage system or appliance attached via a block interface (e.g. iSCSI, SAS, FC, FCoE). Note that as part of the caching algorithms, the VFCache software will make determinations of what to keep in cache based on IO activity requests similar to how cache management results in better cache effectiveness in a storage system. Given EMCs long history of working with intelligent cache algorithms, one would expect some of that DNA exists or will be leveraged further in future versions of the software. Ironically this is where other vendors with long cache effectiveness histories such as IBM, HDS and NetApp among others should also be scratching their collective heads saying wow, we can or should be doing that as well (or better).

Can VFCache be used as a write cache? Yes, while its default mode is to be used as a persistent read cache to compliment server and application buffers in DRAM along with enhance effectiveness of downstream storage system (or appliances) caches, VFCache can also be configured as a persistent write cache.

Does VFCache include FAST automated tiering between different storage systems? The first version is only a caching tool, however think about it a bit, where the software sits, what storage systems it can work with, ability to learn and understand IO paths and patterns and you can get an idea of where EMC could evolve it to, similar to what they have done with recoverpoint among other tools.

Changing data access patterns and lifecycles
Evolving data access patterns and life cycles (more retention and reads)

Does VFCache mean all or nothing approach with EMC? While the complete VFCache solution comes from EMC (e.g. PCIe card and software), the solution will work with other block attached storage as well as existing EMC storage systems for investment protection.

Does VFCache support NAS based storage systems? The first release of VFCache only supports block based access, however the server that VFCache is installed in could certainly be functioning as a general purpose NAS (NFS or CIFS) server (see supported operating systems in EMC interoperability notes) in addition to being a database or other other application server.

Does VFCache require that all LUNs be cached? No, you can select which LUNs are cached and which ones are not.

Does VFCache run in an active / active mode? In the first release it is active passive, refer to EMC release notes for details.

Can VFCache be installed in multiple physical servers accessing the same shared storage system? Yes, however refer to EMC release notes on details about active / active vs. active / passive configuration rules for ensuring data integrity.

Who else is doing things like this? There are caching appliance vendors as well as others such as NetApp and IBM who have used SSD flash caching cards in their storage systems or virtualization appliances. However keep in mind that VFCache is placing the caching function closer to the application that is accessing it there by improving on the locality of reference (e.g. storage and IO effectiveness).

Does VFCache work with SSD drives installed in EMC or other storage systems? Check the EMC product support matrix for specific tested and certified solutions, however in general if the SSD drive is installed in a storage system that is supported as a block LUN (e.g. iSCSI, SAS, FC, FCoE) in theory it should be possible to work with VFCache. Emphasis, visit the EMC support matrix.
What type of flash is being used?

What type of nand flash SSD memory is EMC using in the PCIe card? The first release of VFCache is leveraging enterprise class SLC (Single Level Cell) nand flash which has been used in other EMC products for its endurance, long duty cycle to minnimize or eliminate concerns of wear and tear while meeting read and write performance. EMC has indicated that they will also as part of an industry trend leverage MLC along with Enterprise MLC (EMLC) technologies on a go forward basis.

Doesnt nand ssd flash cache wear out? While nand flash SSD can wear out over time due to extensive write use, the VFCache approach mitigates this by being primarily a read cache reducing the number or program / erase cycles (P/E cycles) that occur with write operations as well as initially leveraging longer duty cycle SLC flash. EMC also has several years experience from implementing wear leveling algorithms into the storage systems controllers to increase duty cycle and reduce wear on SLC flash which will play forward as MLC or Enterprise MLC (EMLC) techniques are leveraged. This differs from vendors who are positioning their SLC or MLC based flash PCIe SSD cards for mainly write operations which will cause more P/E cycles to occur at a faster rate reducing the duty or useful life of the device.

How much capacity does the VFCache PCIe card contain? The first release supports a 300GB card and EMC has indicated that added capacity and configuration options are in their plans.

Does this mean disks are dead? Contrary to popular industry folk lore (or wish) the hard disk drive (HDD) has plenty of life left part of which has been increased by being complimented by VFCache.

Various options and locations for SSD along with different usage scenarios
Various SSD locations, types, packaging and usage scenario options

Can VFCache work in blade servers? The VFCache software is transparent to blade, rack mount, tower or other types of servers. The hardware part of VFCache is a PCIe card which means that the blade server or system will need to be able to accommodate a PCIe card to compliment the PCIe based mezzaine IO card (e.g. iSCSI, SAS, FC, FCOE) used for accessing storage. What this means is that for blade systems or server vendors such as IBM who have a PCIe expansion module for their H series blade systems (it consumes a slot normally used by a server blade), PCIe cache cards like those being initially released by IBM could work, however check with the EMC interoperability matrix, as well as your specific blade server vendor for PCIe expansion capabilities. Given that EMC leverages Cisco UCS for their vBlocks, one would assume that those systems will also see VFCache modules in those systems. NetApp partners with Cisco using UCS in their FlexPods so you see where that could go as well along with potential other server vendors support including Dell, HP, IBM and Oracle among others.

What about benchmarks? EMC has released some technical documents that show performance improvements in Oracle environments such as this here. Hopefully we will see EMC also release other workloads for different applications including Microsoft Exchange Solutions Proven (ESRP) along with SPC similar to what IBM recently did with their systems among others.

How do the first EMC supplied workload simulations compare vs. other PCIe cards? This is tough to gauge as many SSD solutions and in particular PCIe cards are doing apples to oranges comparisons. For example to generate a high IOPs rating for marketing purposes, most SSD solutions are stress performance tested at 512 bytes or 1/2 of a KByte or at least 1/8 of a small 4Kbyte IO. Note that operating systems such as Windows are moving to 4Kbyte page allocation size to align with growing IO sizes with databases moving from the old average of 4Kbytes to 8Kbytes and larger. What is important to consider is what is the average IO size and activity profile (e.g. reads vs. writes, random vs. sequential) for your applications. If your application is doing ultra small 1/2 Kbyte IOs, or even smaller 64 byte IOs (which should be handled by better application or file system caching in DRAM), then the smaller IO size and record setting examples will apply. However if your applications are more mainstream or larger, then those smaller IO size tests should be taken with a grain of salt. Also keep latency in mind that many target or oppourtunity applications for VFCache are response time sensitive or can benefit by the improved productivity they enable.

What is locality of reference? Locality of reference refers to how close data is to where it is being requested or accessed from. The closer the data to the application requesting the faster the response time or quick the work gets done. For example in the figure below L1/L2/L3 on board processor caches are the fastest, yet smallest while closest to the application running on the server. At the other extreme further down the stack, storage becomes large capacity, lower cost, however lower performing.

Locality of reference data and storage memory

What does cache effectiveness vs. cache utilization mean? Cache utilization is an indicator of how much the available cache capacity is being used however it does not give an indicator of if the cache is being well used or not. For example, cache could be 100 percent used, however there could be a low hit rate. Thus cache effectiveness is a gauge of how well the available cache is being used to improve performance in terms of more work being done (IOPS or bandwidth) or lower of latency and response time.

Isnt more cache better? More cache is not better, it is how the cache is being used, this is a message that I would be disappointed in HDS if they were not to bring up as a point of messaging (or rebuttal) given their history of emphasis cache effectiveness vs. size or quantity (Hu, that is a hint btw ;).

What is the performance impact of VFCache on the host server? EMC is saying greatest of 5 percent or less CPU consumption which they claim is several times less than the competitions worst scenario, as well as claiming 512MB to 1GB of DRM on the server vs. several times that of their competitors. The difference could be expected to be via more off load functioning including flash translation layer (FTL), wear leveling and other optimization being handled by the PCIe card vs. being handled in the servers memory and using host server CPU cycles.

How does this compare to what NetApp or IBM does? NetApp, IBM and others have done caching with SSD in their storage systems, or leveraging third party PCIe SSD cards from different vendors to be installed in servers to be used as a storage target. Some vendors such as LSI have done caching on the PCIe cards (e.g. CacheCaid which in theory has a similar software caching concept to VFCache) to improve performance and effectiveness across JBOD and SAS devices.

What about stale (old or invalid) reads, how does VFCache handle or protect against those? Stale reads are handled via the VFCache management software tool or driver which leverages caching algorithms to decide what is valid or invalid data.

How much does VFCache cost? Refer to EMC announcement pricing, however EMC has indicated that they will be competitive with the market (supply and demand).

If a server shutdowns or reboots, what happens to the data in the VFCache? Being that the data is in non volatile SLC nand flash memory, information is not lost when the server reboots or loses power in the case of a shutdown, thus it is persistent. While exact details are not know as of this time, it is expected that the VFCache driver and software do some form of cache coherency and validity check to guard against stale reads or discard any other invalid cache entries.

Industry trends and perspectives

What will EMC do with VFCache in the future and on a larger scale such as an appliance? EMC via its own internal development and via acquisitions has demonstrated ability to use various clustered techniques such as RapidIO for VMAX nodes, InfiniBand for connecting Isilon  nodes. Given an industry trend with several startups using PCIe flash cards installed in a server that then functions as a IO storage system, it seems likely given EMCs history and experience with different storage systems, caching, and interconnects that they could do something interesting. Perhaps Oracle Exadata III (Exadata I was HP, Exadata II was Sun/Oracle) could be an EMC based appliance (That is pure speculation btw)?

EMC has already shown how it can use SSD drives as a cache extension in VNX and CLARiiON servers ( FAST CACHE ) in addition to as a target or storage tier combined with Fast for tiering. Given their history with caching algorithms, it would not be surprising to see other instantiations of the technology deployed in complimentary ways.

Finally, EMC is showing that it can use nand flash SSD in different ways, various packaging forms to apply to diverse applications or customer environments. The companion or complimentary approach EMC is currently taking contrasts with some other vendors who are taking an all or nothing, its all SSD as disk is dead approach. Given the large installed base of disk based systems EMC as well as other vendors have in place, not to mention the investment by those customers, it makes sense to allow those customers the option of when, where and how they can leverage SSD technologies to coexist and complement their environments. Thus with VFCache, EMC is using SSD as a cache enabler to discuss the decades old and growing storage IO to capacity performance gap in a force multiplier model that spreads the cost over more TBytes, PBytes or EBytes while increasing the overall benefit, in other words effectiveness and productivity.

Additional related material:
Part I: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Unified storage systems showdown: NetApp FAS vs. EMC VNX
Industry adoption vs. industry deployment, is there a difference?
Two companies on parallel tracks moving like trains offset by time: EMC and NetApp
Data Center I/O Bottlenecks Performance Issues and Impacts
From bits to bytes: Decoding Encoding
Who is responsible for vendor lockin
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperabity support matrix

Ok, nuff said for now, I think I see some storm clouds rolling in

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

EMC VFCache respinning SSD and intelligent caching (Part I)

This is the first part of a two part series covering EMC VFCache, you can read the second part here.

EMC formerly announced VFCache (aka Project Lightning) an IO accelerator product that comprises a PCIe nand flash card (aka Solid State Device or SSD) and intelligent cache management software. In addition EMC is also talking about the next phase of the flash business unit and project Thunder. The approach EMC is taking with vFCache should not be a surprise given their history of starting out with memory and SSD evolving it into an intelligent cache optimized storage solution.

Storage IO performance and capacity gap
Data center and storage IO performance capacity gap (Courtesy of Cloud and Virtual Data Storage Networking (CRC Press))

Could we see the future of where EMC will take VFCache along with other possible solutions already being hinted at by the EMC flash business unit by looking where they have been already?

Likewise by looking at the past can we see the future or how VFCache and sibling product solutions could evolve?

After all, EMC is no stranger to caching with both nand flash SSD (e.g. FLASH CACHE, FAST and SSD drives) along with DRAM based across their product portfolio not too mention being a core part of their company founding products that evolved into HDDs and more recent nand flash SSDs among others.

Industry trends and perspectives

Unlike others who also offer PCIe SSD cards such as FusionIO with a focus on eliminating SANs or other storage (read their marketing), EMC not surprisingly is marching to a different beat. The beat EMC is marching too or perhaps leading by example for others to follow is that of going mainstream and using PCIe SSD cards as a cache to compliment theirs as well as other vendors storage systems vs. replacing them. This is similar to what EMC and other mainstream storage vendors have done in the past such as with SSD drives being used as flash cache extension on CLARiiON or VNX based systems as well as target or storage tier.

Various options and locations for SSD along with different usage scenarios
Various SSD locations, types, packaging and usage scenario options

Other vendors including IBM, NetApp and Oracle among others have also leveraged various packaging options of Single Level Cell (SLC) or Multi Level Cell (MLC) flash as caches in the past. A different example of SSD being used as a cache is the Seagate Momentus XT which is a desktop, workstation consumer type device. Seagate has shipped over a million of the Momentus XT which use SLC flash as a cache to compliment and enhance the integrated HDD performance (a 750GB with 8GB SLC memory is in the laptop Im using to type this with).

One of the premises of solutions such as those mentioned above for caching is to discuss changing data access patterns and life cycles shown in the figure below.

Changing data access patterns and lifecycles
Evolving data access patterns and life cycles (more retention and reads)

Put a different way, instead of focusing on just big data or corner case (granted some of those are quite large) or ultra large cloud scale out solutions, EMC with VFCache is also addressing their core business which includes little data. What will be interesting to watch and listen too is how some vendors will start to jump up and down saying that they have done or enabling what EMC is announcing for some time. In some cases those vendors will be rightfully doing and making noise on something that they should have made noise about before.

EMC is bringing the SSD message to the mainstream business and storage marketplace showing how it is a compliment to, vs. a replacement of existing storage systems. By doing so, they will show how to spread the cost of SSD out across a larger storage capacity footprint boosting the effectiveness and productive of those systems. This means that customers who install the VFCache product can accelerate the performance of both their existing EMC as well as storage systems from other vendors preserving their technology along with people skills investment.

 

Key points of VFCache

  • Combines PCIe SLC nand flash card (300GB) with intelligent caching management software driver for use in virtualized and traditional servers

  • Making SSD complimentary to existing installed block based disk (and or SSD) storage systems to increase their effectiveness

  • Providing investment protection while boosting productivity of existing EMC and third party storage in customer sites

  • Brings caching closer to the application where the data is accessed while leverage larger scale direct attached and SAN block storage

  • Focusing message for SSD back on to little data as well as big data for mainstream broad customer adoption scenarios

  • Leveraging benefit and strength of SSD as a read cache and scalable of underlying downstream disk for data storage

  • Reducing concerns around SSD endurance or duty cycle wear and tear by using as a read cache

  • Off loads underlying storage systems from some read requests enabling them to do more work for other servers

Additional related material:
Part II: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Unified storage systems showdown: NetApp FAS vs. EMC VNX
Industry adoption vs. industry deployment, is there a difference?
Two companies on parallel tracks moving like trains offset by time: EMC and NetApp
Data Center I/O Bottlenecks Performance Issues and Impacts
From bits to bytes: Decoding Encoding
Who is responsible for vendor lockin
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperabity support matrix

Ok, nuff said for now, I think I see some storm clouds rolling in

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

AWS (Amazon) storage gateway, first, second and third impressions

Amazon Web Services (AWS) today announced the beta of their new storage gateway functionality that enables access of Amazon S3 (Simple Storage Services) from your different applications using an appliance installed in your data center site. With this beta launch, Amazon joins other startup vendors who are providing standalone gateway appliance products (e.g. Nasuni etc) along with those who have disappeared from the market (e.g. Cirtas). In addition to gateway vendors, there are also those with cloud access added to their software tools such as (e.g. Jungle Disk that access both Rack space and Amazon S3 along with Commvault Simpana Cloud connector among others). There are also vendors that have joined cloud access gateways as part of their storage systems such as TwinStrata among others. Even EMC (and here) has gotten into the game adding qualified cloud access support to some of their products.

What is a cloud storage gateway?

Before going further, lets take a step back and address what for some may be a fundemental quesiton of what is a cloud storage gateway?

Cloud services such as storage are accessed via some type of network, either the public Internet or a private connection. The type of cloud service being accessed (figure 1) will decide what is needed. For example, some services can be accessed using a standard Web browser, while others must plug-in or add-on modules. Some cloud services may need downloading an application, agent, or other tool for accessing the cloud service or resources, while others give an on-site or on-premisess appliance or gateway.

Generic cloud access example via Cloud and Virtual Data Storage Networking (CRC Press)
Figure 1: Accessing and using clouds (From Cloud and Virtual Data Storage Networking (CRC Press))

Cloud access software and gateways or appliances are used for making cloud storage accessible to local applications. The gateways, as well as enabling cloud access, provide replication, snapshots, and other storage services functionality. Cloud access gateways or server-based software include tools from BAE, Citrix, Gladinet, Mezeo, Nasuni, Openstack, Twinstrata among others. In addition to cloud gateway appliances or cloud points of presence (cpops), access to public services is also supported via various software tools. Many data protection tools including backup/restore, archiving, replication, and other applications have added (or are planning to add) support for access to various public services such as Amazon, Goggle, Iron Mountain, Microsoft, Nirvanix, or Rack space among several others.

Some of the tools have added native support for one or more of the cloud services leveraging various applicaiotn programming interfaces (APIs), while other tools or applications rely on third-party access gateway appliances or a combination of native and appliances. Another option for accessing cloud resources is to use tools (Figure 2) supplied by the service provider, which may be their own, from a third-party partner, or open source, as well as using their APIs to customize your own tools.

Generic cloud access example via Cloud and Virtual Data Storage Networking (CRC Press)
Figure 2: Cloud access tools (From Cloud and Virtual Data Storage Networking (CRC Press))

For example, I can use my Amazon S3 or Rackspace storage accounts using their web and other provided tools for basic functionality. However, for doing backups and restores, I use the tools provided by the service provider, which then deal with two different cloud storage services. The tool presents an interface for defining what to back up, protect, and restore, as well as enabling shared (public or private) storage devices and network drives. In addition to providing an interface (Figure 2), the tool also speaks specific API and protocols of the different services, including PUT (create or update a container), POST (update header or Meta data), LIST (retrieve information), HEAD (metadata information access), GET (retrieve data from a container), and DELETE (remove container) functions. Note that the real behavior and API functionality will vary by service provider. The importance of mentioning the above example is that when you look at some cloud storage services providers, you will see mention of PUT, POST, LIST, HEAD, GET, and DELETE operations as well as services such as capacity and availability. Some services will include an unlimited number of operations, while others will have fees for doing updates, listing, or retrieving your data in addition to  basic storage fees. By being aware of cloud primitive functions such as PUT or POST and GET or LIST, you can have a better idea of what they are used for as well as how they play into evaluating different services, pricing, and services plans.

Depending on the type of cloud service, various protocols or interfaces may be used, including iSCSI, NAS NFS, HTTP or HTTPs, FTP, REST, SOAP, and Bit Torrent, and APIs and PaaS mechanisms including .NET or SQL database commands, in addition to XM, JSON, or other formatted data. VMs can be moved to a cloud service using file transfer tools or upload capabilities of the provider. For example, a VM such as a VMDK or VHD  is prepared locally in your environment and then uploaded to a cloud provider for execution. Cloud services may give an access program or utility that allows you to configure when, where, and how data will be protected, similar to other backup or archive tools.

Some traditional backup or archive tools have added direct or via third party support for accessing IaaS cloud storage services such as Amazon, Rack space, and others. Third-party access appliance or gateways enable existing tools to read and write data to a cloud environment by presenting a standard interface such as NAS (NFS and/or CIFS) or iSCSI (Block) that gets mapped to the back-end cloud service format. For example, if you subscribe to Amazon S3, storage is allocated as objects and various tools are used to use or utilize. The cloud access software or appliance understands how to communicate with the IaaS  storage APIs and abstracts those from how they are used. Access software tools or gateways, in addition to translating or mapping between cloud APIs, formats your applications including security with encryption, bandwidth optimization, and data footprint reduction such as compression and de-duplication. Other functionality include reporting, management tools that support various interfaces, protocols and standards including SNMP or SNIA, Storage Management Initiative Specification (SMIS), and Cloud Data Management Initiative (CDMI).

First impression: Interesting, good move Amazon, I was ready to install and start testing it today

The good news here is that Amazon is taking steps to make it easier for your existing applications and IT environments to use and leverage clouds for private and hybrid adoption models with both an Amazon branded and managed services, technology and associated tools.

This means leveraging your existing Amazon accounts to simplify procurement, management, ongoing billing as well as leveraging their infrastructure. As a standalone gateway appliance (e.g. it does not have to be bundled as part of a specific backup, archive, replication or other data management tool), the idea is that you can insert the technology into your existing data center between your servers and storage to begin sending a copy of data off to Amazon S3. In addition to sending data to S3, the integrated functionality with other AWS services should make it easier to integrated with Elastic Cloud Compute (EC2) and Elastic Block storage (EBS) capabilities including snapshots for data protection.

Thus my first impression of AWS storage gateway at a high level view is good and interesting resulting in looking a bit deeper resulting in a second impression.

Second impression: Hmm, what does it really do and require, time to slow down and do more home work

Digging deeper and going through the various publicly available material (note can only comment or discuss on what is announced or publicly available) results in a second impression of wanting and needing to dig deeper based on some of caveats. Now granted and in fairness to Amazon, this is of course a beta release and hence while on first impression it can be easy to miss the notice that it is in fact a beta so keep in mind things can and hopefully will change.

Pricing aside, which means as with any cloud or managed storage service, you will want to do a cost analysis model just as you would for procuring physical storage, look into the cost of monthly gateway fee along with its associated physical service running VMware ESXi configuration that you will need to supply. Chances are that if you are an average sized SMB, you have a physical machine (PM) laying around that you can throw a copy of ESXi on to if you dont already have room for some more VMs on an existing one.

You will also need to assess the costs for using the S3 storage including space capacity charges, access and other fees as well as charges for doing snapshots or using other functionality. Again these are not unique to Amazon or their cloud gateway and should be best practices for any service or solution that you are considering. Amazon makes it easy by the way to see their base pricing for different tiers of availability, geographic locations and optional fees.

Speaking of accessing the cloud, and cloud conversations, you will also want to keep in mind what your networking bandwidth service requirements will be to move data to Amazon that might not already be doing so.

Another thing to consider with the AWS storage gateway is that it does not replace your local storage (that is unless you move your applications to Amazon EC2 and EBS), rather makes a copy of what every you save locally to a remote Amazon S3 storage pool. This can be good for high availability (HA), business continuance (BC), disaster recovery (DR) and compliance among other data management needs. However in your cost model you also need to keep in mind that you are not replacing your local storage, you are adding to it via the cloud which should be seen as complimenting and enhancing your private now to be hybrid environment.

 

Walking the cloud data protection talk

FWIW, I leverage a similar model where I use a service (Jungle Disk) where critical copies of my data get sent to that service which in turn places copies at Rack space (Jungledisks parent) and Amazon S3. What data goes to where depends on different policies that I have established. I also have local backup copies as well as master gold disaster copy stored in a secure offsite location. The idea is that when needed, I can get a good copy restored from my cloud providers quickly regardless of where I am if the local copy is not good. On the other hand, experience has already demonstrated that without sufficient network bandwidth services, if I need to bring back 100s of GBytes or TBytes of data quickly, Im going to be better off bring back onsite my master gold copy, then applying fewer, smaller updates from the cloud service. In other words, the technologies compliment each other.

By the way, a lesson learned here is that once my first copy is made which have data footprint reduction (DFR) techniques applied (e.g. compress, de dupe, optimized, etc), later copies occur very fast. However subsequent restores of those large files or volumes also takes longer to retrieve from the cloud vs. sending up changed versions. Thus be aware of backup vs. restore times, something of which will apply to any cloud provider and can be mitigated by appliances that do local caching. However also keep in mind that if a disaster occurs, will your local appliance be affected and its cache rendered useless.

Getting back to AWS storage gateway and my second impression is that at first it sounded great.

However then I realized it only supports iSCSI and FWIW, nothing wrong with iSCSI, I like it and recommend using it where applicable, even though Im not using it. I would like to have seen a NAS (either NFS and/or CIFS) support for a gateway making it easier for in my scenario different applications, servers and systems to use and leverage the AWS services, something that I can do with my other gateways provided via different software tools. Granted for those environments that already are using iSCSI for your servers that will be using AWS storage gateway, then this is a non issue while for others it is a consideration including cost (time) to factor in to prepare your environment for using the ability.

Depending on the amount of storage you have in your environment, the next item that caught my eye may or may not be an issue that the iSCSI gateway supports up to 1TB volumes and up to 12 of them hence a largest capacity of 12TB under management. This can be gotten around by using multiple gateways however the increased complexity balanced to the benefit the functionality is something to consider.

Third impression: Dig deeper, learn more, address various questions

This leads up to my third impression the need to dig deeper into what AWS storage gateway can and cannot do for various environments. I can see where it can be a fit for some environments while for others at least in its beta version will be a non starter. In the meantime, do your homework, look around at other options which ironically by having Amazon launching a gateway service may reinvigorate the market place of some of the standalone or embedded cloud gateway solution providers.

What is needed for using AWS storage gateway

In addition to having an S3 account, you will need to acquire for a monthly fee the storage gateway appliance which is software installed into a VMware ESXi hypervisor virtual machine (VM). The requirements are VMware ESXi hypervisor (v4.1) on a physical machine (PM) with at least 7.5GB of RAM and four (4) virtual processors assigned to the appliance VM along with 75GB of disk space for the Open Virtual Alliance (OVA) image installation and data. You will also need to have an proper sized network connection to Amazon. You will also need iSCSI initiators on either Windows server 2008, Windows 7 or Red Hat Enterprise Linux.

Note that the AWS storage gateway beta is optimized for block write sizes greater than 4Kbytes and warns that smaller IO sizes can cause overhead resulting in lost storage space. This is a consideration for systems that have not yet changed your file systems and volumes to use the larger allocation sizes.

Some closing thoughts, tips and comments:

  • Congratulations to Amazon for introducing and launching an AWS branded storage gateway.
  • Amazon brings trust the value of trust to a cloud relationship.
  • Initially I was excited about the idea of using a gateway that any of may systems could use my S3 storage pools with vs. using gateway access functions that are part of different tools such as my backup software or via Amazon web tools. Likewise I was excited by the idea of having an easy to install and use gateway that would allow me to grow in a cost effective way.
  • Keep in mind that this solution or at least in its beta version DOES NOT replace your existing iSCSI based storage needs, instead it compliments what you already have.
  • I hope Amazon listens carefully to what they customers and prospects want vs. need to evolve the functionality.
  • This announcement should reinvigorate some of the cloud appliance vendors as well as those who have embedded functionality to Amazon and other providers.
  • Keep bandwidth services and optimization in mind both for sending data as well as for when retrieving during a disaster or small file restore.
  • In concept, the AWS storage gateway is not all that different than appliances that do snapshots and other local and remote data protection such as those from Actifio, EMC (Recoverpoint), Falconstor or dedicated gateways such as those from Nasuni among others.
  • Here is a link to added AWS storage gateways frequently asked questions (FAQs).
  • If the AWS were available with a NAS interface, I would probably be activating it this afternoon even with some of their other requirements and cost aside.
  • Im still formulating my fourth impression which is going to take some time, perhaps if I can get Amazon to help sell more of my books so that I can get some money to afford to test the entire solution leveraging my existing S3, EC2 and EBS accounts I might do so in the future, otherwise for now, will continue to research.
  • Learn more about the AWS storage gateway beta, check out this free Amazon web cast on February 23, 2012.

Learn more abut cloud based data protection, data footprint reduction, cloud gateways, access and management, check out my book Cloud and Virtual Data Storage Networking (CRC Press) which is of course available on Amazon Kindle as well as via hard cover print copy also available at Amazon.com.

Ok, nuff said for now, I need to get back to some other things while thinking about this all some more.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

My Server and Storage IO holiday break projects

Happy new years!

Following up from a flurry of posts in the closing days of 2011 including industry trends perspective predictions for 2012 and 2013, top blog posts from 2011, top all time posts, along with a couple of other items here and here, its time to get back to 2012 activity. Also if you missed it, here is the Fall (December) 2011 StorageIO news letter.

Actually I have been busy working on some other projects the past several weeks most of which are NDA so not much else can be said about them, however there are some other things I’m working on that will show themselves in the weeks and months to come. Here is a link to a webinar and live chat that I did the first week of January on CDP (Continuous Data Protection) and how it can be applied to many different environments.

But lets take a step back for a moment and let me share with you some of the things I did or started during the holiday break between christmas and the new years.

Like many others, I found time to relax and get away from normal work activities during the recent holiday season.

However like many of you that may also be techniques or geeks or wanna be geeks at heart, I could not get away from server, storage, IO, networking, data protection, video and other things completely. I used some time to discuss a few projects that I had wanted to do or that I had started before the holidays and here is a synopsis.

Increased storage capacity on a DVR by about 5x In order to get this to work, I modified a 3.5 enclosure with a power supply to accept a 2.5 1.5TB SATA HDD with an eSATA connection, the easy part was then attaching it to the external eSATA port on my DVR. The hard part was then waiting for the DVR to reconfigure and start recording information again. Also as part of upgrading the external storage on the DVR was to get the media share option to do more than basic things leveraging audio and video real-time trans coding using the Tversity software along with various codecs on a media server.

Another project involved upgrading a 500GB HHDD to a 750GB HHDD and did some testing Shortly before the holidays I received a new 750GB Seagate Momentus XT II HHDD to compare to my exiting 500GB previous generation model. I have been using the 750GB HHDD for over a month now and it is amazing to see so much space in a laptop that also has good performance. Some follow-up activities are to go back and analyze some performance data that I collected before and after the upgrade. This includes both workload simulation of reads, writes, random, sequential of different IO size as well as comparing Windows startup and shutdown speed and impact to build on what I did last summer (see this post). More on these in the not so distance future.

Speaking of clouds, I had a chance to do some more testing with my Amazon EC2 and EBS accounts in addition to cleaning up my S3 pool in addition to my other cloud backup and storage providers accounts. This also involved refining some data protection backup/restore and archive frequency and retention settings. In addition to refinements for cloud based backup, I’m also in the process of transitioning from Imation Odyssey Removable Hard Disk Drives (RHDD) too much larger capacity 2.5 portable RHDDs that are used for offsite bulk copies. Part of the migration includes seeing that end of year master or gold backups and archives were made and safely secured elsewhere in addition to having data sent to the cloud.

Another project involved doing some more testing and simulations with my SSD along with more windows boot and shutdown tests mentioned above. More on these results in a future post.

Sometime (actually not very much) was also spent adding some new shares to my Iomega IX4 NAS which is filling up so I also did some more research on what I will upgrade or replace it with. While Iomega has been great (knock on wood), Synology is also looking interesting as a future solution however keeping my options open for now. Right now I’m leaning towards keeping the IX4 and adding another NAS filer using the two for different purposes.

Some other server, storage and IO projects also included upgrading some networking components, and to finish decommissioning old drives making them secure for safe disposal when the time comes.

I also was able to spend time on non tech items including outside enjoying the nice weather, cutting up some fallen trees and roasting them on a bonfire among other things.

Tree cleanupOn break

roasting logswalking on frozen water

Ok, nuff said for now, time to get back to work.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

A conversation from SNW 2011 with Jenny Hamel

Here (.qt) and here (.wmv) is a video from an interview that I did with Jenny Hamel (@jennyhamelsd6) during the Fall 2011 SNW event in Orlando Florida.

audio

Topics covered during the discussion include:

  • Importance of metrics that matter for gaining and maintaining IT situational awareness
  • The continued journey of IT to improve customer service delivery in a cost-effective manner
  • Reducing cost and complexity without negatively impacting customer service experience
  • Participating in SNW and SNIA for over ten years on three different continents

Industry Trends and Perspectives

  • Industry trends, buzzword bingo (SSD, cloud, big data, virtualization), adoption vs. deployment
  • Increasing efficiency along with effectiveness and productivity
  • Stretching budgets to do more without degrading performance or availability
  • How customers can navigate their way around various options, products and services
  • Importance of networking at events such as SNW along with information exchange and learning
  • Why data footprint reduction is similar to packing smartly when going on a journey
  • Cloud and Virtual Data Storage Networking (now available on Kindle and other epub formats)

View the video from SNW fall 2011 here (.qt) or here (.wmv).

audio

Check out other videos and pod casts here or at StorageioTV.com

Speaking of industry trends, check out the top 25 new posts from 2011, along with the top 25 all time posts and my comments (predictions) for 2012 and 2013.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Top storageio cloud virtualization networking and data protection posts

Im in the process of wrapping up 2011 and getting ready for 2012. Here is a list of the top 25 all time posts from StorageIOblog covering cloud, virtualization, servers, storage, green IT, networking and data protection. Looking back, here is 2010 and 2011 industry trends, thoughts and perspective predictions along with looking forward, a 2012 preview here.

Top 25 all time posts about storage, cloud, virtualization, networking, green IT and data protection

Check out the companion post to this which is the top 25 2011 posts located here as well as 2012 and 2013 predictions preview here.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

2012 industry trends perspectives and commentary (predictions)

2011 is almost over, so its wrap up time of the year as well as getting ready for 2012.

Here is a link to a post of the top 25 new posts that appeared on StorageIOblog in 2011.

As a companion to the above, here is a link to the all time top 25 posts from StorageIOblog.

Looking back, here is a post about industry trends, thoughts and perspective predictions for 2010 and 2011 (preview 2012 and 2013 thoughts and perspectives here).

Im still finalizing my 2012 and 2013 predictions and perspectives which is a work in progress, however here is a synopsis:

  • Addressing storage woes at the source: Time to start treating the source of data management and protection including backup challenges instead of or in addition to addressing downstream target destination topics.
  • Big data and big bandwidth meet big backup: 2011 was a buzz with big data and big bandwidth so 2012 will see realization that big backup needs to be addressed. Also in 2012 there will be continued realization that many have been doing big data and big bandwidth thus also big backups for many years if not decades before the current big buzzword became popular.
  • Little data does not get left out of the discussion even though younger brother big data gets all of the press and praise. Little data may not be the shining diva it once was, however the revenue annuity stream will keep many software, tools, server and storage vendors afloat while customers continue to rely on the little data darling to run their business.
  • Cloud confusion finds clarity on the horizon: Granted there will be plenty of more cloud fud and hype, cloud washing and cleaning going around, however 2012 and beyond will also find organizations realizing where and how to use different types of clouds (public, private, hybrid) too meet various needs from SaaS and AaaS to PaaS to IaaS and other variations of XaaS. Part of the clarification that will help remove the confusion will be that there are many different types of cloud architectures, products, stacks, solutions, services and products to address various needs. Another part of the clarification will be discussion of what needs to be added to clouds to make them more viable for both new, as well as old or existing applications. This means organizations will determine what they need to do to move their existing applications to some form of a cloud model while understanding how clouds coexist and compliment what they are currently doing. Cloud conversations will also shift from low cost or for free focus expanding to discussions around value, trust, quality of service (QoS), SLOs, SLAs, security, reliability and related themes.

Industry Trends and Perspectives

  • Cloud and virtualization stack battles: The golden rule of virtualization and clouds is that who ever controls the management and software stacks controls the gold. Hence, watch for more positioning around management and enablement stacks as well as solutions to see who gains control of the gold.
  • Data protection modernization: Building off of first point above, data protection modernization the past several years has been focused on treating the symptoms of downstream problems at the target or destination. This has involved swapping out or moving media around, applying data footprint reduction (DFR) techniques downstream to give near term tactical relief as has been the cause with backup, restore, BC and DR for many years. Now the focus will start to expand to how to address the source of the problem with is an expanding data footprint upstream or at the source using different data footprint reduction tools and techniques. This also means using different metrics including keeping performance and response time in perspective as part of reduction rates vs. ratios while leveraging different techniques and tools from the data footprint reduction tool box. In other words, its time to stop swapping out media like changing tires that keep going flat on a car, find and fix the problem, change the way data is protected (and when) to cut the impact down stream. This will not happen overnight, however with virtualization and cloud activities underway, now is a good time to start modernizing data protection.
  • End to End (E2E) management tools: Continue focus around E2E tools and capabilities to gain situational awareness across different technology layers.
  • FCoE and Fibre Channel continue to mature: One sure sign that Fibre Channel over Ethernet (FCoE) is continuing to evolve, mature and gain initial traction is the increase in activity declaring it dead or dumb or similar things. FCoE is still in its infancy while Fibre Channel (FC) is in the process of transitioning to 16Gb with a roadmap that will enable it to continue for many more years. As FCoE continues to ramp up over next several years (remember, FC took several years to get where it is today), continued FC enhancements will give options for those wishing to stick with it while gaining confidence with FCoE, iSCSI, SAS and NAS.
  • Hard drive shortages drive revenues and profits: Some have declared that the recent HDD shortages due to Thailand flooding will cause Solid State Devices (SSD) using flash memory to dramatically grow in adoption and deployment. I think that both single level cell (SLC) and multi level cell (MLC) flash SSDs will continue to grow in deployments counted in units shipped as well as revenues and hopefully also margin or profits. However I also think that with the HDD shortage and continued demand, vendors will use the opportunity to stabilize some of their pricing meaning less discounting while managing the inventory which should mean more margin or profits in a quarter or too. What will be interesting to watch will be if SSD vendors drop the margin in an effort to increase units shipped and deployed to show market revenue and adoption growth while HDD margins rise.

Industry Trends and Perspectives

  • QoS, SLA/SLOs part of cloud conversations: Low cost or cost avoidance will continue to be the focus of some cloud conversations. However with metrics and measurements to make informed decisions, discussions will expand to QoS, SLO, SLAs, security, mean time to restore or return information, privacy, trust and value also enter into the picture. In other words, clouds are growing up and maturing for some, while their existing capabilities become discovered by others.
  • Clouds are a shared responsibility model: The cloud blame game when something goes wrong will continue, however there will also be a realization that as with any technology or tool, there is a shared responsibility. This means that customers accept responsibility for how they will use a tool, technologies or service, the provider assumes responsibility, and both parties have a collective responsibility.
  • Return on innovation is the new ROI: For years, no make that decades a popular buzz term is return on investment the companion of total cost of ownership. Both ROI and TCO as you know and like (or hate) will continue to be used, however for situations that are difficult to monitize, a new variation exists. That new variation is return on innovation which is the measure of intangible benefits derived from how hard products are used to derive value for or of soft products and services delivered.
  • Solid State Devices (SSD) confidence: One of the barriers to flash SSD adoption has been cost per capacity with another being confidence in reliability and data consistency over time (aka duty cycle wear and tear). Many enterprise class solutions have used single level cell (SLC) flash SSD which has better endurance, duty cycle or wear handing capabilities however that benefit comes at the cost of a higher price per capacity. Consequently vendors are pushing multi level cell (MLC) flash SSD that reduces the cost per capacity, however needs extra controller and firmware functionality to manage the wear leaving and duty cycle. In some ways, MLC flash is to SSD memory what SATA high-capacity desktop drives were to HDDs in the enterprise storage space about 8 to 9 years ago. What I mean by that is that more cost high performance disk drives were the norm, then lower cost higher capacity SATA drives appeared resulting in enhancements to make them more enterprise capable while boosting the confidence of customers to use the technology. Same thing is happening with flash SSD in that SLC is more expensive and for many has a higher confidence, while MLC is lower cost, higher capacity and gaining the enhancements to take on a role for flash SSD similar to what high-capacity SATA did in the HDD space. In addition to confidence with SSD, new packaging variations will continue to evolve as well.
  • Virtualization beyond consolidation: The current wave of consolidation of desktop using VDI, server and storage aggregation will continue, however a trend that has grown for a couple of years now that will take more prominence in 2012 and 2013 is realization that not everything can be consolidated, however many things can be virtualized. This means for some applications the focus will not be how many VMs to run per PM, rather, how a PM can be more effectively used to boost performance and agility for some applications during part of the day, while being used for other things at different times. For example a high performance database that normally would not be consolidated would be virtualized to enable agility for maintenance, BC, DR load balancing and placed on a fast PM with lots of fast memory, CPU and IO capabilities dedicated to it. However during off hours when little to no database activity is occurring, then other VMs would be moved onto that PM then moved off before the next busy cycle.

Industry Trends and Perspectives

  • Will applications be ready to leverage cloud: Some applications and functionality can more easily be moved to cloud environments vs. others. A question that organizations will start to ask is what prevents their applications or business functionality from going to or using cloud resources in addition to asking cloud providers what new capabilities will they extend to support old environments.
  • Zombie list grows: More items will be declared dead meaning that they are either still alive, or have reached stability to the point where some want to see them dead so that their preferred technology or topic can take root.
  • Some other topics and trends include continued growing awareness that metrics and measurements matter for cloud, virtualization, data and storage networking. This also means a growing awareness that there are more metrics that matter for storage than cost per GByte or Tbyte that include IOPS, latency or response time, bandwidth, IO size, random and sequential along with availability. 2012 and 2013 will see continued respect being given to NAS at both the high end as well as low end of the market from enterprise down to consumer space. Speaking of consumer and SOHO (Small Office Home Office), now that SMB has generally been given respect or at least attention by many vendors, the new frontier will be to move further down market to the lower end of the SMB which is SOHO, just above consumer space. Of course some vendors have already closed the gap (or at least on paper, power point, web ex or you tube video) from consumer to enterprise. Of course Buzzword bingo will continue to be a popular game.
  • Oh, btw, DevOps will also appear in your vocabulary if it has not already.

Watch for more on these and other topics in the weeks and months to come and if you and to read more now, then get a copy of Cloud and Virtual Data Storage Networking. Also check out the top 25 new post of 2011 as well as some of the all time most popular posts at StorageIOblog.com that can also be seen on various other venues that pickup the full RSS feed or archive feed. Also check out the StorageIO news letter for more industry trends perspectives and commentary.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Top 2011 cloud virtualization storage and networking posts

Im in the process of wrapping up 2011 and getting ready for 2012, here is a list of the top 25 new posts from this past year at StorageIOblog.

Looking back, here is a post about industry trends, thoughts and perspective predictions for 2010 and 2011 (preview 2012 and 2013 thoughts and perspectives here).

Here are the top 25 new blog posts from 2011

Check out the companion posts of the top 25 all time posts here as well as 2012 and 2013 predictions preview here.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Fall (December) 2011 StorageIO News Letter

StorageIO News Letter Image
Fall (December) 2011 News letter

Welcome to the Fall (December) 2011 edition of the Server and StorageIO Group (StorageIO) news letter. This follows the Summer 2011 edition.

You can get access to this news letter via various social media venues (some are shown below) in addition to StorageIO web sites and subscriptions.

 

Click on the following links to view the Fall (December) 2011 edition as an HTML or PDF or, to go to the news letter page to view previous editions.

Follow via Goggle Feedburner here or via email subscription here.

You can also subscribe to the news letter by simply sending an email to newsletter@storageio.com

Enjoy this edition of the StorageIO newsletter, let me know your comments and feedback.

Nuff said for now

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Cloud and Virtual Data Storage Networking now on Kindle

It only makes sense that a book about Clouds, Virtualization, Data Storage and Networking be available via a cloud service in electronic format. Today Amazon and my publisher (CRC Press Taylor and Francis) released a Kindle version of my new book Cloud and Virtual Data Storage Networking which joins the previously released hardcopy version also available at Amazon.com among other venues.

Cloud and Virtual Data Storage Networking book on Kindle

Cloud and Virtual Data Storage Networking has been declared The New Enterprise Tech Bible by noted industry blogger and host of the Nekkid Tech (@NekkidTech) pod cast Greg Knieriemen (@Knieriemen). Check out Episode #11 (The Enterprise Tech Bible) of the Nekkid Tech pod cast show here.

Comments and reviews about Cloud and Virtual Data Storage Networking can be found at Amazon.com along with those from Stephen Guendert, PhD (@DrSteveGuendert) at CMG MeasureIT (@cmgnews) who says: Gregs latest book is the ibuprofen that will make these cloud computing information overload headaches go away. Cloud and Virtual Data Storage Networking is the single source you can read to get a clear understanding of the fundamentals of the cloud.

Greg Brunton, EDS, an HP Company commented: With all the chatter in the market about cloud storage and how it can solve all your problems, the industry needed a clear breakdown of the facts and how to use Cloud cloud storage effectively. Gregs latest book does exactly that.

Google preview of Cloud and Virtual Data Storage Networking book

Want to know more besides viewing the Google preview above?

Check out this free PDF download of Chapter 1 and view a PDF flyer with more information about the book including discount codes for ordering via the CRC Press or visit the StorageIO books page. In addition to Amazon Kindle version, other ebook formats including (PDF) are available here (bookdepository.com), and here (CRCnetBase) including each chapter.

View this post which has links too more information about cloud conversations and discussions.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

What industry pundits love and loathe about data storage

Drew Robb has a good article about what IT industry pundits including vendors, analysts, and advisors loath including comments from myself.

In the article Drew asks: What do you really love about storage and what are your pet peeves?

One of my comments and perspectives is that I like Hybrid Hard Disk Drives (HHDDs) in addition to traditional Hard Disk Drives (HDD) along with Solid State Devices (SSDs). As much as I like HHDDs, I also believe that with any technology, they are not the best solution for everything, however they can also be used in many ways than being seen. Here is the fifth installment of a series on HHDDs that I have done since June 2010 when I received my first HHDD a Seagate Momentus XT. You can read the other installments of my momentus moments here, here, here and here.

Seagate Momentus XT
HHDD with integrated nand flash SSD photo courtesy Seagate.com

Molly Rector VP of marketing at tape summit resources vendor Spectra Logic mentioned that what she does not like is companies that base their business plan on patent law trolling. I would have expected something different along the lines of countering or correcting people that say tape sucks, tape is dead, or that tape is the cause problem of anything wrong with storage thus clearing the air or putting up a fight that tape summit resources. Go figure…

Another of my comments involved clouds of which there are plenty of conversations taking place. I do like clouds (I even recently wrote a book involving them) however Im a fan of using them where applicable to coexist and enhance other IT resources. Dont be scared of clouds, however be ready, do your homework, listen, learn, do proof of concepts to decide best practices, when, where, what and how to use them.

Speaking of clouds, click here to read about who is responsible for cloud data loss and cast your vote, along with viewing what do you think about IT clouds in general here.

Mike Karp (aka twitter @storagewonk ) an analyst with Ptak Noel mentions that midrange environments dont get respect from big (or even startup) vendors.

I would take that a step further by saying compared to six or so years ago, SMB are getting night and day better respect along with attention by most vendors, however what is lacking is respect of the SOHO sector (e.g. lower end of SMB down to or just above consumer).

Granted some that have traditional sold into those sectors such as server vendors including Dell and HP get it or at least see the potential along with traditional enterprise vendor EMC via its Iomega . Yet I still see many vendors including startups in general discounting, shrugging off or sneering at the SOHO space similar to those who dissed or did not respect the SMB space several years ago. Similar to the SMB space, SOHO requires different products, packaging, pricing and routes to market via channel or etail mechanisms which means change for some vendors. Those vendors who embraced the SMB and realized what needed to change to adapt to those markets will also stand to do better with the SOHO.

Here is the reason that I think SOHO needs respect.

Simple, SOHOs grow up to become SMBs, SMBs grow up to become SMEs, SMEs grow up to become enterprises and not to mention that the amount of data being generated, moved, processed and stored continues to grow. The net result is that SMBs along with SOHO storage demands will continue to grow and for those vendors who can adjust to support those markets will also stand to gain new customers that in turn can become plans for other solution offerings.

Cloud conversations

Not surprising Eran Farajun of Asigra which has been doing cloud backups decades before they were known as clouds loves backup (and restores). However I am surprised that Eran did not jump on the its time to modernize and re architect data protection theme. Oh well, will have to have a chat with Eran on that sometime.

What was surprising were comments from Panzura who has a good distributed (e.g. read also cloud) file system that can be used for various things including online reference data. Panzura has a solution that normally I would not even think about in the context of being pulled into a Datadomain or dedupe appliance type discussion (e.g tape sucks or other similar themes). So it is odd that they are playing to the tape sucks camp and theme vs. playing to where the technology can really shine which IMHO is in the global, distributed, scale out and cloud file system space. Oh well, I guess you go with what you know or has worked in the past to get some attention.

Molly Rector of Spectra also mentioned that she likes High Performance Computing, surprised that she did not throw in high productivity computing as well in conjunction with big data, big bandwidth, green, dedupe, power, disk, tape and related buzzword bingo terms.

Also there are some comments from myself about cost cutting.

While I see the need for organizations to cut costs during tough economic times, Im not a fan of simply cutting cost for the sake of cost cutting as opposed to finding and removing complexity that in turn remove costs of doing work. In other words, Im a fan of finding and removing waste, becoming more effective and productive along with removing the cost of doing a particular piece of work. This in the end meets the aim of bean counters to cut costs, however can be done in a way that does not degrade service levels or customer service experience. For example instead of looking to cut backup costs, do you know where the real costs of doing data protection exist (hint swapping out media is treating the symptoms) and if so, what can be done to streamline those from the source of the problem downstream to the target (e.g. media or medium). In other words, redesign, review, modernize how data protection is done, leverage data footprint reduction (DFR) techniques including archive, compression, consolidation, data management, dedupe and other technologies in effective and creative ways, after all, return on innovation is the new ROI.

Checkout Drews article here to read more on the above topics and themes.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved