What is the best kind of IO? The one you do not have to do

What is the best kind of IO? The one you do not have to do

data infrastructure server storage I/O trends

Updated 2/10/2018

What is the best kind of IO? If no IO (input/output) operation is the best IO, than the second best IO is the one that can be done as close to the application and processor with best locality of reference. Then the third best IO is the one that can be done in less time, or at least cost or impact to the requesting application which means moving further down the memory and storage stack (figure 1).

Storage and IO or I/O locality of reference and storage hirearchy
Figure 1 memory and storage hierarchy

The problem with IO is that they are basic operation to get data into and out of a computer or processor so they are required; however, they also have an impact on performance, response or wait time (latency). IO require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data to their destination or retrieve from where stored. While IOs cannot be eliminated, their impact can be greatly improved or optimized by doing fewer of them via caching, grouped reads or writes (pre-fetch, write behind) among other techniques and technologies.

Think of it this way, instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip; however, that optimization may also take longer. Hence sometimes it makes sense to go on a couple of quick, short low latency trips vs. one single larger one that takes half a day however accomplishes many things. Of course, how far you have to go on those trips (e.g. locality) makes a difference of how many you can do in a given amount of time.

What is locality of reference?

Locality of reference refers to how close (e.g location) data exists for where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, then level 1 (L1), level 2 (L2) or level 3 (L3) onboard cache, followed by dynamic random access memory (DRAM). Then would come memory also known as storage on PCIe cards such as nand flash solid state device (SSD) or accessible via an adapter on a direct attached storage (DAS), SAN or NAS device. In the case of a PCIe nand flash SSD card, even though physically the nand flash SSD is closer to the processor, there is still the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with Meta or control information to further optimize and improve locality of reference. In other words, help with cache hits, cache use and cache effectiveness vs. simply boosting cache utilization.

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

What can you do the cut the impact of IO

  • Establish baseline performance and availability metrics for comparison
  • Realize that IOs are a fact of IT virtual, physical and cloud life
  • Understand what is a bad IO along with its impact
  • Identify why an IO is bad, expensive or causing an impact
  • Find and fix the problem, either with software, application or database changes
  • Throw more software caching tools, hyper visors or hardware at the problem
  • Hardware includes faster processors with more DRAM and fast internal busses
  • Leveraging local PCIe flash SSD cards for caching or as targets
  • Utilize storage systems or appliances that have intelligent caching and storage optimization capabilities (performance, availability, capacity).
  • Compare changes and improvements to baseline, quantify improvement

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

More Storage IO momentus HHDD and SSD moments part II

This follows the first of a two-part series on my latest experiences with Hybrid Hard Disk Drives (HHDD’s) and Solid State Devices (SSD’s). In my ongoing last momentus moment post I discussed what I have done with HHDD’s and setting the stage for expanded SSD use. I have the newer HHDD’s, e.g. Seagate Momentus XT II 750GB (8GB SLC nand flash) installed and have since bought another from Amazon as well as having some of the older 500GB (4GB SLC nand flash) in various systems. Those are all functioning great, however still waiting and looking forward to the rumored firmware enhancements to boost write capabilities.

This brings me up to the latest momentus moment which now includes SSD’s.

Well its two years later and I now have a 256GB (usable capacity is lower) Samsung SSD that I bought from Amazon.com and installed in one of my laptops and just as when I made the first switch to HHDD’s, I also have a backup copy/clone to fall back to in case of emergency.

Was it worth the wait? Yes, particularly using the HHDD’s to bridge the gap and enable some productivity gain which more than paid for them based on some different projects. I’m already seeing productivity improvements that will make future upgrades more easy to justify (to myself).

I deviated from my strategy a bit and installed the SSD about six months earlier than I was planning to do so because of a physical barrier. That physical barrier was my new traveling laptop only accepts 7mm height 2.5 inch small form factor devices and the 750GB HHDD that I had planned on installing was 2.5mm to thick which pushed up the SSD installation.

What will become of the 750GB HHDD? Its being redeployed to help speed up file serving, backups and other functions.

Will I replace the HHDD’s in my other workstations and laptops now with SSD’s? Across the board no, not yet, however there is one other system that is a prime candidate to maybe upgrade in a month or two (maybe less).

Will I stick with the Samsung SSD’s or look at other options? I’m keeping my options open and using this as a gauge to test and compare other options in a real world working environment as opposed to a lab bench test simulation. In other words, taking the next step past the lab test and product reviews, gaining comfort and confidence and then trying out with real use activity.

What will happen in the future as I install more SSD’s and have surplus HHDD’s? Redeployed them of course into file or NAS servers, backup targets that in turn will replace HDD’s that will either get retired, or redeployed to replace older, smaller capacity, higher cost to handle HDD’s used for offsite protection.

I tried using the software that came with the SSD to do the cloning and should have known better, however wanted to see what the latest version of ghost was like (it was a waste of time to be polite). Instead I used Seagate Discwizard (aka Acronis) which requires at least one Seagate product (source or target) for cloning.

Cloning from the Seagate HHDD that have been previously cloned from the Hitachi HDD that came with the laptop, was a none issue. However, I wanted to see what would happen if I attached the Samsung SSD to the Seagate Goflex cable and clone directly from the Hitachi HDD, it worked. Hence another reason to have some of the Seagate Goflex cables (USB and eSATA) like the ones I bought at Amazon.com around in your toolbox.

While I do not have concrete empirical numbers to share, cloning from a HDD to a SSD is shall we say fast, however, what’s really fun to watch is cloning from a HHDD to a SSD using an eSata (GoFlex) connector adapter. The reason I say that it is fun is that you don’t have to sit and wait for hours, it’s not minutes to move 100s of GBs, however you can very much see the progress bar move at a good pace.

Also, I put the HHDD on an eSata port and try that out as a backup or data dump target if you have the need for speed, capacity and cost effectiveness, yes its fast, has lots of capacity and so forth. Now if Seagate and Synology or EMC Iomega would get their acts together and add support for the HHDD’s in those different unified SMB and SOHO NAS solutions, that would be way cool.

Will I be racing to put SSD’s in my other laptops or workstations soon? Probably not as there are things in the works and working their way into and through the market place that I wanted to wait for, and thus will wait for now, that is unless a more interesting opportunity pops up.

Related links on SDD, HHDD and HDD
More Storage IO momentus HHDD and SSD moments part I
More Storage IO momentus HHDD and SSD moments part II
IO IO it is off to Storage and IO metrics we go
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Other Momentus moments posts here here, here, here and here
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

More Storage IO momentus HHDD and SSD moments part I

This is the first of a two part series on my latest experiences with HHDD and SSD’s

About two years ago I wanted to start installing solid state devices (SSD’s) into my workstations and laptops. Like many others, I found the expensive price for the limited capacity gains of the then generation SSD’s did not make for a good business decision based on my needs. Don’t get me wrong, I have been a huge fan of SSD for decades as an IT user, vendor, analysts, consultant and consumer and still am. In fact I have some SSD’s used for different purposes as well as many Hard Disk Drives (HDD) and Hybrid Hard Disk Drives (HHDD’s). Almost two years ago when I first tested the HHDD’s, I did an first post in this ongoing series and this two-part post is part of that string of experiences observed evolving from HDD’s to HHDD’s to SSD’s


Image courtesy of Seagate.com

As a refresher, HHDD’s like the Seagate Momentus XT combine a traditional 7,200 RPM 2.5 inch 500GB or 750GB HDD with an integrated single level cell (SLC) nand flash SSD within the actual device. The SSD in the HHDD’s is part of the HDD’s controller complementing the existing DRAM buffer by adding 4GB (500GB models) or 8GB (750GB models) of fast nand flash SSD cache. This means that no external special controller, adapter, data movement or migration software are required to get the performance boost over a traditional HDD and the capacity above a SSD at an affordable cost. In other words, the HHDD’s bridge the gap between those who need large capacity and some performance increases, without having to spend a lot on a lower capacity SSD.

However based on my needs or business requirements two years ago I found the justification to get all the extra performance of  SSD not quite there when. Back two years ago my thinking was that it would be about two maybe three years before the right point for a mix of performance, availability (or reliability e.g. duty cycles), capacity and economics aligned.

Note that this was based on my specific needs and requirements as opposed to my wants or wishes (I wanted SSD back then, however my budget needed to go elsewhere). My requirements and performance needs are probably not the same as yours or others might be. I also wanted to see the incremental technology, product and integration improvements ranging from duty cycle or program/erase cycles (P/E) with newer firmware and flash translation layers (FTLs) among other things. Particularly with multilevel cell (MLC) or enhanced multilevel cell (eMLC) which helps bring the cost down while boosting the capacity, I’m seeing enough to have more confidence in those devices. Note that for the past couple of years I have used single level cell (SLC) nand flash SSD technology in my HHDD’s, the same SSD flash technology that has been found in enterprise class storage.

While I wanted SSD’s two years ago in my laptops and workstations to improve productivity which involves a lot of content creation in addition to consumption, however as mentioned above, there were barriers. So instead of sitting on the sidelines, waiting for SSD’s to either become lower cost, or more capacity for a given cost, or wishing somebody would send me some free stuff (that may or may not have worked), I took a different route. That route was to try the HHDD’s such as Seagate Momentus XT.

Disclosure: Seagate sent me my first HHDD for first testing and verifications before buying several more from Amazon.com and installing them in all laptops, workstations and a server (not all servers have the HHDD’s, or at least yet).

The main reason I went with the HHDD’s two years ago and continue to use them today is to bridge the gap and gain some benefit vs. waiting and wishing and talking about what SSD’s would enable me to do in the future while missing out on productivity enhancements.

The HHDD’s also appealed to me in that my laptops are space constrained for putting two drives and playing the hybrid configuration game of installing both a small SSD and HDD and migrating data back and forth. Sure I could do that for in the office or carry an extra external device around however been there, done that in the past and want to move away from those types of models where possible.

Related links on SDD, HHDD and HDD
More Storage IO momentus HHDD and SSD moments part I
More Storage IO momentus HHDD and SSD moments part II
IO IO it is off to Storage and IO metrics we go
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Other Momentus moments posts here here, here, here and here
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now, lets resume this discussion in part II.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

EMC VFCache respinning SSD and intelligent caching (Part II)

This is the second of a two part series pertaining to EMC VFCache, you can read the first part here.

In this part of the series, lets look at some common questions along with comments and perspectives.

Common questions, answers, comments and perspectives:

Why would EMC not just go into the same market space and mode as FusionIO, a model that many other vendors seam eager to follow? IMHO many vendors are following or chasing FusionIO thus most are selling in the same way perhaps to the same customers. Some of those vendors can very easily if they were not already also make a quick change to their playbook adding some new moves to reach broader audience.

Another smart move here is that by taking a companion or complimentary approach is that EMC can continue selling existing storage systems to customers, keep those investments while also supporting competitors products. In addition, for those customers who are slow to adopt the SSD based techniques, this is a relatively easy and low risk way to gain confidence. Granted the disk drive was declared dead several years (and yes also several decades) ago, however it is and will stay alive for many years due to SSD helping to close the IO storage and performance gap.

Storage IO performance and capacity gap
Data center and storage IO performance capacity gap (Courtesy of Cloud and Virtual Data Storage Networking (CRC Press))

Has this been done before? There have been other vendors who have done LUN caching appliances in the past going back over a decade. Likewise there are PCIe RAID cards that support flash SSD as well as DRAM based caching. Even NetApp has had similar products and functionality with their PAM cards.

Does VFCache work with other PCIe SSD cards such as FusionIO? No, VFCache is a combination of software IO intercept and intelligent cache driver along with a PCIe SSD flash card (which could be supplied as EMC has indicated from different manufactures). Thus VFCache to be VFCache requires the EMC IO intercept and intelligent cache software driver.

Does VFCache work with other vendors storage? Yes, Refer to the EMC support matrix, however the product has been architected and designed to install and coexist into a customers existing environment which means supporting different EMC block storage systems as well as those from other vendors. Keep in mind that a main theme of VFCache is to compliment, coexist, enhance and protect customers investments in storage systems to improve their effectiveness and productivity as opposed to replacing them.

Does VFCache introduce a new point of vendor lockin or stickiness? Some will see or place this as a new form of vendor lockin, others assuming that EMC supports different vendors storage systems downstream as well as offer options for different PCIe flash cards and keeps the solution affordable will assert it is no more lockin that other solutions. In fact by supporting third party storage systems as opposed to replacing them, smart sales people and marketeers will place VFCache as being more open and interoperable than some other PCIe flash card vendors approach. Keep in mind that avoiding vendor lockin is a shared responsibility (read more here).

Does VFCache work with NAS? VFCache does not work with NAS (NFS or CIFS) attached storage.

Does VFCache work with databases? Yes, VFCache is well suited for little data (e.g. database) and traditional OLTP or general business application process that may not be covered or supported by other so called big data focused or optimized solutions. Refer to this EMC document (and this document here) for more information.

Does VFCache only work with little data? While VFCache is well suited for little data (e.g. databases, share point, file and web servers, traditional business systems) it also able to work with other forms of unstructured data.

Does VFCache need VMware? No, While VFCache works with VMware vSphere including a vCenter plug in, however it does not need a hypervisor and is practical in a physical machine (PM) as it is in a virtual machine (VM).

Does VFCache work with Microsoft Windows? Yes, Refer to the EMC support matrix for specific server operating systems and hypervisor version support.

Does VFCache work with other unix platforms? Refer to the EMC support matrix for specific server operating systems and hypervisor version support.

How are reads handled with VFCache? The VFCache software (driver if you prefer) intercepts IO requests to LUNs that are being cached performing a quick lookup to see if there is a valid cache entry in the physical VFCache PCIe card. If there is a cache hit the IO is resolved from the closer or local PCIe card cache making for a lower latency or faster response time IO. In the case of a cache miss, the VFCache driver simply passes the IO request onto the normal SCSI or block (e.g. iSCSI, SAS, FC, FCoE) stack for processing by the downstream storage system (or appliance). Note that when the requested data is retrieved from the storage system, the VFCache driver will based on caching algorithms determinations place a copy of the data in the PCIe read cache. Thus the real power of the VFCache is the software implementing the cache lookup and cache management functions to leverage the PCIe card that complements the underlying block storage systems.

How are writes handled with VFCache? Unless put into a write cache mode which is not the default, VFCache software simply passes the IO operation onto the IO stack for downstream processing by the storage system or appliance attached via a block interface (e.g. iSCSI, SAS, FC, FCoE). Note that as part of the caching algorithms, the VFCache software will make determinations of what to keep in cache based on IO activity requests similar to how cache management results in better cache effectiveness in a storage system. Given EMCs long history of working with intelligent cache algorithms, one would expect some of that DNA exists or will be leveraged further in future versions of the software. Ironically this is where other vendors with long cache effectiveness histories such as IBM, HDS and NetApp among others should also be scratching their collective heads saying wow, we can or should be doing that as well (or better).

Can VFCache be used as a write cache? Yes, while its default mode is to be used as a persistent read cache to compliment server and application buffers in DRAM along with enhance effectiveness of downstream storage system (or appliances) caches, VFCache can also be configured as a persistent write cache.

Does VFCache include FAST automated tiering between different storage systems? The first version is only a caching tool, however think about it a bit, where the software sits, what storage systems it can work with, ability to learn and understand IO paths and patterns and you can get an idea of where EMC could evolve it to, similar to what they have done with recoverpoint among other tools.

Changing data access patterns and lifecycles
Evolving data access patterns and life cycles (more retention and reads)

Does VFCache mean all or nothing approach with EMC? While the complete VFCache solution comes from EMC (e.g. PCIe card and software), the solution will work with other block attached storage as well as existing EMC storage systems for investment protection.

Does VFCache support NAS based storage systems? The first release of VFCache only supports block based access, however the server that VFCache is installed in could certainly be functioning as a general purpose NAS (NFS or CIFS) server (see supported operating systems in EMC interoperability notes) in addition to being a database or other other application server.

Does VFCache require that all LUNs be cached? No, you can select which LUNs are cached and which ones are not.

Does VFCache run in an active / active mode? In the first release it is active passive, refer to EMC release notes for details.

Can VFCache be installed in multiple physical servers accessing the same shared storage system? Yes, however refer to EMC release notes on details about active / active vs. active / passive configuration rules for ensuring data integrity.

Who else is doing things like this? There are caching appliance vendors as well as others such as NetApp and IBM who have used SSD flash caching cards in their storage systems or virtualization appliances. However keep in mind that VFCache is placing the caching function closer to the application that is accessing it there by improving on the locality of reference (e.g. storage and IO effectiveness).

Does VFCache work with SSD drives installed in EMC or other storage systems? Check the EMC product support matrix for specific tested and certified solutions, however in general if the SSD drive is installed in a storage system that is supported as a block LUN (e.g. iSCSI, SAS, FC, FCoE) in theory it should be possible to work with VFCache. Emphasis, visit the EMC support matrix.
What type of flash is being used?

What type of nand flash SSD memory is EMC using in the PCIe card? The first release of VFCache is leveraging enterprise class SLC (Single Level Cell) nand flash which has been used in other EMC products for its endurance, long duty cycle to minnimize or eliminate concerns of wear and tear while meeting read and write performance. EMC has indicated that they will also as part of an industry trend leverage MLC along with Enterprise MLC (EMLC) technologies on a go forward basis.

Doesnt nand ssd flash cache wear out? While nand flash SSD can wear out over time due to extensive write use, the VFCache approach mitigates this by being primarily a read cache reducing the number or program / erase cycles (P/E cycles) that occur with write operations as well as initially leveraging longer duty cycle SLC flash. EMC also has several years experience from implementing wear leveling algorithms into the storage systems controllers to increase duty cycle and reduce wear on SLC flash which will play forward as MLC or Enterprise MLC (EMLC) techniques are leveraged. This differs from vendors who are positioning their SLC or MLC based flash PCIe SSD cards for mainly write operations which will cause more P/E cycles to occur at a faster rate reducing the duty or useful life of the device.

How much capacity does the VFCache PCIe card contain? The first release supports a 300GB card and EMC has indicated that added capacity and configuration options are in their plans.

Does this mean disks are dead? Contrary to popular industry folk lore (or wish) the hard disk drive (HDD) has plenty of life left part of which has been increased by being complimented by VFCache.

Various options and locations for SSD along with different usage scenarios
Various SSD locations, types, packaging and usage scenario options

Can VFCache work in blade servers? The VFCache software is transparent to blade, rack mount, tower or other types of servers. The hardware part of VFCache is a PCIe card which means that the blade server or system will need to be able to accommodate a PCIe card to compliment the PCIe based mezzaine IO card (e.g. iSCSI, SAS, FC, FCOE) used for accessing storage. What this means is that for blade systems or server vendors such as IBM who have a PCIe expansion module for their H series blade systems (it consumes a slot normally used by a server blade), PCIe cache cards like those being initially released by IBM could work, however check with the EMC interoperability matrix, as well as your specific blade server vendor for PCIe expansion capabilities. Given that EMC leverages Cisco UCS for their vBlocks, one would assume that those systems will also see VFCache modules in those systems. NetApp partners with Cisco using UCS in their FlexPods so you see where that could go as well along with potential other server vendors support including Dell, HP, IBM and Oracle among others.

What about benchmarks? EMC has released some technical documents that show performance improvements in Oracle environments such as this here. Hopefully we will see EMC also release other workloads for different applications including Microsoft Exchange Solutions Proven (ESRP) along with SPC similar to what IBM recently did with their systems among others.

How do the first EMC supplied workload simulations compare vs. other PCIe cards? This is tough to gauge as many SSD solutions and in particular PCIe cards are doing apples to oranges comparisons. For example to generate a high IOPs rating for marketing purposes, most SSD solutions are stress performance tested at 512 bytes or 1/2 of a KByte or at least 1/8 of a small 4Kbyte IO. Note that operating systems such as Windows are moving to 4Kbyte page allocation size to align with growing IO sizes with databases moving from the old average of 4Kbytes to 8Kbytes and larger. What is important to consider is what is the average IO size and activity profile (e.g. reads vs. writes, random vs. sequential) for your applications. If your application is doing ultra small 1/2 Kbyte IOs, or even smaller 64 byte IOs (which should be handled by better application or file system caching in DRAM), then the smaller IO size and record setting examples will apply. However if your applications are more mainstream or larger, then those smaller IO size tests should be taken with a grain of salt. Also keep latency in mind that many target or oppourtunity applications for VFCache are response time sensitive or can benefit by the improved productivity they enable.

What is locality of reference? Locality of reference refers to how close data is to where it is being requested or accessed from. The closer the data to the application requesting the faster the response time or quick the work gets done. For example in the figure below L1/L2/L3 on board processor caches are the fastest, yet smallest while closest to the application running on the server. At the other extreme further down the stack, storage becomes large capacity, lower cost, however lower performing.

Locality of reference data and storage memory

What does cache effectiveness vs. cache utilization mean? Cache utilization is an indicator of how much the available cache capacity is being used however it does not give an indicator of if the cache is being well used or not. For example, cache could be 100 percent used, however there could be a low hit rate. Thus cache effectiveness is a gauge of how well the available cache is being used to improve performance in terms of more work being done (IOPS or bandwidth) or lower of latency and response time.

Isnt more cache better? More cache is not better, it is how the cache is being used, this is a message that I would be disappointed in HDS if they were not to bring up as a point of messaging (or rebuttal) given their history of emphasis cache effectiveness vs. size or quantity (Hu, that is a hint btw ;).

What is the performance impact of VFCache on the host server? EMC is saying greatest of 5 percent or less CPU consumption which they claim is several times less than the competitions worst scenario, as well as claiming 512MB to 1GB of DRM on the server vs. several times that of their competitors. The difference could be expected to be via more off load functioning including flash translation layer (FTL), wear leveling and other optimization being handled by the PCIe card vs. being handled in the servers memory and using host server CPU cycles.

How does this compare to what NetApp or IBM does? NetApp, IBM and others have done caching with SSD in their storage systems, or leveraging third party PCIe SSD cards from different vendors to be installed in servers to be used as a storage target. Some vendors such as LSI have done caching on the PCIe cards (e.g. CacheCaid which in theory has a similar software caching concept to VFCache) to improve performance and effectiveness across JBOD and SAS devices.

What about stale (old or invalid) reads, how does VFCache handle or protect against those? Stale reads are handled via the VFCache management software tool or driver which leverages caching algorithms to decide what is valid or invalid data.

How much does VFCache cost? Refer to EMC announcement pricing, however EMC has indicated that they will be competitive with the market (supply and demand).

If a server shutdowns or reboots, what happens to the data in the VFCache? Being that the data is in non volatile SLC nand flash memory, information is not lost when the server reboots or loses power in the case of a shutdown, thus it is persistent. While exact details are not know as of this time, it is expected that the VFCache driver and software do some form of cache coherency and validity check to guard against stale reads or discard any other invalid cache entries.

Industry trends and perspectives

What will EMC do with VFCache in the future and on a larger scale such as an appliance? EMC via its own internal development and via acquisitions has demonstrated ability to use various clustered techniques such as RapidIO for VMAX nodes, InfiniBand for connecting Isilon  nodes. Given an industry trend with several startups using PCIe flash cards installed in a server that then functions as a IO storage system, it seems likely given EMCs history and experience with different storage systems, caching, and interconnects that they could do something interesting. Perhaps Oracle Exadata III (Exadata I was HP, Exadata II was Sun/Oracle) could be an EMC based appliance (That is pure speculation btw)?

EMC has already shown how it can use SSD drives as a cache extension in VNX and CLARiiON servers ( FAST CACHE ) in addition to as a target or storage tier combined with Fast for tiering. Given their history with caching algorithms, it would not be surprising to see other instantiations of the technology deployed in complimentary ways.

Finally, EMC is showing that it can use nand flash SSD in different ways, various packaging forms to apply to diverse applications or customer environments. The companion or complimentary approach EMC is currently taking contrasts with some other vendors who are taking an all or nothing, its all SSD as disk is dead approach. Given the large installed base of disk based systems EMC as well as other vendors have in place, not to mention the investment by those customers, it makes sense to allow those customers the option of when, where and how they can leverage SSD technologies to coexist and complement their environments. Thus with VFCache, EMC is using SSD as a cache enabler to discuss the decades old and growing storage IO to capacity performance gap in a force multiplier model that spreads the cost over more TBytes, PBytes or EBytes while increasing the overall benefit, in other words effectiveness and productivity.

Additional related material:
Part I: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Unified storage systems showdown: NetApp FAS vs. EMC VNX
Industry adoption vs. industry deployment, is there a difference?
Two companies on parallel tracks moving like trains offset by time: EMC and NetApp
Data Center I/O Bottlenecks Performance Issues and Impacts
From bits to bytes: Decoding Encoding
Who is responsible for vendor lockin
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperabity support matrix

Ok, nuff said for now, I think I see some storm clouds rolling in

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

EMC VFCache respinning SSD and intelligent caching (Part I)

This is the first part of a two part series covering EMC VFCache, you can read the second part here.

EMC formerly announced VFCache (aka Project Lightning) an IO accelerator product that comprises a PCIe nand flash card (aka Solid State Device or SSD) and intelligent cache management software. In addition EMC is also talking about the next phase of the flash business unit and project Thunder. The approach EMC is taking with vFCache should not be a surprise given their history of starting out with memory and SSD evolving it into an intelligent cache optimized storage solution.

Storage IO performance and capacity gap
Data center and storage IO performance capacity gap (Courtesy of Cloud and Virtual Data Storage Networking (CRC Press))

Could we see the future of where EMC will take VFCache along with other possible solutions already being hinted at by the EMC flash business unit by looking where they have been already?

Likewise by looking at the past can we see the future or how VFCache and sibling product solutions could evolve?

After all, EMC is no stranger to caching with both nand flash SSD (e.g. FLASH CACHE, FAST and SSD drives) along with DRAM based across their product portfolio not too mention being a core part of their company founding products that evolved into HDDs and more recent nand flash SSDs among others.

Industry trends and perspectives

Unlike others who also offer PCIe SSD cards such as FusionIO with a focus on eliminating SANs or other storage (read their marketing), EMC not surprisingly is marching to a different beat. The beat EMC is marching too or perhaps leading by example for others to follow is that of going mainstream and using PCIe SSD cards as a cache to compliment theirs as well as other vendors storage systems vs. replacing them. This is similar to what EMC and other mainstream storage vendors have done in the past such as with SSD drives being used as flash cache extension on CLARiiON or VNX based systems as well as target or storage tier.

Various options and locations for SSD along with different usage scenarios
Various SSD locations, types, packaging and usage scenario options

Other vendors including IBM, NetApp and Oracle among others have also leveraged various packaging options of Single Level Cell (SLC) or Multi Level Cell (MLC) flash as caches in the past. A different example of SSD being used as a cache is the Seagate Momentus XT which is a desktop, workstation consumer type device. Seagate has shipped over a million of the Momentus XT which use SLC flash as a cache to compliment and enhance the integrated HDD performance (a 750GB with 8GB SLC memory is in the laptop Im using to type this with).

One of the premises of solutions such as those mentioned above for caching is to discuss changing data access patterns and life cycles shown in the figure below.

Changing data access patterns and lifecycles
Evolving data access patterns and life cycles (more retention and reads)

Put a different way, instead of focusing on just big data or corner case (granted some of those are quite large) or ultra large cloud scale out solutions, EMC with VFCache is also addressing their core business which includes little data. What will be interesting to watch and listen too is how some vendors will start to jump up and down saying that they have done or enabling what EMC is announcing for some time. In some cases those vendors will be rightfully doing and making noise on something that they should have made noise about before.

EMC is bringing the SSD message to the mainstream business and storage marketplace showing how it is a compliment to, vs. a replacement of existing storage systems. By doing so, they will show how to spread the cost of SSD out across a larger storage capacity footprint boosting the effectiveness and productive of those systems. This means that customers who install the VFCache product can accelerate the performance of both their existing EMC as well as storage systems from other vendors preserving their technology along with people skills investment.

 

Key points of VFCache

  • Combines PCIe SLC nand flash card (300GB) with intelligent caching management software driver for use in virtualized and traditional servers

  • Making SSD complimentary to existing installed block based disk (and or SSD) storage systems to increase their effectiveness

  • Providing investment protection while boosting productivity of existing EMC and third party storage in customer sites

  • Brings caching closer to the application where the data is accessed while leverage larger scale direct attached and SAN block storage

  • Focusing message for SSD back on to little data as well as big data for mainstream broad customer adoption scenarios

  • Leveraging benefit and strength of SSD as a read cache and scalable of underlying downstream disk for data storage

  • Reducing concerns around SSD endurance or duty cycle wear and tear by using as a read cache

  • Off loads underlying storage systems from some read requests enabling them to do more work for other servers

Additional related material:
Part II: EMC VFCache respinning SSD and intelligent caching
IT and storage economics 101, supply and demand
2012 industry trends perspectives and commentary (predictions)
Speaking of speeding up business with SSD storage
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Are Hard Disk Drives (HDDs) getting too big?
Unified storage systems showdown: NetApp FAS vs. EMC VNX
Industry adoption vs. industry deployment, is there a difference?
Two companies on parallel tracks moving like trains offset by time: EMC and NetApp
Data Center I/O Bottlenecks Performance Issues and Impacts
From bits to bytes: Decoding Encoding
Who is responsible for vendor lockin
EMC VPLEX: Virtual Storage Redefined or Respun?
EMC interoperabity support matrix

Ok, nuff said for now, I think I see some storm clouds rolling in

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

IT and storage economics 101, supply and demand

In my 2012 (and 2013) industry trends and perspectives predictions I mentioned that some storage systems vendors who managed their costs could benefit from the current Hard Disk Drive (HDD) shortage. Most in the industry would say that is saying what they have said, however I have an alternate scenario. My scenario is that for vendors who already manage good (or great) margins on their HDD sales and who can manage their costs including inventories stand to make even more margin. There is a popular myth that there is no money or margin in HDD or for those who sell them which might be true for some.

Without going into any details, lets just say it is a popular myth just like saying that there is no money in hardware or that all software and people services are pure profit. Ok, lets leave sleeping dogs lay where rest (at least for now).

Why will some storage vendors make more margin off of HDD when everybody is supposed to be adopting or deploying solid state devices (SSD). Or Hybrid Hard Disk Drives (HHDD) in the case of workstation, desktop or laptops? Simple, SSD adoption (and deployment) is still growing and a lot of demand generator incentives available. Likewise HDD demand continues to be strong and with supplies affected, economics 101 says that some will raise their prices, manage their expenses, make more profits which can be used to help fund or stimulate increased SSD or other initiatives.

Storage, IT and general Economics 101

Economics 101 or basics introduces the concept of supply and demand along with revenue minus costs = profits or margin. If there is no demand yet a supply of a product exists then techniques such as discounting, bundling or other forms of adding value to incentivize customers to make a purchase. Bundling can include offering some other product, service or offering that could be as simple as an extended warranty to motivate sellers. Beyond discounts, coupons, two for one, future buying credits, gift cards or memberships for frequent buyers (or flyers) are other forms of stimulating sales activity.

Likewise if there is a supply or competition for a given market of a product or alternative, vendors or those selling the products including value added resellers (VARS) may sacrifice margin (profits) to meet revenue as well as unit shipped (e.g. expand their customer and installed base footprint) goals.

Currently in the IT industry and specifically around data storage even with increased and growing adoption and demand deployment around SSD, there is also a large supply in different categories. For example there are several fabrication facilities (FABs) that produce the silicon dies (e.g. chips) that form nand flash SSD memories including Intel, Micron, the joint Intel and Micron Fab (IMF) and Samsung. Even with continued strong demand growth, the various FABs seem to have enough capacity at least for now. Likewise manufactures of SSD drive form factor products with SAS or SATA interfaces for attaching to existing servers, storage or appliances including Intel, Micron, Samsung, Seagate, STEC and SANdisk among others seem to be able to meet demand. Even PCIe SSD card vendors have come under pressure of supply and demand. For example the high flying startup FusionIO recently saw its margins affected due to competition which includes Adaptec, LSI, Texas Memory Systems (TMS) and soon EMC among others. In the SSD appliance and storage system space there are even more vendors with what amounts to about one every month or so coming out of stealth. Needless to say there will be some shakeout in the not so distant future.

On the other hand, if there is a demand however limited supply, assuming that the market will support it, prices can be increased from what discounts had applied. Assuming that costs are kept inline any subsequent increase in average selling price (ASP) minus costs should result in higher margins.

Another variation is if there is strong demand and shortage of supply such as what is occurring with hard disk drives (HDD) due to recent flooding in Thailand, not only prices increase, there can also be changes to warranties or other services and incentives. Note some of HDD manufactures such as Western Digital were more affected by the flooding than Seagate. Likewise the Thailand flooding was not limited to just HDD having also affected other electronic chip and component suppliers. Even though HDDs have been declared dead by many in the SSD camps along with their supporters, record number of HDDs are produced every year. Note that economics 101 also tells us that even though more devices are produced and sold, that may not show a profit based on their cost and price. Like the CPU processor chips produced by AMD, Broadcom, IBM and Intel among others that are high volume, with varying margins, the HDD and nand flash SSD market is also high volume with different margins.

As an example, Seagate recently announced strong profits due to a number of factors even though enterprise drive supply and shipments were down while desktop drives were up. Given that many industry pundits have proclaimed a disaster for those involved with HDDs due to the shortage, they forgot about economics 101 (supply and demand). Sure marketing 101 says that HDDs are dead and if there is a shortage then more people will buy SSDs however that also assumes that people are a) ready to buy more SSDs (e.g. demand) and b) vendors or manufactures have supply and c) that those same vendors or manufactures are willing to give up margin while reducing costs to boost profits.

Note that costs typically include selling, general and administrative, cost of goods, manufacturing, transportation and shipping, insurance, research and development among others. If it has been awhile since you looked at one, take a few minutes sometime to look at public companies and their quarterly securities exchange commission (SEC) financial filings. Those public filing documents are a treasure trove of information for those who sift through them and where many reporters, analysts and researchers find information for what they are working or speculating on. These documents show total sales, costs, profits and losses among other things. Something that vendors may not show in these public filings which means you have to look or read between the lines or get the information elsewhere is how many units were actually shipped or the ASP to get an idea of the amount of discounting that is occurring. Likewise sales and marketing expenses often get lumped into or under general selling and administration (SGA). A fun or interesting metric is to look at the percentage of SGA dollars spent per revenue and profits.

What I find interesting is to get an estimate of what it is costing an organization to do or sustain a given level of revenue and margin. For example, while some larger vendors may seem to spend more on selling and marketing, on a percentage basis, they can easily be out spent by smaller startups. Granted the larger vendor may be spending more actually dollars however those are spread out over a larger sales and revenue basis.

What does this all mean?

Look at multiple metrics that have both a future trend or forecast as well as trailing or historical perspective view. Look at both percentages as well as dollar amounts as well as both revenue and margin while keeping units or number of devices (or copies) sold also into perspective. For example its interesting to know if a vendors sales were down 10% (or up) quarter over quarter, or versus the same quarter a year ago or year over year. It is also interesting to keep the margin in perspective along with SGA costs in addition to cost of product acquired for sale. Also important is to get a gauge of if sales were down, yet margins are up, how many devices or copies were sold to get a gauge on expanding footprint which could also be a sign of future annuity (follow up sales opportunities). What Im watching is over the next couple of quarters is to see how some vendors leverage the Thailand flooding and HDD as well as other electronic component supply shortages to meet demand by managing discounts, costs and other items that contribute to enhanced margins.

Rest assured there is a lot more to IT and storage economics, including advanced topics such as Return on Investment (ROI) or Return on Innovation (The new ROI) and Total Cost of Ownership (TCO) among others that maybe we will discuss in the future.

Ok, nuff fun for now, lets get back to work.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

My Server and Storage IO holiday break projects

Happy new years!

Following up from a flurry of posts in the closing days of 2011 including industry trends perspective predictions for 2012 and 2013, top blog posts from 2011, top all time posts, along with a couple of other items here and here, its time to get back to 2012 activity. Also if you missed it, here is the Fall (December) 2011 StorageIO news letter.

Actually I have been busy working on some other projects the past several weeks most of which are NDA so not much else can be said about them, however there are some other things I’m working on that will show themselves in the weeks and months to come. Here is a link to a webinar and live chat that I did the first week of January on CDP (Continuous Data Protection) and how it can be applied to many different environments.

But lets take a step back for a moment and let me share with you some of the things I did or started during the holiday break between christmas and the new years.

Like many others, I found time to relax and get away from normal work activities during the recent holiday season.

However like many of you that may also be techniques or geeks or wanna be geeks at heart, I could not get away from server, storage, IO, networking, data protection, video and other things completely. I used some time to discuss a few projects that I had wanted to do or that I had started before the holidays and here is a synopsis.

Increased storage capacity on a DVR by about 5x In order to get this to work, I modified a 3.5 enclosure with a power supply to accept a 2.5 1.5TB SATA HDD with an eSATA connection, the easy part was then attaching it to the external eSATA port on my DVR. The hard part was then waiting for the DVR to reconfigure and start recording information again. Also as part of upgrading the external storage on the DVR was to get the media share option to do more than basic things leveraging audio and video real-time trans coding using the Tversity software along with various codecs on a media server.

Another project involved upgrading a 500GB HHDD to a 750GB HHDD and did some testing Shortly before the holidays I received a new 750GB Seagate Momentus XT II HHDD to compare to my exiting 500GB previous generation model. I have been using the 750GB HHDD for over a month now and it is amazing to see so much space in a laptop that also has good performance. Some follow-up activities are to go back and analyze some performance data that I collected before and after the upgrade. This includes both workload simulation of reads, writes, random, sequential of different IO size as well as comparing Windows startup and shutdown speed and impact to build on what I did last summer (see this post). More on these in the not so distance future.

Speaking of clouds, I had a chance to do some more testing with my Amazon EC2 and EBS accounts in addition to cleaning up my S3 pool in addition to my other cloud backup and storage providers accounts. This also involved refining some data protection backup/restore and archive frequency and retention settings. In addition to refinements for cloud based backup, I’m also in the process of transitioning from Imation Odyssey Removable Hard Disk Drives (RHDD) too much larger capacity 2.5 portable RHDDs that are used for offsite bulk copies. Part of the migration includes seeing that end of year master or gold backups and archives were made and safely secured elsewhere in addition to having data sent to the cloud.

Another project involved doing some more testing and simulations with my SSD along with more windows boot and shutdown tests mentioned above. More on these results in a future post.

Sometime (actually not very much) was also spent adding some new shares to my Iomega IX4 NAS which is filling up so I also did some more research on what I will upgrade or replace it with. While Iomega has been great (knock on wood), Synology is also looking interesting as a future solution however keeping my options open for now. Right now I’m leaning towards keeping the IX4 and adding another NAS filer using the two for different purposes.

Some other server, storage and IO projects also included upgrading some networking components, and to finish decommissioning old drives making them secure for safe disposal when the time comes.

I also was able to spend time on non tech items including outside enjoying the nice weather, cutting up some fallen trees and roasting them on a bonfire among other things.

Tree cleanupOn break

roasting logswalking on frozen water

Ok, nuff said for now, time to get back to work.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

What industry pundits love and loathe about data storage

Drew Robb has a good article about what IT industry pundits including vendors, analysts, and advisors loath including comments from myself.

In the article Drew asks: What do you really love about storage and what are your pet peeves?

One of my comments and perspectives is that I like Hybrid Hard Disk Drives (HHDDs) in addition to traditional Hard Disk Drives (HDD) along with Solid State Devices (SSDs). As much as I like HHDDs, I also believe that with any technology, they are not the best solution for everything, however they can also be used in many ways than being seen. Here is the fifth installment of a series on HHDDs that I have done since June 2010 when I received my first HHDD a Seagate Momentus XT. You can read the other installments of my momentus moments here, here, here and here.

Seagate Momentus XT
HHDD with integrated nand flash SSD photo courtesy Seagate.com

Molly Rector VP of marketing at tape summit resources vendor Spectra Logic mentioned that what she does not like is companies that base their business plan on patent law trolling. I would have expected something different along the lines of countering or correcting people that say tape sucks, tape is dead, or that tape is the cause problem of anything wrong with storage thus clearing the air or putting up a fight that tape summit resources. Go figure…

Another of my comments involved clouds of which there are plenty of conversations taking place. I do like clouds (I even recently wrote a book involving them) however Im a fan of using them where applicable to coexist and enhance other IT resources. Dont be scared of clouds, however be ready, do your homework, listen, learn, do proof of concepts to decide best practices, when, where, what and how to use them.

Speaking of clouds, click here to read about who is responsible for cloud data loss and cast your vote, along with viewing what do you think about IT clouds in general here.

Mike Karp (aka twitter @storagewonk ) an analyst with Ptak Noel mentions that midrange environments dont get respect from big (or even startup) vendors.

I would take that a step further by saying compared to six or so years ago, SMB are getting night and day better respect along with attention by most vendors, however what is lacking is respect of the SOHO sector (e.g. lower end of SMB down to or just above consumer).

Granted some that have traditional sold into those sectors such as server vendors including Dell and HP get it or at least see the potential along with traditional enterprise vendor EMC via its Iomega . Yet I still see many vendors including startups in general discounting, shrugging off or sneering at the SOHO space similar to those who dissed or did not respect the SMB space several years ago. Similar to the SMB space, SOHO requires different products, packaging, pricing and routes to market via channel or etail mechanisms which means change for some vendors. Those vendors who embraced the SMB and realized what needed to change to adapt to those markets will also stand to do better with the SOHO.

Here is the reason that I think SOHO needs respect.

Simple, SOHOs grow up to become SMBs, SMBs grow up to become SMEs, SMEs grow up to become enterprises and not to mention that the amount of data being generated, moved, processed and stored continues to grow. The net result is that SMBs along with SOHO storage demands will continue to grow and for those vendors who can adjust to support those markets will also stand to gain new customers that in turn can become plans for other solution offerings.

Cloud conversations

Not surprising Eran Farajun of Asigra which has been doing cloud backups decades before they were known as clouds loves backup (and restores). However I am surprised that Eran did not jump on the its time to modernize and re architect data protection theme. Oh well, will have to have a chat with Eran on that sometime.

What was surprising were comments from Panzura who has a good distributed (e.g. read also cloud) file system that can be used for various things including online reference data. Panzura has a solution that normally I would not even think about in the context of being pulled into a Datadomain or dedupe appliance type discussion (e.g tape sucks or other similar themes). So it is odd that they are playing to the tape sucks camp and theme vs. playing to where the technology can really shine which IMHO is in the global, distributed, scale out and cloud file system space. Oh well, I guess you go with what you know or has worked in the past to get some attention.

Molly Rector of Spectra also mentioned that she likes High Performance Computing, surprised that she did not throw in high productivity computing as well in conjunction with big data, big bandwidth, green, dedupe, power, disk, tape and related buzzword bingo terms.

Also there are some comments from myself about cost cutting.

While I see the need for organizations to cut costs during tough economic times, Im not a fan of simply cutting cost for the sake of cost cutting as opposed to finding and removing complexity that in turn remove costs of doing work. In other words, Im a fan of finding and removing waste, becoming more effective and productive along with removing the cost of doing a particular piece of work. This in the end meets the aim of bean counters to cut costs, however can be done in a way that does not degrade service levels or customer service experience. For example instead of looking to cut backup costs, do you know where the real costs of doing data protection exist (hint swapping out media is treating the symptoms) and if so, what can be done to streamline those from the source of the problem downstream to the target (e.g. media or medium). In other words, redesign, review, modernize how data protection is done, leverage data footprint reduction (DFR) techniques including archive, compression, consolidation, data management, dedupe and other technologies in effective and creative ways, after all, return on innovation is the new ROI.

Checkout Drews article here to read more on the above topics and themes.

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

New Seagate Momentus XT Hybrid drive (SSD and HDD)

Seagate recently announced the next generation Momentus XT Hybrid Hard Disk Drive (HHDD) with a capacity of 750GB in a 2.5 inch form factor and MSRP of $245.00 USD including integrated NAND flash solid state device (SSD). As a refresher, the Momentus XT is a HHDD in that it includes a 4GB nand flash SSD integrated with a 500GB (or larger) 7,200 RPM hard disk drive (HDD) in a single 2.5 inch package.

Seagate Momentus XT
HHDD with integrated nand flash SSD photo courtesy Seagate.com

This is the fifth installment of a series that I have done since June 2010 when I received my first HHDD a Seagate Momentus XT. You can read the other installments of my momentus moments here, here, here and here.

Whats is new with the new generation.
Besides extra storage space capacity up to 750GB (was 500GB), there is twice as much single level cell (SLC) nand flash memory (8GB vs. 4GB in previous generation) along with an enhanced interface using 6Gb per second SATA that supports native command queuing (NCQ) for better performance. Note that NCQ was available on the previous generation Momentus XT that used a 3Gb SATA interface. Other enhancements include a larger block or sector size of 4096 bytes vs. traditional 512 bytes on previous generation storage devices.

This bigger sector size results in less overhead with managing data blocks on large capacity storage devices. Also new are caching enhancements are FAST Factor Flash Management, FAST Factor Boot and Adaptive Memory Technology. Not to be confused with EMC Fully Automated Storage Tiering the other FAST; Seagate FAST is technology that exists inside the storage drive itself. FAST Factor boot enables systems to boot and be productive with speeds similar to SSD or several times faster than traditional HDDs.

The FAST Factor Flash Management provides the integrated intelligence to maximize use of the nand flash or SSD capabilities along with spinning HDD to boot performance, keep up compatibility with different systems and their operating systems. In addition to performance and interoperability, data integrity and SSD flash endurance are also enhanced for investment protection. The Adaptive Memory technology is a self learning algorithm to give SSD like performance for often used applications and data to close the storage capacity too performance gap that has increased along with data center bottlenecks.

Some questions and discussion comments:

When to use SSD vs. HDD vs. HHDD?
If you need the full speed of SSD to boost performance across all data access and cost is not an issue for available capacity that is where you should be focused. However if you are looking for lowest total cost of storage capacity with no need for performance, than lower cost high capacity HDDs should be on your shopping list. On the other hand, if you want a mix of performance and capacity at an effective price, than HHDDs should be considered.

Why the price jump compared to first generation HHDD?
IMHO, it has a lot to do with current market conditions, supply and demand.

With recent floods in Thailand and forecasted HDD and other technology shortages, the lay of supply and demand applies. This means that the supply may be constrained for some products causing demand to rise for others. Your particular vendor or supplier may have inventory however will be less likely to heavily discount while there are shortages or market opportunities to keep prices high. There are already examples of this if you check around on various sites to compare prices now vs. a few months ago. Granted it is the holiday shopping season for both people as well as organizations spending the last of their available budgets so more demand for available supplies.

What kind of performance or productivity have I seen with HHDDs?
While I have not yet tested and compared the second generation or new devices, I can attest to the performance improvements resulting in better productivity over the past year using Seagate Momentus XT HHDDs compared to traditional HDDs. Here is a post that you can follow to see some boot performance comparisons as part of some virtual desktop infrastructure (VDI) sizing testing I did earlier this year that included both HHDD and HDD.

HHDD desktop 1

HDD desktop 1

HHDD desktop 2

Avg. IOPS

334

69 to 113

186 to 353

Avg. MByte sec

5.36

1.58 to 2.13

2.76 to 5.2

Percent IOPS read

94

80 to 88

92

Percent MBs read

87

63 to 77

84

Mbytes read

530

201 to 245

504

Mbytes written

128

60 to 141

100

Avg. read latency

2.24ms

8.2 to 9.5ms

1.3ms

Avg. write latency

10.41ms

20.5 to 14.96ms

8.6ms

Boot duration

120 seconds

120 to 240 sec

120

Click here to read the entire post about the above table

When will I jump on the SSD bandwagon?
Great question, I have actually been on the SSD train for several decades using them, selling them, covering, analyzing and consulting around them along with other storage mediums including HDD, HHDD, cloud and tape. I have some SSDs and will eventually put them into my laptops, workstations and servers as primary storage when the opportunity makes sense.

Will HHDDs help backup and other data protection tasks?
Yes, in fact I initially used my Momentus XTs as backup or data protection targets along with for moving large amounts of data between systems faster than what my network could support.

Why not use a SSD?
If you need the performance and can afford the price, go SSD!

On the other hand, if you are looking to add a small 64GB, 128GB or even 256GB SSD while retaining a larger capacity, slower and lower cost HDD, an HHDD should be considered as an option. By using an HHDD instead of both a SSD and HDD, you will cut the need of figuring out how to install both in space constrained laptops, desktop or workstations. In addition, you will cut the need to either manually move data between the different devices or avoid having to acquire software or drivers to do that for you.

How much does the new Seagate Momentus XT HHDD cost?
Manufactures Suggested Retail Price (MSRP) is listed at $245 for a 750GB version.

Does the Momentus XT HHDD need any special drivers, adapters or software?
No, they are plug and play. There is no need for caching or performance acceleration drivers, utilities or other software. Likewise no needs for tiering or data movement tools.

How do you install an HHDD into an existing system?
Similar to installing a new HDD to replace an existing one if you are familiar with that process. If not, it goes like this (or uses your own preferred approach).

  • Attach a new HHDD to an existing system using a cable
  • Utilize a disk clone or image tool to make a copy of the existing HDD to HHDD
  • Note that the system may not be able to be used during the copy, so plan ahead.
  • After the clone or image copy is made, shutdown system, remove existing HDD and replace it with the HHDD that was connected to the system during the copy (remember to remove the copy cable).
  • Reboot the system to verify all is well, note that it will take a few reboots before the HHDD will start to learn your data and files along with how they are used.
  • Regarding your old HDD, save it, put it in a safe place and use it as a disaster recovery (DR) backup. For example if you have a safe deposit box or somewhere else safe, put it there for when you will need it in the future.


Seagate Momentus XT and USB to SATA cable

Can an HHDD fit into an existing slot in a laptop, workstation or server?
Yes. In fact, unlike a HDD and SSD combination, that requires multiple slots or forcing one device to be external, HHDDs like the Momentus XT simply use the space where your current HDD is installed.

How do you move data to it?
Beyond the first installation described above, the HHDD appears as just another local device meaning you can move data to or from it like any other HDD, SSD or CD.

Do you need automated tiering software?
No, not unless you need it for some other reason or if you want to use an HHDD as the lower cost, larger capacity option as a companion to a smaller SSD.

Do I have any of the new or second generation HHDDs?
Not yet, maybe soon and I will do another momentus moment point when that time arrives. For the time being, I will continue to use the first generation Momentus XT HHDDs

Bottom line (for now), If you are considering a large capacity, HDDs check out the HHDDs for an added performance boost including faster boot times as well as accessing other data quicker.

On the other hand if you want an SSD however your budget restricts you to a smaller capacity version, look into how an HHDD can be a viable option for some of your needs.

Ok, nuff said

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Speaking of speeding up business with SSD storage

Solid state devices (SSD) are a popular topic gaining both industry adoption and customer deployment to speed up storage performance. Here is a link to a recent conversation that I had with John Hillard to discuss industry trends and perspectives pertaining to using SSD to boost performance and productivity for SMB and other environments.

I/O consolidation from Cloud and Virtual Data Storage Networking (CRC Press) www.storageio.com/book3.html

SSDs can be a great way for organizations to do IO consolidation to reduce costs in place of using many hard disk drives (HDDs) grouped together to achieve a certain level of performance. By consolidating the IOs off of many HDDs that often end up being under utilized from a space capacity basis, organizations can boost performance for applications while reducing, or reusing HDD based storage capacity for other purposes including growth.

Here is some related material and comments:
Has SSD put Hard Disk Drives (HDDs) On Endangered Species List?
SSD and Storage System Performance
Are Hard Disk Drives (HDDs) getting too big?
Solid state devices and the hosting industry
Achieving Energy Efficiency using FLASH SSD
Using SSD flash drives to boost performance

Four ways to use SSD storage
4 trends that shape how agencies handle storage
Giving storage its due

You can read a transcript of the conversation and listen to the pod cast here, or download the MP3 audio here.

Ok, nuff said about SSD (for now)

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

2011 Summer momentus hybrid hard disk drive (HHDD) moment

This is the fourth in a series of posts (others are here, here and here) that I have been doing for over a year now taking a moment now and then to share some of my experiences with using hybrid hard disk drives (HHDD) along side my hard disk drives (HDD) and solid state drives (SSD).

It has been  a several months now since applying the latest firmware (SD25) which resulted in even better stability that was further enhanced when upgrading a few months ago to Windows 7 on all systems with the Seagate Momentus XT HHDD installed in them. One additional older system was recently upgraded from a slower, lower capacity 3.5 inch form factor SATA HDD to a physically smaller 2.5 inch HHDD. The net result is that system now boots in a fraction of the time, shuts down faster, work on it is much more productive and capacity was increased by three and half times.

Why use an HHDD when you could get an SSD?

With flash SSD devices continuing to become more affordable for a given price capacity point, why did I not simply install some of those devices instead of using the HHDDs?

With the money saved from buying the 500GB Momentus XT on Amazon.com (under $100 USD) vs. buying a smaller capacity SSD, I was also able to double the amount of DRAM in that system furthering its useful life plus buying some time to decide what to replace it with while having extra funds for other projects.

Sure I would like to have more and larger capacity SSDs to go along with those I already have, however there is balancing budget with needs and improving productivity (needs vs. wants).

To expand more on why the HHDD at this time vs. SSD, want some more SSD devices to coexist with those I already have and use for different functions. Looking to stretch my budget further, the HHDDs are a great balance of being almost and in some cases as fast as SSDs while at the cost of a high capacity HDD. In other words Im getting the best of both worlds which is a 7,200 RPM 2.5 inch 500GB HDD (e.g. for space capacity) that has 4GB of single layer cell (SLC) flash (e.g. SSD) and 32MB of DRAM as buffers (for read and write performance) to help speed up read and write operations.

Given for what Im using them for, I do not need the consistent higher performance of an SSD across all of my data which brings up the other benefit, Im able to retain more data on the device as a buffer or cache instead of having to go to a NAS or other storage repository to get it. Even though the amount of data being stored on the HHDD is increasing, not all of it gets backed up locally or to my cloud provider as there is already a copy(s) elsewhere. Instead, a small subset of data that is changing or very important gets routinely protected locally and remotely to the cloud enabling easier and faster restores when needed. Now if you have a large budget or someone is willing to buy or give you one, sure, go ahead and get one of the high capacity SSDs (preferably SLC based if concerned about endurance) however there are some good MLC ones out there as well.

Step back a bit, what is an HHDD?

Hybrid hard disk drives (HHDDs) such as the Seagate Momentus XT are, as their name implies, a combination of large- to medium-capacity HDDs with FLASH SSDs. The result is a mix of performance and capacity in a cost effective footprint. HHDDs have not seen much penetration in the enterprise space and may not see much more, given how many vendors are investing in the firmware and associated software technology to achieve hybrid results using a mix of SSDs and high capacity disk drives along with the lack of awarness that they exist.

Where HHDDs could have some additional traction is in secondary or near-line solutions that need some performance enhancements while having a large amount of capacity in a cost-effective footprint. For now, HHDDs are appearing mainly in desktops, laptops, and workstations that need lots of capacity with some performance but without the high price of SSDs. Before I installed the HHDDs in my laptops, I initially used one as a backup and data movement device, and I found that large, gigabyte-sized files could be transferred as fast as with SSDs and much faster than via my WiFi based network and NAS. The easiest way to characterize where HHDDs fit is where you want an SSD for performance, but your applications do not always need speed and you need a large amount of storage capacity at an affordable price.

SSDs are part of the future, however HDDs have a lot of life in them including increased capacities, both are best used where their strengths can be maximized, thus HHDDs are a great compliment or stepping stone for some applications. Note, Seagate recently announced that they have shipped over one million HHDDs in just over a years time.

I do find it interesting though when I hear from those who claim that the HDD is dead and that SSD is the future yet they do not have SSDs in their systems let alone do they have or talk about HHDDs, hmmmm.

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Measuring Windows performance impact for VDI planning

Here is a link to a recent guest post that I was invited to do over at The Virtualization Practice (TVP) pertaining to measuring the impact of Windows Boot performance and what that means for planning for Virtual Desktop Infrastructure (VDI) initiatives.

With Virtual Desktop Infrastructures (VDI) initiatives adoption being a popular theme associated with cloud and dynamic infrastructure environments a related discussion point is the impact on networks, servers and storage during boot or startup activity to avoid bottlenecks. VDI solution vendors include Citrix, Microsoft and VMware along with various server, storage, networking and management tools vendors.

A common storage and network related topic involving VDI are boot storms when many workstations or desktops all startup at the same time. However any discussion around VDI and its impact on networks, servers and storage should also be expanded from read centric boots to write intensive shutdown or maintenance activity as well.

Having an understanding of what your performance requirements are is important to adequately design a configuration that will meet your Quality of Service (QoS) and service level objectives (SLOs) for VDI deployment in addition to knowing what to look for in candidate server, storage and networking technologies. For example, knowing how your different desktop applications and workloads perform on a normal basis provides a baseline to compare with during busy periods or times of trouble. Another benefit is that when shopping for example storage systems and reviewing various benchmarks, knowing what your actual performance and application characteristics are helps to align the applicable technology to your QoS and SLO needs while avoiding apples to oranges benchmark comparisons.

Check out the entire piece including some test results using the hIOmon tool from hyperIO to gather actual workstation performance numbers.

Keep in mind that the best benchmark is your actual applications running as close to possible to their typical workload and usage scenarios.

Also keep in mind that fast workstations need fast networks, fast servers and fast storage.

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Are Hard Disk Drives (HDDs) getting too big?

Lets start out by clarifying something, that is in terms of context or scope, big means storage capacity as opposed to the physical packaging size of a hard disk drive (HDD) which are getting smaller.

So are HDDs in terms of storage capacity getting too big?

This question of if HDDs storage capacity getting too big to manage comes up every few years and it is the topic of Rick Vanovers (aka twitter @RickVanover Episode 27 Pod cast: Are hard drives getting to big?

Veeam community podcast guest appearance

As I discuss in this pod cast with Rick Vannover of Veeam, with the 2TB and even larger future 4TB, 8 to 9TB, 18TB, 36TB and 48 to 50TB drives not many years away, sure they are getting bigger (in terms of capacity) however we have been here before (or at least some of us have). We discuss how back in the late 90s HDDs were going from 5.25 inch to 3.5 inch (now they are going from 3.5 inch to 2.5 inch), and 9GB were big and seen as a scary proposition by some for doing RAID rebuilds, drive copy or backups among other things, not to mention if putting to many eggs (or data) in one basket.

In some instances vendors have been able to combine various technologies, algorithms and other techniques to RAID rebuild a 1TB or 2TB drive in the same or less amount of time as it used to take to process a 9GB HDD. However those improvements are not enough and more will be needed leveraging faster processors, IO busses and back planes, HDDs with more intelligence and performance, different algorithms and design best practices among other techniques that I discussed with Rick. After all, there is no such thing as a data recession with more information to be generated, processed, moved, stored, preserved and served in the future.

If you are interested in data storage, check out Ricks pod cast and hear some of our other discussion points including how SSD will help keep the HDD alive similar to how HDDs are offloading tape from their traditional backup role, each with its changing or expanding focus among other things.

On a related note, here is post about RAID remaining relevant yet continuing to evolve. We also talk about Hybrid Hard Disk Drives (HHDD) where in a single sealed HDD device there is flash and dram along with a spinning disk all managed by the drives internal processor with no external special software or hardware needed.

Listen to comments by Greg Schulz of StorageIO on HDD, HHDD, SSD, RAID and more

Put on your head phones (or not) and check out Ricks pod cast here (or on the head phone image above).

Thanks again Rick, really enjoyed being a guest on your show.

Whats your take, are HDDs getting to big in terms of capacity or do we need to leverage other tools, technology and techniques to be more effective in managing expanding data footprint including use of data footprint reduction (DFR) techniques?

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Using Removable Hard Disk Drives (RHDDs)

Removable hard disk drives (RHDD) are a form of removable media which includes magnetic tape that address many common use cases. Usage scenarios include enabling bulk data portability for larger environments or for D2D backup where the media needs to be physically moved offsite for small and mid sized environments. RHDDs include among others those from Imation such as the Odyssey (which is what I use) and the Prostor RDX (OEMed by Imation and others). RHDD, tape along with other forms of portable media including those that use flash by being removable and portable presumable should have some extra packaging protection to safeguard against static shock in addition to supporting encryption capabilities.

Compared to disks including RHDD, tape for most and particularly larger environments should have an overall lower media cost for parking, preserving and when needed serving inactive or archived data (e.g. the changing roll of tape from day to back up to archive). Of course your real costs will vary by use in addition to how combined with data footprint reduction and other technologies.

A big benefit of RHDDs is that they are random meaning data can be searched and found quickly vs. tape media which has great sequential or streaming capabilities if you have a system that can support that ability. The other benefit of RHDD is that depending on their implementation, they should plug and play with your systems appearing as disk without any extra drivers or configuration or software tools making for ease of use. Being removable they can be used for portability such as sending data to a cloud or MSP as part of an initial bulk copy, or sending data offset or taking home as part of an offsite backup, data protection or BC/DR strategy as well as being used for archiving. The warning with RHDD is their cost per TByte will generally be higher than compared to tape as well as having to have a docking station or specific drive interface depending on specific product and configuration.

RHDD are a great compliment to traditional fixed or non removable disk, Hybrid Hard Disk Drive (HHDD) and Solid State Device (SSD) based storage as well as coexist with cloud or MSP backup and archive solutions. The smaller the environment the more affordable using RHDD become vs. tape for backup and archive operations or when portability is required. Even if using a cloud or managed service provider (MSP) backup provider, network bandwidth costs, availability or performance may limit the amount of data that can be moved in a cost effective way. For example placing an archive and gold or master copy of your static data on a RHDD that may be kept on site in a safe off-site place and then sending data that is routinely changed to the cloud or MSP provider (think full local and offsite plus partial full and incremental in the cloud).

By leveraging archiving and data footprint reduction (DFR) techniques including dedupe and compression, you can stretch your budget by sending less data to cloud or MSP services while using removable media for data protection. You would be surprised how many TBytes of data can be kept in a safe deposit box. For my own business, I have used RHDDs for several years to keep gold master copies as well as archives offsite as part of a disk to disk (D2D) or D2D2RHDD strategy. The data protection strategy is also complimented by sending active data to a cloud backup MSP (encrypted of course). It might be belt and suspenders, however it is also eating my own dog food practicing what I talk about and the approach has proven itself a few times.

Here are some related links to more material:
Removable disk drives vs. tape storage for small businesses
The pros and cons of removable disk storage for small businesses
Removable storage media appealing to SMBs, but with caveats
StorageIO Momentus Hybrid Hard Disk Drive (HHDD) Moments

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author The Green and Virtual Data Center (CRC), Resilient Storage Networks (Elsevier) and coming summer 2011 Cloud and Virtual Data Storage Networking (CRC)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved