StorageIO Momentus Hybrid Hard Disk Drive (HHDD) Moments

This is the third in a series of posts that I have done about Hybrid Hard Disk Drives (HHDDs) along with pieces about Hard Disk Drives (HDD) and Solid State Devices (SSDs). Granted the HDD received its AARP card several years ago when it turned 50 and is routinely declared dead (or read here) even though it continues to evolve along SSD maturing and both expanding into different markets as well as usage roles.

For those who have not read previous posts about Hybrid Hard Disk Drives (HHDDs) and the Seagate Momentus XT you can find them here and here.

Since my last post, I have been using the HHDDs extensively and recently installed the latest firmware. The release of new HHDD firmware by Seagate for the Momentus XT (SD 25) like its predecessor SD24 cleaned up some annoyances and improved on overall stability. Here is a Seagate post by Mark Wojtasiak discussing SD25 and feedback obtained via the Momentus XT forum from customers.

If you have never done a HDD firmware update, its not as bad or intimidating as might be expected. The Seagate firmware update tools make it very easy, that is assuming you have a recent good backup of your data (one that can be restored) and about 10 to 15 minutes of time for a couple of reboots.

Speaking of stability, the Momentus XT HHDDs have been performing well helping to speed up accessing large documents on various projects including those for my new book. Granted an SSD would be faster across the board, however the large capacity at the price point of the HHDD is what makes it a hybrid value proposition. As I have said in previous posts, if you have the need for speed all of the time and time is money, get an SSD. Likewise if you need as much capacity as you can get and performance is not your primary objective, then leverage the high capacity HDDs. On the other hand, if you need a balance of some performance boost with capacity boost and a good value, then check out the HHDDs.

Image of Momentus XT courtesy of www.Seagate.com

Lets shift gears from that of the product or technology to that of common questions that I get asked about HHDDs.

Common questions I get asked about HHDDs include:

What is a Hybrid Hard Disk Drive?

A Hybrid Hard Disk Drive includes a combination of rotating HDD, solid state flash persistent memory along with volatile dynamic random access memory (DRAM) in an integrated package or product. The value proposition and benefit is a balance of performance and capacity at a good price for those environments, systems or applications that do not need all SSD performance (and cost) vs. those that need some performance in addition to large capacity.

How the Seagate Momentus XT differs from other Hybrid Disks?
One approach is to take a traditional HDD and pair it with a SSD using a controller packaged in various ways. For example on a large scale, HDDs and SSDs coexist in the same tiered storage system being managed by the controllers, storage processors or nodes in the solution including automated tiering and cache promotion or demotion. The main difference however between other storage systems, tiering and pairing and HHDDs is that in the case of the Momentus XT the HDD, SLC flash (SSD functionality) and RAM cache and their management are all integrated within the disk drive enclosure.

Do I use SSDs and HDDs or just HHDDs?
I have HHDDs installed internally in my laptops. I also have HDDs which are installed in servers, NAS and disk to disk (D2D) backup devices and Digital Video Recorders (DVRs) along with external SSD and Removable Hard Disk Drives (RHDDs). The RHDDs are used for archive and master or gold copy data protection that go offsite complimenting how I also use cloud backup services as part of my data protection strategy.

What are the technical specifications of a HHDD such as the Seagate Momentus XT?
3Gbs SATA interface, 2.5 inch 500GB 7,200 RPM HDD with 32MB RAM cache and integrated 4GByte SLC flash all managed via internal drive processor. Power consumption varies depending what the device is doing such as initial power up, idle, normal or other operating modes. You can view the Seagate Momentus XT 500GB (ST95005620AS which is what I have) specifications here as well as the product manual here.


One of my HHDDs on a note pad (paper) and other accessories

Do you need a special controller or management software?
Generally speaking no, the HHDD that I have been using plugged and played into my existing laptops internal bay replacing the HDD that came with those systems. No extra software was needed for Windows, no data movement or migration tools needed other than when initially copying from the source HDD to the new HHDD. The HHDD do their own caching, read ahead and write behind independent of the operating system or controller. Now the reason I say generally speaking is that like many devices, some operating systems or controllers may be able to leverage advanced features so check your particular system capabilities.

How come the storage system vendors are not talking about these HHDDs?
Good question which I assume it has a lot to do with the investment (people, time, engineering, money and marketing) that they have or are making in controller and storage system software functionality to effectively create hybrid tiered storage systems using SSD and HDDs on different scales. There have been some packaged HHDD systems or solutions brought to market by different vendors that combine HDD and SSD into a single physical package glued together with some software and controllers or processors to appear as a single system. I would not be surprised to see discrete HHDDs (where the HDD and flash SSD and RAM are all one integrated product) appear in lower end NAS or multifunction storage systems as well as for backup, dedupe or other system that requires large amounts of capacity space and performance boost now and then.

Why do I think this? Simple, say you have five HHDDs each with 500GB of capacity configured as a RAID5 set resulting in 2TByte of capacity. Using as a hypothetical example the Momentus XT yields 5 x 4GByte or 20GByte of flash cache helps accelerate write operations during data dumps, backup or other updates. Granted that is an overly simplified example and storage systems can be found with hundreds of GByte of cache, however think in terms of value or low cost balancing performance and capacity to cost for different usage scenarios. For example, applications such as bulk or scale out file and object storage including cloud or big data, entertainment, Server (Citrix/Xen, Microsoft/HyperV, VMware/vSphere) and Desktop virtualization or VDI, Disk to Disk (D2D) backup, business analytics among others. The common tenets of those applications and usage scenario is a combination of I/O and storage consolidation in a cost effective manner addressing the continuing storage capacity to I/O performance gap.

Data Center and I/O Bottlenecks

Storage and I/O performance gap

Do you have to backup HHDDs?
Yes, just as you would want to backup or protect any SSD or HHD device or system.

How does data get moved between the SSD and the HDD?
Other than the initial data migration from the old HDD (or SSD) to the HHDD, unless you are starting with a new system, once your data and applications exist on the HHDD, it automatically via the internal process of the device manages the RAM, flash and HDD activity. Unlike in a tiered storage system where data blocks or files may be moved between different types of storage devices, inside the HHDD, all data gets written to the HDD, however the flash and RAM are used as buffers for caching depending on activity needs. If you have sat through or listened to a NetApp or HDS use of cache for tiering discussion what the HHDDs do is similar in concept however on a smaller scale at the device level, potentially even in a complimentary mode in the future? Other functions performed inside the HHDD by its processor includes reading and writing, managing the caches, bad block replacement or re vectoring on the HDD, wear leveling of the SLC flash and other routine tasks such as integrity checks and diagnostics. Unlike paired storage solutions where data gets moved between tiers or types of devices, once data is stored in the HHDD, it is managed by the device similar to how a SSD or HDD would move blocks of data to and from the specific media along with leveraging RAM cache as a buffer.

Where is the controller that manages the SSD and HDD?
The HHDD itself is the controller per say in that the internal processor that manages the HDD also directly access the RAM and flash.

What type of flash is used and will it wear out?
The XT uses SLC (single level cell) flash which with wear leveling has a good duty cycle (life span) and is what is typically found in higher end flash SSD solutions vs. lower cost MLC (multi level cell)

Have I lost any data from it yet?
No, at least nothing that was not my own fault from saving the wrong file in the wrong place and having to recover from one of my recent D2D copies or the cloud. Oh, regarding what have I done with the HDDs that were replaced by the HHDDs? They are now an extra gold master backup copy as of a particular point in time and are being kept in a safe secure facility, encrypted of course.

Have you noticed a performance improvement?
Yes, performance will vary however in many cases I have seen performance comparable to SSD on both reads and writes as long as the HDDs keep up with the flash and RAM cache. Even as larger amounts of data are written, I have seen better performance than compared to HDDs. The caveat however is that initially you may see little to marginal performance improvement however over time, particularly on the same files, performance tends to improve. Working on large tens to hundreds of MByte size documents I noticed good performance when doing saves compared to working with them on a HDD.

What do the HHDDs cost?
Amazon.com has the 500GB model for about $100 which is about $40 to $50 less than when I bought my most recent one last fall. I have heard from other people that you can find them at even lower prices at other venues. In the theme of disclosures, I bought one of my HHDDs from Amazon and Seagate gave me one to test.

Will I buy more HHDDs or switch to SSDs?
Where applicable I will add SSDs as well as HDDs, however where possible and practical, I will also add HHDDs perhaps even replacing the HDDs in my NAS system with HHDDs at some time or maybe trying them in a DVR.

What is the down side to the HHDDs?
Im generating and saving more data on the devices at a faster rate which means that when I installed them I was wondering if I would ever fill up a 500GB drive. I still have hundreds of GBytes free or available for use, however I also am able to cary more reference data or information than in the past. In addition to more reference data including videos, audio, images, slide decks and other content, I have also been able to keep more versions or copies of documents which has been handy on the book project. Data that changes gets backed up D2D as well as to my cloud provider including while traveling. Leveraging compression and dedupe, given that many chapters or other content are similar, not as much data actually gets transmitted when doing cloud backups which has been handy when doing a backup from a airplane flying over the clouds. A wish for the XT type of HHDD that I have is for vendors such as Seagate to add Self Encrypting Disk (SED) capabilities to them along with applying continued intelligent power management (IPM) enhancements.

Why do I like the HHDD?
Simple, it solves both business and technology challenges while being an enabler, it gives me a balance of performance for productivity and capacity in a cost effective manner while being transparent to the systems it works with.

Here are some related links to additional material:
Data Center I/O Bottlenecks Performance Issues and Impacts
Has SSD put Hard Disk Drives (HDDs) On Endangered Species List?
Seagate Momentus XT SD 25 firmware
Seagate Momentus XT SD25 firmware update coming this week
A Storage I/O Momentus Moment
Another StorageIO Hybrid Momentus Moment
As the Hard Disk Drive (HDD) continues to spin
Has SSD put Hard Disk Drives (HDDs) On Endangered Species List?
Funeral for a Friend
As the Hard Disk Drive (HDD) continues to spin
Seagate Momentus XT product specifications
Seagate Momentus XT product manual
Technology Tiering, Servers Storage and Snow Removal
Self Encrypting Disks (SEDs)

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author The Green and Virtual Data Center (CRC), Resilient Storage Networks (Elsevier) and coming summer 2011 Cloud and Virtual Data Storage Networking (CRC)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Clarifying Clustered Storage Confusion

Clustered storage can be iSCSI, Fibre Channel block based or NAS (NFS or CIFS or proprietary file system) file system based. Clustered storage can also be found in virtual tape library (VTL) including dedupe solutions along with other storage solutions such as those for archiving, cloud, medical or other specialized grids among others.

Recently in the IT and data storage specific industry, there has been a flurry of merger and acquisition (M&A) (Here and here), new product enhancement or announcement activity around clustered storage. For example, HP buying clustered file system vendor IBRIX complimenting their previous acquisition of another clustered file system vendor (PolyServe) a few years ago, or, of iSCSI block clustered storage software vendor LeftHand earlier this year. Another recent acquisition is that of LSI buying clustered NAS vendor ONstor, not to mention Dell buying iSCSI block clustered storage vendor EqualLogic about a year and half ago, not to mention other vendor acquisitions or announcements involving storage and clustering.

Where the confusion enters into play is the term cluster which means many things to different people, and even more so when clustered storage is combined with NAS or file based storage. For example, clustered NAS may infer a clustered file system when in reality a solution may only be multiple NAS filers, NAS heads, controllers or storage processors configured for availability or failover.

What this means is that a NFS or CIFS file system may only be active on one node at a time, however in the event of a failover, the file system shifts from one NAS hardware device (e.g. NAS head or filer) to another. On the other hand, a clustered file system enables a NFS or CIFS or other file system to be active on multiple nodes (e.g. NAS heads, controllers, etc.) concurrently. The concurrent access may be for small random reads and writes for example supporting a popular website or file serving application, or, it may be for parallel reads or writes to a large sequential file.

Clustered storage is no longer exclusive to the confines of high-performance sequential and parallel scientific computing or ultra large environments. Small files and I/O (read or write), including meta-data information, are also being supported by a new generation of multipurpose, flexible, clustered storage solutions that can be tailored to support different applications workloads.

There are many different types of clustered and bulk storage systems. Clustered storage solutions may be block (iSCSI or Fibre Channel), NAS or file serving, virtual tape library (VTL), or archiving and object-or content-addressable storage. Clustered storage in general is similar to using clustered servers, providing scale beyond the limits of a single traditional system—scale for performance, scale for availability, and scale for capacity and to enable growth in a modular fashion, adding performance and intelligence capabilities along with capacity.

For smaller environments, clustered storage enables modular pay-as-you-grow capabilities to address specific performance or capacity needs. For larger environments, clustered storage enables growth beyond the limits of a single storage system to meet performance, capacity, or availability needs.

Applications that lend themselves to clustered and bulk storage solutions include:

  • Unstructured data files, including spreadsheets, PDFs, slide decks, and other documents
  • Email systems, including Microsoft Exchange Personal (.PST) files stored on file servers
  • Users’ home directories and online file storage for documents and multimedia
  • Web-based managed service providers for online data storage, backup, and restore
  • Rich media data delivery, hosting, and social networking Internet sites
  • Media and entertainment creation, including animation rendering and post processing
  • High-performance databases such as Oracle with NFS direct I/O
  • Financial services and telecommunications, transportation, logistics, and manufacturing
  • Project-oriented development, simulation, and energy exploration
  • Low-cost, high-performance caching for transient and look-up or reference data
  • Real-time performance including fraud detection and electronic surveillance
  • Life sciences, chemical research, and computer-aided design

Clustered storage solutions go beyond meeting the basic requirements of supporting large sequential parallel or concurrent file access. Clustered storage systems can also support random access of small files for highly concurrent online and other applications. Scalable and flexible clustered file servers that leverage commonly deployed servers, networking, and storage technologies are well suited for new and emerging applications, including bulk storage of online unstructured data, cloud services, and multimedia, where extreme scaling of performance (IOPS or bandwidth), low latency, storage capacity, and flexibility at a low cost are needed.

The bandwidth-intensive and parallel-access performance characteristics associated with clustered storage are generally known; what is not so commonly known is the breakthrough to support small and random IOPS associated with database, email, general-purpose file serving, home directories, and meta-data look-up (Figure 1). Note that a clustered storage system, and in particular, a clustered NAS may or may not include a clustered file system.

Clustered Storage Model: Source The Green and Virtual Data Center (CRC)
Figure 1 – Generic clustered storage model (Courtesy “The Green and Virtual Data Center  (CRC)”

More nodes, ports, memory, and disks do not guarantee more performance for applications. Performance depends on how those resources are deployed and how the storage management software enables those resources to avoid bottlenecks. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion when processing diverse application workloads. Other things to consider include support for industry-standard interfaces, protocols, and technologies.

Scalable and flexible clustered file server and storage systems provide the potential to leverage the inherent processing capabilities of constantly improving underlying hardware platforms. For example, software-based clustered storage systems that do not rely on proprietary hardware can be deployed on industry-standard high-density servers and blade centers and utilizes third-party internal or external storage.

Clustered storage is no longer exclusive to niche applications or scientific and high-performance computing environments. Organizations of all sizes can benefit from ultra scalable, flexible, clustered NAS storage that supports application performance needs from small random I/O to meta-data lookup and large-stream sequential I/O that scales with stability to grow with business and application needs.

Additional considerations for clustered NAS storage solutions include the following.

  • Can memory, processors, and I/O devices be varied to meet application needs?
  • Is there support for large file systems supporting many small files as well as large files?
  • What is the performance for small random IOPS and bandwidth for large sequential I/O?
  • How is performance enabled across different application in the same cluster instance?
  • Are I/O requests, including meta-data look-up, funneled through a single node?
  • How does a solution scale as the number of nodes and storage devices is increased?
  • How disruptive and time-consuming is adding new or replacing existing storage?
  • Is proprietary hardware needed, or can industry-standard servers and storage be used?
  • What data management features, including load balancing and data protection, exists?
  • What storage interface can be used: SAS, SATA, iSCSI, or Fibre Channel?
  • What types of storage devices are supported: SSD, SAS, Fibre Channel, or SATA disks?

As with most storage systems, it is not the total number of hard disk drives (HDDs), the quantity and speed of tiered-access I/O connectivity, the types and speeds of the processors, or even the amount of cache memory that determines performance. The performance differentiator is how a manufacturer combines the various components to create a solution that delivers a given level of performance with lower power consumption.

To avoid performance surprises, be leery of performance claims based solely on speed and quantity of HDDs or the speed and number of ports, processors and memory. How the resources are deployed and how the storage management software enables those resources to avoid bottlenecks are more important. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion.

Learn more about clustered storage (block, file, VTL/dedupe, archive), clustered NAS, clustered file system, grids and cloud storage among other topics in the following links:

"The Many faces of NAS – Which is appropriate for you?"

Article: Clarifying Storage Cluster Confusion
Presentation: Clustered Storage: “From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0”
Video Interview: How to Scale Data Storage Systems with Clustering
Guidelines for controlling clustering
The benefits of clustered storage

Along with other material on the StorageIO Tips and Tools or portfolio archive or events pages.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved