RAID and IOPS and IO observations

Storage I/O trends

There are at least two different meanings for IOPs, which for those not familiar with the information technology (IT) and data storage meaning is Input/output Operations Per second (e.g. data movement activity). Another meaning for IOP that is the international organization for a participatory society (iopsociety.org), and their fundraising activity found here.

I recently came across a piece (here and here) talking about RAID and IOPs that had some interesting points; however, some generalizations could use some more comments. One of the interesting comments and assertions is that RAID writes increase with the number of drives in the parity scheme. Granted the specific implementation and configuration could result in an it depends type response.

StorageIO industry trends cloud, virtualization and big data

Here are some more perspectives to the piece (here and here) as the sites comments seem to be restricted.

Keep in mind that such as with RAID 5 (or 6) performance, your IO size will have a bearing on if you are doing those extra back-end IOs. For example if you are writing a 32KB item that is accomplished by a single front-end IO from an applications server, and your storage system, appliance, adapter, software implementing and performing the RAID (or erasure coding for that matter) has a chunk size of say 8KB (e.g. the amount of data written to each back-end drive). Then a 5 drive R5 (e.g. 4+1) would in fact have five back-end IOPS (32KB / 8KB = 4 + 1 (8KB Parity)).

StorageIO industry trends cloud, virtualization and big data

Otoh of the front end IOP were only 16KB (using whole numbers for simplicity, otherwise round-up), in the case of a write, there would be three back-end writes with the R5 (e.g. 2 + 1). Keep in mind the controller/software managing the RAID would (or should) try to schedule back-end IO with cache, read-head, write-behind, write-back, other forms of optimization etc.

In the piece (here and here), a good point is the understanding and factoring in IOPS is important, as is also latency or response time in addition to bandwidth or throughput, along with availability, they are all inter-related.

Also very important is to keep in mind the size of the IOP, read and write, random, sequential etc.

RAID along with erasure coding is a balancing act between performance, availability, space capacity and economics aligned to different application needs.

RAID 0 (R0) actually has a big impact on performance, no penalty on writes; however, it has no availability protection benefit and in fact can be a single point of failure (e.g. loss of a HDD or SSD) impacts the entire R0 group. However, for static items, or items that are being journaled and protected on some other medium/RAID/protection scheme, R0 is used more than people realize for scratch/buffer/transient/read cache types of applications. Keep in mind that it is a balance of all performance and capacity with the exposure of no availability as opposed to other approaches. Thus, do not be scared of R0, however also do not get burned or hurt with it either, treat it with respect and can be effective for something’s.

Also mentioned in the piece was that SSD based servers will perform vastly better than SATA or SAS based ones. I am assuming that the authors meant to say better than SAS or SATA DAS based HDDs?

Storage I/O trends

Keep in mind that unless you are using a PCIe nand flash SSD card as a target or cache or RAID card, most SSD drives today are either SAS or SATA (being the more common) along with moving from 3Gb SAS or SATA to 6Gb SAS & SATA.

Also while HDD and SSDs can do a given number of reads or writes per second, those will vary based on the size of the IO, read, write, random, sequential. However what can have the biggest impact and where I have seen too many people or environments get into a performance jam is when assuming that those IOP numbers per HDD or SSD are a given. For example assuming that 100-140, IOPs (regardless of size, type, etc.) can be achieved as a limiting factor is the type of interface and controller/adapter being used.

I have seen fast HDDs and SSDs deliver sub-par performance or not meeting expectations fast interfaces such as iSCSI/SAS/SATA/FC/FCoE/IBA or other interfaces due to bottlenecks in the adapter card, storage system / appliance / controller / software. In some cases you may see more effective IOPs or reads, writes or both, while on other implementations you may see lower than expected due to internal implementation bottlenecks or architectural designs. Hint, watch out for solutions where the vendor tries to blame poor performance on the access network (e.g. SAS, iSCSI, FC, etc.) particular if you know that those are not bottlenecks.

Here are some related content:
Are Hard Disk Drives (HDDs) getting too big?
How can direct attached storage (DAS) make a comeback if it never left?
EMC VFCache re spinning SSD and intelligent caching
SSD and Green IT moving beyond green washing
Optimize Data Storage for Performance and Capacity Efficiency
Is SSD dead? No, however some vendors might be
RAID Relevance Revisited
Industry Trends and Perspectives: RAID Rebuild Rates
What is the best kind of IO? The one you do not have to do
More storage and IO metrics that matter
IBM buys flash solid state device (SSD) industry veteran TMS

In terms of fund-raising, if you feel so compelled, send a gift, donation, sponsorship, project, buy some books, piece of work, assignment, research project, speaking, keynote, web cast, video or seminar event my way and just like professional fund-raisers, or IOPS vendors, StorageIO accept visa, Master Card, American express, Pay Pal, check and traditional POs.

As for this site and comments, outside of those caught in the spam trap, courteous perspectives and discussions are welcome.

Ok, nuff said.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Are Hard Disk Drives (HDDs) getting too big?

Lets start out by clarifying something, that is in terms of context or scope, big means storage capacity as opposed to the physical packaging size of a hard disk drive (HDD) which are getting smaller.

So are HDDs in terms of storage capacity getting too big?

This question of if HDDs storage capacity getting too big to manage comes up every few years and it is the topic of Rick Vanovers (aka twitter @RickVanover Episode 27 Pod cast: Are hard drives getting to big?

Veeam community podcast guest appearance

As I discuss in this pod cast with Rick Vannover of Veeam, with the 2TB and even larger future 4TB, 8 to 9TB, 18TB, 36TB and 48 to 50TB drives not many years away, sure they are getting bigger (in terms of capacity) however we have been here before (or at least some of us have). We discuss how back in the late 90s HDDs were going from 5.25 inch to 3.5 inch (now they are going from 3.5 inch to 2.5 inch), and 9GB were big and seen as a scary proposition by some for doing RAID rebuilds, drive copy or backups among other things, not to mention if putting to many eggs (or data) in one basket.

In some instances vendors have been able to combine various technologies, algorithms and other techniques to RAID rebuild a 1TB or 2TB drive in the same or less amount of time as it used to take to process a 9GB HDD. However those improvements are not enough and more will be needed leveraging faster processors, IO busses and back planes, HDDs with more intelligence and performance, different algorithms and design best practices among other techniques that I discussed with Rick. After all, there is no such thing as a data recession with more information to be generated, processed, moved, stored, preserved and served in the future.

If you are interested in data storage, check out Ricks pod cast and hear some of our other discussion points including how SSD will help keep the HDD alive similar to how HDDs are offloading tape from their traditional backup role, each with its changing or expanding focus among other things.

On a related note, here is post about RAID remaining relevant yet continuing to evolve. We also talk about Hybrid Hard Disk Drives (HHDD) where in a single sealed HDD device there is flash and dram along with a spinning disk all managed by the drives internal processor with no external special software or hardware needed.

Listen to comments by Greg Schulz of StorageIO on HDD, HHDD, SSD, RAID and more

Put on your head phones (or not) and check out Ricks pod cast here (or on the head phone image above).

Thanks again Rick, really enjoyed being a guest on your show.

Whats your take, are HDDs getting to big in terms of capacity or do we need to leverage other tools, technology and techniques to be more effective in managing expanding data footprint including use of data footprint reduction (DFR) techniques?

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Industry Trends and Perspectives: RAID Rebuild Rates

This is part of an ongoing series of short industry trends and perspectives blog posts briefs.

These short posts compliment other longer posts along with traditional industry trends and perspective white papers, research reports, solution brief content found at www.storageioblog.com/reports.

There is continued concern about how long large capacity disk drives take to be rebuilt in RAID sets particularly as the continued shift from 1TB to 2TB occurs. It should not be a surprise that a disk with more capacity will take longer to rebuild or copy as well as with more drives; the likely hood of one failing statistically increases.

Not to diminish the issue, however also to avoid saying the sky is falling, we have been here before! In the late 90s and early 2000s there was a similar concern with the then large 9GB, 18GB let alone emerging 36GB and 72GB drives. There have been improvements in RAID as well as rebuild algorithms along with other storage system software or firmware enhancements not to mention boost in processor or IO bus performance.

However not all storage systems are equal even if they use the same underlying processors, IO busses, adapters or disk drives. Some vendors have made significant improvements in their rebuild times where each generation of software or firmware can reconstruct a failed drive faster. Yet for others, each subsequent iteration of larger capacity disk drives brings increased rebuild times.

If disk drive rebuild times are a concern, ask your vendor or solution provider what they are doing as well as have done over the past several years to boost their performance. Look for signs of continued improvement in rebuild and reconstruction performance as well as decrease in error rates or false drive rebuilds.

Related and companion material:
Blog: RAID data protection remains relevant
Blog: Optimize Data Storage for Performance and Capacity Efficiency

That is all for now, hope you find this ongoing series of current and emerging Industry Trends and Perspectives interesting.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved