What is the best kind of IO? The one you do not have to do

May 3, 2012 – 3:38 pm

If no IO (input/output) operation is the best IO, than the second best IO is the one that can be done as close to the application and processor with best locality of reference. Then the third best IO is the one that can be done in less time, or at least cost or impact to the requesting application which means moving further down the memory and storage stack (figure 1).

Storage and IO or I/O locality of reference and storage hirearchy
Figure 1 memory and storage hierarchy

The problem with IO is that they are basic operation to get data into and out of a computer or processor so they are required; however, they also have an impact on performance, response or wait time (latency). IO require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data to their destination or retrieve from where stored. While IOs cannot be eliminated, their impact can be greatly improved or optimized by doing fewer of them via caching, grouped reads or writes (pre-fetch, write behind) among other techniques and technologies.

Think of it this way, instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip; however, that optimization may also take longer. Hence sometimes it makes sense to go on a couple of quick, short low latency trips vs. one single larger one that takes half a day however accomplishes many things. Of course, how far you have to go on those trips (e.g. locality) makes a difference of how many you can do in a given amount of time.

What is locality of reference?

Locality of reference refers to how close (e.g location) data exists for where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, then level 1 (L1), level 2 (L2) or level 3 (L3) onboard cache, followed by dynamic random access memory (DRAM). Then would come memory also known as storage on PCIe cards such as nand flash solid state device (SSD) or accessible via an adapter on a direct attached storage (DAS), SAN or NAS device. In the case of a PCIe nand flash SSD card, even though physically the nand flash SSD is closer to the processor, there is still the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with Meta or control information to further optimize and improve locality of reference. In other words, help with cache hits, cache use and cache effectiveness vs. simply boosting cache utilization.

What can you do the cut the impact of IO

  • Establish baseline performance and availability metrics for comparison
  • Realize that IOs are a fact of IT virtual, physical and cloud life
  • Understand what is a bad IO along with its impact
  • Identify why an IO is bad, expensive or causing an impact
  • Find and fix the problem, either with software, application or database changes
  • Throw more software caching tools, hyper visors or hardware at the problem
  • Hardware includes faster processors with more DRAM and fast internal busses
  • Leveraging local PCIe flash SSD cards for caching or as targets
  • Utilize storage systems or appliances that have intelligent caching and storage optimization capabilities (performance, availability, capacity).
  • Compare changes and improvements to baseline, quantify improvement

Related links on storage IO metrics and SSD performance
What is the best kind of IO? The one you do not have to do
Is SSD dead? No, however some vendors might be
Storage and IO metrics that matter
IO IO it is off to Storage and IO metrics we go
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO All Rights Reserved