I/O, I/O how well do you know about good or bad server and storage I/Os?

February 1, 2015 – 12:57 am

How well do you know about good or bad server and storage I/Os?

StorageIO industry trends

The best IO is one you do not have to do.

What about all the cloud, virtual, software defined and legacy based application that still need to do I/O?

If no IO operation is the best IO, then the second best IO is the one that can be done as close to the application and processor as possible with the best locality of reference.

Also keep in mind that aggregation (e.g. consolidation) can cause aggravation (server storage I/O performance bottlenecks).

aggregation causes aggravation
Example of aggregation (consolidation) causing aggravation (server storage i/o blender bottlenecks)

And the third best?

It’s the one that can be done in less time or at least cost or effect to the requesting application, which means moving further down the memory and storage stack.

solving server storage i/o blender and other bottlenecks
Leveraging flash SSD and cache technologies to find and fix server storage I/O bottlenecks

On the other hand, any IOP regardless of if for block, file or object storage that involves some context is better than those without, particular involving metrics that matter (here, here and here [webinar] )

Server Storage I/O optimization and effectiveness

The problem with IO’s is that they are a basic operations to get data into and out of a computer or processor, so there’s no way to avoid all of them, unless you have a very large budget. Even if you have a large budget that can afford an all flash SSD solution, you may still meet bottlenecks or other barriers.

IO’s require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data too their destination or retrieve them from where they are stored. While IO’s cannot be eliminated, their impact can be greatly improved or optimized by, among other techniques, doing fewer of them via caching and by grouping reads or writes (pre-fetch, write-behind).

server storage I/O STI and SUT

Think of it this way: Instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip. However, that optimization may also mean your drive will take longer. So, sometimes it makes sense to go on a couple of quick, short, low-latency trips instead of one larger one that takes half a day even as it accomplishes many tasks. Of course, how far you have to go on those trips (i.e., their locality) makes a difference about how many you can do in a given amount of time.

Locality of reference (or proximity)

What is locality of reference?

This refers to how close (i.e., its place) data exists to where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, ready to be acted on immediately. This would be followed by levels 1, 2, and 3 (L1, L2, and L3) onboard caches, followed by main memory, or DRAM. After that comes solid-state memory typically NAND flash either on PCIe cards or accessible on a direct attached storage (DAS), SAN, or NAS device. 

server storage I/O locality of reference

Even though a PCIe NAND flash card is close to the processor, there still remains the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with meta or control information to further optimize and improve locality of reference. In other words, this information is used to help with cache hits, cache use, and cache effectiveness vs. simply boosting cache use.

SSD to the rescue?

What can you do the cut the impact of IO’s?

There are many steps one can take, starting with establishing baseline performance and availability metrics.

The metrics that matter include IOP’s, latency, bandwidth, and availability. Then, leverage metrics to gain insight into your application’s performance.

Understand that IO’s are a fact of applications doing work (storing, retrieving, managing data) no matter whether systems are virtual, physical, or running up in the cloud. But it’s important to understand just what a bad IO is, along with its impact on performance. Try to identify those that are bad, and then find and fix the problem, either with software, application, or database changes. Perhaps you need to throw more software caching tools, hypervisors, or hardware at the problem. Hardware may include faster processors with more DRAM and faster internal busses.

Leveraging local PCIe flash SSD cards for caching or as targets is another option.

You may want to use storage systems or appliances that rely on intelligent caching and storage optimization capabilities to help with performance, availability, and capacity.

Where to gain insight into your server storage I/O environment

There are many tools that you can be used to gain insight into your server storage I/O environment across cloud, virtual, software defined and legacy as well as from different layers (e.g. applications, database, file systems, operating systems, hypervisors, server, storage, I/O networking). Many applications along with databases have either built-in or optional tools from their provider, third-party, or via other sources that can give information about work activity being done. Likewise there are tools to dig down deeper into the various data information infrastructure to see what is happening at the various layers as shown in the following figures.

application storage I/O performance
Gaining application and operating system level performance insight via different tools

windows and linux storage I/O performance
Insight and awareness via operating system tools on Windows and Linux

In the above example, Spotlight on Windows (SoW) which you can download for free from Dell here along with Ubuntu utilities are shown, You could also use other tools to look at server storage I/O performance including Windows Perfmon among others.

vmware server storage I/O
Hypervisor performance using VMware ESXi / vsphere built-in tools

vmware server storage I/O performance
Using Visual ESXtop to dig deeper into virtual server storage I/O performance

vmware server storage i/o cache
Gaining insight into virtual server storage I/O cache performance

Wrap up and summary

There are many approaches to address (e.g. find and fix) vs. simply move or mask data center and server storage I/O bottlenecks. Having insight and awareness into how your environment along with applications is important to know to focus resources. Also keep in mind that a bit of flash SSD or DRAM cache in the applicable place can go along way while a lot of cache will also cost you cash. Even if you cant eliminate I/Os, look for ways to decrease their impact on your applications and systems.

Check out the following resource links to learn more about server storage I/O performance, flash SSD, benchmarking and related topics.

How many IOP’s can a HDD or SSD do?
Server Storage I/O Benchmarking and other resources (various links)
Can we get a side of context with them IOP’s?
Cloud and Object Storage resources (various links)
When and Where to Use NAND Flash SSD for Virtual Servers
Server Storage I/O flash and SSD resources (various links)
Revisiting RAID storage remains relevant and resources (various links)
Server and Storage I/O Networking Performance Management (webinar)
Data Center Monitoring – Metrics that Matter for Effective Management (webinar)

Flash back to reality – Flash SSD Myths and Realities (Industry trends & benchmarking tips), (MSP CMG presentation)

Keep in mind: SSD including flash and DRAM among others are in your future, the question is where, when, with what, how much and whose technology or packaging.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2017 Server StorageIO and UnlimitedIO LLC All Rights Reserved

  1. 2 Responses to “I/O, I/O how well do you know about good or bad server and storage I/Os?”

  2. Good article, but in terms of consolidation in very large environments, the I/O dilemma is understanding workloads primarily. Keep in mind, that VMWare approach is consolidate everything regardless. So I would say that I/O is more related to the Admin as problem instead. I’ve manage large enviroments, and we handle I/O very well, due that we do our home work and it’s very consolidated .
    Regards

    Alex

    By Alex on Mar 22, 2015

  3. Thanks Alex
    Yes aggregation (consolidation) including using server virtualization can cause aggravation (bottlenecks or other problems) similar to consolidating workloads on physical servers or storage. Likewise have been involved with large environments including server perf engineering and capacity planning. Thats why concur understanding the application workload characteristics is important regardless of if a server, system, network, database, storage or converged admin for both large as well as small environments.

    ga

    By Greg Schulz on Mar 22, 2015

Post a Comment

Powered by Disqus