EPA Energy Star for Data Center Storage Update

EPA Energy Star

Following up on a recent post about Green IT, energy efficiency and optimization for servers, storage and more, here are some additional  thoughts, perspectives along with industry activity around the U.S. Environmental Protection Agency (EPA) Energy Star for Server, Data Center Storage and Data Centers.

First a quick update, Energy Star for Servers is in place with work now underway on expanding and extending beyond the first specification. Second is that Energy Star for Data Center storage definition is well underway including a recent workshop to refine the initial specification along with discussion for follow-on drafts.

Energy Star for Data Centers is also currently undergoing definition which is focused more on macro or facility energy (notice I did not say electricity) efficiency as opposed to productivity or effectiveness, items that the Server and Storage specifications are working towards.

Among all of the different industry trade or special interests groups, at least on the storage front the Storage Networking Industry Association (SNIA) Green Storage Initiative (GSI) and their Technical Work Groups (TWG) have been busily working for the past couple of years on taxonomies, metrics and other items in support of EPA Energy Star for Data Center Storage.

A challenge for SNIA along with others working on related material pertaining to storage and efficiency is the multi-role functionality of storage. That is, some storage simply stores data with little to no performance requirements while other storage is actively used for reading and writing. In addition, there are various categories, architectures not to mention hardware and software feature functionality or vendors with different product focus and interests.

Unlike servers that are either on and doing work, or, off or in low power mode, storage is either doing active work (e.g. moving data), storing in-active or idle data, or a combination of both. Hence for some, energy efficiency is about how much data can be stored in a given footprint with the least amount of power known as in-active or idle measurement.

On the other hand, storage efficiency is also about using the least amount of energy to produce the most amount of work or activity, for example IOPS or bandwidth per watt per footprint.

Thus the challenge and need for at least a two dimensional  model looking at, and reflecting different types or categories of storage aligned for active or in-active (e.g. storing) data enabling apples to apples, vs. apples to oranges comparison.

This is not all that different from how EPA looks at motor vehicle categories of economy cars, sport utility, work or heavy utility among others when doing different types of work, or, in idle.

What does this have to do with servers and storage?

Simple, when a server powers down where does its data go? That’s right, to a storage system using disk, ssd (RAM or flash), tape or optical for persistency. Likewise, when there is work to be done, where does the data get read into computer memory from, or written to? That’s right, a storage system. Hence the need to look at storage in a multi-tenant manner.

The storage industry is diverse with some vendors or products focused on performance or activity, while others on long term, low cost persistent storage for archive, backup, not to mention some doing a bit of both. Hence the nomenclature of herding cats towards a common goal when different parties have various interests that may conflict yet support needs of various customer storage usage requirements.

Figure 1 shows a simplified, streamlined storage taxonomy that has been put together by SNIA representing various types, categories and functions of data center storage. The green shaded areas are a good step in the right direction to simplify yet move towards realistic and achievable befits for storage consumers.


Figure 1 Source: EPA Energy Star for Data Center Storage web site document

The importance of the streamlined SNIA taxonomy is to help differentiate or characterize various types and tiers of storage (Figure 2) products facilitating apples to apples comparison instead of apples or oranges. For example, on-line primary storage needs to be looked at in terms of how much work or activity per energy footprint determines efficiency.


Figure 2: Tiered Storage Example

On other hand, storage for retaining large amounts of data that is in-active or idle for long periods of time should be looked at on a capacity per energy footprint basis. While final metrics are still being flushed out, some examples could be active storage gauged by IOPS or work or bandwidth per watt of energy per footprint while other storage for idle or inactive data could be looked at on a capacity per energy footprint basis.

What benchmarks or workloads to be used for simulating or measuring work or activity are still being discussed with proposals coming from various sources. For example SNIA GSI TWG are developing measurements and discussing metrics, as have the storage performance council (SPC) and SPEC among others including use of simulation tools such as IOmeter, VMware VMmark, TPC, Bonnie, or perhaps even Microsoft ESRP.

Tenants of Energy Star for Data Center Storage overtime hopefully will include:

  • Reflective of different types, categories, price-bands and storage usage scenarios
  • Measure storage efficiency for active work along with in-active or idle usage
  • Provide insight for both storage performance efficiency and effective capacity
  • Baseline or raw storage capacity along with effective enhanced optimized capacity
  • Easy to use metrics with more in-depth back ground or disclosure information

Ultimately the specification should help IT storage buyers and decision makers to compare and contrast different storage systems that are best suited and applicable to their usage scenarios.

This means measuring work or activity per energy footprint at a given capacity and data protection level to meet service requirements along with during in-active or idle periods. This also means showing storage that is capacity focused in terms of how much data can be stored in a given energy footprint.

One thing that will be tricky however will be differentiating GBytes per watt in terms of capacity, or, in terms of performance and bandwidth.

Here are some links to learn more:

Stay tuned for more on Energy Star for Data Centers, Servers and Data Center Storage.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

SPC and Storage Benchmarking Games

Storage I/O trends

There is a post over in one of the LinkedIn Discussion forums about storage performance council (SPC) benchmarks being miss-leading that I just did a short response post to. Here’s the full post as LinkedIn has a short post response limit.

While the SPC is far from perfect, it is at least for block, arguably better than doing nothing.

For the most part, SPC has become a de facto standard for at least block storage benchmarks independent of using IOmeter or other tools or vendor specific simulations, similar how MSFT ESRP is for exchange, TPC for database, SPEC for NFS and so forth. In fact, SPC even recently rather quietly rolled out a new set of what could be considered the basis for Green storage benchmarks. I would argue that SPC results in themselves are not misleading, particularly if you take the time to look at both the executive and full disclosures and look beyond the summary.

Some vendors have taken advantage of the SPC results playing games with discounting on prices (something that’s allowed under SPC rules) to show and make apples to oranges comparisons on cost per IOP or other ploys. This proactive is nothing new to the IT industry or other industries for that matter, hence benchmark games.

Where the misleading SPC issue can come into play is for those who simply look at what a vendor is claiming and not looking at the rest of the story, or taking the time to look at the results and making apples to apples, instead of believing the apples to oranges comparison. After all, the results are there for a reason. That reason is for those really interested to dig in and sift through the material, granted not everyone wants to do that.

For example, some vendors can show a highly discounted list price to get a better IOP per cost on an apple to oranges basis, however, when processes are normalized, the results can be quite different. However here’s the real gem for those who dig into the SPC results, including looking at the configurations and that is that latency under workload is also reported.

The reason that latency is a gem is that generally speaking, latency does not lie.

What this means is that if vendor A doubles the amount of cache, doubles the number of controllers, doubles the number of disk drives, plays games with actual storage utilization (ASU), utilizes fast interfaces from 10 GbE  iSCSI to 8Gb FC or FCoE or SAS to get a better cost per IOP number with discounting, look at the latency numbers. There have been some recent examples of this where vendor A has a better cost per IOP while achieving a higher number of IOPS at a lower cost compared to vendor B, which is what is typically reported in a press release or news story. (See a blog entry that also points to a CMG presentation discussion around this topic here.

Then go and look at the two results, vendor B may be at list price while vendor A is severely discounted which is not a bad thing, as that is then the starting list price as to which customers should start negotiations. However to be fair, normalize the pricing for fun, look at how much more equipment vendor A may need while having to discount to get the price to offset the increased amount of hardware, then look at latency.

In some of the recent record reported results, the latency results are actually better for a vendor B than for a vendor A and why does latency matter? Beyond showing what a controller can actually do in terms of levering  the number of disks, cache, interface ports and so forth, the big kicker is for those talking about SSD (RAM or FLASH) in that SSD generally is about latency. To fully effectively utilize SSD which is a low latency device, you would want a controller that can do a decent job at handling IOPS; however you also need a controller that can do a decent job of handling IOPS with low latency under heavy workload conditions.

Thus the SPC again while far from perfect, at least for a thumb nail sketch and comparison is not necessarily misleading, more often than not it’s how the results are utilized that is misleading. Now in the quest for the SPC administrators to try and gain more members and broader industry participation and thus secure their own future, is the SPC organization or administration opening itself up to being used more and more as a marketing tool in ways that potentially compromise all the credibility (I know, some will dispute the validity of SPC, however that’s reserved for a different discussion ;) )?

There is a bit of Déjà here for those involved with RAID and storage who recall how the RAID Advisory Board (RAB) in its quest to gain broader industry adoption and support succumbed to marketing pressures and use or what some would describe as miss-use and is now a member of the “Where are they now” club!

Don’t get me wrong here; I like the SPC tests/results/format, there is a lot of good information in the SPC. The various vendor folks who work very hard behind the scenes to make the SPC actually work and continue to evolve it also all deserve a great big kudos, an “atta boy” or “atta girl” for the fine work that have been doing, work that I hope does not become lost in the quest to gain market adoption for the SPC.

Ok, so then this should all then beg the question of what is the best benchmark. Simple, the one that most closely resembles your actual applications, workload, conditions, configuration and environment.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Power, Cooling, Floor-space, Environmental (PCFE) and Green Metrics

The Metrics and Measurement page on www.greendatastorage.com has been updated along with other pages covering IT data center PCFE and green topics for servers, storage, networks and facilities. Have a look.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Do Disk based VTLs draw less power than Tape?

The tape is dead debates rage on as they have for a decades which make for good press and discussion or debate during slow times, similar to coverage of what Britney Spears or Paris Hilton are or are not wearing.

In the on-going debates and Greenwashing of what technology or vendor is greener to prevent global warming, some recent tape is dead flare-ups have occurred including one hinting that tape libraries can draw more power than a disk based VTL with de-dupe are discussed over on Tony Pearson of IBM fame blog site as well as Beth Pariseau of TechTarget StorageSoup site.

I posted some comments on those sites along along with a link to a StorageIO Industry Trends and Perspective report titled “Energy Savings without Performance Compromise” as an example (look for an updated version of the comparison charts in the report in the not so distant future). The report looks at how different storage tiers including on-line disk, MAID, MAID 2.0 and tape libraries vary to address different PCFE (power, cooling, floor-space, environment) issues while supporting various service levels including performance, availability, capacity and energy use.

Additional related material can be found at www.storageio.com and www.greendatastorage.com including the Industry Trends and Perspective Report Business “Benefits of Data Footprint Reduction in general covering archiving, compression (on-line and off-line) along with de-duplication

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved