Is SSD dead? No, however some vendors might be

Storage I/O trends

Is SSD dead? No, however some vendors might be

In a recent conversation with Dave Raffo about the nand flash solid state disk (SSD) market, we talked about industry trends, perspectives and where the market is now as well as headed. One of my comments is, has been and will remain that the industry has still not reached anywhere near full potential for deployment of SSD for enterprise, SMB and other data storage needs. Granted, there is broad adoption in terms of discussion or conversation and plenty of early adopters.

SSD and in particular nand flash is anything but dead, in fact in the big broad picture of things, it is still very early in the game. Sure, for those who cover and crave the newest, latest and greatest technology to talk about, nand flash SSD might seem old, yesterday news, long in the tooth and time for something else. However, for those who are focused on deployment vs. adoption such as customers, in general, nand flash SSD in its many packaging options has still not yet reached its full potential.

Despite the hype, fanfare from CEOs or their evangelist along with loyal followers of startups that help drive industry adoption (e.g. what is talked about), there is still lots of upside growth in the customer drive industry deployment (actually buying, installing and using) for nand flash SSD.

What about broad customer deployments?

Sure, there are the marquee customer success stories that you need a high-capacity SAS or SATA drive to hold the YouTube videos, slide decks, press releases for.

However, have we truly, reached broad customer deployment or broad industry adoption?

Hence, I see more startups coming into the market space, and some exiting on their own, via mergers and acquisition or other means.

Will we see a feeding frenzy or IPO craze as with earlier hype cycles of technologies, IMHO there will be some companies that get the big deal, some will survive as new players running as a business vs. running to be acquired or IPO. Others will survive by evolving into something else while others will join the where are they now list.

If you are a SSD startup, CEO, CxO, or marketer, their PR, evangelist or loyal follower do not worry as the SSD market and even nand flash is far from being dead. On the other hand, if you think that it has hit its full stride, you are missing either the bigger picture, or too busy patting yourselves on the back for a job well done. There is much more opportunity out there and not even all the low hanging fruit has been picked yet.

Check out the conversation with Dave Raffo along with comments from others here.

Related links on storage IO metrics and SSD performance
What is the best kind of IO? The one you do not have to do
Is SSD dead? No, however some vendors might be
Storage and IO metrics that matter
IO IO it is off to Storage and IO metrics we go
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

More storage and IO metrics that matter

It is great to see more conversations and coverage around storage metrics that matter beyond simply focusing on cost per GByte or TByte (e.g. space capacity). Likewise, it is also good to see conversations expanding beyond data footprint reduction (DFR) from a space capacity savings or reduction ratio to also address data movement and transfer rates. Also good to see is increase in discussion around input/output operations per section (IOPs) tying into conversations from virtualization, VDI, cloud to Sold State Devices (SSD).

Other storage and IO metrics that matter include latency or response time, which is how fast work is done, or time spent. Latency also ties to IOPS in that as more work arrives to be done (IOPS) of various size, random or sequential, reads or writes, queue depths are an indicator of how well work is flowing. Another storage and IO metric that matters is availability because without it, performance or capacity can be affected. Likewise, without performance, availability can be affected.

Needless to say that I am just scratching the surface here with storage and IO metrics that matter for physical, virtual and cloud environments from servers to networks to storage.

Here is a link to a post I did called IO, IO, it is off to storage and IO metrics we go that ties in themes of performance measurements and solid-state disk (SSD) among others. Also check out this piece about why VASA (VMware storage analysis metrics) is important to have your VMware CASA along with Windows boot storage and IO performance for VDI and traditional planning purposes.

Check out this post about metrics and measurements that matter along with this conversation about IOPs, capacity, bandwidth and purchasing discussion topics.

Related links on storage IO metrics and SSD performance
What is the best kind of IO? The one you do not have to do
Is SSD dead? No, however some vendors might be
Storage and IO metrics that matter
IO IO it is off to Storage and IO metrics we go
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

What is the best kind of IO? The one you do not have to do

What is the best kind of IO? The one you do not have to do

data infrastructure server storage I/O trends

Updated 2/10/2018

What is the best kind of IO? If no IO (input/output) operation is the best IO, than the second best IO is the one that can be done as close to the application and processor with best locality of reference. Then the third best IO is the one that can be done in less time, or at least cost or impact to the requesting application which means moving further down the memory and storage stack (figure 1).

Storage and IO or I/O locality of reference and storage hirearchy
Figure 1 memory and storage hierarchy

The problem with IO is that they are basic operation to get data into and out of a computer or processor so they are required; however, they also have an impact on performance, response or wait time (latency). IO require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data to their destination or retrieve from where stored. While IOs cannot be eliminated, their impact can be greatly improved or optimized by doing fewer of them via caching, grouped reads or writes (pre-fetch, write behind) among other techniques and technologies.

Think of it this way, instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip; however, that optimization may also take longer. Hence sometimes it makes sense to go on a couple of quick, short low latency trips vs. one single larger one that takes half a day however accomplishes many things. Of course, how far you have to go on those trips (e.g. locality) makes a difference of how many you can do in a given amount of time.

What is locality of reference?

Locality of reference refers to how close (e.g location) data exists for where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, then level 1 (L1), level 2 (L2) or level 3 (L3) onboard cache, followed by dynamic random access memory (DRAM). Then would come memory also known as storage on PCIe cards such as nand flash solid state device (SSD) or accessible via an adapter on a direct attached storage (DAS), SAN or NAS device. In the case of a PCIe nand flash SSD card, even though physically the nand flash SSD is closer to the processor, there is still the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with Meta or control information to further optimize and improve locality of reference. In other words, help with cache hits, cache use and cache effectiveness vs. simply boosting cache utilization.

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

What can you do the cut the impact of IO

  • Establish baseline performance and availability metrics for comparison
  • Realize that IOs are a fact of IT virtual, physical and cloud life
  • Understand what is a bad IO along with its impact
  • Identify why an IO is bad, expensive or causing an impact
  • Find and fix the problem, either with software, application or database changes
  • Throw more software caching tools, hyper visors or hardware at the problem
  • Hardware includes faster processors with more DRAM and fast internal busses
  • Leveraging local PCIe flash SSD cards for caching or as targets
  • Utilize storage systems or appliances that have intelligent caching and storage optimization capabilities (performance, availability, capacity).
  • Compare changes and improvements to baseline, quantify improvement

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Speaking of speeding up business with SSD storage

Solid state devices (SSD) are a popular topic gaining both industry adoption and customer deployment to speed up storage performance. Here is a link to a recent conversation that I had with John Hillard to discuss industry trends and perspectives pertaining to using SSD to boost performance and productivity for SMB and other environments.

I/O consolidation from Cloud and Virtual Data Storage Networking (CRC Press) www.storageio.com/book3.html

SSDs can be a great way for organizations to do IO consolidation to reduce costs in place of using many hard disk drives (HDDs) grouped together to achieve a certain level of performance. By consolidating the IOs off of many HDDs that often end up being under utilized from a space capacity basis, organizations can boost performance for applications while reducing, or reusing HDD based storage capacity for other purposes including growth.

Here is some related material and comments:
Has SSD put Hard Disk Drives (HDDs) On Endangered Species List?
SSD and Storage System Performance
Are Hard Disk Drives (HDDs) getting too big?
Solid state devices and the hosting industry
Achieving Energy Efficiency using FLASH SSD
Using SSD flash drives to boost performance

Four ways to use SSD storage
4 trends that shape how agencies handle storage
Giving storage its due

You can read a transcript of the conversation and listen to the pod cast here, or download the MP3 audio here.

Ok, nuff said about SSD (for now)

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Congratulations to IBM for releasing XIV SPC results

Over the past several years I have done an annual post about IBM and their XIV storage system and this is the fourth in what has become a series. You can read the first one here, the second one here, and last years here and here after the announcement of the IBM V7000.

IBM XIV Gen3
IBM recently announced the generation 3 or Gen3 version of XIV along with releasing for the first time public performance comparison benchmarks using storage performance council (SPC) throughout SPC2 workload.

The XIV Gen3 is positioned by IBM as having up to four (4) times the performance of earlier generations of the storage system. In terms of speeds and feeds, the Gen3 XIV supports up to 180 2TB SAS hard disk drives (HDD) that provides up to 161TB of usable storage space capacity. For connectivity, the Gen3 XIV supports up to 24 8Gb Fibre Channel (8GFC) or for iSCSI 22 1Gb Ethernet (1 GbE) ports with a total of up to 360GBytes of system cache. In addition to the large cache to boost performance, other enhancements include leveraging multi core processors along with an internal InfiniBand  network to connect nodes replacing the former 1 GbE interconnect. Note, InfiniBand is only used to interconnect the various nodes in the XIV cluster and is not used for attachment to applications servers which is handled via iSCSI and Fibre Channel.

IBM and SPC storage performance history
IBM has a strong history if not leading the industry with benchmarking and workload simulation of their storage systems including Storage Performance Council (SPC) among others. The exception for IBM over the past couple of years has been the lack of SPC benchmarks for XIV. Last year when IBM released their new V7000 storage system benchmarks include SPC were available close to if not at the product launch. I have in the past commented about IBMs lack of SPC benchmarks for XIV to confirm their marketing claims given their history of publishing results for all of their other storage systems. Now that IBM has recently released SPC2 results for the XIV it is only fitting then that I compliment them for doing so.

Benchmark brouhaha
Performance workload simulation results can often lead to applies and oranges comparisons or benchmark brouhaha battles or storage performance games. For example a few years back NetApp submitted a SPC performance result on behalf of their competitor EMC. Now to be clear on something, Im not saying that SPC is the best or definitive benchmark or comparison tool for storage or other purpose as it is not. However it is representative and most storage vendors have released some SPC results for their storage systems in addition to TPC and Microsoft ESRP among others. SPC2 is focused on streaming such as video, backup or other throughput centric applications where SPC1 is centered around IOPS or transactional activity. The metrics for SPC2 are Megabytes per second (MBps) for large file processing (LFP), large database query (LDQ) and video on demand delivery (VOD) for a given price and protection level.

What is the best benchmark?
Simple, your own application in as close to as actual workload activity as possible. If that is not possible, then some simulation or workload simulation that closets resembles your needs.

Does this mean that XIV is still relevant?
Yes

Does this mean that XIV G3 should be used for every environment?
Generally speaking no. However its performance enhancements should allow it to be considered for more applications than in the past. Plus with the public comparisons now available, that should help to silence questions (including those from me) about what the systems can really do vs. marketing claims.

How does XIV compare to some other IBM storage systems using SPC2 comparisons?

System
SPC2 MBps
Cost per SPC2
Storage GBytes
Price tested
Discount
Protection
DS5300
5,634.17
$74.13
16,383
417,648
0%
R5
V7000
3,132.87
$71.32
29,914
$223,422
38-39%
R5
XIV G3
7,467.99
$152.34
154,619
1,137,641
63-64%
Mirror
DS8800
9,705.74
$270.38
71,537
2,624,257
40-50%
R5

In the above comparisons, the DS5300 (NetApp/Engenio based) is a dual controller (4GB of cache per controller) with 128 x 146.8GB 15K HDDs configured as RAID 5 with no discount applied to the price submitted. The V7000 system which is based on the IBM SVC along with other enhancements consists of dual controllers each with 8GB of cache and 120 x 10K 300GB HDDs configured as RAID 5 with just under a 40% discount off list price for system tested. For the XIV Gen3 system tested, discount off list price for the submission is about 63% with 15 nodes and a total of 360GB of cache and 180 2TB 7.2K SAS HDDs configured as mirrors. The DS8800 system with dual controllers has a 256GB of cache, 768 x 146GB 15K HDDs configured in RAID5 with a discount between 40 to 50% off of list.

What the various metrics do not show is the benefit of various features and functionality which should be considered to your particular needs. Likewise, if your applications are not centered around bandwidth or throughput, then the above performance comparisons would not be relevant. Also note that the systems above have various discount prices as submitted which can be a hint to a smart shopper where to begin negotiations at. You can also do some analysis of the various systems based on their performance, configuration, physical footprint, functionality and cost plus the links below take you to the complete reports with more information.

DS8800 SPC2 executive summary and full disclosure report

XIV SPC2 executive summary and full disclosure report

DS5300 SPC2 executive summary and full disclosure report

V7000 SPC2 executive summary and full disclosure report

Bottom line, benchmarks and performance comparisons are just that, a comparison that may or may not be relevant to your particular needs. Consequently they should be used as a tool combined with other information to see how a particular solution might be a fit for your specific needs. The best benchmark however is your own application running as close to possible realistic workload to get a representative perspective of a systems capabilities.

Ok, nuff said
Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

March Metrics and Measuring Social Media

What metrics matter for social media and networking?

Of course the answer should be it depends.

     

For example, would that be number of followers or how many posts, tweets or videos you post?

How about the number of page hits, pages read or unique visitors to a site, perhaps time on site?

Or, how about the number of times a visitor returns to a site or shares the link or information with others?

What about click through rates, page impressions, revenue per page and related metrics?

Maybe the metric is your blog ranking or number of points on your favorite community site such as Storage Monkeys or Wikibon among others?

Another metrics could be number of comments received particularly if your venue is more interactive for debate or discussion purposes compared to a site with many viewers who prefer to read (lurk). Almost forgot number of LinkedIn contacts or face book friends along with you tube and other videos or pod casts as well as who is on your blog roll.

Lets not forget how many are following or those being followed along with RSS subscribers as metrics.

To say that there are many different metrics along with reasons or interests around them would be an understatement to say the least.

Why do metrics matter in social networking?

One reason metrics are used (even by those who do not admit it) is to compare status amongst peers or others in your sphere of influence or in adjacent areas.

Who Are You and Your Influences
Some spheres of influence and influences

In additional metrics also matter for those looking to land or obtain advertising sponsors for their sites or perhaps to help gain exposure if looking for a new job or career move. Metrics also matter to gauge the effectiveness or return on investment with social media that could range from how many followers to how far your brands reach extends into other realms and venues.

In the case of twitter, for some the key metric is number of followers (e.g. popularity) or those being followed with other metrics being number of posts or tweets along with re tweets and list inclusions.For blogs and web sites, incoming links along with site activity among other metrics factor into various ranking sites. Web site activity can be measured in several ways including total hits or visits, pages read and unique visitors among others.

Having been involved with social media from a blogging along with twitter perspective for a couple of years not to mention being a former server and storage capacity planner I find metrics to be interesting. In addition to the metrics themselves, what is also interesting is how they are used differently for various purposes including gauging cause and effect or return on social networking investment.

Regardless of your motives or objectives with metrics, here is a quick synopsis of some tools and sites that I have come across that you may already be using, or if not, that you might be interested in.

What are some metrics?

If you are interested in your twitter effectiveness, see your report card at tweet grade. Another twitter site that provides a twitter grade based on numerous factors is Twitter Grader while Klout.com characterizes your activity on four different planes similar to a Gartner Magic quadrant. Over at the customer collective they have an example of a more thorough gauge of effectiveness looking at several different metrics some of which are covered here.

Sample metricsSample Metrics

Customer Collective Metrics and Rankings

Similar to Technorati, Tekrati, or other directory and index sites, Wefollow is a popular venue for tracking twitter tweeps based on various has tags for example IT or storage among many others. Tweet level provides a composite ranking determined by influence, popularity, engagement and trust. Talkreview.com provides various metrics of blog and websites including unique visitor traffic estimates while Compete.com shows estimated site visitor traffic with option to compare to others. Interested to see how your website or blog is performing in terms of effectiveness and reach in addition to Compete.com, then check out talkreviews.com or Blog grader that looks at and reports on various blog metrics and information.

The sites and tools mentioned are far from an exhaustive listing of sites or metrics for various purposes, rather a sampling of what is available to meet different needs. For example there are Alexa, Goggle and Yahoo rankings among many others.

Wefollow as an example or discussion topic

One of the things that I find interesting is the diversity in the metrics and rankings for example if you were to say look at wefollow for a particularly category in the top 10 or 20, then use one or more of the other tools to see how the various rankings change.

A month or so ago I was curious to see if some of the sites could be gamed beyond running up the number of posts, tweets, followers or followings along with re tweets of which some sites appear to be influenced by. As part of determining what metrics matter and which to ignore or keep in the back pocket for when needed, I looked at and experiment with wefollow.

For those who might have been aware of what I was doing, I went from barely being visible for example in the storage category to jumping into the top 5. Then with some changes, was able to disappear from the top 5 and show up elsewhere and then when all was said and done, return to top rankings.

Does this mean I put a lot of stock or value in wefollow or simply use it as a gauge and metric along with all of the others? The answer is that it is just that, another metric and tool that can be used for gauging effectiveness and reach, or if you prefer, status or what ever your preference and objective are.

How did I change my rankings on wefollow? Simple, experimented with using various tags in different combinations, sometimes only one, sometimes many however keeping them relevant and then waiting several days. Im sure if you are inclined and have plenty of time on your hands, someone can figure out or find out how the actual algorithms work, however for me right now, I have other projects to pursue.

What is the best metric?

That is going to depends on your objectives or what you are trying to accomplish.

As with other measurements and metrics, those for social media provide different points of reference from how many followers to amount of influence.

Depending on your objective, effectiveness may be gauged by number of followers or those being followed, number of posts or the number of times being quoted or referenced by others including in lists.

In some cases rankings that compare with others are based on those sites knowing about you which may mean having to register so that you can be found.

Bottom line, metrics matter however what they mean and their importance will vary depending on objectives, preferences or for accomplishing different things.

One of the interesting things about social networking and media sites is that if you do not like a particularly ranking, list, grade or status then either work to change the influence of those scores, or, come up with your own.

What is your take on metrics that matter, which is of course unless they do not matter to you?

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Green IT, Green Gap, Tiered Energy and Green Myths

There are many different aspects of Green IT along with several myths or misperceptions not to mention missed opportunities.

There is a Green Gap or disconnect between environmentally aware, focused messaging and core IT data center issues. For example, when I ask IT professionals whether they have or are under direction to implement green IT initiatives, the number averages in the 10-15% range.

However, when I ask the same audiences who has or sees power, cooling, floor space, supporting growth, or addressing environmental health and safety (EHS) related issues, the average is 75 to 90%. What this means is a disconnect between what is perceived as being green and opportunities for IT organizations to make improvements from an economic and efficiency standpoint including boosting productivity.

 

Some IT Data Center Green Myths
Is “green IT” a convenient or inconvenient truth or a legend?

When it comes to green and virtual environments, there are plenty of myths and realities, some of which vary depending on market or industry focus, price band, and other factors.

For example, there are lines of thinking that only ultra large data centers are subject to PCFE-related issues, or that all data centers need to be built along the Columbia River basin in Washington State, or that virtualization eliminates vendor lock-in, or that hardware is more expensive to power and cool than it is to buy.

The following are some myths and realities as of today, some of which may be subject to change from reality to myth or from myth to reality as time progresses.

Myth: Green and PCFE issues are applicable only to large environments.

Reality: I commonly hear that green IT applies only to the largest of companies. The reality is that PCFE issues or green topics are relevant to environments of all sizes, from the largest of enterprises to the small/medium business, to the remote office branch office, to the small office/home office or “virtual office,” all the way to the digital home and consumer.

 

Myth: All computer storage is the same, and powering disks off solves PCFE issues.

Reality: There are many different types of computer storage, with various performance, capacity, power consumption, and cost attributes. Although some storage can be powered off, other storage that is needed for online access does not lend itself to being powered off and on. For storage that needs to be always online and accessible, energy efficiency is achieved by doing more with less—that is, boosting performance and storing more data in a smaller footprint using less power.

 

Myth: Servers are the main consumer of electrical power in IT data centers.

Reality: In the typical IT data center, on average, 50% of electrical power is consumed by cooling, with the balance used for servers, storage, networking, and other aspects. However, in many environments, particularly processing or computation intensive environments, servers in total (including power for cooling and to power the equipment) can be a major power draw.

 

Myth: IT data centers produce 2 to 8% of all global Carbon Dioxide (CO2) and carbon emissions.

Reality:  Thus might be perhaps true, given some creative accounting and marketing math in order to help build a justification case or to scare you into doing something. However, the reality is that in the United States, for example, IT data centers consume around 2 to 4% of electrical power (depending on when you read this), and less than 80% of all U.S. CO2 emissions are from electrical power generation, so the math does not quite add up. The reality is this, if no action is taken to improve IT data center energy efficiency, continued demand growth will shift IT power-related emissions from myth to reality, not to mention cause constraints on IT and business sustainability from an economic and productivity standpoint.

Myth: Server consolidation with virtualization is a silver bullet to address PCFE issues.

Reality: Server virtualization for consolidation is only part of an overall solution that should be combined with other techniques, including lower power, faster and more energy efficient servers, and improved data and storage management techniques.

 

Myth: Hardware costs more to power than to purchase.

Reality: Currently, for some low-cost servers, standalone disk storage, or entry level networking switches and desktops, this may be true, particularly where energy costs are excessively high and the devices are kept and used continually for three to five years. A general rule of thumb is that the actual cost of most IT hardware will be a fraction of the price of associated management and software tool costs plus facilities and cooling costs. For the most part, at least as of this writing, small standalone individual hard disk drives or small entry level volume servers can be bought and then used in locations that have very high electrical costs over a three  to five year time frame.

 

Regarding this last myth, for the more commonly deployed external storage systems across all price bands and categories, generally speaking, except for extremely inefficient and hot running legacy equipment, the reality is that it is still cheaper to power the equipment than to buy it. Having said that, there are some qualifiers that should also be used as key indicators to keep the equation balanced. These qualifiers include the acquisition cost  if any, for new, expanded, or remodeled habitats or space to house the equipment, the price of energy in a given region, including surcharges, as well as cooling, length of time, and continuous time the device will be used.

For larger businesses, IT equipment in general still costs more to purchase than to power, particularly with newer, more energy efficient devices. However, given rising energy prices, or the need to build new facilities, this could change moving forward, particularly if a move toward energy efficiency is not undertaken.

There are many variables when purchasing hardware, including acquisition cost, the energy efficiency of the device, power and cooling costs for a given location and habitat, and facilities costs. For example, if a new storage solution is purchased for $100,000, yet new habitat or facilities must be built for three to five times the cost of the equipment, those costs must be figured into the purchase cost.

Likewise, if the price of a storage solution decreases dramatically, but the device consumes a lot of electrical power and needs a large cooling capacity while operating in a region with expensive electricity costs, that, too, will change the equation and the potential reality of the myth.

 

Tiered Energy Sources
Given that IT resources and facilitated require energy to power equipment as well as keep them cool, electricity are popular topics associated with Green IT, economics and efficiency with lots of metrics and numbers tossed around. With that in mind, the U.S. national average CO2 emission is 1.34 lb/kWh of electrical power. Granted, this number will vary depending on the region of the country and the source of fuel for the power-generating station or power plant.

Like IT tiered resources (Servers, storage, I/O networks, virtual machines and facilities) of which there are various tiers or types of technologies to meet various needs, there are also multiple types of energy sources. Different tiers of energy sources vary by their cost, availability and environmental characteristics among others. For example, in the US, there are different types of coal and not all coal is as dirty when combined with emissions air scrubbers as you might be lead to believe however there are other energy sources to consider as well.

Coal continues to be a dominant fuel source for electrical power generation both in the United States and abroad, with other fuel sources, including oil, gas, natural gas, liquid propane gas (LPG or propane), nuclear, hydro, thermo or steam, wind and solar. Within a category of fuel, for example, coal, there are different emissions per ton of fuel burned. Eastern U.S. coal is higher in CO2 emissions per kilowatt hour than western U.S. lignite coal. However, eastern coal has more British thermal units (Btu) of energy per ton of coal, enabling less coal to be burned in smaller physical power plants.

If you have ever noticed that coal power plants in the United States seem to be smaller in the eastern states than in the Midwest and western states, it’s not an optical illusion. Because eastern coal burns hotter, producing more Btu, smaller boilers and stockpiles of coal are needed, making for smaller power plant footprints. On the other hand, as you move into the Midwest and western states of the United States, coal power plants are physically larger, because more coal is needed to generate 1 kWh, resulting in bigger boilers and vent stacks along with larger coal stockpiles.

On average, a gallon of gasoline produces about 20 lb of CO2, depending on usage and efficiency of the engine as well as the nature of the fuel in terms of octane or amount of Btu. Aviation fuel and diesel fuel differ from gasoline, as does natural gas or various types of coal commonly used in the generation of electricity. For example, natural gas is less expensive than LPG but also provides fewer Btu per gallon or pound of fuel. This means that more natural gas is needed as a fuel to generate a given amount of power.

Recently, while researching small, 10 to 12 kWh standby generators for my office, I learned about some of the differences between propane and natural gas. What I found was that with natural gas as fuel, a given generator produced about 10.5 kWh, whereas the same unit attached to a LPG or propane fuel source produced 12 kWh. The trade off was that to get as much power as possible out of the generator, the higher cost LPG was the better choice. To use lower cost fuel but get less power out of the device, the choice would be natural gas. If more power was needed, than a larger generator could be deployed to use natural gas, with the trade off of requiring a larger physical footprint.

Oil and gas are not used as much as fuel sources for electrical power generation in the United States as in other countries such as the United Kingdom. Gasoline, diesel, and other petroleum based fuels are used for some power plants in the United States, including standby or peaking plants. In the electrical power G and T industry as in IT, where different tiers of servers and storage are used for different applications there are different tiers of power plants using different fuels with various costs. Peaking and standby plants are brought online when there is heavy demand for electrical power, during disruptions when a lower cost or more environmentally friendly plant goes offline for planned maintenance, or in the event of a trip or unplanned outage.

CO2 is commonly discussed with respect to green and associated emissions however there are other so called Green Houses Gases including Nitrogen Dioxide (NO2) and water vapors among others. Carbon makes up only a fraction of CO2. To be specific, only about 27% of a pound of CO2 is carbon; the balance is not. Consequently, carbon emissions taxes schemes (ETS), as opposed to CO2 tax schemes, need to account for the amount of carbon per ton of CO2 being put into the atmosphere. In some parts of the world, including the EU and the UK, ETS are either already in place or in initial pilot phases, to provide incentives to improve energy efficiency and use.

Meanwhile, in the United States there are voluntary programs for buying carbon offset credits along with initiatives such as the carbon disclosure project. The Carbon Disclosure Project (www.cdproject.net) is a not for profit organization to facilitate the flow of information pertaining to emissions by organizations for investors to make informed decisions and business assessment from an economic and environmental perspective. Another voluntary program is the United States EPA Climate Leaders initiative where organizations commit to reduce their GHG emissions to a given level or a specific period of time.

Regardless of your stance or perception on green issues, the reality is that for business and IT sustainability, a focus on ecological and, in particular, the corresponding economic aspects cannot be ignored. There are business benefits to aligning the most energy efficient and low power IT solutions combined with best practices to meet different data and application requirements in an economic and ecologically friendly manner.

Green initiatives need to be seen in a different light, as business enables as opposed to ecological cost centers. For example, many local utilities and state energy or environmentally concerned organizations are providing funding, grants, loans, or other incentives to improve energy efficiency. Some of these programs can help offset the costs of doing business and going green. Instead of being seen as the cost to go green, by addressing efficiency, the by products are economic as well as ecological.

Put a different way, a company can spend carbon credits to offset its environmental impact, similar to paying a fine for noncompliance or it can achieve efficiency and obtain incentives. There are many solutions and approaches to address these different issues, which will be looked at in the coming chapters.

What does this all mean?
There are real things that can be done today that can be effective toward achieving a balance of performance, availability, capacity, and energy effectiveness to meet particular application and service needs.

Sustaining for economic and ecological purposes can be achieved by balancing performance, availability, capacity, and energy to applicable application service level and physical floor space constraints along with intelligent power management. Energy economics should be considered as much a strategic resource part of IT data centers as are servers, storage, networks, software, and personnel.

The bottom line is that without electrical power, IT data centers come to a halt. Rising fuel prices, strained generating and transmission facilities for electrical power, and a growing awareness of environmental issues are forcing businesses to look at PCFE issues. IT data centers to support and sustain business growth, including storing and processing more data, need to leverage energy efficiency as a means of addressing PCFE issues. By adopting effective solutions, economic value can be achieved with positive ecological results while sustaining business growth.

Some additional links include:

Want to learn or read more?

Check out Chapter 1 (Green IT and the Green Gap, Real or Virtual?) in my book “The Green and Virtual Data Center” (CRC) here or here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

2010 and 2011 Trends, Perspectives and Predictions: More of the same?

2011 is not a typo, I figured that since Im getting caught up on some things, why not get a jump as well.

Since 2009 went by so fast, and that Im finally getting around to doing an obligatory 2010 predictions post, lets take a look at both 2010 and 2011.

Actually Im getting around to doing a post here having already done interviews and articles for others soon to be released.

Based on prior trends and looking at forecasts, a simple predictions is that some of the items for 2010 will apply for 2011 as well given some of this years items may have been predicted by some in 2008, 2007, 2006, 2005 or, well ok, you get the picture. :)

Predictions are fun and funny in that for some, they are taken very seriously, while for others, at best they are taken with a grain of salt depending on where you sit. This applies both for the reader as well as who is making the predictions along with various motives or incentives.

Some are serious, some not so much…

For some, predictions are a great way of touting or promoting favorite wares (hard, soft or services) or getting yet another plug (YAP is a TLA BTW) in to meet coverage or exposure quota.

Meanwhile for others, predictions are a chance to brush up on new terms for the upcoming season of buzzword bingo games (did you pick up on YAP).

In honor of the Vancouver winter games, Im expecting some cool Olympic sized buzzword bingo games with a new slippery fast one being federation. Some buzzwords will take a break in 2010 as well as 2011 having been worked pretty hard the past few years, while others that have been on break, will reappear well rested, rejuvenated, and ready for duty.

Lets also clarify something regarding predictions and this is that they can be from at least two different perspectives. One view is that from a trend of what will be talked about or discussed in the industry. The other is in terms of what will actually be bought, deployed and used.

What can be confusing is sometimes the two perspectives are intermixed or assumed to be one and the same and for 2010 I see that trend continuing. In other words, there is adoption in terms of customers asking and investigating technologies vs. deployment where they are buying, installing and using those technologies in primary situations.

It is safe to say that there is still no such thing as an information, data or processing recession. Ok, surprise surprise; my dogs could have probably made that prediction during a nap. However what this means is more data will need to be moved, processed and stored for longer periods of time and at a lower cost without degrading performance or availability.

This means, denser technologies that enable a lower per unit cost of service without negatively impacting performance, availability, capacity or energy efficiency will be needed. In other words, watch for an expanded virtualization discussion around life beyond consolidation for servers, storage, desktops and networks with a theme around productivity and virtualization for agility and management enablement.

Certainly there will be continued merger and acquisitions on both a small as well as large scale ranging from liquidation sales or bargain hunting, to large and a mega block buster or two. Im thinking in terms of outside of the box, the type that will have people wondering perhaps confused as to why such a deal would be done until the whole picture is reveled and thought out.

In other words, outside of perhaps IBM, HP, Oracle, Intel or Microsoft among a few others, no vendor is too large not to be acquired, merged with, or even involved in a reverse merger. Im also thinking in terms of vendors filling in niche areas as well as building out their larger portfolio and IT stacks for integrated solutions.

Ok, lets take a look at some easy ones, lay ups or slam dunks:

  • More cluster, cloud conversations and confusion (public vs. private, service vs. product vs. architecture)
  • More server, desktop, IO and storage consolidation (excuse me, server virtualization)
  • Data footprint impact reduction ranging from deletion to archive to compress to dedupe among others
  • SSD and in particular flash continues to evolve with more conversations around PCM
  • Growing awareness of social media as yet another tool for customer relations management (CRM)
  • Security, data loss/leap prevention, digital forensics, PCI (payment card industry) and compliance
  • Focus expands from gaming/digital surveillance /security and energy to healthcare
  • Fibre Channel over Ethernet (FCoE) mainstream in discussions with some initial deployments
  • Continued confusion of Green IT and carbon reduction vs. economic and productivity (Green Gap)
  • No such thing as an information, data or processing recession, granted budgets are strained
  • Server, Storage or Systems Resource Analysis (SRA) with event correlation
  • SRA tools that provide and enable automation along with situational awareness

The green gap of confusion will continue with carbon or environment centric stories and messages continue to second back stage while people realize the other dimension of green being productivity.

As previously mentioned, virtualization of servers and storage continues to be popular with an expanding focus from just consolidation to one around agility, flexibility and enabling production, high performance or for other systems that do not lend themselves to consolidation to be virtualized.

6GB SAS interfaces as well as more SAS disk drives continue to gain popularity. I have said in the past there was a long shot that 8GFC disk drives might appear. We might very well see those in higher end systems while SAS drives continue to pick up the high performance spinning disk role in mid range systems.

Granted some types of disk drives will give way over time to others, for example high performance 3.5” 15.5K Fibre Channel disks will give way to 2.5” 15.5K SAS boosting densities, energy efficiency while maintaining performance. SSD will help to offload hot spots as they have in the past enabling disks to be more effectively used in their applicable roles or tiers with a net result of enhanced optimization, productivity and economics all of which have environmental benefits (e.g. the other Green IT closing the Green Gap).

What I dont see occurring, or at least in 2010

  • An information or data recession requiring less server, storage, I/O networking or software resources
  • OSD (object based disk storage without a gateway) at least in the context of T10
  • Mainframes, magnetic tape, disk drives, PCs, or Windows going away (at least physically)
  • Cisco cracking top 3, no wait, top 5, no make that top 10 server vendor ranking
  • More respect for growing and diverse SOHO market space
  • iSCSI taking over for all I/O connectivity, however I do see iSCSI expand its footprint
  • FCoE and flash based SSD reaching tipping point in terms of actual customer deployments
  • Large increases in IT Budgets and subsequent wild spending rivaling the dot com era
  • Backup, security, data loss prevention (DLP), data availability or protection issues going away
  • Brett Favre and the Minnesota Vikings winning the super bowl

What will be predicted at end of 2010 for 2011 (some of these will be DejaVU)

  • Many items that were predicted this year, last year, the year before that and so on…
  • Dedupe moving into primary and online active storage, rekindling of dedupe debates
  • Demise of cloud in terms of hype and confusion being replaced by federation
  • Clustered, grid, bulk and other forms of scale out storage grow in adoption
  • Disk, Tape, RAID, Mainframe, Fibre Channel, PCs, Windows being declared dead (again)
  • 2011 will be the year of Holographic storage and T10 OSD (an annual prediction by some)
  • FCoE kicks into broad and mainstream deployment adoption reaching tipping point
  • 16Gb (16GFC) Fibre Channel gets more attention stirring FCoE vs. FC vs. iSCSI debates
  • 100GbE gets more attention along with 4G adoption in order to move more data
  • Demise of iSCSI at the hands of SAS at low end, FCoE at high end and NAS from all angles

Gaining ground in 2010 however not yet in full stride (at least from customer deployment)

  • On the connectivity front, iSCSI, 6Gb SAS, 8Gb Fibre Channel, FCoE and 100GbE
  • SSD/flash based storage everywhere, however continued expansion
  • Dedupe  everywhere including primary storage – its still far from its full potential
  • Public and private clouds along with pNFS as well as scale out or clustered storage
  • Policy based automated storage tiering and transparent data movement or migration
  • Microsoft HyperV and Oracle based server virtualization technologies
  • Open source based technologies along with heterogeneous encryption
  • Virtualization life beyond consolidation addressing agility, flexibility and ease of management
  • Desktop virtualization using Citrix, Microsoft and VMware along with Microsoft Windows 7

Buzzword bingo hot topics and themes (in no particular order) include:

  • 2009 and previous year carry over items including cloud, iSCSI, HyperV, Dedupe, open source
  • Federation takes over some of the work of cloud, virtualization, clusters and grids
  • E2E, End to End management preferably across different technologies
  • SAS, Serial Attached SCSI for server to storage systems and as disk to storage interface
  • SRA, E23, Event correlation and other situational awareness related IRM tools
  • Virtualization, Life beyond consolidation enabling agility, flexibility for desktop, server and storage
  • Green IT, Transitions from carbon focus to economic with efficiency enabling productivity
  • FCoE, Continues to evolve and mature with more deployments however still not at tipping point
  • SSD, Flash based mediums continue to evolve however tipping point is still over the horizon
  • IOV, I/O Virtualization for both virtual and non virtual servers
  • Other new or recycled buzzword bingo candidates include PCoIP, 4G,

RAID will again be pronounced as being dead no longer relevant yet being found in more diverse deployments from consumer to the enterprise. In other words, RAID may be boring and thus no longer relevant to talk about, yet it is being used everywhere and enhanced in evolutionary ways, perhaps for some even revolutionary.

Tape remains being declared dead (e.g. on the Zombie technology list) yet being enhanced, purchased and utilized at higher rates with more data stored than in past history. Instead of being killed off by the disk drive, tape is being kept around for both traditional uses as well as taking on new roles where it is best suited such as long term or bulk off-line storage of data in ultra dense and energy efficient not to mention economical manners.

What I am seeing and hearing is that customers using tape are able to reduce the number of drives or transports, yet due to leveraging disk buffers or caches including from VTL and dedupe devices, they are able to operate their devices at higher utilization, thus requiring fewer devices with more data stored on media than in the past.

Likewise, even though I have been a fan of SSD for about 20 years and am bullish on its continued adoption, I do not see SSD killing off the spinning disk drive anytime soon. Disk drives are helping tape take on this new role by being a buffer or cache in the form of VTLs, disk based backup and bulk storage enhanced with compression, dedupe, thin provision and replication among other functionality.

There you have it, my predictions, observations and perspectives for 2010 and 2011. It is a broad and diverse list however I also get asked about and see a lot of different technologies, techniques and trends tied to IT resources (servers, storage, I/O and networks, hardware, software and services).

Lets see how they play out.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

EPA Server and Storage Workshop Feb 2, 2010

EPA Energy Star

Following up on a recent previous post pertaining to US EPA Energy Star(r) for Servers, Data Center Storage and Data Centers, there will be a workshop held Tuesday February 2, 2010 in San Jose, CA.

Here is the note (Italics added by me for clarity) from the folks at EPA with information about the event and how to participate.

 

Dear ENERGY STAR® Servers and Storage Stakeholders:

Representatives from the US EPA will be in attendance at The Green Grid Technical Forum in San Jose, CA in early February, and will be hosting information sessions to provide updates on recent ENERGY STAR servers and storage specification development activities.  Given the timing of this event with respect to ongoing data collection and comment periods for both product categories, EPA intends for these meetings to be informal and informational in nature.  EPA will share details of recent progress, identify key issues that require further stakeholder input, discuss timelines for the completion, and answer questions from the stakeholder community for each specification.

The sessions will take place on February 2, 2010, from 10:00 AM to 4:00 PM PT, at the San Jose Marriott.  A conference line and Webinar will be available for participants who cannot attend the meeting in person.  The preliminary agenda is as follows:

Servers (10:00 AM to 12:30 PM)

  • Draft 1 Version 2.0 specification development overview & progress report
    • Tier 1 Rollover Criteria
    • Power & Performance Data Sheet
    • SPEC efficiency rating tool development
  • Opportunities for energy performance data disclosure

 

Storage (1:30 PM to 4:00 PM)

  • Draft 1 Version 1.0 specification development overview & progress report
  • Preliminary stakeholder feedback & lessons learned from data collection 

A more detailed agenda will be distributed in the coming weeks.  Please RSVP to storage@energystar.gov or servers@energystar.gov no later than Friday, January 22.  Indicate in your response whether you will be participating in person or via Webinar, and which of the two sessions you plan to attend.

Thank you for your continued support of ENERGY STAR.

 

End of EPA Transmission

For those attending the event, I look forward to seeing you there in person on Tuesday before flying down to San Diego where I will be presenting on Wednesday the 3rd at The Green Data Center Conference.

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Recent tips, videos, articles and more update V2010.1

Realizing that some prefer blogs to webs to twitter to other venues, here are some recent links to articles, tips, videos, webcasts and other content that have appeared in different venues since August 2009.

  • i365 Guest Interview: Experts Corner: Q&A with Greg Schulz December 2009
  • SearchCIO Midmarket: Remote-location disaster recovery risks and solutions December 2009
  • BizTech Magazine: High Availability: A Delicate Balancing Act November 2009
  • ESJ: What Comprises a Green, Efficient and Effective Virtual Data Center? November 2009
  • SearchSMBStorage: Determining what server to use for SMB November 2009
  • SearchStorage: Performance metrics: Evaluating your data storage efficiency October 2009
  • SearchStorage: Optimizing capacity and performance to reduce data footprint October 2009
  • SearchSMBStorage: How often should I conduct a disaster recovery (DR) test? October 2009
  • SearchStorage: Addressing storage performance bottlenecks in storage September 2009
  • SearchStorage AU: Is tape the right backup medium for smaller businesses? August 2009
  • ITworld: The new green data center: From energy avoidance to energy efficiency August 2009
  • Video and podcasts include:
    December 2009 Video: Green Storage: Metrics and measurement for management insight
    Discussion between Greg Schulz and Mark Lewis of TechTarget the importance of metrics and measurement to gauge productivity and efficiency for Green IT and enabling virtual information factories. Click here to watch the Video.

    December 2009 Podcast: iSCSI SANs can be a good fit for SMB storage
    Discussion between Greg Schulz and Andrew Burton of TechTarget about iSCSI and other related technologies for SMB storage. Click here to listen to the podcast.

    December 2009 Podcast: RAID Data Protection Discussion
    Discussion between Greg Schulz and Andrew Burton of TechTarget about RAID data proteciton, techniques and technologies. Click here to listen to the podcast.

    December 2009 Podcast: Green IT, Effiency and Productivity Discussion
    Discussion between Greg Schulz and Jon Flower of Adaptec about data Green IT, energy effiency, inteligent power management (IPM) also known as MAID 2.0 and other forms of optimization techniques including SSD. Click here to listen to the podcast sponsored by Adaptec.

    November 2009 Podcast: Reducing your data footprint impact
    Even though many enterprise data storage environments are coping with tightened budgets and reduced spending, overall net storage capacity is increasing. In this interview, Greg Schulz, founder and senior analyst at StorageIO Group, discusses how storage managers can reduce their data footprint. Schulz touches on the importance of managing your data footprint on both online and offline storage, as well as the various tools for doing so, including data archiving, thin provisioning and data deduplication. Click here to listen to the podcast.

    October 2009 Podcast: Enterprise data storage technologies rise from the dead
    In this interview, Greg Schulz, founder and senior analyst of the Storage I/O group, classifies popular technologies such as solid-state drives (SSDs), RAID and Fibre Channel (FC) as “zombie” technologies. Why? These are already set to become part of standard storage infrastructures, says Schulz, and are too old to be considered fresh. But while some consider these technologies to be stale, users should expect to see them in their everyday lives. Click here to listen to the podcast.

    Check out the Tips, Tools and White Papers, and News pages for additional commentary, coverage and related content or events.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    Saving Money with Green IT: Time To Invest In Information Factories

    There is a good and timely article titled Green IT Can Save Money, Too over at Business Week that has a familiar topic and theme for those who read this blog or other content, articles, reports, books, white papers, videos, podcasts or in-person speaking and keynote sessions that I have done..

    I posted a short version of this over there, here is the full version that would not fit in their comment section.

    Short of calling it Green IT 2.0 or the perfect storm, there is a resurgence and more importantly IMHO a growing awareness of the many facets of Green IT along with Green in general having an economic business sustainability aspect.

    While the Green Gap and confusion still exists, that is, the difference between what people think or perceive and actual opportunities or issues; with growing awareness, it will close or at least narrow. For example, when I regularly talk with IT professionals from various sized, different focused industries across the globe in diverse geographies and ask them about having to go green, the response is in the 7-15% range (these are changing) with most believing that Green is only about carbon footprint.

    On the other hand, when I ask them if they have power, cooling, floor space or other footprint constraints including frozen or reduced budgets, recycling along with ewaste disposition or RoHS requirements, not to mention sustaining business growth without negatively impacting quality of service or customer experience, the response jumps up to 65-75% (these are changing) if not higher.

    That is the essence of the green gap or disconnect!

    Granted carbon dioxide or CO2 reduction is important along with NO2, water vapors and other related issues, however there is also the need to do more with what is available, stretch resources and footprints do be more productive in a shrinking footprint. Keep in mind that there is no such thing as an information, data or processing recession with all indicators pointing towards the need to move, manage and store larger amounts of data on a go forward basis. Thus, the need to do more in a given footprint or constraint, maximizing resources, energy, productivity and available budgets.

    Innovation is the ability to do more with less at a lower cost without compromise on quality of service or negatively impacting customer experience. Regardless of if you are a manufacturer, or a service provider including in IT, by innovating with a diverse Green IT focus to become more efficient and optimized, the result is that your customers become more enabled and competitive.

    By shifting from an avoidance model where cost cutting or containment are the near-term tactical focus to an efficiency and productivity model via optimization, net unit costs should be lowered while overall service experience increase in a positive manner. This means treating IT as an information factory, one that needs investment in the people, processes and technologies (hardware, software, services) along with management metric indicator tools.

    The net result is that environmental or perceived Green issues are addressed and self-funded via the investment in Green IT technology that boosts productivity (e.g. closing or narrowing the Green Gap). Thus, the environmental concerns that organizations have or need to address for different reasons yet that lack funding get addressed via funding to boost business productivity which have tangible ROI characteristics similar to other lean manufacturing approaches.

    Here are some additional links to learn more about these and other related themes:

    Have a read over at Business Week about how Green IT Can Save Money, Too while thinking about how investing in IT infrastructure productivity (Information Factories) by becoming more efficient and optimized helps the business top and bottom line, not to mention the environment as well.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Optimize Data Storage for Performance and Capacity Efficiency

    This post builds on a recent article I did that can be read here.

    Even with tough economic times, there is no such thing as a data recession! Thus the importance of optimizing data storage efficiency addressing both performance and capacity without impacting availability in a cost effective way to do more with what you have.

    What this means is that even though budgets are tight or have been cut resulting in reduced spending, overall net storage capacity is up year over year by double digits if not higher in some environments.

    Consequently, there is continued focus on stretching available IT and storage related resources or footprints further while eliminating barriers or constraints. IT footprint constraints can be physical in a cabinet or rack as well as floorspace, power or cooling thresholds and budget among others.

    Constraints can be due to lack of performance (bandwidth, IOPS or transactions), poor response time or lack of availability for some environments. Yet for other environments, constraints can be lack of capacity, limited primary or standby power or cooling constraints. Other constraints include budget, staffing or lack of infrastructure resource management (IRM) tools and time for routine tasks.

    Look before you leap
    Before jumping into an optimization effort, gain insight if you do not already have it as to where the bottlenecks exist, along with the cause and effect of moving or reconfiguring storage resources. For example, boosting capacity use to more fully use storage resources can result in a performance issue or data center bottlenecks for other environments.

    An alternative scenario is that in the quest to boost performance, storage is seen as being under-utilized, yet when capacity use is increased, low and behold, response time deteriorates. The result can be a vicious cycle hence the need to address the issue as opposed to moving problems by using tools to gain insight on resource usage, both space and activity or performance.

    Gaining insight means looking at capacity use along with performance and availability activity and how they use power, cooling and floor-space. Consequently an important tool is to gain insight and knowledge of how your resources are being used to deliver various levels of service.

    Tools include storage or system resource management (SRM) tools that report on storage space capacity usage, performance and availability with some tools now adding energy usage metrics along with storage or system resource analysis (SRA) tools.

    Cooling Off
    Power and cooling are commonly talked about as constraints, either from a cost standpoint, or availability of primary or secondary (e.g. standby) energy and cooling capacity to support growth. Electricity is essential for powering IT equipment including storage enabling devices to do their specific tasks of storing data, moving data, processing data or a combination of these attributes.

    Thus, power gets consumed, some work or effort to move and store data takes place and the by product is heat that needs to be removed. In a typical IT data center, cooling on average can account for about 50% of energy used with some sites using less.

    With cooling being a large consumer of electricity, a small percentage change to how cooling consumes energy can yield large results. Addressing cooling energy consumption can be to discuss budget or cost issues, or to enable cooling capacity to be freed up to support installation of extra storage or other IT equipment.

    Keep in mind that effective cooling relies on removing heat from as close to the source as possible to avoid over cooling which requires more energy. If you have not done so, have a facilities review or assessment performed that can range from a quick walk around, to a more in-depth review and thermal airflow analysis. A means of removing heat close to the sort are techniques such as intelligent, precision or smart cooling also known by other marketing names.

    Powering Up, or, Powering Down
    Speaking of energy or power, in addition to addressing cooling, there are a couple of ways of addressing power consumption by storage equipment (Figure 1). The most popular discussed approach towards efficiency is energy avoidance involving powering down storage when not used such as first generation MAID at the cost of performance.

    For off-line storage, tape and other removable media give low-cost capacity per watt with low to no energy needed when not in use. Second generation (e.g. MAID 2.0) solutions with intelligent power management (IPM) capabilities have become more prevalent enabling performance or energy savings on a more granular or selective basis often as a standard feature in common storage systems.

    GreenOptionsBalance
    Figure 1:  How various RAID levels and configuration impact or benefit footprint constraints

    Another approach to energy efficiency is seen in figure 1 which is doing more work for active applications per watt of energy to boost productivity. This can be done by using same amount of energy however doing more work, or, same amount of work with less energy.

    For example instead of using larger capacity disks to improve capacity per watt metrics, active or performance sensitive storage should be looked at on an activity basis such as IOP, transactions, videos, emails or throughput per watt. Hence, a fast disk drive doing work can be more energy-efficient in terms of productivity than a higher capacity slower disk drive for active workloads, where for idle or inactive, the inverse should hold true.

    On a go forward basis the trend already being seen with some servers and storage systems is to do both more work, while using less energy. Thus a larger gap between useful work (for active or non idle storage) and amount of energy consumed yields a better efficiency rating, or, take the inverse if that is your preference for smaller numbers.

    Reducing Data Footprint Impact
    Data footprint impact reduction tools or techniques for both on-line as well as off-line storage include archiving, data management, compression, deduplication, space-saving snapshots, thin provisioning along with different RAID levels among other approaches. From a storage access standpoint, you can also include bandwidth optimization, data replication optimization, protocol optimizers along with other network technologies including WAFS/WAAS/WADM to help improve efficiency of data movement or access.

    Thin provisioning for capacity centric environments can be used to achieving a higher effective storage use level by essentially over booking storage similar to how airlines oversell seats on a flight. If you have good historical information and insight into how storage capacity is used and over allocated, thin provisioning enables improved effective storage use to occur for some applications.

    However, with thin provisioning, avoid introducing performance bottlenecks by leveraging solutions that work closely with tools that providing historical trending information (capacity and performance).

    For a technology that some have tried to declare as being dead to prop other new or emerging solutions, RAID remains relevant given its widespread deployment and transparent reliance in organizations of all size. RAID also plays a role in storage performance, availability, capacity and energy constraints as well as a relief tool.

    The trick is to align the applicable RAID configuration to the task at hand meeting specific performance, availability, capacity or energy along with economic requirements. For some environments a one size fits all approach may be used while others may configure storage using different RAID levels along with number of drives in RAID sets to meet specific requirements.


    Figure 2:  How various RAID levels and configuration impact or benefit footprint constraints

    Figure 2 shows a summary and tradeoffs of various RAID levels. In addition to the RAID levels, how many disks can also have an impact on performance or capacity, such as, by creating a larger RAID 5 or RAID 6 group, the parity overhead can be spread out, however there is a tradeoff. Tradeoffs can be performance bottlenecks on writes or during drive rebuilds along with potential exposure to drive failures.

    All of this comes back to a balancing act to align to your specific needs as some will go with a RAID 10 stripe and mirror to avoid risks, even going so far as to do triple mirroring along with replication. On the other hand, some will go with RAID 5 or RAID 6 to meet cost or availability requirements, or, some I have talked with even run RAID 0 for data and applications that need the raw speed, yet can be restored rapidly from some other medium.

    Lets bring it all together with an example
    Figure 3 shows a generic example of a before and after optimization for a mixed workload environment, granted you can increase or decrease the applicable capacity and performance to meet your specific needs. In figure 3, the storage configuration consists of one storage system setup for high performance (left) and another for high-capacity secondary (right), disk to disk backup and other near-line needs, again, you can scale the approach up or down to your specific need.

    For the performance side (left), 192 x 146GB 15K RPM (28TB raw) disks provide good performance, however with low capacity use. This translates into a low capacity per watt value however with reasonable IOPs per watt and some performance hot spots.

    On the capacity centric side (right), there are 192 x 1TB disks (192TB raw) with good space utilization, however some performance hot spots or bottlenecks, constrained growth not to mention low IOPS per watt with reasonable capacity per watt. In the before scenario, the joint energy use (both arrays) is about 15 kWh or 15,000 watts which translates to about $16,000 annual energy costs (cooling excluded) assuming energy cost of 12 cents per kWh.

    Note, your specific performance, availability, capacity and energy mileage will vary based on particular vendor solution, configuration along with your application characteristics.


    Figure 3: Baseline before and after storage optimization (raw hardware) example

    Building on the example in figure 3, a combination of techniques along with technologies yields a net performance, capacity and perhaps feature functionality (depends on specific solution) increase. In addition, floor-space, power, cooling and associated footprints are also reduced. For example, the resulting solution shown (middle) comprises 4 x 250GB flash SSD devices, along with 32 x 450GB 15.5K RPM and 124 x 2TB 7200RPM enabling an 53TB (raw) capacity increase along with performance boost.

    The previous example are based on raw or baseline capacity metrics meaning that further optimization techniques should yield improved benefits. These examples should also help to discuss the question or myth that it costs more to power storage than to buy it which the answer should be it depends.

    If you can buy the above solution for say under $50,000 (cost to power), or, let alone, $100,000 (power and cool) for three years which would also be a good acquisition, then the myth of buying is more expensive than powering holds true. However, if a solution as described above costs more, than the story changes along with other variables include energy costs for your particular location re-enforcing the notion that your mileage will vary.

    Another tip is that more is not always better.

    That is, more disks, ports, processors, controllers or cache do not always equate into better performance. Performance is the sum of how those and other pieces working together in a demonstrable way, ideally your specific application workload compared to what is on a product data sheet.

    Additional general tips include:

    • Align the applicable tool, technique or technology to task at hand
    • Look to optimize for both performance and capacity, active and idle storage
    • Consolidated applications and servers need fast servers
    • Fast servers need fast I/O and storage devices to avoid bottlenecks
    • For active storage use an activity per watt metric such as IOP or transaction per watt
    • For in-active or idle storage, a capacity per watt per footprint metric would apply
    • Gain insight and control of how storage resources are used to meet service requirements

    It should go without saying, however sometimes what is understood needs to be restated.

    In the quest to become more efficient and optimized, avoid introducing performance, quality of service or availability issues by moving problems.

    Likewise, look beyond storage space capacity also considering performance as applicable to become efficient.

    Finally, it is all relative in that what might be applicable to one environment or application need may not apply to another.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved