Inaugural StorageIO Newsletter

Welcome to the winter 2010 edition of the Server and StorageIO (StorageIO) news letter. This inaugural edition of the StorageIO news letter coincides with our 5th year in business along with recent web site and blog enhancements.

In an age of social media including facebook, twitter, blogs and video, some might ask the question of why a news letter, after all, is that not old school or non social media?

For those who are immersed into twitter, blogs, facebook, feeds and other Web 2.0 means of communication, a traditional newsletter might not be in vogue.

StorageIO News Letter Image
Winter 2010 Newsletter
(Inaugural Edition)

However, realizing that there is still a large percentage of the population which also means a vast number of visitors and guest of StorageIO web sites and blogs or viewers of articles along with other content that do not use twitter, facebook, LinkedIn or RSS feeds, I realize that there is still a role for a newsletter.

Thus, it makes sense to bring info to those of you who prefer a traditional news letter format via email or other subscription, however this newsletter is available in HTML or PDF formats.

You can access this news letter via various social media venues (some are shown below) in addition to StorageIO web sites and subscriptions. Click on the following links to view the inaugural newsletter as HTML or PDF or, to go to the newsletter page.

Follow via Goggle Feedburner here or via email subscription here.

You can also subscribe to the news letter by simply sending an email to newsletter@storageio.com

Enjoy this inaugural edition of the StorageIO newsletter, let me know your comments and feedback.

Also, a very big thank you to everyone who has helped make StorageIO a success!.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Green IT, Green Gap, Tiered Energy and Green Myths

There are many different aspects of Green IT along with several myths or misperceptions not to mention missed opportunities.

There is a Green Gap or disconnect between environmentally aware, focused messaging and core IT data center issues. For example, when I ask IT professionals whether they have or are under direction to implement green IT initiatives, the number averages in the 10-15% range.

However, when I ask the same audiences who has or sees power, cooling, floor space, supporting growth, or addressing environmental health and safety (EHS) related issues, the average is 75 to 90%. What this means is a disconnect between what is perceived as being green and opportunities for IT organizations to make improvements from an economic and efficiency standpoint including boosting productivity.

 

Some IT Data Center Green Myths
Is “green IT” a convenient or inconvenient truth or a legend?

When it comes to green and virtual environments, there are plenty of myths and realities, some of which vary depending on market or industry focus, price band, and other factors.

For example, there are lines of thinking that only ultra large data centers are subject to PCFE-related issues, or that all data centers need to be built along the Columbia River basin in Washington State, or that virtualization eliminates vendor lock-in, or that hardware is more expensive to power and cool than it is to buy.

The following are some myths and realities as of today, some of which may be subject to change from reality to myth or from myth to reality as time progresses.

Myth: Green and PCFE issues are applicable only to large environments.

Reality: I commonly hear that green IT applies only to the largest of companies. The reality is that PCFE issues or green topics are relevant to environments of all sizes, from the largest of enterprises to the small/medium business, to the remote office branch office, to the small office/home office or “virtual office,” all the way to the digital home and consumer.

 

Myth: All computer storage is the same, and powering disks off solves PCFE issues.

Reality: There are many different types of computer storage, with various performance, capacity, power consumption, and cost attributes. Although some storage can be powered off, other storage that is needed for online access does not lend itself to being powered off and on. For storage that needs to be always online and accessible, energy efficiency is achieved by doing more with less—that is, boosting performance and storing more data in a smaller footprint using less power.

 

Myth: Servers are the main consumer of electrical power in IT data centers.

Reality: In the typical IT data center, on average, 50% of electrical power is consumed by cooling, with the balance used for servers, storage, networking, and other aspects. However, in many environments, particularly processing or computation intensive environments, servers in total (including power for cooling and to power the equipment) can be a major power draw.

 

Myth: IT data centers produce 2 to 8% of all global Carbon Dioxide (CO2) and carbon emissions.

Reality:  Thus might be perhaps true, given some creative accounting and marketing math in order to help build a justification case or to scare you into doing something. However, the reality is that in the United States, for example, IT data centers consume around 2 to 4% of electrical power (depending on when you read this), and less than 80% of all U.S. CO2 emissions are from electrical power generation, so the math does not quite add up. The reality is this, if no action is taken to improve IT data center energy efficiency, continued demand growth will shift IT power-related emissions from myth to reality, not to mention cause constraints on IT and business sustainability from an economic and productivity standpoint.

Myth: Server consolidation with virtualization is a silver bullet to address PCFE issues.

Reality: Server virtualization for consolidation is only part of an overall solution that should be combined with other techniques, including lower power, faster and more energy efficient servers, and improved data and storage management techniques.

 

Myth: Hardware costs more to power than to purchase.

Reality: Currently, for some low-cost servers, standalone disk storage, or entry level networking switches and desktops, this may be true, particularly where energy costs are excessively high and the devices are kept and used continually for three to five years. A general rule of thumb is that the actual cost of most IT hardware will be a fraction of the price of associated management and software tool costs plus facilities and cooling costs. For the most part, at least as of this writing, small standalone individual hard disk drives or small entry level volume servers can be bought and then used in locations that have very high electrical costs over a three  to five year time frame.

 

Regarding this last myth, for the more commonly deployed external storage systems across all price bands and categories, generally speaking, except for extremely inefficient and hot running legacy equipment, the reality is that it is still cheaper to power the equipment than to buy it. Having said that, there are some qualifiers that should also be used as key indicators to keep the equation balanced. These qualifiers include the acquisition cost  if any, for new, expanded, or remodeled habitats or space to house the equipment, the price of energy in a given region, including surcharges, as well as cooling, length of time, and continuous time the device will be used.

For larger businesses, IT equipment in general still costs more to purchase than to power, particularly with newer, more energy efficient devices. However, given rising energy prices, or the need to build new facilities, this could change moving forward, particularly if a move toward energy efficiency is not undertaken.

There are many variables when purchasing hardware, including acquisition cost, the energy efficiency of the device, power and cooling costs for a given location and habitat, and facilities costs. For example, if a new storage solution is purchased for $100,000, yet new habitat or facilities must be built for three to five times the cost of the equipment, those costs must be figured into the purchase cost.

Likewise, if the price of a storage solution decreases dramatically, but the device consumes a lot of electrical power and needs a large cooling capacity while operating in a region with expensive electricity costs, that, too, will change the equation and the potential reality of the myth.

 

Tiered Energy Sources
Given that IT resources and facilitated require energy to power equipment as well as keep them cool, electricity are popular topics associated with Green IT, economics and efficiency with lots of metrics and numbers tossed around. With that in mind, the U.S. national average CO2 emission is 1.34 lb/kWh of electrical power. Granted, this number will vary depending on the region of the country and the source of fuel for the power-generating station or power plant.

Like IT tiered resources (Servers, storage, I/O networks, virtual machines and facilities) of which there are various tiers or types of technologies to meet various needs, there are also multiple types of energy sources. Different tiers of energy sources vary by their cost, availability and environmental characteristics among others. For example, in the US, there are different types of coal and not all coal is as dirty when combined with emissions air scrubbers as you might be lead to believe however there are other energy sources to consider as well.

Coal continues to be a dominant fuel source for electrical power generation both in the United States and abroad, with other fuel sources, including oil, gas, natural gas, liquid propane gas (LPG or propane), nuclear, hydro, thermo or steam, wind and solar. Within a category of fuel, for example, coal, there are different emissions per ton of fuel burned. Eastern U.S. coal is higher in CO2 emissions per kilowatt hour than western U.S. lignite coal. However, eastern coal has more British thermal units (Btu) of energy per ton of coal, enabling less coal to be burned in smaller physical power plants.

If you have ever noticed that coal power plants in the United States seem to be smaller in the eastern states than in the Midwest and western states, it’s not an optical illusion. Because eastern coal burns hotter, producing more Btu, smaller boilers and stockpiles of coal are needed, making for smaller power plant footprints. On the other hand, as you move into the Midwest and western states of the United States, coal power plants are physically larger, because more coal is needed to generate 1 kWh, resulting in bigger boilers and vent stacks along with larger coal stockpiles.

On average, a gallon of gasoline produces about 20 lb of CO2, depending on usage and efficiency of the engine as well as the nature of the fuel in terms of octane or amount of Btu. Aviation fuel and diesel fuel differ from gasoline, as does natural gas or various types of coal commonly used in the generation of electricity. For example, natural gas is less expensive than LPG but also provides fewer Btu per gallon or pound of fuel. This means that more natural gas is needed as a fuel to generate a given amount of power.

Recently, while researching small, 10 to 12 kWh standby generators for my office, I learned about some of the differences between propane and natural gas. What I found was that with natural gas as fuel, a given generator produced about 10.5 kWh, whereas the same unit attached to a LPG or propane fuel source produced 12 kWh. The trade off was that to get as much power as possible out of the generator, the higher cost LPG was the better choice. To use lower cost fuel but get less power out of the device, the choice would be natural gas. If more power was needed, than a larger generator could be deployed to use natural gas, with the trade off of requiring a larger physical footprint.

Oil and gas are not used as much as fuel sources for electrical power generation in the United States as in other countries such as the United Kingdom. Gasoline, diesel, and other petroleum based fuels are used for some power plants in the United States, including standby or peaking plants. In the electrical power G and T industry as in IT, where different tiers of servers and storage are used for different applications there are different tiers of power plants using different fuels with various costs. Peaking and standby plants are brought online when there is heavy demand for electrical power, during disruptions when a lower cost or more environmentally friendly plant goes offline for planned maintenance, or in the event of a trip or unplanned outage.

CO2 is commonly discussed with respect to green and associated emissions however there are other so called Green Houses Gases including Nitrogen Dioxide (NO2) and water vapors among others. Carbon makes up only a fraction of CO2. To be specific, only about 27% of a pound of CO2 is carbon; the balance is not. Consequently, carbon emissions taxes schemes (ETS), as opposed to CO2 tax schemes, need to account for the amount of carbon per ton of CO2 being put into the atmosphere. In some parts of the world, including the EU and the UK, ETS are either already in place or in initial pilot phases, to provide incentives to improve energy efficiency and use.

Meanwhile, in the United States there are voluntary programs for buying carbon offset credits along with initiatives such as the carbon disclosure project. The Carbon Disclosure Project (www.cdproject.net) is a not for profit organization to facilitate the flow of information pertaining to emissions by organizations for investors to make informed decisions and business assessment from an economic and environmental perspective. Another voluntary program is the United States EPA Climate Leaders initiative where organizations commit to reduce their GHG emissions to a given level or a specific period of time.

Regardless of your stance or perception on green issues, the reality is that for business and IT sustainability, a focus on ecological and, in particular, the corresponding economic aspects cannot be ignored. There are business benefits to aligning the most energy efficient and low power IT solutions combined with best practices to meet different data and application requirements in an economic and ecologically friendly manner.

Green initiatives need to be seen in a different light, as business enables as opposed to ecological cost centers. For example, many local utilities and state energy or environmentally concerned organizations are providing funding, grants, loans, or other incentives to improve energy efficiency. Some of these programs can help offset the costs of doing business and going green. Instead of being seen as the cost to go green, by addressing efficiency, the by products are economic as well as ecological.

Put a different way, a company can spend carbon credits to offset its environmental impact, similar to paying a fine for noncompliance or it can achieve efficiency and obtain incentives. There are many solutions and approaches to address these different issues, which will be looked at in the coming chapters.

What does this all mean?
There are real things that can be done today that can be effective toward achieving a balance of performance, availability, capacity, and energy effectiveness to meet particular application and service needs.

Sustaining for economic and ecological purposes can be achieved by balancing performance, availability, capacity, and energy to applicable application service level and physical floor space constraints along with intelligent power management. Energy economics should be considered as much a strategic resource part of IT data centers as are servers, storage, networks, software, and personnel.

The bottom line is that without electrical power, IT data centers come to a halt. Rising fuel prices, strained generating and transmission facilities for electrical power, and a growing awareness of environmental issues are forcing businesses to look at PCFE issues. IT data centers to support and sustain business growth, including storing and processing more data, need to leverage energy efficiency as a means of addressing PCFE issues. By adopting effective solutions, economic value can be achieved with positive ecological results while sustaining business growth.

Some additional links include:

Want to learn or read more?

Check out Chapter 1 (Green IT and the Green Gap, Real or Virtual?) in my book “The Green and Virtual Data Center” (CRC) here or here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Is MAID Storage Dead? I Dont Think So!

Some vendors are doing better than others and first generation MAID (Massive or monolithic Array of Idle Disks) might be dead or about to be deceased, spun down or put into a long term sleep mode, it is safe to say that second generation MAID (e.g. MAID 2.0) also known as intelligent power management (IPM) is alive and doing well.

In fact, IPM is not unique to disk storage or disk drives as it is also a technique found in current generation of processors such as those from Intel (e.g. Nehalem) and others.

Other names for IPM include adaptive voltage scaling (AVS), adaptive voltage scaling optimized (AVSO) and adaptive power management (APM) among others.

The basic concept is to vary the amount of power being used to the amount of work and service level needed at a point in time and on a granular basis.

For example, first generation MAID or drive spin down as deployed by vendors such as Copan, which is rumored to be in the process of being spun down as a company (see blog post by a former Copan employee) were binary. That is, a disk drive was either on or off, and, that the granularity was the entire storage system. In the case of Copan, the granularly was that a maximum of 25% of the disks could ever be spun up at any point in time. As a point of reference, when I ask IT customers why they dont use MAID or IPM enabled technology they commonly site concerns about performance, or more importantly, the perception of bad performance.

CPU chips have been taking the lead with the ability to vary the voltage and clock speed, enabling or disabling electronic circuitry to align with amount of work needing to be done at a point in time. This more granular approach allows the CPU to run at faster rates when needed, slower rates when possible to conserve energy (here, here and here).

A common example is a laptop with technology such as speed step, or battery stretch saving modes. Disk drives have been following this approach by being able to vary their power usage by adjusting to different spin speeds along with enabling or disabling electronic circuitry.

On a granular basis, second generation MAID with IPM enabled technology can be done on a LUN or volume group basis across different RAID levels and types of disk drives depending on specific vendor implementation. Some examples of vendors implementing various forms of IPM for second generation MAID to name a few include Adaptec, EMC, Fujitsu Eternus, HDS (AMS), HGST (disk drives), Nexsan and Xyratex among many others.

Something else that is taking place in the industry seems to be vendors shying away from using the term MAID as there is some stigma associated with performance issues of some first generation products.

This is not all that different than what took place about 15 years ago or so when the first purpose built monolithic RAID arrays appeared on the market. Products such as the SF2 aka South San Francisco Forklift company product called Failsafe (here and here) which was bought by MTI with patents later sold to EMC.

Failsafe, or what many at DEC referred to as Fail Some was a large refrigerator sized device with 5.25” disk drives configured as RAID5 with dedicated hot spare disk drives. Thus its performance was ok for the time doing random reads, however writes in the pre write back cache RAID5 days was less than spectacular.

Failsafe and other early RAID (and here) implementations received a black eye from some due to performance, availability and other issues until best practices and additional enhancements such as multiple RAID levels appeared along with cache in follow on products.

What that trip down memory (or nightmare) lane has to do with MAID and particularly first generation products that did their part to help establish new technology is that they also gave way to second, third, fourth, fifth, sixth and beyond generations of RAID products.

The same can be expected as we are seeing with more vendors jumping in on the second generation of MAID also known as drive spin down with more in the wings.

Consequently, dont judge MAID based solely on the first generation products which could be thought of as advanced technology production proof of concept solutions that will have paved the way for follow up future solutions.

Just like RAID has become so ubiquitous it has been declared dead making it another zombie technology (dead however still being developed, produced, bought and put to use), follow on IPM enabled generations of technology will be more transparent. That is, similar to finding multiple RAID levels in most storage, look for IPM features including variable drive speeds, power setting and performance options on a go forward basis. These newer solutions may not carry the MAID name, however the sprit and function of intelligent power management without performance compromise does live on.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Optimize Data Storage for Performance and Capacity Efficiency

This post builds on a recent article I did that can be read here.

Even with tough economic times, there is no such thing as a data recession! Thus the importance of optimizing data storage efficiency addressing both performance and capacity without impacting availability in a cost effective way to do more with what you have.

What this means is that even though budgets are tight or have been cut resulting in reduced spending, overall net storage capacity is up year over year by double digits if not higher in some environments.

Consequently, there is continued focus on stretching available IT and storage related resources or footprints further while eliminating barriers or constraints. IT footprint constraints can be physical in a cabinet or rack as well as floorspace, power or cooling thresholds and budget among others.

Constraints can be due to lack of performance (bandwidth, IOPS or transactions), poor response time or lack of availability for some environments. Yet for other environments, constraints can be lack of capacity, limited primary or standby power or cooling constraints. Other constraints include budget, staffing or lack of infrastructure resource management (IRM) tools and time for routine tasks.

Look before you leap
Before jumping into an optimization effort, gain insight if you do not already have it as to where the bottlenecks exist, along with the cause and effect of moving or reconfiguring storage resources. For example, boosting capacity use to more fully use storage resources can result in a performance issue or data center bottlenecks for other environments.

An alternative scenario is that in the quest to boost performance, storage is seen as being under-utilized, yet when capacity use is increased, low and behold, response time deteriorates. The result can be a vicious cycle hence the need to address the issue as opposed to moving problems by using tools to gain insight on resource usage, both space and activity or performance.

Gaining insight means looking at capacity use along with performance and availability activity and how they use power, cooling and floor-space. Consequently an important tool is to gain insight and knowledge of how your resources are being used to deliver various levels of service.

Tools include storage or system resource management (SRM) tools that report on storage space capacity usage, performance and availability with some tools now adding energy usage metrics along with storage or system resource analysis (SRA) tools.

Cooling Off
Power and cooling are commonly talked about as constraints, either from a cost standpoint, or availability of primary or secondary (e.g. standby) energy and cooling capacity to support growth. Electricity is essential for powering IT equipment including storage enabling devices to do their specific tasks of storing data, moving data, processing data or a combination of these attributes.

Thus, power gets consumed, some work or effort to move and store data takes place and the by product is heat that needs to be removed. In a typical IT data center, cooling on average can account for about 50% of energy used with some sites using less.

With cooling being a large consumer of electricity, a small percentage change to how cooling consumes energy can yield large results. Addressing cooling energy consumption can be to discuss budget or cost issues, or to enable cooling capacity to be freed up to support installation of extra storage or other IT equipment.

Keep in mind that effective cooling relies on removing heat from as close to the source as possible to avoid over cooling which requires more energy. If you have not done so, have a facilities review or assessment performed that can range from a quick walk around, to a more in-depth review and thermal airflow analysis. A means of removing heat close to the sort are techniques such as intelligent, precision or smart cooling also known by other marketing names.

Powering Up, or, Powering Down
Speaking of energy or power, in addition to addressing cooling, there are a couple of ways of addressing power consumption by storage equipment (Figure 1). The most popular discussed approach towards efficiency is energy avoidance involving powering down storage when not used such as first generation MAID at the cost of performance.

For off-line storage, tape and other removable media give low-cost capacity per watt with low to no energy needed when not in use. Second generation (e.g. MAID 2.0) solutions with intelligent power management (IPM) capabilities have become more prevalent enabling performance or energy savings on a more granular or selective basis often as a standard feature in common storage systems.

GreenOptionsBalance
Figure 1:  How various RAID levels and configuration impact or benefit footprint constraints

Another approach to energy efficiency is seen in figure 1 which is doing more work for active applications per watt of energy to boost productivity. This can be done by using same amount of energy however doing more work, or, same amount of work with less energy.

For example instead of using larger capacity disks to improve capacity per watt metrics, active or performance sensitive storage should be looked at on an activity basis such as IOP, transactions, videos, emails or throughput per watt. Hence, a fast disk drive doing work can be more energy-efficient in terms of productivity than a higher capacity slower disk drive for active workloads, where for idle or inactive, the inverse should hold true.

On a go forward basis the trend already being seen with some servers and storage systems is to do both more work, while using less energy. Thus a larger gap between useful work (for active or non idle storage) and amount of energy consumed yields a better efficiency rating, or, take the inverse if that is your preference for smaller numbers.

Reducing Data Footprint Impact
Data footprint impact reduction tools or techniques for both on-line as well as off-line storage include archiving, data management, compression, deduplication, space-saving snapshots, thin provisioning along with different RAID levels among other approaches. From a storage access standpoint, you can also include bandwidth optimization, data replication optimization, protocol optimizers along with other network technologies including WAFS/WAAS/WADM to help improve efficiency of data movement or access.

Thin provisioning for capacity centric environments can be used to achieving a higher effective storage use level by essentially over booking storage similar to how airlines oversell seats on a flight. If you have good historical information and insight into how storage capacity is used and over allocated, thin provisioning enables improved effective storage use to occur for some applications.

However, with thin provisioning, avoid introducing performance bottlenecks by leveraging solutions that work closely with tools that providing historical trending information (capacity and performance).

For a technology that some have tried to declare as being dead to prop other new or emerging solutions, RAID remains relevant given its widespread deployment and transparent reliance in organizations of all size. RAID also plays a role in storage performance, availability, capacity and energy constraints as well as a relief tool.

The trick is to align the applicable RAID configuration to the task at hand meeting specific performance, availability, capacity or energy along with economic requirements. For some environments a one size fits all approach may be used while others may configure storage using different RAID levels along with number of drives in RAID sets to meet specific requirements.


Figure 2:  How various RAID levels and configuration impact or benefit footprint constraints

Figure 2 shows a summary and tradeoffs of various RAID levels. In addition to the RAID levels, how many disks can also have an impact on performance or capacity, such as, by creating a larger RAID 5 or RAID 6 group, the parity overhead can be spread out, however there is a tradeoff. Tradeoffs can be performance bottlenecks on writes or during drive rebuilds along with potential exposure to drive failures.

All of this comes back to a balancing act to align to your specific needs as some will go with a RAID 10 stripe and mirror to avoid risks, even going so far as to do triple mirroring along with replication. On the other hand, some will go with RAID 5 or RAID 6 to meet cost or availability requirements, or, some I have talked with even run RAID 0 for data and applications that need the raw speed, yet can be restored rapidly from some other medium.

Lets bring it all together with an example
Figure 3 shows a generic example of a before and after optimization for a mixed workload environment, granted you can increase or decrease the applicable capacity and performance to meet your specific needs. In figure 3, the storage configuration consists of one storage system setup for high performance (left) and another for high-capacity secondary (right), disk to disk backup and other near-line needs, again, you can scale the approach up or down to your specific need.

For the performance side (left), 192 x 146GB 15K RPM (28TB raw) disks provide good performance, however with low capacity use. This translates into a low capacity per watt value however with reasonable IOPs per watt and some performance hot spots.

On the capacity centric side (right), there are 192 x 1TB disks (192TB raw) with good space utilization, however some performance hot spots or bottlenecks, constrained growth not to mention low IOPS per watt with reasonable capacity per watt. In the before scenario, the joint energy use (both arrays) is about 15 kWh or 15,000 watts which translates to about $16,000 annual energy costs (cooling excluded) assuming energy cost of 12 cents per kWh.

Note, your specific performance, availability, capacity and energy mileage will vary based on particular vendor solution, configuration along with your application characteristics.


Figure 3: Baseline before and after storage optimization (raw hardware) example

Building on the example in figure 3, a combination of techniques along with technologies yields a net performance, capacity and perhaps feature functionality (depends on specific solution) increase. In addition, floor-space, power, cooling and associated footprints are also reduced. For example, the resulting solution shown (middle) comprises 4 x 250GB flash SSD devices, along with 32 x 450GB 15.5K RPM and 124 x 2TB 7200RPM enabling an 53TB (raw) capacity increase along with performance boost.

The previous example are based on raw or baseline capacity metrics meaning that further optimization techniques should yield improved benefits. These examples should also help to discuss the question or myth that it costs more to power storage than to buy it which the answer should be it depends.

If you can buy the above solution for say under $50,000 (cost to power), or, let alone, $100,000 (power and cool) for three years which would also be a good acquisition, then the myth of buying is more expensive than powering holds true. However, if a solution as described above costs more, than the story changes along with other variables include energy costs for your particular location re-enforcing the notion that your mileage will vary.

Another tip is that more is not always better.

That is, more disks, ports, processors, controllers or cache do not always equate into better performance. Performance is the sum of how those and other pieces working together in a demonstrable way, ideally your specific application workload compared to what is on a product data sheet.

Additional general tips include:

  • Align the applicable tool, technique or technology to task at hand
  • Look to optimize for both performance and capacity, active and idle storage
  • Consolidated applications and servers need fast servers
  • Fast servers need fast I/O and storage devices to avoid bottlenecks
  • For active storage use an activity per watt metric such as IOP or transaction per watt
  • For in-active or idle storage, a capacity per watt per footprint metric would apply
  • Gain insight and control of how storage resources are used to meet service requirements

It should go without saying, however sometimes what is understood needs to be restated.

In the quest to become more efficient and optimized, avoid introducing performance, quality of service or availability issues by moving problems.

Likewise, look beyond storage space capacity also considering performance as applicable to become efficient.

Finally, it is all relative in that what might be applicable to one environment or application need may not apply to another.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Storage Efficiency and Optimization – The Other Green

For those of you in the New York City area, I will be presenting live in person at Storage Decisions September 23, 2009 conference The Other Green, Storage Efficiency and Optimization.

Throw out the "green“: buzzword, and you’re still left with the task of saving or maximizing use of space, power, and cooling while stretching available IT dollars to support growth and business sustainability. For some environments the solution may be consolation while others need to maintain quality of service response time, performance and availability necessitating faster, energy efficient technologies to achieve optimization objectives.

To accomplish these and other related issues, you can turn to the cloud, virtualization, intelligent power management, data footprint reduction and data management not to mention various types of tiered storage and performance optimization techniques. The session will look at various techniques and strategies to optimize either on-line active or primary as well as near-line or secondary storage environment during tough economic times, as well as to position for future growth, after all, there is no such thing as a data recession!

Topics, technologies and techniques that will be discussed include among others:

  • Energy efficiency (strategic) vs. energy avoidance (tactical), whats different between them
  • Optimization and the need for speed vs. the need for capacity, finding the right balance
  • Metrics & measurements for management insight, what the industry is doing (or not doing)
  • Tiered storage and tiered access including SSD, FC, SAS, tape, clouds and more
  • Data footprint reduction (archive, compress, dedupe) and thin provision among others
  • Best practices, financial incentives and what you can do today

This is a free event for IT professionals, however space I hear is limited, learn more and register here.

For those interested in broader IT data center and infrastructure optimization, check out the on-going seminar series The Infrastructure Optimization and Planning Best Practices (V2.009) – Doing more with less without sacrificing storage, system or network capabilities Seminar series continues September 22, 2009 with a stop in Chicago. This is also a free Seminar, register and learn more here or here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Storage Optimization: Performance, Availability, Capacity, Effectiveness

Storage I/O trends

With the IT and storage industry shying away from green hype, green washing and other green noise, there is also a growing realization that the new green is about effectively boosting efficiency to improve productivity and profitability or to sustain business and IT growth during tough economic times.

This past week while doing some presentations (I’ll post a link soon to the downloads) at the 2008 San Francisco installment of Storage Decisions event focused on storage professionals, as well as a keynote talk at the value added reseller (VAR) channel professional focused storage strategies event, a common theme was boosting productivity, improving on efficiency, stretching budgets and enabling existing personal and resources to do more with the same or less.

During these and other presentations, keynotes, sessions and seminars both here in the U.S. as well as in Europe recently, these common themes of booting efficiency as well as the closing of the green gap, that is, the gap between industry and marketing rhetoric around green hype, green noise, green washing and issues that either do not resonate with, or, can not be funded by IT organizations compared with the disconnect of where many IT organizations issues exist which are around power, cooling, floor space or footprint as well as EH&S (Environmental health and safety) and economics.

The green gap (here, and here, and here) is that many IT organizations around the world have not realized due to green hype around carbon footprints and related themes that in fact, boosting energy efficiency for active and on-line applications, data and workloads (e.g. doing more I/O operations per second-IOPS, transactions, files or messages processed per watt of energy) to address power, cooling, floor space are in fact a form of addressing green issues, both economic and environmental.

Likewise for inactive or idle data, there is a bit more of a linkage that green can mean powering things off, however there is also a disconnect in that many perceive that green storage for example is only green if the storage can be powered off which while true for in-active or idle data and applications, is not true for all data and applications types.

As mentioned already, for active workloads, green means doing more with the same or less power, cooling and floor space impact, this means doing more work per unit of energy. In that theme, for active workload, a slow, large capacity disk may in fact not be energy efficient if it impedes productivity and results in more energy to get the same amount of work done. For example, larger capacity SATA disk drives are also positioned as being the most green or energy efficiency which can be true for idle or in-active or non performance (time) sensitive applications where more data is stored in a denser footprint.

However for active workload, lower capacity 15.5K RPM 300GB and 400GB Fibre Channel (FC) and SAS disk drives that deliver more IOPS or bandwidth per watt of energy can get more work done in the same amount of time.

There is also a perception that FC and SAS disk drives use more power than SATA disk drives which in some cases can be true, however current generations of high performance 10K RPM and 15.5K RPM drives have very similar power draw on a raw spindle or device basis. What differs is the amount of capacity per watt for idle or inactive applications, or, the number of IOPS or amount of performance for active configurations.

On the other hand, while not normally perceived as being green compared to tape or IPM and MAID (1st generation and MAID 2.0) solutions, along with SSD (Flash and RAM), not to mention fast SAS and FC disks or tiered storage systems that can do more IOPS or bandwidth per watt of energy are in fact green and energy efficiency for getting work done. Thus, there are two sides to optimizing storage for energy efficiency, optimizing for when doing work e.g. more miles per gallon per amount of work done, and, how little energy used when not doing work.

Thus, a new form of being green to sustain business growth while boosting productivity is Gaining Realistic Economic Efficiency Now that as a by product helps both business bottom lines as well as the environment by doing more with less. These are themes that are addressed in my new book

“The Green and Virtual Data Center” (Auerbach) that will be formerly launched and released for generally availability just after the 1st of the year (hopefully sooner), however you can beat the rush and order your copy now to beat the rush at Amazon and other fine venues around the world.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Intelligent Power Management (IPM) and second generation MAID 2.0 on the rise

Storage I/O trends

In case you missed it today, Adaptec announced that they are the 1st vendor “This Week” to add support for Intelligent Power Management (IPM) to their storage systems. Adaptec joins a growing list of vendors who are deploying, or, who are program announcing some variation of IPM and second generation MAID 2.0 ability including support for different types of tiered disk drives including various combinations of Fibre Channel and SAS as well as SATA.

As a quick refresh, Massive or Monolithic Arrays of Idle or Inactive Disks (MAID) was popularized by 1st generation MAID vendor Copan who spins down disk drives to avoid energy usage. One of the challenges with 1st generation MAID is the poor performance by being able to only have at most 25% of the disk drives spinning at any time to transfer data when needed.

This is a balancing act between achieving energy avoidance and associated benefits vs. maintaining performance to move data when needed particularly for large restoration to support BC/DR or other purposes. Granted, 1st generation MAID systems like those from Copan while positioned as alternatives to high-performance disk storage systems to amplify potential energy savings on one hand, or, to put as an alternative to magnetic tape by providing random restore capability. The reality is that 1st generation MAID systems are finding their niche not for on-line primary or even on-line secondary storage, nor as a direct replacement for tape or even disk based libraries to support large-scale BC/DR, rather, in a sweet spot between secondary and near-line disk libraries and virtual tape libraries with a target application of very infrequently accessed of data.

Second generation MAID, aka MAID 2.0 is an evolution of the general technologies and capabilities extending functionality and flexibility while addressing quality of service (QoS), performance, availability, capacity and energy consumption using IPM also known as Adaptive Power Management (APM), dynamic bandwidth switching or scaling (DBS) among other names. The basic premise is to add flexibility building on 1st generation characteristics including data protection, resiliency and pro-active part or drive monitoring. Another basic premise of IPM. and MAID 2.0. solutions is to allow the performance and subsequent energy usage to vary, which is to cut the amount of performance and energy usage during in-active times, yet, when data needs to be accessed, to allow full performance without penalties for energy savings.

Second generation MAID solutions can be characterized by multiple power saving modes as well as flexible performance to adjust to changing workload and application needs. Another characteristic is the ability to work across different types of disk drives including Fibre Channel, SAS and SATA as opposed to only SATA drives found in 1st generation solutions as well as for the IPM or MAID 2.0 functionality to exist in a standard storage system or array instead of in a purpose-built dedicated storage system. Other capabilities include support for more granular power settings down to a RAID group or LUN level instead of across an entire array or storage system as well as support for different RAID levels among other features.

Examples of vendors who have either announced product or made statements of direction with regard to MAID 2.0 and IPM enabled storage systems include:

Adaptec (Today), Datadirect, EMC, Fujitsu, HDS, HGST (Hitachi Disk Drives), NEC, Nexsan, and Xyratex among others on a growing list of solutions.

For applications and data storage needs that need good performance and QoS over a range of changing usage conditions to balance good performance when needed to efficiently get work done to boost productivity, while saving or avoiding energy when little or no work needs to be done, take a look at current and emerging IPM and MAID 2.0 enabled storage systems as part of a tiered storage strategy to discuss power, cooling, floor-space and EHS (PCFE) related issues.

To learn more, check out the StorageIO Industry Trends and Perspective white paper Intelligent Power Management (IPM) and MAID 2.0 and visit www.thegreenandvirtualdatacenter.com well as www.storageio.com.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Airport Parking, Tiered Storage and Latency

Storage I/O trends

Ok, so what do airport parking, tiered storage and latency have in common? Based on some recent travel experience I will assert that there is a bit in common, or at a least an analogy. What got me thinking about this was recently I could not get a parking spot at the airport primary parking ramp next to the terminal (either a reasonable walk or short tram ride) which offers quick access to the departure gate.

Granted there is premium for this ability to park or “store” my vehicle for a few days in near to airport terminal, however that premium is off-set in the time savings and less disruptions enabling me a few extra minutes to get other things done while traveling.

Let me call the normal primary airport parking tier-1 (regardless of what level of the ramp you park on), with tier-0 being valet parking where you pay a fee that might rival the cost of your airline ticket, yet your car stays in a climate controlled area, gets washed and cleaned, maybe an oil change and hopefully in a more secure environment with an even faster access to your departure gate, something for the rich and famous.

Now the primary airport parking has been full lately, not surprising given the cold weather and everyone looking to use up their carbon off-set credits to fly somewhere warm or attend business meetings or what ever it is that they are doing.

Budgeting some extra time, a couple of weeks ago I tried one of those off-site airport parking facilities where the bus picks you up in the parking lot and then whisks you off to the airport, then you on return you wait for the buss to pick you up at the airport, ride to the lot and tour the lot looking at everyone’s car as they get dropped off and 30-40 minutes later, you are finally to your vehicle faced with the challenge of how to get out of the parking lot late at night as it is such a budget operation, they have gone to lights out and automated check-out. That is, put your credit card in the machine and the gate opens, that is, if the credit card reader is not frozen because it about “zero” outside and the machine wont read your card using up more time, however heck, I saved a few dollars a day.

On another recent trip, again the main parking ramp was full, at least the airport has a parking or storage resource monitoring ( aka Airport SRM) tool that you can check ahead to see if the ramps are full or not. This time I went to another terminal, parked in the ramp there, walked a mile (would have been a nice walk if it had not been 1 above zero (F) with a 20 mile per hour wind) to the light rail train station, waited ten minutes for the 3 minute train ride to the main terminal, walked to the tram for the 1-2 minute tram ride to the real terminal to go to my departure gate. On return, the process was reversed, adding what I will estimate to be about an hour to the experience, which, if you have the time, not a bad option and certainly good exercise even if it was freezing cold.

During the planes, trains and automobiles expedition, it dawned on me, airport parking is a lot like tiered storage in that you have different types of parking with different cost points, locality of reference or latency or speed from which how much time to get from your car to your plane, levels of protection and security among others.

I likened the off-airport parking experience to off-line tier-3 tape or MAID or at best, near-line tier-2 storage in that I saved some money at the cost of lost time and productivity. The parking at the remote airport ramp involving a train ride and tram ride I likened to tier-2 or near-line storage over a very slow network or I/O path in that the ramp itself was pretty efficiency, however the transit delays or latency were ugly, however I did save some money, a couple of bucks, not as much as the off-site, however a few less than the primary parking.

Hence I jump back to the primary ramp as being the fastest as tier-1 unless you have someone footing your parking bills and can afford tier-0. It also dawned on me that like primary or tier-1 storage, regardless of if it is enterprise class like an EMC DMX, IBM DS8K, Fujitsu, HDS USP or mid-range EMC CLARiiON, HP EVA, IBM DS4K, HDS AMS, Dell or EqualLogic, 3PAR, Fujitsu, NetApp or entry-level products from many different vendors; people still pay for the premium storage, aka tier-1 storage in a given price band even if there are cheaper alternatives however like the primary airport parking, there are limits on how much primary storage or parking can be supported due to floor space, power, cooling and budget constraints.

With tiered storage the notion is to align different types and classes of storage for various usage and application categories based on service (performance, availability, capacity, energy consumption) requirements balanced with cost or other concerns. For example there is high cost yet ultra high performance with ultra low energy saving and relative small capacity of tier-0 solid state devices (SSD) using either FLASH or dynamic random access memory (DRAM) as part of a storage system, as a storage device or as a caching appliance to meet I/O or activity intensive scenarios. Tier-1 is high performance, however not as high performance as tier-0, although given a large enough budget, large enough power and cooling ability and no constraints on floor space, you can make an total of traditional disk drives out perform even solid state, having a lot more capacity at the tradeoff of power, cooling, floor space and of course cost.

For most environments tier-1 storage will be the fastest storage with a reasonable amount of capacity, as tier-1 provides a good balance of performance and capacity per amount of energy consumed for active storage and data. On the other hand, lower cost, higher capacity and slower tier-2 storage also known as near-line or secondary storage is used in some environments for primary storage where performance is not a concern, yet is typically more for non-performance intensive applications.

Again, given enough money, unlimited power, cooling and floor space not to mention the number of enclosures, controllers and management software, you can sum a large bunch of low-cost SATA drives as an example to produce a high level of performance, however the cost benefits to do a high activity or performance level, either IOPS or bandwidth particular where the excess capacity is not needed would make SSD technology look cheap on an overall cost basis perspective.

Likewise replacing your entire disk with SSD particularly for capacity based environments is not really practical outside of extreme corner case applications unless you have the disposable income of a small country for your data storage and IT budget.

Another aspect of tiered storage is the common confusion of a class of storage and the class of a storage vendor or where a product is positioned for example from a price band or target environment such as enterprise, small medium environment, small medium business (SMB), small office or home office (SOHO) or prosumer/consumer.

I often hear discussions that go along the lines of tier-1 storage being products for the enterprise, tier-1 being for workgroups and tier-3 being for SMB and SOHO. I also hear confusion around tier-1 being block based, tier-2 being NAS and tier-3 being tape. “What we have here is a failure to communicate” in that there is confusion around tiered, categories, classification, price band and product positioning and perception. To add to the confusion is that there are also different tiers of access including Fibre Channel and FICON using 8GFC (coming soon to a device near you), 4GFC, 2GFC and even 1GFC along with 1GbE and 10GbE for iSCSI and/or NAS (NFS and/or CIFS) as well as InfiniBand for block (iSCSI or SRP) and file (NAS) offering different costs, performance, latency and other differing attributes to aligning to various application service and cost requirements.

What this all means is that there is more to tiered storage, there is tiered access, tiered protection, tiered media, different price band and categories of vendors and solutions to be aligned to applicable usage and service requirements. On the other hand, similar to airport parking, I can chose to skip the airport parking and take a cab to the airport which would be analogous to shifting your storage needs to a managed service provider. However ultimately it will come down to balancing performance, availability, capacity and energy (PACE) efficiency to the level of service and specific environment or application needs.

Greg Schulz www.storageio.com and www.greendatastorage.com