Justifying Green IT and Home Hardware Upgrades with EnergyStar

Energy Star

Have you seen the TV commercials or print advertisements where an energy star washer is mentioned as so efficient that the savings from reduced power consumption are enough to pay for the dryer? If not, check out the EPA Energy Star website for information about various programs, savings and efficiency options to learn more

What does this have to do with servers, storage, networking, data centers or other IT equipment?

Simple, if you are not aware, Energy Star for Servers now exits and is being enhanced while good progress is being made on the Energy Star for storage program.

The Energy Star for household appliances has been around a bit longer and more refined, something that I anticipated the server and storage programs to follow-suit with over time.

What really caught my eye with the commercial is the focus on closing the green gap, that is instead of the green environmental impact savings of an appliance that uses less power and subsequent carbon footprint benefits, the message is to the economic hot button. That is, switch to more energy efficient technology that allows more work to done at a lower overall cost and the savings can help self fund the enhancements.

For example, a more energy efficient server that can do more work or GHz per watt of energy when needed, or, to go into lower power modes (intelligent power management: IPM). Low power modes do not necessarily mean turning completely off, rather, drawing less energy and subsequently lower cooling demands during slow periods such as with new Intel Nehalem and other processors.

From a disk storage perspective, energy efficiency is often thought to be avoidance or turning disk drives off boosting capacity and squeezing data footprints.

However energy efficiency and savings can also be achieved by slowing a disk drive down or turning of some of the electronics to reduce energy consumption and heat generation.

Other forms of energy savings include thin provisioning and deduplication however another form of energy efficiency for storage is boosting performance. That is, doing more work per watt of energy for active or time sensitive applications or usage scenarios.

Thus there is another Green IT, one that provides both economic and environmental benefits!

Here are some related links:

Saving Money with Green IT: Time To Invest In Information Factories

EPA Energy Star for Data Center Storage Update

Green Storage is Alive and Well: ENERGY STAR Enterprise Storage Stakeholder Meeting Details

Shifting from energy avoidance to energy efficiency

U.S. EPA Energy Star for Server Update

U.S. EPA Looking for Industry Input on Energy Star for Storage

Update: EnergyStar for Server Workshop

US EPA EnergyStar for Servers Wants To Hear From YOU!

Optimize Data Storage for Performance and Capacity Efficiency

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

ILM = Has It Losts its Meaning

Disclaimer, warning, be advised, heads up, disclosure, this post is partially for fun so take it that way.

Remember ILM, that is, Information Lifecycle Management among other meanings.

It was a popular buzzword de jour a few years ago similar to how cloud is being tossed around lately, or in the recent past, virtualization, clusters, grids and SOA among others.

One of the challenges with ILM besides its overuse and thus confusion was what it meant, after all was or is it a product, process, paradigm or something else?

That depends of course on who you talk to and their view or definition.

For some, ILM was a new name for archiving, or storage and data tiering, or data management, or hierarchical storage management (HSM) or system managed storage (SMS) and software managed storage (SMS) among others.

So where is ILM today?

Better yet, what does ILM stand for?

Well here are a few thoughts; some are oldies but goodies, some new, some just for fun.

ILM = I Like Marketing or Its a Lot of Marketing or Its a Lot of Money
ILM = It Losts its Meaning or Its a Lot of Meetings
ILM = Information Loves Magnetic media or I Love Magnetic media
ILM = IBM Loves Mainframes or Intel Loves Memory
ILM = Infrastructure Lifecycle Management or iPods/iPhones Like Macintosh

Then there are many other variations of xLM where I is replaced with X (similar to XaaS) where X is any letter you want or need for a particular purpose or message theme. For example, how about replacing X with an A for Application Lifecycle Management (ALM), or a B for Buzzword or Backup Lifecycle Management (BLM), C for Content Lifecycle Management (CLM) and D for Document or Data Lifecycle Management (DLM). There are many others including Hardware Lifecycle Management (HLM), Product or Program Lifecycle Management (PLM) not to mention Server, Storage or Security Lifecycle Management (SLM).

While ILM or xLM specific product and marketing buzz for the most part has subsided, perhaps it is about time to reappear to give current buzzwords such as cloud a bread or rest. After all, ILM and xLM as buzzwords should be well rested after their break at the Buzzword Rest Spa (BRS) perhaps located on someday isle. You know about someday isle dont you? Its that place of dreams, a visionary place to be visited in the future.

There are already signs of the impending rested, rejuvenated and re branded appearance of ILM in the form of automated tiering, intelligent storage and data management, file virtualization, policy managed server and storage among others.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Greg Schulz – StorageIO, Author “The Green and Virtual Data Center” (CRC)

Technorati tags: ILM

Acadia VCE: VMware + Cisco + EMC = Virtual Computing Environment

Was today the day the music died? (click here or here if you are not familar with the expression)

Add another three letter acronym (TLA) to your IT vocabulary if you are involved with server, storage, networking, virtualization, security and related infrastructure resource management (IRM) topics.

That new TLA is Virtual Computing Environment (VCE), a coalition formed by EMC and Cisco along with partner Intel called Acadia that was announced today. Of course, EMC who also happens to own VMware for virtualization and RSA for security software tools bring those to the coalition (read press release here).

For some quick fun, twittervile and the blogosphere have come up with other meanings such as:

VCE = Virtualization Communications Endpoint
VCE = VMware Cisco EMC
VCE = Very Cash Efficient
VCE = VMware Controls Everything
VCE = Virtualization Causes Enthusiasm
VCE = VMware Cisco Exclusive

Ok, so much for some fun, at least for now.

With Cisco, EMC and VMware announcing their new VCE coalition, has this signaled the end of servers, storage, networking, hardware and software for physical, virtual and clouding computing as we know it?

Does this mean all other vendors not in this announcement should pack it up, game over and go home?

The answer in my perspective is NO!

No, the music did not end today!

NO, servers, storage and networking for virtual or cloud environments has not ended.

Also, NO, other vendors do not have to go home today, the game is not over!

However a new game is on, one that some have seen before, for others it is something new, exciting perhaps revolutionary or an industry first.

What was announced?
Figure 1 shows a general vision or positioning from the three major players involved along with four tenants or topic areas of focus. Here is a link to a press release where you can read more.

CiscoVirtualizationCoalition.png
Figure 1: Source: Cisco, EMC, VMware

General points include:

  • A new coalition (e.g. VCE) focused on virtual compute for cloud and non cloud environments
  • A new company Acadia owned by EMC and Cisco (1/3 each) along with Intel and VMware
  • A new go to market pre-sales, service and support cross technology domain skill set team
  • Solution bundles or vblocks with technology from Cisco, EMC, Intel and VMware

What are the vblocks and components?
Pre-configured (see this link for a 3D model), tested, and supported with a single throat to choke model for streamlined end to end management and acquisition. There are three vblocks or virtual building blocks that include server, storage, I/O networking, and virtualization hypervisor software along with associated IRM software tools.

Cisco is bringing to the game their Unified Compute Solution (UCS) server along with Nexus 1000v and Multilayer Director (MDS) switches, EMC is bringing storage (Symmetrix VMax, CLARiiON and unified storage) along with their RSA security and Ionix IRM tools. VMware is providing their vSphere hypervisors running on Intel based services (via Cisco).

The components include:

  • EMC Ionix management tools and framework – The IRM tools
  • EMC RSA security framework software – The security tools
  • EMC VMware vSphere hypervisor virtualization software – The virtualization layer
  • EMC VMax, CLARiiON and unified storage systems – The storage
  • Cisco Nexus 1000v and MDS switches – The Network and connectivity
  • Cisco Unified Compute Solution (UCS) – The physical servers
  • Services and support – Cross technology domain presales, delivery and professional services

CiscoEMCVMwarevblock.jpg
Figure 2: Source: Cisco vblock (Server, Storage, Networking and Virtualization Software) via Cisco

The three vblock models are:
Vblock0: entry level system due out in 2010 supporting 300 to 800 VMs for initial customer consolidation, private clouds or other diverse applications in small or medium sized business. You can think of this as a SAN in a CAN or Data Center in a box with Cisco UCS and Nexus 1000v, EMC unified storage secured by RSA and VMware vSphere.

Vblock1: mid sized building block supporting 800 to 3000 VMs for consolidation and other optimization initiatives using Cisco UCS, Nexus and MDS switches along with EMC CLARiiON storage secured with RSA software hosting VMware hypervisors.

Vblock2 high end supporting up 3000 to 6000 VMs for large scale data center transformation or new virtualization efforts combing Cisco Unified Computing System (UCS), Nexus 1000v and MDS switches and EMC VMax Symmetix storage with RSA security software hosting VMware vSpshere hypervisor.

What does this all mean?
With this move, for some it will add fuel to the campfire that Cisco is moving closer to EMC and or VMware with a pre-nuptial via Acadia. For others, this will be seen as fragmentation for virtualization particularly if other vendors such as Dell, Fujitsu, HP, IBM and Microsoft among others are kept out of the game, not to mention their channels of vars or IT customers barriers.

Acadia is a new company or more precisely, a joint venture being created by major backers EMC and Cisco with minority backers being VMware and Intel.

Like any other joint ventures, for examples those commonly seen in the airline industry (e.g. transportation utility) where carriers pool resources such as SkyTeam whose members include Delta who had a JV with Airframe owner of KLM who had a antitrust immunity JV with northwest (now being digested by Delta).

These joint ventures can range from simple marketing alliances like you see with EMC programs such as their Select program to more formal OEM to ownership as is the case with VMware and RSA to this new model for Acadia.

An airline analogy may not be the most appropriate, yet there are some interesting similarities, least of which that air carriers rely on information systems and technologies provided by members of this collation among others. There is also a correlation in that joint ventures are about streamlining and creating a seamless end to end customer experience. That is, give them enough choice and options, keep them happy, take out the complexities and hopefully some cost, and with customer control come revenue and margin or profits.

Certainly there are opportunities to streamline and not just simply cut corners, perhaps that’s another area or analogy with the airlines where there is a current focus on cutting, nickel and dimming for services. Hopefully the Acadia and VCE are not just another example of vendors getting together around the campfire to sing Kumbaya in the name of increasing customer adoption, cost cutting or putting a marketing spin on how to sell more to customers for account control.

Now with all due respect to the individual companies and personal, at least in this iteration, it is not as much about the technology or packaging. Likewise, while important, it is also not just about bundling, integration and testing (they are important) as we have seen similar solutions before.

Rather, I think this has the potential for changing the way server, storage and networking hardware along with IRM and virtualization software are sold into organizations, for the better or worse.

What Im watching is how Acadia and their principal backers can navigate the channel maze and ultimately the customer maze to sell a cross technology domain solution. For example, will a sales call require six to fourteen legs (e.g. one person is a two legged call for those not up on sales or vendor lingo) with a storage, server, networking, VMware, RSA, Ionix and services representative?

Or, can a model to drive down the number of people or product specialist involved in a given sales call be achieved leveraging people with cross technology domain skills (e.g. someone who can speak server and storage hardware and software along with networking)?

Assuming Acadia and VCE vblocks address product integration issues, I see the bigger issue as being streamlining the sales process (including compensation plans) along with how partners are dealt with not to mention customers.

How will the sales pitch be to the Cisco network people at VARs or customer sites, or too the storage or server or VMware teams, or, all of the above?

What about the others?
Cisco has relationships with Dell, HP, IBM, Microsoft and Oracle/Sun among others that they will be stepping even more on the partner toes than when they launched the UCS earlier this year. EMC for its part if fairly diversified and is not as subservient to IBM however has a history of partnering with Dell, Oracle and Microsoft among others.

VMware has a smaller investment and thus more in the wings as is Intel given that both have large partnership with Dell, HP, IBM and Microsoft. Microsoft is of interest here because on one front the bulk of all servers virtualized into VMware VMs are Windows based.

On the other hand, Microsoft has their own virtualization hypervisor HyperV that depending upon how you look at it, could be a competitor of VMware or simply a nuisance. Im of the mindset that its still to early and don’t judge this game on the first round which VMware has won. Keep in mind the history such as desktop and browser wars that Microsoft lost in the first round only to come back strong later. This move could very well invigorate Microsoft, or perhaps Oracle, Citrix among others.

Now this is far from the first time that we have seen alliances, coalitions, marketing or sales promotion cross technology vendor clubs in the industry let alone from the specific vendors involved in this announcement.

One that comes to mind was 3COMs failed attempt in the late 90s to become the first traditional networking vendor to get into SANs, that was many years before Cisco could spell SAN let alone their Andiamo startup incubated. The 3COM initiative which was cancelled due to financial issues literally on the eve of rollout was to include the likes of STK (pre-sun), Qlogic, Anchor (People were still learning how to spell Brocade), Crossroads (FC to SCSI routers for tape), Legato (pre-EMC), DG CLARiiON (Pre-EMC), MTI (sold their patents to EMC, became a reseller, now defunct) along with some others slated to jump on the bandwagon.

Lets also not forget that while among the traditional networking market vendors Cisco is the $32B giant and all of the others including 3Com, Brocade, Broadcom, Ciena, Emulex, Juniper and Qlogic are the seven plus dwarfs. However, keep the $23B USD Huawei networking vendor that is growing at a 45% annual rate in mind.

I would keep an eye on AMD, Brocade, Citrix, Dell, Fujitsu, HP, Huawei, Juniper, Microsoft, NetApp, Oracle/Sun, Rackable and Symantec among many others for similar joint venture or marketing alliances.

Some of these have already surfaced with Brocade and Oracle sharing hugs and chugs (another sales term referring to alliance meetings over beers or shots).

Also keep in mind that VMware has a large software (customer business) footprint deployed on HP with Intel (and AMD) servers.

Oh, and those VMware based VMs running on HP servers also just happen to be hosting in their neighbor of 80% or more Windows based guests operating systems, I would say its game on time.

When I say its game on time, I dont think VMware is brash enough to cut HP (or others) off forcing them to move to Microsoft for virtualization. However the game is about control, control of technology stacks and partnerships, control of vars, integrators and the channel, as well as control of customers.

If you cannot tell, I find this topic fun and interesting.

For those who only know me from servers they often ask when did I learn about networking to which I say check out one of my books (Resilient Storage Networks-Elsevier). Meanwhile for others who know me from storage I get asked when did I learn about or get into servers to which I respond about 28 years ago when I worked in IT as the customer.

Bottom line on Acadia, vblocks and VCE for now, I like the idea of a unified and bundled solution as long as they are open and flexible.

On the other hand, I have many questions and even skeptical in some areas including of how this plays out for Cisco and EMC in terms of if it can be a unifier or polarized causing market fragmentation.

For some this is or will be dejavu, back to the future, while for others it is a new, exciting and revolutionary approach while for others it will be new fodder for smack talk!

More to follow soon.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Optimize Data Storage for Performance and Capacity Efficiency

This post builds on a recent article I did that can be read here.

Even with tough economic times, there is no such thing as a data recession! Thus the importance of optimizing data storage efficiency addressing both performance and capacity without impacting availability in a cost effective way to do more with what you have.

What this means is that even though budgets are tight or have been cut resulting in reduced spending, overall net storage capacity is up year over year by double digits if not higher in some environments.

Consequently, there is continued focus on stretching available IT and storage related resources or footprints further while eliminating barriers or constraints. IT footprint constraints can be physical in a cabinet or rack as well as floorspace, power or cooling thresholds and budget among others.

Constraints can be due to lack of performance (bandwidth, IOPS or transactions), poor response time or lack of availability for some environments. Yet for other environments, constraints can be lack of capacity, limited primary or standby power or cooling constraints. Other constraints include budget, staffing or lack of infrastructure resource management (IRM) tools and time for routine tasks.

Look before you leap
Before jumping into an optimization effort, gain insight if you do not already have it as to where the bottlenecks exist, along with the cause and effect of moving or reconfiguring storage resources. For example, boosting capacity use to more fully use storage resources can result in a performance issue or data center bottlenecks for other environments.

An alternative scenario is that in the quest to boost performance, storage is seen as being under-utilized, yet when capacity use is increased, low and behold, response time deteriorates. The result can be a vicious cycle hence the need to address the issue as opposed to moving problems by using tools to gain insight on resource usage, both space and activity or performance.

Gaining insight means looking at capacity use along with performance and availability activity and how they use power, cooling and floor-space. Consequently an important tool is to gain insight and knowledge of how your resources are being used to deliver various levels of service.

Tools include storage or system resource management (SRM) tools that report on storage space capacity usage, performance and availability with some tools now adding energy usage metrics along with storage or system resource analysis (SRA) tools.

Cooling Off
Power and cooling are commonly talked about as constraints, either from a cost standpoint, or availability of primary or secondary (e.g. standby) energy and cooling capacity to support growth. Electricity is essential for powering IT equipment including storage enabling devices to do their specific tasks of storing data, moving data, processing data or a combination of these attributes.

Thus, power gets consumed, some work or effort to move and store data takes place and the by product is heat that needs to be removed. In a typical IT data center, cooling on average can account for about 50% of energy used with some sites using less.

With cooling being a large consumer of electricity, a small percentage change to how cooling consumes energy can yield large results. Addressing cooling energy consumption can be to discuss budget or cost issues, or to enable cooling capacity to be freed up to support installation of extra storage or other IT equipment.

Keep in mind that effective cooling relies on removing heat from as close to the source as possible to avoid over cooling which requires more energy. If you have not done so, have a facilities review or assessment performed that can range from a quick walk around, to a more in-depth review and thermal airflow analysis. A means of removing heat close to the sort are techniques such as intelligent, precision or smart cooling also known by other marketing names.

Powering Up, or, Powering Down
Speaking of energy or power, in addition to addressing cooling, there are a couple of ways of addressing power consumption by storage equipment (Figure 1). The most popular discussed approach towards efficiency is energy avoidance involving powering down storage when not used such as first generation MAID at the cost of performance.

For off-line storage, tape and other removable media give low-cost capacity per watt with low to no energy needed when not in use. Second generation (e.g. MAID 2.0) solutions with intelligent power management (IPM) capabilities have become more prevalent enabling performance or energy savings on a more granular or selective basis often as a standard feature in common storage systems.

GreenOptionsBalance
Figure 1:  How various RAID levels and configuration impact or benefit footprint constraints

Another approach to energy efficiency is seen in figure 1 which is doing more work for active applications per watt of energy to boost productivity. This can be done by using same amount of energy however doing more work, or, same amount of work with less energy.

For example instead of using larger capacity disks to improve capacity per watt metrics, active or performance sensitive storage should be looked at on an activity basis such as IOP, transactions, videos, emails or throughput per watt. Hence, a fast disk drive doing work can be more energy-efficient in terms of productivity than a higher capacity slower disk drive for active workloads, where for idle or inactive, the inverse should hold true.

On a go forward basis the trend already being seen with some servers and storage systems is to do both more work, while using less energy. Thus a larger gap between useful work (for active or non idle storage) and amount of energy consumed yields a better efficiency rating, or, take the inverse if that is your preference for smaller numbers.

Reducing Data Footprint Impact
Data footprint impact reduction tools or techniques for both on-line as well as off-line storage include archiving, data management, compression, deduplication, space-saving snapshots, thin provisioning along with different RAID levels among other approaches. From a storage access standpoint, you can also include bandwidth optimization, data replication optimization, protocol optimizers along with other network technologies including WAFS/WAAS/WADM to help improve efficiency of data movement or access.

Thin provisioning for capacity centric environments can be used to achieving a higher effective storage use level by essentially over booking storage similar to how airlines oversell seats on a flight. If you have good historical information and insight into how storage capacity is used and over allocated, thin provisioning enables improved effective storage use to occur for some applications.

However, with thin provisioning, avoid introducing performance bottlenecks by leveraging solutions that work closely with tools that providing historical trending information (capacity and performance).

For a technology that some have tried to declare as being dead to prop other new or emerging solutions, RAID remains relevant given its widespread deployment and transparent reliance in organizations of all size. RAID also plays a role in storage performance, availability, capacity and energy constraints as well as a relief tool.

The trick is to align the applicable RAID configuration to the task at hand meeting specific performance, availability, capacity or energy along with economic requirements. For some environments a one size fits all approach may be used while others may configure storage using different RAID levels along with number of drives in RAID sets to meet specific requirements.


Figure 2:  How various RAID levels and configuration impact or benefit footprint constraints

Figure 2 shows a summary and tradeoffs of various RAID levels. In addition to the RAID levels, how many disks can also have an impact on performance or capacity, such as, by creating a larger RAID 5 or RAID 6 group, the parity overhead can be spread out, however there is a tradeoff. Tradeoffs can be performance bottlenecks on writes or during drive rebuilds along with potential exposure to drive failures.

All of this comes back to a balancing act to align to your specific needs as some will go with a RAID 10 stripe and mirror to avoid risks, even going so far as to do triple mirroring along with replication. On the other hand, some will go with RAID 5 or RAID 6 to meet cost or availability requirements, or, some I have talked with even run RAID 0 for data and applications that need the raw speed, yet can be restored rapidly from some other medium.

Lets bring it all together with an example
Figure 3 shows a generic example of a before and after optimization for a mixed workload environment, granted you can increase or decrease the applicable capacity and performance to meet your specific needs. In figure 3, the storage configuration consists of one storage system setup for high performance (left) and another for high-capacity secondary (right), disk to disk backup and other near-line needs, again, you can scale the approach up or down to your specific need.

For the performance side (left), 192 x 146GB 15K RPM (28TB raw) disks provide good performance, however with low capacity use. This translates into a low capacity per watt value however with reasonable IOPs per watt and some performance hot spots.

On the capacity centric side (right), there are 192 x 1TB disks (192TB raw) with good space utilization, however some performance hot spots or bottlenecks, constrained growth not to mention low IOPS per watt with reasonable capacity per watt. In the before scenario, the joint energy use (both arrays) is about 15 kWh or 15,000 watts which translates to about $16,000 annual energy costs (cooling excluded) assuming energy cost of 12 cents per kWh.

Note, your specific performance, availability, capacity and energy mileage will vary based on particular vendor solution, configuration along with your application characteristics.


Figure 3: Baseline before and after storage optimization (raw hardware) example

Building on the example in figure 3, a combination of techniques along with technologies yields a net performance, capacity and perhaps feature functionality (depends on specific solution) increase. In addition, floor-space, power, cooling and associated footprints are also reduced. For example, the resulting solution shown (middle) comprises 4 x 250GB flash SSD devices, along with 32 x 450GB 15.5K RPM and 124 x 2TB 7200RPM enabling an 53TB (raw) capacity increase along with performance boost.

The previous example are based on raw or baseline capacity metrics meaning that further optimization techniques should yield improved benefits. These examples should also help to discuss the question or myth that it costs more to power storage than to buy it which the answer should be it depends.

If you can buy the above solution for say under $50,000 (cost to power), or, let alone, $100,000 (power and cool) for three years which would also be a good acquisition, then the myth of buying is more expensive than powering holds true. However, if a solution as described above costs more, than the story changes along with other variables include energy costs for your particular location re-enforcing the notion that your mileage will vary.

Another tip is that more is not always better.

That is, more disks, ports, processors, controllers or cache do not always equate into better performance. Performance is the sum of how those and other pieces working together in a demonstrable way, ideally your specific application workload compared to what is on a product data sheet.

Additional general tips include:

  • Align the applicable tool, technique or technology to task at hand
  • Look to optimize for both performance and capacity, active and idle storage
  • Consolidated applications and servers need fast servers
  • Fast servers need fast I/O and storage devices to avoid bottlenecks
  • For active storage use an activity per watt metric such as IOP or transaction per watt
  • For in-active or idle storage, a capacity per watt per footprint metric would apply
  • Gain insight and control of how storage resources are used to meet service requirements

It should go without saying, however sometimes what is understood needs to be restated.

In the quest to become more efficient and optimized, avoid introducing performance, quality of service or availability issues by moving problems.

Likewise, look beyond storage space capacity also considering performance as applicable to become efficient.

Finally, it is all relative in that what might be applicable to one environment or application need may not apply to another.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

EPA Energy Star for Data Center Storage Update

EPA Energy Star

Following up on a recent post about Green IT, energy efficiency and optimization for servers, storage and more, here are some additional  thoughts, perspectives along with industry activity around the U.S. Environmental Protection Agency (EPA) Energy Star for Server, Data Center Storage and Data Centers.

First a quick update, Energy Star for Servers is in place with work now underway on expanding and extending beyond the first specification. Second is that Energy Star for Data Center storage definition is well underway including a recent workshop to refine the initial specification along with discussion for follow-on drafts.

Energy Star for Data Centers is also currently undergoing definition which is focused more on macro or facility energy (notice I did not say electricity) efficiency as opposed to productivity or effectiveness, items that the Server and Storage specifications are working towards.

Among all of the different industry trade or special interests groups, at least on the storage front the Storage Networking Industry Association (SNIA) Green Storage Initiative (GSI) and their Technical Work Groups (TWG) have been busily working for the past couple of years on taxonomies, metrics and other items in support of EPA Energy Star for Data Center Storage.

A challenge for SNIA along with others working on related material pertaining to storage and efficiency is the multi-role functionality of storage. That is, some storage simply stores data with little to no performance requirements while other storage is actively used for reading and writing. In addition, there are various categories, architectures not to mention hardware and software feature functionality or vendors with different product focus and interests.

Unlike servers that are either on and doing work, or, off or in low power mode, storage is either doing active work (e.g. moving data), storing in-active or idle data, or a combination of both. Hence for some, energy efficiency is about how much data can be stored in a given footprint with the least amount of power known as in-active or idle measurement.

On the other hand, storage efficiency is also about using the least amount of energy to produce the most amount of work or activity, for example IOPS or bandwidth per watt per footprint.

Thus the challenge and need for at least a two dimensional  model looking at, and reflecting different types or categories of storage aligned for active or in-active (e.g. storing) data enabling apples to apples, vs. apples to oranges comparison.

This is not all that different from how EPA looks at motor vehicle categories of economy cars, sport utility, work or heavy utility among others when doing different types of work, or, in idle.

What does this have to do with servers and storage?

Simple, when a server powers down where does its data go? That’s right, to a storage system using disk, ssd (RAM or flash), tape or optical for persistency. Likewise, when there is work to be done, where does the data get read into computer memory from, or written to? That’s right, a storage system. Hence the need to look at storage in a multi-tenant manner.

The storage industry is diverse with some vendors or products focused on performance or activity, while others on long term, low cost persistent storage for archive, backup, not to mention some doing a bit of both. Hence the nomenclature of herding cats towards a common goal when different parties have various interests that may conflict yet support needs of various customer storage usage requirements.

Figure 1 shows a simplified, streamlined storage taxonomy that has been put together by SNIA representing various types, categories and functions of data center storage. The green shaded areas are a good step in the right direction to simplify yet move towards realistic and achievable befits for storage consumers.


Figure 1 Source: EPA Energy Star for Data Center Storage web site document

The importance of the streamlined SNIA taxonomy is to help differentiate or characterize various types and tiers of storage (Figure 2) products facilitating apples to apples comparison instead of apples or oranges. For example, on-line primary storage needs to be looked at in terms of how much work or activity per energy footprint determines efficiency.


Figure 2: Tiered Storage Example

On other hand, storage for retaining large amounts of data that is in-active or idle for long periods of time should be looked at on a capacity per energy footprint basis. While final metrics are still being flushed out, some examples could be active storage gauged by IOPS or work or bandwidth per watt of energy per footprint while other storage for idle or inactive data could be looked at on a capacity per energy footprint basis.

What benchmarks or workloads to be used for simulating or measuring work or activity are still being discussed with proposals coming from various sources. For example SNIA GSI TWG are developing measurements and discussing metrics, as have the storage performance council (SPC) and SPEC among others including use of simulation tools such as IOmeter, VMware VMmark, TPC, Bonnie, or perhaps even Microsoft ESRP.

Tenants of Energy Star for Data Center Storage overtime hopefully will include:

  • Reflective of different types, categories, price-bands and storage usage scenarios
  • Measure storage efficiency for active work along with in-active or idle usage
  • Provide insight for both storage performance efficiency and effective capacity
  • Baseline or raw storage capacity along with effective enhanced optimized capacity
  • Easy to use metrics with more in-depth back ground or disclosure information

Ultimately the specification should help IT storage buyers and decision makers to compare and contrast different storage systems that are best suited and applicable to their usage scenarios.

This means measuring work or activity per energy footprint at a given capacity and data protection level to meet service requirements along with during in-active or idle periods. This also means showing storage that is capacity focused in terms of how much data can be stored in a given energy footprint.

One thing that will be tricky however will be differentiating GBytes per watt in terms of capacity, or, in terms of performance and bandwidth.

Here are some links to learn more:

Stay tuned for more on Energy Star for Data Centers, Servers and Data Center Storage.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Clouds are like Electricity: Dont be Scared

Clouds

IT clouds (compute, applications, storage, and services) are like electricity in that they can be scary or confusing to some while being enabling or a necessity s to others not to mention being a polarizing force depending on where you sit or view them.

As a polarizing force, if you are a cloud crowd cheerleader or evangelist, you might view someone who does not subscribe or share your excitement, views or interpretations as a cynic.

On the other hand, if you are a skeptic, or perhaps scared or even a cynic, you might view anyone who talks about cloud in general or not specific terms as a cheerleader.

I have seen and experienced this electrifying polarization first hand having being told by crowd cloud cheerleaders or evangelists that I dont like clouds, that Im a cynic who does not know anything about clouds.

As a funny aside (at least I thought it was funny), I recently asked someone who gave me an ear full while they were trying to convert me to be a cloud believer if they had read any of the chapters in my new book The Green and Virtual Data Center (CRC). The response was NO and I said to the effect to bad, as in the book, I talk about how clouds can be complimentary to existing IT resources as being another tier of servers, storage, applications, facilities and IT services.

On the other hand, and this might be funny for some of the crowd cloud, when I bring up tiered IT resources including servers, storage, applications and facilities as well as where or how clouds can fit to compliment IT, I have been told by cynics or naysayers that Im a cloud cheerleader.

Wow, talk about polarized sides!

Now, what about all those that are somewhere in the middle, those that are skeptics who might see value for IT clouds for different scenarios and may in fact already be using clouds (depending upon someones definition).

For those in the middle, whether they are vendors, vars, media, press, analysts, consultants, IT professionals, investors or others, they can easily be misunderstood, misrepresented, and a missed opportunity, perhaps even lamented by those on either of the two extremes (e.g. cloud crowd cheerleaders or true skeptic nay sayers).

Time for some education, don’t be scared, however be careful!

When I worked for an electric power generating and transmission utility an important lesson was not to be scared of electricity, however, be educated, what to do, what not to do in different situations including what to do or not do in the actual power plant or substation. I was taught that when in the actual plant, or at a substation of which I visited in support of the applications and systems I was developing or maintaining, to do certain things. For example, number one, dont touch certain things, number two, if you fall, don’t grab anything, the fall may or may not hurt you, let alone the sudden stop where ever you land, however, if you grab something, that might kill you and you may not be able to let go further injuring yourself. This was a challenging thought as we are taught to grab onto something when falling.

What does this have to do with clouds?

Don’t grab and hang-on if you don’t know what you are grabbing on to if you don’t have to.

The cloud crowd can be polarizing and in some ways acting as a lightning rod drawing the scorns, cynicism ,skeptics, lambasting or being poked fun of given some of the over the top hype around clouds today. Now granted, not all cloud evangelists, vendors or cheerleaders deserve to be the brunt of some of this backlash within the industry; however, it comes with the territory.

Im in the middle as I pointed out above when I talk with vendors, vars, media, investors and IT customers.  Some I talk with are using clouds (perhaps not compliant with some of the definitions). Some are looking at clouds to move problems or mask issues, others are curious yet skeptical to see where or how they could use clouds to compliment their environments. Yet others are scared however maybe in the future will be more open minded as they become educated and see technologies evolve or shift beyond a fashionable trend.

So its time for disclosure, I seeIT clouds as being complimentary that can co-exist with other IT resources (servers, storage, software). In essence, my view is that clouds are just another tier of IT resources to be used when and where applicable as opposed to being a complete replacement, or, simply ignored.

My point is that cloud computing is another tier of traditional computing or servers providing a different performance, availability, capacity, economic and management attributes compared to other traditional technology delivery vehicles. Same thing with storage, same thing with data centers or hosting sites in general. This also applies to application services, in that a cloud web, email, expense, sales, crm, erp, office or other applications is a tier of those same implementations that may exist in a traditional environment. After all, legacy, physical, virtual, grid and cloud IT datacenters all have something in common, they rely on physical servers, storage, networks, software, metrics and management involving people, processes and best practices.

Now back to disclosure, I like clouds, however Im not a cloud cheerleader, Im a skeptic at times of some over the top hype, yet I also personally use some cloud services and technologies as well as advise others to leverage cloud services when, or where applicable to compliment, co-exist and help enable a green and virtual data center and information factory.

To the cloud crowd cheerleaders, too bad if I don’t line up with all of your belief systems or if you perceive me as raining on your parade by being a skeptic , or what you might think of as a cynic and non believer, even though I use clouds myself.

Likewise, to the true cynics (not skeptics) or naysayers, ease up, Im not drinking the cool-aid of the cheerleaders and evangelists, or at least not in large excessive binge doses. I agree that clouds are not the solution to every IT issue, regardless of what your definition of a cloud happens to be.

To everyone else, regardless of if you are the minatory or majority out there that do not fall into one of the two above groups I have this to say.

Dont be afraid, dont be scared of clouds, learn to navigate your way around and through the various technologies, techniques, products and services and indemnity where they might compliment and enable a flexible and scalable resilient IT infrastructure.

Take some time to listen and learn, become educated on what the different types of clouds (public, private, services, products, architectures, or marketecture), their attributes (compute, storage, applications, services, cost, availability, performance, protocols, functionality) and value proposition.

Look into how cloud technologies and techniques might compliment your existing environment to meet specific business objectives. You might find there are fits, you might there are not, however have a look and do some research so that you can at least hold your ground if storm clouds roll in.

After all, clouds are just another tier of IT resources to add to your tool box enabling more efficient and effective IT services delivery. Clouds do not have to be the all or nothing value proposition that often end up in discussions due to polarized extreme views and definitions or past experiences.

Look at it this way, IT relies on electricity, however electricity needs to be understood and respected not to mention used in effective ways. You can be scared of electricity, you can be caviler around it, or, it can be part of your environment and enabler as long as you know when, where and how to use it, not to mention not using it as applicable.

So next time you see a cloud crowd cheerleader, give them a hug, give them a pat on the back, an atta boy or atta girl as they are just doing their jobs, perhaps even following their beliefs and in the line of duty taking a lot of heat from the industry in the pursuit of their work.

On the other hand, as to the cynics and naysayers, they may in fact be using clouds already, perhaps not under the strict definition of some of the chieftains of the cloud crowd.

To everyone else, dont worry, don’t by scared about the clouds, instead, focus on your business, you IT issues and look at various tiers of technologies that can serve as an enabler in a cost effective manner.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Storage Efficiency and Optimization – The Other Green

For those of you in the New York City area, I will be presenting live in person at Storage Decisions September 23, 2009 conference The Other Green, Storage Efficiency and Optimization.

Throw out the "green“: buzzword, and you’re still left with the task of saving or maximizing use of space, power, and cooling while stretching available IT dollars to support growth and business sustainability. For some environments the solution may be consolation while others need to maintain quality of service response time, performance and availability necessitating faster, energy efficient technologies to achieve optimization objectives.

To accomplish these and other related issues, you can turn to the cloud, virtualization, intelligent power management, data footprint reduction and data management not to mention various types of tiered storage and performance optimization techniques. The session will look at various techniques and strategies to optimize either on-line active or primary as well as near-line or secondary storage environment during tough economic times, as well as to position for future growth, after all, there is no such thing as a data recession!

Topics, technologies and techniques that will be discussed include among others:

  • Energy efficiency (strategic) vs. energy avoidance (tactical), whats different between them
  • Optimization and the need for speed vs. the need for capacity, finding the right balance
  • Metrics & measurements for management insight, what the industry is doing (or not doing)
  • Tiered storage and tiered access including SSD, FC, SAS, tape, clouds and more
  • Data footprint reduction (archive, compress, dedupe) and thin provision among others
  • Best practices, financial incentives and what you can do today

This is a free event for IT professionals, however space I hear is limited, learn more and register here.

For those interested in broader IT data center and infrastructure optimization, check out the on-going seminar series The Infrastructure Optimization and Planning Best Practices (V2.009) – Doing more with less without sacrificing storage, system or network capabilities Seminar series continues September 22, 2009 with a stop in Chicago. This is also a free Seminar, register and learn more here or here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Back to School and Dedupe School

Summers is over hear in the northern hemisphere and its back to school time.

This coming week I will be the substitute teacher filling in for my friend Mr. Backup in Minneapolis and Toronto for TechTargets Dedupe School. If you are in either city and have not yet signed up, check out the link here to learn more.

Hope to see you this week, or, next week at Infrastructure Optimization in Chicago or Storage Decisions in NYC where I will also be presenting or teaching if you prefer, as well as listening and learning from the attendees whats on their minds.

Stay current on other upcoming activities on our events page, as well as see whats new or in the news here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Data Center I/O Bottlenecks Performance Issues and Impacts

This is an excerpt blog version of the popular Server and StorageIO Group white paper "IT Data Center and Data Storage Bottlenecks" originally published August of 2006 that is as much if not more relevant today than it was in the past.

Most Information Technology (IT) data centers have bottleneck areas that impact application performance and service delivery to IT customers and users. Possible bottleneck locations shown in Figure-1 include servers (application, web, file, email and database), networks, application software, and storage systems. For example users of IT services can encounter delays and lost productivity due to seasonal workload surges or Internet and other network bottlenecks. Network congestion or dropped packets resulting in wasteful and delayed retransmission of data can be the results of network component failure, poor configuration or lack of available low latency bandwidth.

Server bottlenecks due to lack of CPU processing power, memory or under sized I/O interfaces can result in poor performance or in worse case scenarios application instability. Application including database systems bottlenecks due to excessive locking, poor query design, data contention and deadlock conditions result in poor user response time. Storage and I/O performance bottlenecks can occur at the host server due to lack of I/O interconnect bandwidth such as an overloaded PCI interconnect, storage device contention, and lack of available storage system I/O capacity.

These performance bottlenecks, impact most applications and are not unique to the large enterprise or scientific high compute (HPC) environments. The direct impact of data center I/O performance issues include general slowing of the systems and applications, causing lost productivity time for users of IT services. Indirect impacts of data center I/O performance bottlenecks include additional management by IT staff to trouble shoot, analyze, re-configure and react to application delays and service disruptions.


Figure-1: Data center performance bottleneck locations

Data center performance bottleneck impacts (see Figure-1) include:

  • Under utilization of disk storage capacity to compensate for lack of I/O performance capability
  • Poor Quality of Service (QoS) causing Service Level Agreements (SLA) objectives to be missed
  • Premature infrastructure upgrades combined with increased management and operating costs
  • Inability to meet peak and seasonal workload demands resulting in lost business opportunity

I/O bottleneck impacts
It should come as no surprise that businesses continue to consume and rely upon larger amounts of disk storage. Disk storage and I/O performance fuel the hungry needs of applications in order to meet SLAs and QoS objectives. The Server and StorageIO Group sees that, even with efforts to reduce storage capacity or improve capacity utilization with information lifecycle management (ILM) and Infrastructure Resource Management (IRM) enabled infrastructures, applications leveraging rich content will continue to consume more storage capacity and require additional I/O performance. Similarly, at least for the next few of years, the current trend of making and keeping additional copies of data for regulatory compliance and business continue is expected to continue. These demands all add up to a need for more I/O performance capabilities to keep up with server processor performance improvements.


Figure-2: Processing and I/O performance gap

Server and I/O performance gap
The continued need for accessing more storage capacity results in an alarming trend: the expanding gap between server processing power and available I/O performance of disk storage (Figure-2). This server to I/O performance gap has existed for several decades and continues to widen instead of improving. The net impact is that bottlenecks associated with the server to I/O performance lapse result in lost productivity for IT personal and customers who must wait for transactions, queries, and data access requests to be resolved.

Application symptoms of I/O bottlenecks
There are many applications across different industries that are sensitive to timely data access and impacted by common I/O performance bottlenecks. For example, as more users access a popular file, database table, or other stored data item, resource contention will increase. One way resource contention manifests itself is in the form of database “deadlock” which translates into slower response time and lost productivity. 

Given the rise and popularity of internet search engines, search engine optimization (SEO) and on-line price shopping, some businesses have been forced to create expensive read-only copies of databases. These read-only copies are used to support more queries to address bottlenecks from impacting time sensitive transaction databases.

In addition to increased application workload, IT operational procedures to manage and protect data help to contribute to performance bottlenecks. Data center operational procedures result in additional file I/O scans for virus checking, database purge and maintenance, data backup, classification, replication, data migration for maintenance and upgrades as well as data archiving. The net result is that essential data center management procedures contribute to performance challenges and impacting business productivity.

Poor response time and increased latency
Generally speaking, as additional activity or application workload including transactions or file accesses are performed, I/O bottlenecks result in increased response time or latency (shown in Figure-3). With most performance metrics more is better; however, in the case of response time or latency, less is better.  Figure-3 shows the impact as more work is performed (dotted curve) and resulting I/O bottlenecks have a negative impact by increasing response time (solid curve) above acceptable levels. The specific acceptable response time threshold will vary by applications and SLA requirements. The acceptable threshold level based on performance plans, testing, SLAs and other factors including experience serves as a guide line between acceptable and poor application performance.

As more workload is added to a system with existing I/O issues, response time will correspondingly decrease as was seen in Figure-3. The more severe the bottleneck, the faster response time will deteriorate (e.g. increase) from acceptable levels. The elimination of bottlenecks enables more work to be performed while maintaining response time below acceptable service level threshold limits.


Figure-3: I/O response time performance impact

Seasonal and peak workload I/O bottlenecks
Another common challenge and cause of I/O bottlenecks is seasonal and/or unplanned workload increases that result in application delays and frustrated customers. In Figure-4 a workload representing an eCommerce transaction based system is shown with seasonal spikes in activity (dotted curve). The resulting impact to response time (solid curve) is shown in relation to a threshold line of acceptable response time performance. For example, peaks due holiday shopping exchanges appear in January then dropping off increasing near mother’s day in May, then back to school shopping in August results in increased activity as does holiday shopping starting in late November.


Figure-4: I/O bottleneck impact from surge workload activity

Compensating for lack of performance
Besides impacting user productivity due to poor performance, I/O bottlenecks can result in system instability or unplanned application downtime. One only needs to recall recent electric power grid outages that were due to instability, insufficient capacity bottlenecks as a result of increased peak user demand.

I/O performance improvement approaches to address I/O bottlenecks have been to do nothing (incur and deal with the service disruptions) or over configure by throwing more hardware and software at the problem. To compensate for lack of I/O performance and counter the resulting negative impact to IT users, a common approach is to add more hardware to mask or move the problem.

However, this often leads to extra storage capacity being added to make up for a short fall in I/O performance. By over configuring to support peak workloads and prevent loss of business revenue, excess storage capacity must be managed throughout the non-peak periods, adding to data center and management costs. The resulting ripple affect is that now more storage needs to be managed, including allocating storage network ports, configuring, tuning, and backing up of data. This can and does result in environments that have storage utilization well below 50% of their useful storage capacity. The solution is to address the problem rather than moving and hiding the bottleneck elsewhere (rather like sweeping dust under the rug).

Business value of improved performance
Putting a value on the performance of applications and their importance to your business is a necessary step in the process of deciding where and what to focus on for improvement. For example, what is the value of reducing application response time and the associated business benefit of allowing more transactions, reservations or sales to be made? Likewise, what is the value of improving the productivity of a designer or animator to meet tight deadlines and market schedules? What is business benefit of enabling a customer to search faster for and item, place an order, access media rich content, or in general improve their productivity?

Server and I/O performance gap as a data center bottleneck
I/O performance bottlenecks are a wide spread issue across most data centers, affecting many applications and industries. Applications impacted by data center I/O bottlenecks to be looked at in more depth are electronic design automation (EDA), entertainment and media, database online transaction processing (OLTP) and business intelligence. These application categories represent transactional processing, shared file access for collaborative work, and processing of shared, time sensitive data.

Electronic design
Computer aided design (CAD), computer assisted engineering (CAE), electronic design automaton (EDA) and other design tools are used for a wide variety of engineering and design functions. These design tools require fast access to shared, secured and protected data. The objective of using EDA and other tools is to enable faster product development with better quality and improved worker productivity. Electronic components manufactured for the commercial, consumer and specialized markets rely on design tools to speed the time-to-market of new products as well as to improve engineer productivity.

EDA tools, including those from Cadence, Synopsis, Mentor Graphics and others, are used to develop expensive and time sensitive electronic chips, along with circuit boards and other components to meet market windows and suppler deadlines. An example of this is a chip vendor being able to simulate, develop, test, produce and deliver a new chip in time for manufacturers to release their new products based on those chips. Another example is aerospace and automotive engineering firms leveraging design tools, including CATIA and UGS, on a global basis relying on their suppler networks to do the same in a real-time, collaborative manner to improve productivity and time-to-market. These results in contention of shared file and data access and, as a work-around, more copies of data kept as local buffers.

I/O performance impacts and challenges for EDA, CAE and CAD systems include:

  • Delays in drawing and file access resulting in lost productivity and project delays
  • Complex configurations to support computer farms (server grids) for I/O and storage performance
  • Proliferation of dedicated storage on individual servers and workstations to improve performance

Entertainment and media
While some applications are characterized by high bandwidth or throughput, such as streaming video and digital intermediate (DI) processing of 2K (2048 pixels per line) and 4K (4096 pixels per line) video and film, there are many other applications that are also impacted by I/O performance time delays. Even bandwidth intensive applications for video production and other applications are time sensitive and vulnerable to I/O bottleneck delays. For example, cell phone ring tone, instant messaging, small MP3 audio, and voice- and e-mail are impacted by congestion and resource contention.

Prepress production and publishing requiring assimilation of many small documents, files and images while undergoing revisions can also suffer. News and information websites need to look up breaking stories, entertainment sites need to view and download popular music, along with still images and other rich content; all of this can be negatively impacted by even small bottlenecks.  Even with streaming video and audio, access to those objects requires accessing some form of a high speed index to locate where the data files are stored for retrieval. These indexes or databases can become bottlenecks preventing high performance storage and I/O systems from being fully leveraged.

Index files and databases must be searched to determine the location where images and objects, including streaming media, are stored. Consequently, these indices can become points of contention resulting in bottlenecks that delay processing of streaming media objects. When cell phone picture is taken phone and sent to someone, chances are that the resulting image will be stored on network attached storage (NAS) as a file with a corresponding index entry in a database at some service provider location. Think about what happens to those servers and storage systems when several people all send photos at the same time.

I/O performance impacts and challenges for entertainment and media systems include:

  • Delays in image and file access resulting in lost productivity
  • Redundant files and storage local servers to improve performance
  • Contention for resources causing further bottlenecks during peak workload surges

OLTP and business intelligence
Surges in peak workloads result in performance bottlenecks on database and file servers, impacting time sensitive OLTP systems unless they are over configured for peak demand. For example, workload spikes due to holiday and back-to-school shopping, spring break and summer vacation travel reservations, Valentines or Mothers Day gift shopping, and clearance and settlement on peak stock market trading days strain fragile systems. For database systems maintaining performance for key objects, including transaction logs and journals, it is important to eliminate performance issues as well as maintain transaction and data integrity.

An example tied to eCommerce is business intelligence systems (not to be confused with back office marketing and analytics systems for research). Online business intelligence systems are popular with online shopping and services vendors who track customer interests and previous purchases to tailor search results, views and make suggestions to influence shopping habits.

Business intelligence systems need to be fast and support rapid lookup of history and other information to provide purchase histories and offer timely suggestions. The relative performance improvements of processors shift the application bottlenecks from the server to the storage access network. These applications have, in some cases, resulted in an exponential increase in query or read operations beyond the capabilities of single database and storage instances, resulting in database deadlock and performance problems or the proliferation of multiple data copies and dedicated storage on application servers.

A more recent contribution to performance challenges, caused by the increased availability of on-line shopping and price shopping search tools, is low cost craze (LCC) or price shopping. LCC has created a dramatic increase in the number of read or search queries taking place, further impacting database and file systems performance. For example, an airline reservation system that supports price shopping while preventing impact to time sensitive transactional reservation systems would create multiple read-only copies of reservations databases for searches. The result is that more copies of data must be maintained across more servers and storage systems thus increasing costs and complexity. While expensive, the alternative of doing nothing results in lost business and market share.

I/O performance impacts and challenges for OLTP and business intelligence systems include:

  • Application and database contention, including deadlock conditions, due to slow transactions
  • Disruption to application servers to install special monitoring, load balance or I/O driver software
  • Increased management time required to support additional storage needed as a I/O workaround

Summary/Conclusion
It is vital to understand the value of performance, including response time or latency, and numbers of I/O operations for each environment and particular application. While the cost per raw TByte may seem relatively in-expensive, the cost for I/O response time performance also needs to be effectively addressed and put into the proper context as part of the data center QoS cost structure.

There are many approaches to address data center I/O performance bottlenecks with most centered on adding more hardware or addressing bandwidth or throughput issues. Time sensitive applications depend on low response time as workload including throughput increase and thus latency can not be ignored. The key to removing data center I/O bottlenecks is to find and address the problem instead of simply moving or hiding it with more hardware and/or software. Simply adding fast devices such as SSD may provide relief, however if the SSDs are attached to high latency storage controllers, the full benefit may not be realized. Thus, identify and gain insight into data center and I/O bottleneck paths eliminating issues and problems to boost productivity and efficiency.

Where to Learn More
Additional information about IT data center, server, storage as well as I/O networking bottlenecks along with solutions can be found at the Server and StorageIO website in the tips, tools and white papers, as well as news, books, and activity on the events pages. If you are in the New York area on September 23, 2009, check out my presentation on The Other Green – Storage Optimization and Efficiency that will touch on the above and other related topics. Download your copy of "IT Data Center and Storage Bottlenecks" by clicking here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Upcoming Out and About Events

Following up on previous Out and About updates ( here and here ) of where I have been, heres where I’m going to be over the next couple of weeks.

On September 15th and 16th 2009, I will be the keynote speaker along with doing a deep dive discussion around data deduplication in Minneapolis, MN and Toronto ON. Free Seminar, register and learn more here.

The Infrastructure Optimization and Planning Best Practices (V2.009) – Doing more with less without sacrificing storage, system or network capabilities Seminar series continues September 22, 2009 with a stop in Chicago. Free Seminar, register and learn more here.

On September 23, 2009 I will be in New York City at Storage Decisions conference participating in the Ask the Experts during the expo session as well as presenting The Other Green — Storage Efficiency and Optimization.

Throw out the "green“: buzzword, and you’re still left with the task of saving or maximizing use of space, power, and cooling while stretching available IT dollars to support growth and business sustainability. For some environments the solution may be consolation while others need to maintain quality of service response time, performance and availability necessitating faster, energy efficient technologies to achieve optimization objectives. To accomplish these and other related issues, you can turn to the cloud, virtualization, intelligent power management, data footprint reduction and data management not to mention various types of tiered storage and performance optimization techniques. The session will look at various techniques and strategies to optimize either on-line active or primary as well as near-line or secondary storage environment during tough economic times, as well as to position for future growth, after all, there is no such thing as a data recession!

Topics, technologies and techniques that will be discussed include among others:

  • Energy efficiency (strategic) vs. energy avoidance (tactical)
  • Optimization and the need for speed vs. the need for capacity
  • Metrics and measurements for management insight
  • Tiered storage and tiered access including SSD, FC, SAS and clouds
  • Data footprint reduction (archive, compress, dedupe) and thin provision
  • Best practices, financial incentives and what you can do today

Free event, learn more and register here.

Check out the events page for other upcoming events and hope to see you this fall while Im out and about.

Cheers – gs

Greg Schulz – StorageIOblog, twitter @storageio Author “The Green and Virtual Data Center” (CRC)

Summer Book Update and Back to School Reading

August and thus Summer 2009 in the northern hemisphere are swiftly passing by and start of a new school year is just around the corner which means it is also time for final vacations, time at the beach, pool, golf course, amusement park or favorite fishing hole among other past times. In order to help get you ready for fall (or late summer) book shopping for those with IT interests, here are some Amazon lists (here, here and here) for ideas, after all, the 2009 holiday season is not that far away!

Here’s a link to my Amazon.com Authors page that includes coverage of both my books, "The Green and Virtual Data Center" (CRC) and "Resilient Storage Networks – Designing Scalable Flexible Data Infrastructures" (Elsevier).

The Green and Virtual Data Center (CRC)Resilient Storage Networks - Designing Flexible Scalable Data Infrastructures (Elsevier)

Click here to look inside "The Green and Virtual Data Center" (CRC) and or inside "Resilient Storage Networks" (Elsevier).

Its been six months since the launch announcement of my new book "The Green and Virtual Data Center" (CRC) and general availability at Amazon.com and other global venues here and here. In celebration of the six month anniversary of the book launch (thank you very much to all who have bought a copy!), here is some coverage including what is being said, related articles, interviews, book reviews and more.

Article: New Green Data Center: shifting from avoidance to becoming more efficient IT-World August 2009

wsradio.com interview discussing themes and topics covered in the book including closing the green gap and shifting towards an IT efficiency and productivity for business sustainability.

Closing the green gap: Discussion about expanding data centers with environmental benefits at SearchDataCenter.com

From Greg Brunton – EDS/An HP Company: “Greg Schulz has presented a concise and visionary perspective on the Green issues, He has cut through the hype and highlighted where to start and what the options are. A great place to start your green journey and a useful handbook to have as the journey continues.”

From Rick Bauer – Storage Networking Industry Association (SNIA) – Education and Technology Director”
“Greg is one of the smartest “good guys” in the storage industry.
He has been a voice of calm amid all the “green IT hype” over the past few years. So when he speaks of the possible improvements that Green Tech can bring, it’s a much more realistic approach…”

From CMG (Computer Measurement Group) MeasureIT
I must admit that I have been slightly skeptical at times, when it comes to what the true value is behind all of the discussions on “green” technologies in the data center. As someone who has seen both the end user and vendor side of things, I think my skepticism gets heightened more than it normally would be. This book really helped dispel my skepticism.

The book is extremely well organized and easy to follow. Each chapter has a very good introduction and comprehensive summary. This book could easily serve as a blueprint for organizations to follow when they look for ideas on how to design new data centers. It’s a great addition to an IT Bookshelf. – Reviewed by Stephen R. Guendert, PhD (Brocade and CMG MeasureIT). Click here to read the full review in CMG MeasureIT.

From Tom Becchetti – IT Architect: “This book is packed full of information. From ecological and energy efficiencies, to virtualization strategies and what the future may hold for many of the key enabling technologies. Greg’s writing style benefits both technologists and management levels.”

From MSP Business Journal: Greg Schulz named an Eco-Tech Warrior – April 2009

From David Marshall at VMblog.com: If you follow me on Linked in, you might have seen that I had been reading a new book that came out at the beginning of the year titled, “The Green and Virtual Data Center” by Greg Schulz. Rather than writing about a specific virtualization platform and how to get it up and running, Schulz takes an interesting approach at stepping back and looking at the big picture. After reading the book, I reached out to the author to ask him a few more questions and to share his thoughts with readers of VMBlog.com. I know I’m not Oprah’s Book Club, but I think everyone here will enjoy this book. Click here to read more what David Marshal has to say.

From Zen Kishimoto of Altaterra Research: Book Review May 2009

From Kurt Marko of Processor.com Green and Virtual Book Review – April 2009

From Serial Storage Wire (STA): Green and SASy = Energy and Economic, Effective Storage – March 2009

From Computer Technology Review: Recent Comments on The Green and Virtual Data Center – March 2009

From Alan Radding in Big Fat Finance Blog: Green IT for Finance Operations – April 2009

From VMblog: Comments on The Green and Virtual Data CenterMarch 2009

From StorageIO Blog: Recent Comments and Tips – March 2009

From VMblog: Comments on The Green and Virtual Data CenterMarch 2009

From Data Center Links John Rath comments on “The Green and Virtual Data Center

From InfoStor Dave Simpson comments on “The Green and Virtual Data Center

From Sys-Con Georgiana Comsa comments on “The Green and Virtual Data Center

From Ziff Davis Heather Clancy comments on “The Green and Virtual Data Center”

From Byte & Switch Green IT and the Green Gap February 2009

From GreenerComputing: Enabling a Green and Virtual Data Center February 2009

From Sys-con: Comments on The Green and Virtual Data Center – March 2009

From ServerWatch: Green IT: Myths vs. Realities – February 2009

From Byte & Switch: Going Green and the Economic Downturn – February 2009

From Business Wire: Comments on The Green and Virtual Data Center Book – January 2009

Additional content and news can be found here and here with upcoming events listed here.

Interested in Kindle? Here’s a link to get a Kindle copy of "Resilient Storage Networks" (Elsevier) or to send a message via Amazon to publisher CRC that you would like to see a Kindle version of "The Green and Virtual Data Center". While you are at it, I also invite you to become a fan of my books at Facebook.

Thanks again to everyone who has obtained their copy of either of my books, also thanks to all of those who have done reviews, interviews and helped in many other ways!

Enjoy the rest of your summer!

Cheers – gs

Greg Schulz – twitter @storageio

StorageIO in the news

StorageIO is regularly quoted and interviewed in various industry and vertical market venues and publications both on-line and in print on a global basis. The following is coverage, perspectives and commentary by StorageIO on IT industry trends including servers, storage, I/O networking, hardware, software, services, virtualization, cloud, cluster, grid, SSD, data protection, Green IT and more.

Realizing that some prefer blogs to webs to twitters to other venues, here are some recent links among others to media coverage and comments by me on a different topics that are among others found at www.storageio.com/news.html:

  • Virtualization Review: Comments on Clouds, Virtualizaiton and Cisco move into servers – July 2009
  • SearchStorage: Comments on Storage Resource Managemet (SRM) and related tools – July 2009
  • SearchStorage: Comments on flash SSD – July 2009
  • SearchDataBackup: Comments on Data backup reporting tools’ trends – July 2009
  • SearchServerVirtualization: Comments on Hyper-V R2 matches VMware with 64-processor support – July 2009
  • SearchStorage: Comments on HP buying IBRIX for clustered and Cloud NAS – July 2009
  • Enterprise Storage Forum: Comments on HP buying IBRIX for clustered and Cloud NAS – July 2009
  • eWeek: Comments on NetApps next moves after DDUP and EMC – July 2009
  • Enterprise Storage Forum: Comments on NetApps next moves after DDUP and EMC – July 2009
  • SearchStorage: Comments on EMC buying DataDomain, NetApps next moves – July 2009
  • SearchVirtualization: Comments on Microsft HyperV features and VMware – July 2009
  • SearchITchannel: Comments on social media for business – June 2009
  • SearchSMBstorage: Comments on Storage Resource Management (SRM) for SMBs – June 2009
  • Enterprise Storage Forum: Comments on IT Merger & Acquisition activity – June 2009
  • Evolving Solutions: Comments on Storage Consolidation, Networking & Green IT – June 2009
  • Enterprise Storage Forum: Comments on EMC letter to DDUP – June 2009
  • SearchStorage: Comments on best practices for effective thin provisioning – June 2009
  • Processor: Comments on Cloud computing, SaaS and SOAs – June 2009
  • Serverwatch: Comments in How EMC’s World Pulls the Data Center Together – June 2009
  • Processor: Comments on Virtual Security Is No Walk In The Park – May 2009
  • SearchStorage: Comments on EPA launching Green Storage specification – May 2009
  • SearchStorage: Comments on Storage Provisioning Tools – May 2009
  • Enterprise Systems Journal: Comments on Tape: The Zombie Technology – May 2009
  • Enterprise Storage Forum: Comments on Oracle Keeping Sun Storage Business – May 2009
  • IT Health Blogging: Discussion about iSCSI vs. Fibre Channel for Virtual Environments – May 2009
  • IT Business Edge: Discussion about IT Data Center Futures – May 2009
  • IT Business Edge: Comments on Tape being a Green Technology – April 2009
  • Big Fat Finance Blog: Quoted in story about Green IT for Finance Operaitons – April 2009
  • SearchStorage: Comments on FLASH and SSD Storage – April 2009
  • SearchStorage AU: Comments on Data Classificaiton – April 2009
  • IT Knowledge Exchange: Comments on FCoE and Converged Networking Coming Together – April 2009
  • SearchSMBStorage: Comments on Data Deduplicaiton for SMBs – April 2009
  • SearchSMBStorage: Comments on Blade Storage for SMBs – April 2009
  • SearchStorage: Comments on MAID technology remaining underutilized – April 2009
  • SearchDataCenter: Closing the green gap: Expanding data centers with environmental benefits – April 2009
  • ServerWatch: Comments on What’s Selling In the Data Storage Market? – April 2009
  • ServerWatch: Comments on Oracle Buys Sun: The Consequences – April 2009
  • SearchStorage: Comments on Tiered Storage – April 2009
  • SearchStorage: Comments on Data Classification for Storage Managers – April 2009
  • wsradio.com Interview closing the Green Gap


  • IT Knowledge Exchange: Comments on FCoE eco-system maturing – April 2009
  • Internet Revolution: Comments on the Pre-mature death of the disk drive – April 2009
  • Enterprise Storage Forum: Comments on EMC V-MAX announcement – April 2009
  • MSP Business Journal: Greg Schulz named an Eco-Tech Warrior – April 2009
  • Storage Magazine: Comments on Power-smart disk systems – April 2009
  • Storage Magazine: Comments on Replication Alternatives – April 2009
  • StorageIO Blog: Comments and Tape as a Green Storage Medium – April 2009
  • Inside HPC: Recent Comments on Tape and Green IT – April 2009
  • Processor.com: Recent Comments on Green and Virtual – April 2009
  • SearchDataCenter: Interview: Closing the green gap: Expanding data centers with environmental benefits – April 2009
  • Enterprise Systems Journal: Recent Comments and Tips – March 2009
  • Computer Technology Review: Recent Comments on The Green and Virtual Data Center – March 2009
  • VMblog: Comments on The Green and Virtual Data Center – March 2009
  • Sys-con: Comments on The Green and Virtual Data Center – March 2009
  • Server Watch: Comments on IBM possibly buying Sun – March 2009
  • Bnet: Comments on IBM possibly buying Sun – March 2009
  • SearchStorage: Comments on Tiered Storage 101 – March 2009
  • SearchStorage: Comments – Cisco pushes into Servers March 2009
  • Enterprise Storage Forum: Comments – Cisco Entering Server Market March 2009
  • Enterprise Storage Forum: Comments – State of Storage Job Market – March 2009
  • SearchSMBStorage: Comments on SMB Storage Options – March 2009
  • Enterprise Storage Forum: Comments on Sun Proposes New Solid State Storage Spec – March 2009
  • Enterprise Storage Forum: Comments on Despite Economy, Storage Bargains Hard to Find – March 2009
  • TechWorld: Comments on Where to Stash Your Data – February 2009
  • ServerWatch: Green IT: Myths vs. Realities – February 2009
  • Byte & Switch: Going Green and the Economic Downturn – February 2009
  • CTR: Comments on Tape Hardly Being On Way Out – February 2009
  • Processor: Comments on SSD (FLASH and RAM) – February 2009
  • Internet News: Comments on Steve Wozniak joining SSD startup – February 2009
  • SearchServerVirtualization: Comments on I/O and Virtualization – February 2009
  • Technology Inc.: Comments on Data De-dupe for DR – February 2009
  • SearchStorage: Comments on NetApp SMB NAS – February 2009
  • Check out the Tips, Tools and White Papers, and News pages for additional commentary, coverage and related content or events.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    Recent tips, videos, articles and more

    Its been a busy year so far and there is still plenty more to do. Taking advantage of a short summer break, I’m getting caught up on some items including putting up a link to some of the recent articles, tips, reports, webcasts, videos and more that I have eluded to in recent posts. Realizing that some prefer blogs to webs to tweets to other venues, here are some links to recent articles, tips, videos, podcasts, webcasts, white papers and more that can be found on the StorageIO Tips, tools and White Papers pages.

    Recent articles, columns, tips, white papers and reports:

  • ITworld: The new green data center: From energy avoidance to energy efficiency August 2009
  • SearchSystemsChannel: Comparing I/O virtualization and virtual I/O benefits July 2009
  • SearchDisasterRecovery: Top server virtualization myths in DR and BC July 2009
  • Enterprise Storage Forum: Saving Money with Green Data Storage Technology July 2009
  • SearchSMB ATE Tips: SMB Tips and ATE by Greg Schulz
  • SearchSMB ATE Tip: Tape library storage July 2009
  • SearchSMB ATE Tip: Server-based operating systems vs. PC-based operating systems June 2009
  • SearchSMB ATE Tip: Pros/cons of block/variable block dedupe June 2009
  • FedTechAt the Ready: High-availability storage hinges on being ready for a system failure May 2009
  • Byte & Switch Part XI – Key Elements For A Green and Virtual Data Center May 2009
  • Byte & Switch Part X – Basic Steps For Building a Green and Virtual Data Center May 2009
  • InfoStor Technology Options for Green Storage: April 2009
  • Byte & Switch Part IX – I/O, I/O, Its off to Virtual Work We Go: Networks role in Virtual Data Centers April 2009
  • Byte & Switch Part VIII – Data Storage Can Become Green: There are many steps you can take April 2009
  • Byte & Switch Part VII – Server Virtualization Can Save Costs April 2009
  • Byte & Switch Part VI – Building a Habitat for Technology April 2009
  • Byte & Switch Part V – Data Center Measurement, Metrics & Capacity Planning April 2009
  • zJournal Storage & Data Management: Tips for Enabling Green and Virtual Efficient Data Management March 2009
  • Serial Storage Wire (STA): Green and SASy = Energy and Economic, Effective Storage March 2009
  • SearchSystemsChannel: FAQs: Green IT strategies for solutions providers March 2009
  • Computer Technology Review: Recent Comments on The Green and Virtual Data Center March 2009
  • Byte & Switch Part IV – Virtual Data Centers Can Promote Business Growth March 2009
  • Byte & Switch Part III – The Challenge of IT Infrastructure Resource Management March 2009
  • Byte & Switch Part II – Building an Efficient & Ecologically Friendly Data Center March 2009
  • Byte & Switch Part I – The Green Gap – Addressing Environmental & Economic Sustainability March 2009
  • Byte & Switch Green IT and the Green Gap February 2009
  • GreenerComputing: Enabling a Green and Virtual Data Center February 2009
  • Some recent videos and podcasts include:

  • bmighty.com The dark side of SMB virtualization July 2009
  • bmighty.com SMBs Are Now Virtualization’s “Sweet Spot” July 2009
  • eWeek.com Green IT is not dead, its new focus is about efficiency July 2009
  • SearchSystemsChannel FAQ: Using cloud computing services opportunities to get more business July 2009
  • SearchStorage FAQ guide – How Fibre Channel over Ethernet can combine networks July 2009
  • SearchDataCenter Business Benefits of Boosting Web hosting Efficiency June 2009
  • SearchStorageChannel Disaster recovery services for solution providers June 2009
  • The Serverside The Changing Dynamic of the Data Center April 2009
  • TechTarget Virtualization and Consolidation for Agility: Intels Xeon Processor 5500 series May 2009
  • TechTarget Virtualization and Consolidation for Agility: Intels Xeon Processor 5500 series May 2009
  • Intel Reduce Energy Usage while Increasing Business Productivity in the Data Center May 2009
  • WSRadio Closing the green gap and shifting towards an IT efficiency and productivity April 2009
  • bmighty.com July 2009
  • Check out the Tips, Tools and White Papers, and News pages for more commentary, coverage and related content or events.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    Clarifying Clustered Storage Confusion

    Clustered storage can be iSCSI, Fibre Channel block based or NAS (NFS or CIFS or proprietary file system) file system based. Clustered storage can also be found in virtual tape library (VTL) including dedupe solutions along with other storage solutions such as those for archiving, cloud, medical or other specialized grids among others.

    Recently in the IT and data storage specific industry, there has been a flurry of merger and acquisition (M&A) (Here and here), new product enhancement or announcement activity around clustered storage. For example, HP buying clustered file system vendor IBRIX complimenting their previous acquisition of another clustered file system vendor (PolyServe) a few years ago, or, of iSCSI block clustered storage software vendor LeftHand earlier this year. Another recent acquisition is that of LSI buying clustered NAS vendor ONstor, not to mention Dell buying iSCSI block clustered storage vendor EqualLogic about a year and half ago, not to mention other vendor acquisitions or announcements involving storage and clustering.

    Where the confusion enters into play is the term cluster which means many things to different people, and even more so when clustered storage is combined with NAS or file based storage. For example, clustered NAS may infer a clustered file system when in reality a solution may only be multiple NAS filers, NAS heads, controllers or storage processors configured for availability or failover.

    What this means is that a NFS or CIFS file system may only be active on one node at a time, however in the event of a failover, the file system shifts from one NAS hardware device (e.g. NAS head or filer) to another. On the other hand, a clustered file system enables a NFS or CIFS or other file system to be active on multiple nodes (e.g. NAS heads, controllers, etc.) concurrently. The concurrent access may be for small random reads and writes for example supporting a popular website or file serving application, or, it may be for parallel reads or writes to a large sequential file.

    Clustered storage is no longer exclusive to the confines of high-performance sequential and parallel scientific computing or ultra large environments. Small files and I/O (read or write), including meta-data information, are also being supported by a new generation of multipurpose, flexible, clustered storage solutions that can be tailored to support different applications workloads.

    There are many different types of clustered and bulk storage systems. Clustered storage solutions may be block (iSCSI or Fibre Channel), NAS or file serving, virtual tape library (VTL), or archiving and object-or content-addressable storage. Clustered storage in general is similar to using clustered servers, providing scale beyond the limits of a single traditional system—scale for performance, scale for availability, and scale for capacity and to enable growth in a modular fashion, adding performance and intelligence capabilities along with capacity.

    For smaller environments, clustered storage enables modular pay-as-you-grow capabilities to address specific performance or capacity needs. For larger environments, clustered storage enables growth beyond the limits of a single storage system to meet performance, capacity, or availability needs.

    Applications that lend themselves to clustered and bulk storage solutions include:

    • Unstructured data files, including spreadsheets, PDFs, slide decks, and other documents
    • Email systems, including Microsoft Exchange Personal (.PST) files stored on file servers
    • Users’ home directories and online file storage for documents and multimedia
    • Web-based managed service providers for online data storage, backup, and restore
    • Rich media data delivery, hosting, and social networking Internet sites
    • Media and entertainment creation, including animation rendering and post processing
    • High-performance databases such as Oracle with NFS direct I/O
    • Financial services and telecommunications, transportation, logistics, and manufacturing
    • Project-oriented development, simulation, and energy exploration
    • Low-cost, high-performance caching for transient and look-up or reference data
    • Real-time performance including fraud detection and electronic surveillance
    • Life sciences, chemical research, and computer-aided design

    Clustered storage solutions go beyond meeting the basic requirements of supporting large sequential parallel or concurrent file access. Clustered storage systems can also support random access of small files for highly concurrent online and other applications. Scalable and flexible clustered file servers that leverage commonly deployed servers, networking, and storage technologies are well suited for new and emerging applications, including bulk storage of online unstructured data, cloud services, and multimedia, where extreme scaling of performance (IOPS or bandwidth), low latency, storage capacity, and flexibility at a low cost are needed.

    The bandwidth-intensive and parallel-access performance characteristics associated with clustered storage are generally known; what is not so commonly known is the breakthrough to support small and random IOPS associated with database, email, general-purpose file serving, home directories, and meta-data look-up (Figure 1). Note that a clustered storage system, and in particular, a clustered NAS may or may not include a clustered file system.

    Clustered Storage Model: Source The Green and Virtual Data Center (CRC)
    Figure 1 – Generic clustered storage model (Courtesy “The Green and Virtual Data Center  (CRC)”

    More nodes, ports, memory, and disks do not guarantee more performance for applications. Performance depends on how those resources are deployed and how the storage management software enables those resources to avoid bottlenecks. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion when processing diverse application workloads. Other things to consider include support for industry-standard interfaces, protocols, and technologies.

    Scalable and flexible clustered file server and storage systems provide the potential to leverage the inherent processing capabilities of constantly improving underlying hardware platforms. For example, software-based clustered storage systems that do not rely on proprietary hardware can be deployed on industry-standard high-density servers and blade centers and utilizes third-party internal or external storage.

    Clustered storage is no longer exclusive to niche applications or scientific and high-performance computing environments. Organizations of all sizes can benefit from ultra scalable, flexible, clustered NAS storage that supports application performance needs from small random I/O to meta-data lookup and large-stream sequential I/O that scales with stability to grow with business and application needs.

    Additional considerations for clustered NAS storage solutions include the following.

    • Can memory, processors, and I/O devices be varied to meet application needs?
    • Is there support for large file systems supporting many small files as well as large files?
    • What is the performance for small random IOPS and bandwidth for large sequential I/O?
    • How is performance enabled across different application in the same cluster instance?
    • Are I/O requests, including meta-data look-up, funneled through a single node?
    • How does a solution scale as the number of nodes and storage devices is increased?
    • How disruptive and time-consuming is adding new or replacing existing storage?
    • Is proprietary hardware needed, or can industry-standard servers and storage be used?
    • What data management features, including load balancing and data protection, exists?
    • What storage interface can be used: SAS, SATA, iSCSI, or Fibre Channel?
    • What types of storage devices are supported: SSD, SAS, Fibre Channel, or SATA disks?

    As with most storage systems, it is not the total number of hard disk drives (HDDs), the quantity and speed of tiered-access I/O connectivity, the types and speeds of the processors, or even the amount of cache memory that determines performance. The performance differentiator is how a manufacturer combines the various components to create a solution that delivers a given level of performance with lower power consumption.

    To avoid performance surprises, be leery of performance claims based solely on speed and quantity of HDDs or the speed and number of ports, processors and memory. How the resources are deployed and how the storage management software enables those resources to avoid bottlenecks are more important. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion.

    Learn more about clustered storage (block, file, VTL/dedupe, archive), clustered NAS, clustered file system, grids and cloud storage among other topics in the following links:

    "The Many faces of NAS – Which is appropriate for you?"

    Article: Clarifying Storage Cluster Confusion
    Presentation: Clustered Storage: “From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0”
    Video Interview: How to Scale Data Storage Systems with Clustering
    Guidelines for controlling clustering
    The benefits of clustered storage

    Along with other material on the StorageIO Tips and Tools or portfolio archive or events pages.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved