Back to school shopping: Dude, Dell Digests 3PAR Disk storage

Dell

No sooner has the dust settled from Dells other recent acquisitions, its back to school shopping time and the latest bargain for the Round Rock Texas folks is bay (San Francisco) area storage vendor 3PAR for $1.15B. As a refresh, some of Dells more recent acquisitions including a few years ago $1.4B for EqualLogic, $3.9B for Perot systems not to mention Exanet, Kace and Ocarina earlier this year. For those interested, as of April 2010 reporting figures found here, Dell showed about $10B USD in cash and here is financial information on publicly held 3PAR (PAR).

Who is 3PAR
3PAR is a publicly traded company (PAR) that makes a scalable or clustered storage system with many built in advanced features typically associated with high end EMC DMX and VMAX as well as CLARiiON, in addition to Hitachi or HP or IBM enterprise class solutions. The Inserv (3PARs storage solution) combines hardware and software providing a very scalable solution that can be configured for smaller environments or larger enterprise by varying the number of controllers or processing nodes, connectivity (server attachment) ports, cache and disk drives.

Unlike EqualLogic which is more of a mid market iSCSI only storage system, the 3PAR Inserv is capable of going head to head with the EMC CLARiiON as well as DMC or VMAX systems that support a mix of iSCSI and Fibre Channel or NAS via gateway or appliances. Thus while there were occasional competitive situations between 3PAR and Dell EqualLogic, they for the most part were targeted at different market sectors or customers deployment scenarios.

What does Dell get with 3PAR?

  • A good deal if not a bargain on one of the last new storage startup pure plays
  • A public company that is actually generating revenue with a large and growing installed base
  • A seasoned sales force who knows how to sell into the enterprise storage space against EMC, HP, IBM, Oracle/SUN, Netapp and others
  • A solution that can scale in terms of functionality, connectivity, performance, availability, capacity and energy efficiency (PACE)
  • Potential route to new markets where 3PAR has had success, or to bridge gaps where both have played and competed in the past
  • Did I say a company with an established footprint of installed 3PAR Inserv storage systems and good list of marquee customers
  • Ability to sell a solution that they own the intellectual property (IP) instead of that of partner EMC
  • Plenty of IP that can be leveraged within other Dell solutions, not to mention combine 3PAR with other recently acquired technologies or companies.

On a lighter note, Dell picks up once again Marc Farley who was with them briefly after the EqualLogic acquisition who then departed to 3PAR where he became director of social media including launch of Infosmack on Storage Monkeys with co host Greg Knieriemen (@Knieriemen). Of course the twitter world and traditional coconut wires are now speculating where Farley will go next that Dell may end up buying in the future.

What does this mean for Dell and their data storage portfolio?
While in no ways all inclusive or comprehensive, table 1 provides a rough framework of different price bands, categories, tiers and market or application segments requiring various types of storage solutions where Dell can sell into.

 

HP

Dell

EMC

IBM

Oracle/Sun

Servers

Blade systems, rack mount, towers to desktop

Blade systems, rack mount, towers to desktop

Virtual servers with VMware, servers via vBlock servers via Cisco

Blade systems, rack mount, towers to desktop

Blade systems, rack mount, towers to desktop

Services

HP managed services, consulting and hosting supplemented by EDS acquisition

Bought Perot systems (an EDS spin off/out)

Partnered with various organizations and services

Has been doing smaller acquisitions adding tools and capabilities to IBM global services

Large internal consulting and services as well as Software as a Service (SaaS) hosting, partnered with others

Enterprise storage

XP (FC, iSCSI, FICON for mainframe and NAS with gateway) which is OEMed from Hitachi Japan parent of HDS

3PAR (iSCSI and FICON or NAS with gateway) replaces EMC CLARiiON or perhaps rare DMX/VMAX at high end?

DMX and VMAX

DS8000

Sun resold HDS version of XP/USP however Oracle has since dropped it from lineup

Data footprint impact reduction

Dedupe on VTL via Sepaton plus HP developed technology or OEMed products

Dedupe in OEM or partner software or hardware solutions, recently acquired Ocarina

Dedupe in Avamar, Datadomain, Networker, Celerra, Centera, Atmos. CLARiiON and Celerra compression

Dedupe in various hardware and software solutions, source and target, compression with Storwize

Dedupe via OEM VTLs and other sun solutions

Data preservation

Database and other archive tools, archive storage

OEM solutions from EMC and others

Centera and other solutions

Various hardware and software solutions

Various hardware and software solutions

General data protection (excluding logical or physical security and DLP)

Internal Data Protector software plus OEM, partners with other software, various VTL, TL and target solutions as well as services

OEM and resell partner tools as well as Dell target devices and those of partners. Could this be a future acquisition target area?

Networker and Avamar software, Datadomain and other targets, DPA management tools and Mozy services

Tivoli suite of software and various hardware targets, management tools and cloud services

Various software and partners tools, tape libraries, VTLs and online storage solutions

Scale out, bulk, or clustered NAS

eXtreme scale out, bulk and clustered storage for unstructured data applications

Exanet on Dell servers with shared SAS, iSCSI or FC storage

Celerra and ATMOS

IBM SONAS or N series (OEM from NetApp)

ZFS based solutions including 7000 series

General purpose NAS

Various gateways for EVA or MSA or XP, HP IBRIX or Polyserve based as well as Microsoft WSS solutions

EMC Celerra, Dell Exanet, Microsoft WSS based. Acquisition or partner target area?

Celerra

N Series OEMed from Netapp as well as growing awareness of SONAS

ZFS based solutions. Whatever happened to Procom?

Mid market multi protocol block

EVA (FC with iSCSI or NAS gateways), LeftHand (P Series iSCSI) for lowered of this market

3PAR (FC and iSCSI, NAS with gateway) for mid to upper end of this market, EqualLogic (iSCSI) for the lower end of the market, some residual EMC CX activity phases out over time?

CLARiiON (FC and iSCSI with NAS via gateway), Some smaller DMX or VMAX configurations for mid to upper end of this market

DS5000, DS4000 (FC and iSCSI with NAS via a gateway) both OEMed from LSI, XIV and N series (Netapp)

7000 series (ZFS and Sun storage software running on Sun server with internal storage, optional external storage)

6000 series

Scalable SMB iSCSI

LeftHand (P Series)

EqualLogic

Celerra NX, CLARiiON AX/CX

XIV, DS3000, N Series

2000
7000

Entry level shared block

MSA2000 (iSCSI, FC, SAS)

MD3000 (iSCSI, FC, SAS)

AX (iSCSI, FC)

DS3000 (iSCSI, FC, SAS), N Series (iSCSI, FC, NAS)

2000
7000

Entry level unified multi function

X (not to be confused with eXtreme series) HP servers with Windows Storage Software

Dell servers with Windows Storage Software or EMC Celerra

Celerra NX, Iomega

xSeries servers with Microsoft or other software installed

ZFS based solutions running on Sun servers

Low end SOHO

X (not to be confused with eXtreme series) HP servers with Windows Storage Software

Dell servers with storage and Windows Storage Software. Future acqustion area perhaps?

Iomega

 

 

Table 1: Sampling of various tiers, architectures, functionality and storage solution options

Clarifying some of the above categories in table 1:

Servers: Application servers or computers running Windows, Linux, HyperV, VMware or other applications, operating systems and hypervisors.

Services: Professional and consulting services, installation, break fix repair, call center, hosting, managed services or cloud solutions

Enterprise storage: Large scale (hundreds to thousands of drives, many front end as well as back ports, multiple controllers or storage processing engines (nodes), large amount of cache and equally strong performance, feature rich functionality, resilient and scalable.

Data footprint impact reduction: Archive, data management, compression, dedupe, thin provision among other techniques. Read more here and here.

Data preservation: Archiving for compliance and non regulatory applications or data including software, hardware, services.

General data protection: Excluding physical or logical data security (firewalls, dlp, etc), this would be backup/restore with encryption, replication, snapshots, hardware and software to support BC, DR and normal business operations. Read more about data protection options for virtual and physical storage here.

Scale out NAS: Clustered NAS, bulk unstructured storage, cloud storage system or file system. Read more about clustered storage here. HP has their eXtreme X series of scale out and bulk storage systems as well as gateways. These leverage IBRIX and Polyserve which were bought by HP as software, or as a solution (HP servers, storage and software), perhaps with optional data reduction software such as Ocarina OEMed by Dell. Dell now has Exanet which they bought recently as software, or as a solution running on Dell servers, with either SAS, iSCSI or FC back end storage plus optional data footprint reduction software such as Ocarina. IBM has GPFS as a software solution running on IBM or other vendors servers with attached storage, or as a solution such as SONAS with IBM servers running software with IBM DS mid range storage. IBM also OEMs Netapp as the N series.

General purpose NAS: NAS (NFS and CIFS or optional AFP and pNFS) for everyday enterprise (or SME/SMB) file serving and sharing

Mid market multi protocol block: For SMB to SME environments that need scalable shared (SAN) scalable block storage using iSCSI, FC or FCoE

Scalable SMB iSCSI: For SMB to SME environments that need scalable iSCSI storage with feature rich functionality including built in virtualization

Entry level shared block: Block storage with flexibility to support iSCSI, SAS or Fibre Channel with optional NAS support built in or available via a gateway. For example external SAS RAID shared storage between 2 or more servers configured in a HyeprV or VMware clustered that do not need or can afford higher cost of iSCSI. Another example would be shared SAS (or iSCSI or Fibre Channel) storage attached to a server running storage software such as clustered file system (e.g. Exanet) or VTL, Dedupe, Backup, Archiving or data footprint reduction tools or perhaps database software where higher cost or complexity of an iSCSI or Fibre Channel SAN is not needed. Read more about external shared SAS here.

Entry level unified multifunction: This is storage that can do block and file yet is scaled down to meet ease of acquisition, ease of sale, channel friendly, simplified deployment and installation yet affordable for SMBs or larger SOHOs as well as ROBOs.

Low end SOHO: Storage that can scale down to consumer, prosumer or lower end of SMB (e.g. SOHO) providing mix of block and file, yet priced and positioned below higher price multifunction systems.

Wait a minute, are that too many different categories or types of storage?

Perhaps, however it also enables multiple tools (tiers of technologies) to be in a vendors tool box, or, in an IT professionals tool bin to address different challenges. Lets come back to this in a few moments.

 

Some Industry trends and perspectives (ITP) thoughts:

How can Dell with 3PAR be an enterprise play without IBM mainframe FICON support?
Some would say forget about it, mainframes are dead thus not a Dell objective even though EMC, HDS and IBM sell a ton of storage into those environments. However, fair enough argument and one that 3PAR has faced for years while competing with EMC, HDS, HP, IBM and Fujitsu thus they are versed in how to handle that discussion. Thus the 3PAR teams can help the Dell folks determine where to hunt and farm for business something that many of the Dell folks already know how to do. After all, today they have to flip the business to EMC or worse.

If truly pressured and in need, Dell could continue reference sales with EMC for DMX and VMAX. Likewise they could also go to Bustech and/or Luminex who have open systems to mainframe gateways (including VTL support) under a custom or special solution sale. Ironically EMC has OEMed in the past Bustech to transform their high end storage into Mainframe VTLs (not to be confused with Falconstor or Quantum for open system) as well as Datadomain partnered with Luminex.

BTW, did you know that Dell has had for several years a group or team that handles specialized storage solutions addressing needs outside the usual product portfolio?

Thus IMHO Dells enterprise class focus will be that for open systems large scale out where they will compete with EMC DMX and VMAX, HDS USP or their soon to be announced enhancements, HP and their Hitachi Japan OEMed XP, IBM and the DS8000 as well as the seldom heard about yet equally scalable Fujitsu Eternus systems.

 

Why only 1.15B, after all they paid 1.4B for EqualLogic?
IMHO, had this deal occurred a couple of years ago when some valuations were still flying higher than today, and 3PAR were at their current sales run rate, customer deployment situations, it is possible the amount would have been higher, either way, this is still a great value for both Dell and 3PAR investors, customers, employees and partners.

 

Does this mean Dell dumps EMC?
Near term I do not think Dell dumps the EMC dudes (or dudettes) as there is still plenty of business in the mid market for the two companies. However, over time, I would expect that Dell will unleash the 3PAR folks into the space where normally a CLARiiON CX would have been positioned such as deals just above where EqualLogic plays, or where Fibre Channel is preferred. Likewise, I would expect Dell to empower the 3PAR team to go after additional higher end deals where a DMX or VMAX would have been the previous option not to mention where 3PAR has had success.

This would also mean extending into sales against HP EVA and XPs, IBM DS5000 and DS8000 as well as XIV, Oracle/Sun 6000 and 7000s to name a few. In other words there will be some spin around coopition, however longer term you can read the writing on the wall. Oh, btw, lest you forget, Dell is first and foremost a server company who now is getting into storage in a much bigger way and EMC is first and foremost a storage company who is getting into severs via VMware as well as their Cisco partnerships.

Are shots being fired across each other bows? I will leave that up to you to speculate.

 

Does this mean Dell MD1000/MD3000 iSCSI, SAS and FC disappears?
I do not think so as they have had a specific role for entry level below where the EqualLogic iSCSI only solution fits providing mixed iSCSI, SAS and Fibre Channel capabilities to compete with the HP MSA2000 (OEMed by Dothill) and IBM DS3000 (OEMed from LSI). While 3PAR could be taken down into some of these markets, which would also potentially dilute the brand and thus premium margin of those solutions.

Likewise, there is a play with server vendors to attach shared SAS external storage to small 2 and 4 node clusters for VMware, HyperV, Exchange, SQL, SharePoint and other applications where iSCSI or Fibre Channel are to expensive or not needed or where NAS is not a fit. Another play for the shared external SAS attached is for attaching low cost storage to scale out clustered NAS or bulk storage where software such as Exanet runs on a Dell server. Take a closer look at how HP is supporting their scale out as well as IBM and Oracle among others. Sure you can find iSCSI or Fibre Channel or even NAS back end to file servers. However growing trend of using shared SAS.

 

Does Dell now have too many different storage systems and solutions in their portfolio?
Possibly depending upon how you look at it and certainly the potential is there for revenue prevention teams to get in the way of each other instead of competing with external competitors. However if you compare the Dell lineup with those of EMC, HP, IBM and Oracle/Sun among others, it is not all that different. Note that HP, IBM and Oracle also have something in common with Dell in that they are general IT resource providers (servers, storage, networks, services, hardware and software) as compared to other traditional storage vendors.

Consequently if you look at these vendors in terms of their different markets from consumer to prosumer to SOHO at the low end of the SMB to SME that sits between SMB and enterprise, they have diverse customer needs. Likewise, if you look at these vendors server offerings, they too are diverse ranging from desktops to floor standing towers to racks, high density racks and blade servers that also need various tiers, architectures, price bands and purposed storage functionality.

 

What will be key for Dell to make this all work?
The key for Dell will be similar to that of their competitors which is to clearly communicate the value proposition of the various products or solutions, where, who and what their target markets are and then execute on those plans. There will be overlap and conflict despite the best spin as is always the case with diverse portfolios by vendors.

However if Dell can keep their teams focused on expanding their customer footprints at the expense of their external competition vs. cannibalizing their own internal product lines, not to mention creating or extending into new markets or applications. Consequently Dell now has many tools in their tool box and thus need to educate their solution teams on what to use or sell when, where, why and how instead of just having one tool or a singular focus. In other words, while a great solution, Dell no longer has to respond with the solution to everything is iSCSI based EqualLogic.

Likewise Dell can leverage the same emotion and momentum behind the EqualLogic teams to invigorate and unleash the best with 3PAR teams and solution into or onto the higher end of the SMB, SME and enterprise environments.

Im still thinking that Exanet is a diamond in the rough for Dell where they can install the clustered scalable NAS software onto their servers and use either lower end shared SAS RAID (e.g. MD3000), or iSCSI (MD3000, EqualLogic or 3PAR) or higher end Fibre Channel with 3PAR) for scale out, cloud and other bulk solutions competing with HP, Oracle and IBM. Dell still has the Windows based storage server for entry level multi protocol block and file capabilities as well as what they OEM from EMC.

 

Is Dell done shopping?
IMHO I do not think so as there are still areas where Dell can extend their portfolio and not just in storage. Likewise there are still some opportunities or perhaps bargains out there for fall and beyond acquisitions.

 

Does this mean that Dell is not happy with EqualLogic and iSCSI
Simply put from my perspective talking with Dell customers, prospects, and partners and seeing them all in action nothing could be further from Dell not being happy with iSCSI or EqualLogic. Look at this as being a way to extend the Dell story and capabilities into new markets, granted the EqualLogic folks now have a new sibling to compete with internal marketing and management for love and attention.

 

Isnt Dell just an iSCSI focused company?
A couple of years I was quoted in one of the financial analysis reports as saying that Dell needed to remain open to various forms of storage instead of becoming singularly focused on just iSCSI as a result of the EqualLogic deal. I standby that statement in that Dell to be a strong enterprise contender needs to have a balanced portfolio across different price or market bands, from block to file, from shared SAS to iSCSI to Fibre Channel and emerging FCoE.

This also means supporting traditional NAS across those different price band or market sectors as well as support for emerging and fast growing unstructured data markets where there is a need for scale out and bulk storage. Thus it is great to see Dell remaining open minded and not becoming singularly focused on just iSCSI instead providing the right solution to meet their diverse customer as well as prospect needs or opportunities.

While EqualLogic was and is a very successfully iSCSI focused storage solution not to mention one that Dell continues to leverage, Dell is more than just iSCSI. Take a look at Dells current storage line up as well as up in table 1 and there is a lot of existing diversity. Granted some of that current diversity is via partners which the 3PAR deal helps to address. What this means is that iSCSI continues to grow in popularity however there are other needs where shared SAS or Fibre Channel or FCoE will be needed opening new markets to Dell.

 

Bottom line and wrap up (for now)
This is a great move for Dell (as well as 3PAR) to move up market in the storage space with less reliance on EMC. Assuming that Dell can communicate the what to use when, where, why and how to both their internal teams, partners as well as industry and customers not to mention then execute on, they should have themselves a winner.

Will this deal end up being an even better bargain than when Dell paid $1.4B for EqualLogic?

Not sure yet, it certainly has potential if Dell can execute on their plans without losing momentum in any other their other areas (products).

Whats your take?

Cheers gs

Greg Schulz – Author The Green and Virtual Data Center (CRC) and Resilient Storage Networks (Elsevier)
twitter @storageio

Here are some related links to read more

Data footprint reduction (Part 1): Life beyond dedupe and changing data lifecycles

Over the past couple of weeks there has been a flurry of IT industry activity around data footprint impact reduction with Dell buying Ocarina and IBM acquiring Storwize. For those who want the quick (compacted, reduced) synopsis of what Dell buying Ocarina as well as IBM acquiring Storwize means read this post here along with some of my comments here and here.

Now, before any Drs or Divas of Dedupe get concerned and feel the need to debate dedupes expanding role, success or applicability, relax, take a deep breath, then read on and take another breath before responding if so inclined.

The reason I mention this is that some may mistake this as a piece against or not in favor of dedupe as it talks about life beyond dedupe which could be mistaken as indicating dedupes diminished role which is not the case (read ahead and see figure 5 to see the bigger picture).

Likewise some might feel that since this piece talks about archiving for compliance and non regulatory situations along with compression, data management and other forms of data footprint reduction they may be compelled to defend dedupes honor and future role.

Again, relax, take a deep breath and read on, this is not about the death of dedupe.

Now for others, you might wonder why the dedupe tongue in check humor mentioned above (which is what it is) and the answer is quite simple. The industry in general is drunk on dedupe and in some cases thus having numbed its senses not to mention having blurred its vision of the even bigger opportunities for the business benefits of data footprint reduction beyond todays backup centric or vmware server virtualization dedupe discussions.

Likewise, it is time for the industry to wake (or sober) up and instead of trying to stuff everything under or into the narrowly focused dedupe bottle. Instead, realize that there is a broader umbrella called data footprint impact reduction which includes among other techniques, dedupe, archive, compression, data management, data deletion and thin provisioning across all types of data and applications. What this means is a broader opportunity or market than what exists or being discussed today leveraging different techniques, technologies and best practices.

Consequently this piece is about expanding the discussion to the larger opportunity for vendors or vars to extend their focus to the bigger world of overall data footprint impact reduction beyond where currently focused. Likewise, this is about IT customers realizing that there are more opportunities to address data and storage optimization across your entire organization using various techniques instead of just focusing on backup.

In other words, there is a very bright future for dedupe as well as other techniques and technologies that fall under the data footprint reduction umbrella including data stored online, offline, near line, primary, secondary, tertiary, virtual and in a public or private cloud..

Before going further however lets take a step back and look at some business along with IT issues, challenges and opportunities.

What is the business and IT issue or challenge?
Given that there is no such thing as a data or information recession shown in figure 1, IT organizations of all size are faced with the constant demand to store more data, including multiple copies of the same or similar data, for longer periods of time.


Figure 1: IT resource demand growth continues

The result is an expanding data footprint, increased IT expenses, both capital and operational, due to additional Infrastructure Resource Management (IRM) activities to sustain given levels of application Quality of Service (QoS) delivery shown in figure 2.

Some common IT costs associated with supporting an increased data footprint include among others:

  • Data storage hardware and management software tools acquisition
  • Associated networking or IO connectivity hardware, software and services
  • Recurring maintenance and software renewal fees
  • Facilities fees for floor space, power and cooling along with IT staffing
  • Physical and logical security for data and IT resources
  • Data protection for HA, BC or DR including backup, replication and archiving


Figure 2: IT Resources and cost balancing conflicts and opportunities

Figure 2 shows the result is that IT organizations of all size are faced with having to do more with what they have or with less including maximizing available resources. In addition, IT organizations often have to overcome common footprint constraints (available power, cooling, floor space, server, storage and networking resources, management, budgets, and IT staffing) while supporting business growth.

Figure 2 also shows that to support demand, more resources are needed (real or virtual) in a denser footprint, while maintaining or enhancing QoS plus lowering per unit resource cost. The trick is improving on available resources while maintaining QoS in a cost effective manner. By comparison, traditionally if costs are reduced, one of the other curves (amount of resources or QoS) are often negatively impacted and vice versa. Meanwhile in other situations the result can be moving problems around that later resurface elsewhere. Instead, find, identify, diagnose and prescribe the applicable treatment or form of data footprint reduction or other IT IRM technology, technique or best practices to cure the ailment.

What is driving the expanding data footprint?
Granted more data can be stored in the same or smaller physical footprint than in the past, thus requiring less power and cooling per Gbyte, Tbyte or PByte. Data growth rates necessary to sustain business activity, enhanced IT service delivery and enable new applications are placing continued demands to move, protect, preserve, store and serve data for longer periods of time.

The popularity of rich media and Internet based applications has resulted in explosive growth of unstructured file data requiring new and more scalable storage solutions. Unstructured data includes spreadsheets, Power Point, slide decks, Adobe PDF and word documents, web pages, video and audio JPEG, MP3 and MP4 files. This trend towards increasing data storage requirements does not appear to be slowing anytime soon for organizations of all sizes.

After all, there is no such thing as a data or information recession!

Changing data access lifecycles
Many strategies or marketing stories are built around the premise that shortly after data is created data is seldom, if ever accessed again. The traditional transactional model lends itself to what has become known as information lifecycle management (ILM) where data can and should be archived or moved to lower cost, lower performing, and high density storage or even deleted where possible.

Figure 3 shows as an example on the left side of the diagram the traditional transactional data lifecycle with data being created and then going dormant. The amount of dormant data will vary by the type and size of an organization along with application mix. 


Figure 3: Changing access and data lifecycle patterns

However, unlike the transactional data lifecycle models where data can be removed after a period of time, Web 2.0 and related data needs to remain online and readily accessible. Unlike traditional data lifecycles where data goes dormant after a period of time, on the right side of figure 3, data is created and then accessed on an intermittent basis with variable frequency. The frequency between periods of inactivity could be hours, days, weeks or months and, in some cases, there may be sustained periods of activity.

A common example is a video or some other content that gets created and posted to a web site or social networking site such as Face book, Linked in, or You Tube among others. Once the content is discussed, while it may not change, additional comment and collaborative data can be wrapped around the data as additional viewers discover and comment on the content. Solution approaches for the new category and data lifecycle model include low cost, relative good performing high capacity storage such as clustered bulk storage as well as leveraging different forms of data footprint reduction techniques.

Given that a large (and growing) percentage of new data is unstructured, NAS based storage solutions including clustered, bulk, cloud and managed service offerings with file based access are gaining in popularity. To reduce cost along with support increased business demands (figure 2), a growing trend is to utilize clustered, scale out and bulk NAS file systems that support NFS, CIFS for concurrent large and small IOs as well as optionally pNFS for large parallel access of files. These solutions are also increasingly being deployed with either built in or add on accessorized data footprint reduction techniques including archive, policy management, dedupe and compression among others.

What is your data footprint impact?
Your data footprint impact is the total data storage needed to support your various business application and information needs. Your data footprint may be larger than how much actual data storage you have as seen in figure 4. In Figure 4, an example is an organization that has 20TBytes of storage space allocated and being used for databases, email, home directories, shared documents, engineering documents, financial and other data in different formats (structured and unstructured) not to mention varying access patterns.


Figure 4: Expanding data footprint due to data proliferation and copies being retained

Of the 20TBytes of data allocated and used, it is very likely that the consumed storage space is not 100 percent used. Database tables may be sparsely (empty or not fully) allocated and there is likely duplicate data in email and other shared documents or folders. Additionally, of the 20TBytes, 10TBytes are duplicated to three different areas on a regular basis for application testing, training and business analysis and reporting purposes.

The overall data footprint is the total amount of data including all copies plus the additional storage required for supporting that data such as extra disks for Redundant Array of Independent Disks (RAID) protection or remote mirroring.

In this overly simplified example, the data footprint and subsequent storage requirement are several times that of the 20TBytes of data. Consequently, the larger the data footprint the more data storage capacity and performance bandwidth needed, not to mention being managed, protected and housed (powered, cooled, situated in a rack or cabinet on a floor somewhere).

Data footprint reduction techniques
While data storage capacity has become less expensive on a relative basis, as data footprint continue to expand in order to support business requirements, more IT resources will be needed to be made available in a cost effective, yet QoS satisfying manner (again, refer back to figure 2). What this means is that more IT resources including server, storage and networking capacity, management tools along with associated software licensing and IT staff time will be required to protect, preserve and serve information.

By more effectively managing the data footprint across different applications and tiers of storage, it is possible to enhance application service delivery and responsiveness as well as facilitate more timely data protection to meet compliance and business objectives. To realize the full benefits of data footprint reduction, look beyond backup and offline data improvements to include online and active data using various techniques such as those in table 1 among others.

There are several methods (shown in table 1) that can be used to address data footprint proliferation without compromising data protection or negatively impacting application and business service levels. These approaches include archiving of structured (database), semi structured (email) and unstructured (general files and documents), data compression (real time and offline) and data deduplication.

 

Archiving

Compression

Deduplication

When to use

Structured (database), email and unstructured

Online (database, email, file sharing), backup or archive

Backup or archiving or recurring and similar data

Characteristic

Software to identify and remove unused data from active storage devices

Reduce amount of data to be moved (transmitted) or stored on disk or tape.

Eliminate duplicate files or file content observed over a period of time to reduce data footprint

Examples

Database, email, unstructured file solutions with archive storage

Host software, disk or tape, (network routers) and compression appliances or software as well as appearing in some primary storage system solutions

Backup and archive target devices and Virtual Tape Libraries (VTLs), specialized appliances

Caveats

Time and knowledge to know what and when to archive and delete, data and application aware

Software based solutions require host CPU cycles impacting application performance

Works well in background mode for backup data to avoid performance impact during data ingestion

Table 1: Data footprint reduction approaches and techniques

Archiving for compliance and general data retention
Data archiving is often perceived as a solution for compliance, however, archiving can be used for many other non compliance purposes. These include general data footprint reduction, to boost performance and enhance routine data maintenance and data protection. Archiving can be applied to structured databases data, semi structured email data and attachments and unstructured file data.

A key to deploying an archiving solution is having insight into what data exists along with applicable rules and policies to determine what can be archived, for how long, how many copies and how data ultimately may be finally retired or deleted. Archiving requires a combination of hardware, software and people to implement business rules.

A challenge with archiving is having the time and tools available to identify what data should be archived and what data can be securely destroyed when no longer needed. Further complicating archiving is that knowledge of the data value is also needed; this may well include legal issues as to who is responsible for making decisions on what data to keep or discard.

If a business can invest in the time and software tools, as well as identify which data to archive to support an effective archive strategy, the returns can be very positive towards reducing the data footprint without limiting the amount of information available for use.

Data compression (real time and offline)
Data compression is a commonly used technique for reducing the size of data being stored or transmitted to improve network performance or reduce the amount of storage capacity needed for storing data. If you have used a traditional or TCP/IP based telephone or cell phone, watched either a DVD or HDTV, listened to an MP3, transferred data over the internet or used email you have most likely relied on some form of compression technology that is transparent to you. Some forms of compression are time delayed, such as using PKZIP to zip files, while others are real time or on the fly based such as when using a network, cell phone or listening to an MP3.

Two different approaches to data compression that vary in time delay or impact on application performance along with the amount of compression and loss of data are loss less (no data loss) and lossy (some data loss for higher compression ratio). In addition to these approaches, there are also different implementations of including real time for no performance impact to applications and time delayed where there is a performance impact to applications.

In contrast to traditional ZIP or offline, time delayed compression approaches that require complete decompression of data prior to modification, online compression allows for reading from, or writing to, any location within a compressed file without full file decompression and resulting application or time delay. Real time appliance or target based compression capabilities are well suited for supporting online applications including databases, OLTP, email, home directories, web sites and video streaming among others without consuming host server CPU or memory resources or degrading storage system performance.

Note that with the increase of CPU server processing performance along with multiple cores, server based compression running in applications such as database, email, file systems or operating systems can be a viable option for some environments.

A scenario for using real time data compression is for time sensitive applications that require large amounts of data such as online databases, video and audio media servers, web and analytic tools. For example, databases such as Oracle support NFS3 Direct IO (DIO) and Concurrent IO (CIO) capabilities to enable random and direct addressing of data within an NFS based file. This differs from traditional NFS operations where a file would be sequential read or written.

Another example of using real time compression is to combine a NAS file server configured with 300GB or 600GB high performance 15.5K Fibre Channel or SAS HDDs in addition to flash based SSDs to boost the effective storage capacity of active data without introducing a performance bottleneck associated with using larger capacity HDDs. Of course, compression would vary with the type of solution being deployed and type of data being stored just as dedupe ratios will differ depending on algorithm along with if text or video or object based among other factors.

Deduplication (Dedupe)
Data deduplication (also known as single instance storage, commonalty factoring, data difference or normalization) is a data footprint reduction technique that eliminates the occurrence of the same data. Deduplication works by normalizing the data being backed up or stored by eliminating recurring or duplicate copies of files or data blocks depending on the implementation.

Some data deduplication solutions boast spectacular ratios for data reduction given specific scenarios, such as backup of repetitive and similar files, while providing little value over a broader range of applications.

This is in contrast with traditional data compression approaches that provide lower, yet more predictable and consistent data reduction ratios over more types of data and application, including online and primary storage scenarios. For example, in environments where there is little to no common or repetitive data files, data deduplication will have little to no impact while data compression generally will yield some amount of data footprint reduction across almost all types of data.

Some data deduplication solution providers have either already added, or have announced plans to add, compression techniques to compliment and increase the data footprint effectiveness of their solutions across a broader range of applications and storage scenarios, attesting to the value and importance of data compression to reduce data footprint.

When looking at deduplication solutions, determine if the solution is designed to scale in terms of performance, capacity and availability over a large amount of data along with how restoration of data will be impacted by scaling for growth. Other items to consider include how data is reduplicated, such as real time using inline or some form of time delayed post processing, and the ability to select the mode of operation.

For example, a dedupe solution may be able to process data at a specific ingest rate inline until a certain threshold is hit and then processing reverts to post processing so as to not cause a performance degradation to the application writing data to the deduplication solution. The downside of post processing is that more storage is needed as a buffer. It can, however, also enable solutions to scale without becoming a bottleneck during data ingestion.

However, there is life beyond dedupe which is to in no way diminish dedupe or its very strong and bright future, one that Im increasingly convinced of having talked with hundreds of IT professionals (e.g. the customers) is that only the surface is being scratched for dedupe, not to mention larger data footprint impact opportunity seen in figure 5.


Figure 5: Dedupe adoption and deployment waves over time

While dedupe is a popular technology from a discussion standpoint and has good deployment traction, it is far from reaching mass customer adoption or even broad coverage in environments where it is being used. StorageIO research shows broadest adoption of dedupe centered around backup in smaller or SMB environments (dedupe deployment wave one in figure 5) with some deployment in Remote Office Branch Office (ROBO) work groups as well as departmental environments.

StorageIO research also shows that complete adoption in many of those SMB, ROBO, work group or smaller environments has yet to reach 100 percent. This means that there remains a large population that has yet to deploy dedupe as well as further opportunities to increase the level of dedupe deployment by those already doing so.

There has also been some early adoption in larger core IT environments where dedupe coexists with complimenting existing data protection and preservation practices. Another current deployment scenario for dedupe has been for supporting core edge deployments in larger environments that provide support for backup and data protection of ROBO, work group and departmental systems.

Note that figure 5 simply shows the general types of environments in which dedupe is being adopted and not any sort of indicators as to the degree of deployment by a given customer or IT environment.

What to do about your expanding data footprint impact?
Develop an overall data foot reduction strategy that leverages different techniques and technologies addressing online primary, secondary and offline data. Assess and discover what data exists and how it is used in order to effectively manage storage needs.

Determine policies and rules for retention and deletion of data combining archiving, compression (online and offline) and dedupe in a comprehensive data footprint strategy. The benefit of a broader, more holistic, data footprint reduction strategy is the ability to address the overall environment, including all applications that generate and use data as well as IRM or overhead functions that compound and impact the data footprint.

Data footprint reduction: life beyond (and complimenting) dedupe
The good news is that the Drs. and Divas of dedupe marketing (the ones who also are good at the disco dedupe dance debates) have targeted backup as an initial market sweet (and success) spot shown in figure 5 given the high degree of duplicate data.


Figure 6: Leverage multiple data footprint reduction techniques and technologies

However that same good news is bad news in that there is now a stigma that dedupe is only for backup, similar to how archive was hijacked by the compliance marketing folks in the post Y2K era. There are several techniques that can be used individually to address specific data footprint reduction issues or in combination as seen in figure 7 to implement a more cohesive and effective data footprint reduction strategy.


Figure 7: How various data footprint reduction techniques are complimentary

What this means is that both archive, dedupe as well as other forms of data footprint reduction can and should be used beyond where they have been target marketed using the applicable tool for the task at hand. For example, a common industry rule of thumb is that on average, ten percent of data changes per day (your mileage and rate of change will certainly vary given applications, environment and other factors).

Now assuming that you have 100TB (feel free to subtract a zero or two, or add as many as needed) of data (note I did not say storage capacity or percent utilized), ten percent change would be 10TB that needs to be backed up, replicated and so forth. Now with basic 2 to 1 streaming tape compression (2.5 to 1 in upcoming LTO enhancements) would reduce the daily backup footprint from 10TB to 5TB.

Using dedupe with 10 to 1 would get that from 10TB down to 1TB or about the size of a large capacity disk drive. With 20 to 1 that cuts the daily backup down to 500GB and so forth. The net effect is that more daily backups can be stored in the same footprint which in turn helps expedite individual file recover by having more options to choose from off of the disk based cache, buffer or storage pool.

On the other hand, if your objective is to reduce and eliminate storage capacity, then the same amount of backups can be stored on less disk freeing up resources. Now take the savings times the number of days in your backup retention and you should see the numbers start to add up.

Now what about the other 90 percent of the data that may not have changed, or, that did change and exists on higher performance storage?

Can its footprint impact be reduced?

The answer should be perhaps or it depends as well as prompts the question of what tool would be best. There is a popular thinking as is often the case with industry buzzwords or technologies to use it everywhere. After all goes the thinking, if it is a good thing why not use and deploy more of it everywhere?

Keep in mind that dedupe trades time to perform thinking and apply intelligence to further reduce data in exchange for space capacity. Thus trading time for space capacity can have a negative impact on applications that need lower response time, higher performance where the focus is on rates vs ratios. For example, the other 90 to 100 percent of the data in the above example may have to be on a mix of high and medium performance storage to meet QoS or service level agreement (SLA) objectives. While it would fun or perhaps cool to try and achieve a high data reduction ratio on the entire 100TB of active data with dedupe (e.g. trying to achieve primary dedupe), the performance impacts could have a negative impact.

The option is to apply a mix of different data footprint reduction techniques across the entire 100TB. That is, use dedupe where applicable and higher reduction ratios can be achieved while balancing performance, compression used for streaming data to tape for retention or archive as well as in databases or other applications software not to mention in networks. Likewise, use real time compression or what some refer to as primary dedupe for online active changing data along with online static read only data.

Deploy a comprehensive data footprint reduction strategy combining various techniques and technologies to address point solution needs as well as the overall environment, including online, near line for backup, and offline for archive data.

Lets not forget about archiving, thin provisioning, space saving snapshots, commonsense data management among other techniques across the entire environment. In other words, if your focus is just on dedupe for backup to
achieve an optimized and efficient storage environment, you are also missing

out on a larger opportunity. However, this also means having multiple tools or

technologies in your IT IRM toolbox as well as understanding what to use when, where and why.

Data transfer rates is a key metric for performance (time) optimization such as meeting backup or restore or other data protection windows. Data reduction ratios is a key metric for capacity (space) optimization where the focus is on storing as much data in a given footprint

Some additional take away points:

  • Develop a data footprint reduction strategy for online and offline data
  • Energy avoidance can be accomplished by powering down storage
  • Energy efficiency can be accomplished by using tiered storage to meet different needs
  • Measure and compare storage based on idle and active workload conditions
  • Storage efficiency metrics include IOPS or bandwidth per watt for active data
  • Storage capacity per watt per footprint and cost is a measure for in active data
  • Small percentage reductions on a large scale have big benefits
  • Align the applicable form of virtualization for the given task at hand

Some links for additional reading on the above and related topics

Wrap up (for now, read part II here)

For some applications reduction ratios are an important focus on the tools or modes of operations that achieve those results.

Likewise for other applications where the focus is on performance with some data reduction benefit, tools are optimized for performance first and reduction secondary.

Thus I expect messaging from some vendors to adjust (expand) to those capabilities that they have in their toolboxes (product portfolios) offerings

Consequently, IMHO some of the backup centric dedupe solutions may find themselves in niche roles in the future unless they can diversity. Vendors with multiple data footprint reduction tools will also do better than those with only a single function or focused tool.

However for those who only have a single or perhaps a couple of tools, well, guess what the approach and messaging will be.

After all, if all you have is a hammer everything looks like a nail, if all you have is a screw driver, well, you get the picture.

On the other hand, if you are still not clear on what all this means, send me a note, give a call, post a comment or a tweet and will be happy to discuss with you.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Its US Census time, What about IT Data Centers?

It is that once a decade activity time this year referred to as the US 2010 Census.

With the 2010 census underway, not to mention also time for completing and submitting your income tax returns, if you are in IT, what about measuring, assessing, taking inventory or analyzing your data and data center resources?

US 2010 Cenus formsUS 2010 Cenus forms
Figure 1: IT US 2010 Census forms

Have you recently taken a census of your data, data storage, servers, networks, hardware, software tools, services providers, media, maintenance agreements and licenses not to mention facilities?

Likewise have you figured out what if any taxes in terms of overhead or burden exists in your IT environment or where opportunities to become more optimized and efficient to get an IT resource refund of sorts are possible?

If not, now is a good time to take a census of your IT data center and associated resources in what might also be called an assessment, review, inventory or survey of what you have, how its being used, where and who is using and when along with associated configuration, performance, availability, security, compliance coverage along with costs and energy impact among other items.

IT Data Center Resources
Figure 2: IT Data Center Metrics for Planning and Forecasts

How much storage capacity do you have, how is it allocated along with being used?

What about storage performance, are you meeting response time and QoS objectives?

Lets not forget about availability, that is planned and unplanned downtime, how have your systems been behaving?

From an energy or power and cooling standpoint, what is the consumption along with metrics aligned to productivity and effectiveness. These include IOPS per watt, transactions per watt, videos or email along with web clicks or page views per watt, processor GHz per watt along with data movement bandwidth per watt and capacity stored per watt in a given footprint.

Other items to look into for data centers besides storage include servers, data and I/O networks, hardware, software, tools, services and other supplies along with physical facility with metrics such as PUE. Speaking of optimization, how is your environment doing, that is another advantage of doing a data center census.

For those who have completed and sent in your census material along with your 2009 tax returns, congratulations!

For others in the US who have not done so, now would be a good time to get going on those activities.

Likewise, regardless of what country or region you are in, its always a good time to take a census or inventory of your IT resources instead of waiting every ten years to do so.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Storage Optimization: Performance, Availability, Capacity, Effectiveness

Storage I/O trends

With the IT and storage industry shying away from green hype, green washing and other green noise, there is also a growing realization that the new green is about effectively boosting efficiency to improve productivity and profitability or to sustain business and IT growth during tough economic times.

This past week while doing some presentations (I’ll post a link soon to the downloads) at the 2008 San Francisco installment of Storage Decisions event focused on storage professionals, as well as a keynote talk at the value added reseller (VAR) channel professional focused storage strategies event, a common theme was boosting productivity, improving on efficiency, stretching budgets and enabling existing personal and resources to do more with the same or less.

During these and other presentations, keynotes, sessions and seminars both here in the U.S. as well as in Europe recently, these common themes of booting efficiency as well as the closing of the green gap, that is, the gap between industry and marketing rhetoric around green hype, green noise, green washing and issues that either do not resonate with, or, can not be funded by IT organizations compared with the disconnect of where many IT organizations issues exist which are around power, cooling, floor space or footprint as well as EH&S (Environmental health and safety) and economics.

The green gap (here, and here, and here) is that many IT organizations around the world have not realized due to green hype around carbon footprints and related themes that in fact, boosting energy efficiency for active and on-line applications, data and workloads (e.g. doing more I/O operations per second-IOPS, transactions, files or messages processed per watt of energy) to address power, cooling, floor space are in fact a form of addressing green issues, both economic and environmental.

Likewise for inactive or idle data, there is a bit more of a linkage that green can mean powering things off, however there is also a disconnect in that many perceive that green storage for example is only green if the storage can be powered off which while true for in-active or idle data and applications, is not true for all data and applications types.

As mentioned already, for active workloads, green means doing more with the same or less power, cooling and floor space impact, this means doing more work per unit of energy. In that theme, for active workload, a slow, large capacity disk may in fact not be energy efficient if it impedes productivity and results in more energy to get the same amount of work done. For example, larger capacity SATA disk drives are also positioned as being the most green or energy efficiency which can be true for idle or in-active or non performance (time) sensitive applications where more data is stored in a denser footprint.

However for active workload, lower capacity 15.5K RPM 300GB and 400GB Fibre Channel (FC) and SAS disk drives that deliver more IOPS or bandwidth per watt of energy can get more work done in the same amount of time.

There is also a perception that FC and SAS disk drives use more power than SATA disk drives which in some cases can be true, however current generations of high performance 10K RPM and 15.5K RPM drives have very similar power draw on a raw spindle or device basis. What differs is the amount of capacity per watt for idle or inactive applications, or, the number of IOPS or amount of performance for active configurations.

On the other hand, while not normally perceived as being green compared to tape or IPM and MAID (1st generation and MAID 2.0) solutions, along with SSD (Flash and RAM), not to mention fast SAS and FC disks or tiered storage systems that can do more IOPS or bandwidth per watt of energy are in fact green and energy efficiency for getting work done. Thus, there are two sides to optimizing storage for energy efficiency, optimizing for when doing work e.g. more miles per gallon per amount of work done, and, how little energy used when not doing work.

Thus, a new form of being green to sustain business growth while boosting productivity is Gaining Realistic Economic Efficiency Now that as a by product helps both business bottom lines as well as the environment by doing more with less. These are themes that are addressed in my new book

“The Green and Virtual Data Center” (Auerbach) that will be formerly launched and released for generally availability just after the 1st of the year (hopefully sooner), however you can beat the rush and order your copy now to beat the rush at Amazon and other fine venues around the world.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Dutch StorageExpo Recap

Earlier this week I had the pleasure of presenting a keynote talk (“Storage Industry Trends and Perspectives: Beyond Hype and Green Washing”) at the Dutch StorageExpo (produced by VNU Exhibitions Europe) event in Utrecht the Netherlands which was co-located in the ultra large Jaarbeurs congress center (e.g. convention center) along with concurrent shows for Linux, Security and networking making for a huge show and exhibition, almost a mini scaled down version of cebit or VMworld or EMCworld like event.

Dutch StorageExpo

Congratulations and many thanks to Marloes van den Berg of VNU Exhibitions and her team who put together a fantastic and well attended event, not to mention their warm and gracious Dutch hospitality.

European shows and events are different than those in the U.S. in that at European events, the focus is more on meeting, building and maintaining relationships and less on “Uui Gui” demos or marketing sales pitches involving complex demos and technology displays found at many U.S. events.

Granted, their are indeed product demos and technology to look at and talk about, and rest assured, the conversations and discussions when involving technology get right to the point and often much more direct. There is also a more relaxed aspect as seen in the many booths or stands as they are called, many of which have bars that serve up coffee in the morning as well as snacks and other beverages (the Hienken in Holland is much better than what is shipped to the U.S.) over which to discuss and have conversations about various topics, issues and technolgies.

Many of the issues being faced by the Europeans are similar to those being faced by IT organizations in North America as well as elsewhere in the world including limits or issues around power, cooling, floor-space footprints, economics, doing more with less to boost productivity and enhance efficieinecy while sustaining business growth without impacting service delivery or service levels. BC/DR, data proteciton and data security, virtualizaiton were all topics of interest and points of discussions among others.

I had the opportunity to meet several new people both from IT organizations, vars or resellers, consultants, vendors and media along with putting a face to a name of people I had meet virtually in the past not to mention re-connect with others that I have known from the past whom it was great to have had a chance to re-connect with.

Thanks to all of those who attended both the key note session on Wednesday afternoon as well as to those who were at Monday’s all day seminar organized by Gert Brouwer or Brouwer consultancy in Nijkerk, I really enjoyed the conversations and perspectives of everyone I had a chance to meet with this past week and look forward to future conversations.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Thank you Gartner for generating green awareness for my new book: The Green and Virtual Data Center!

Storage I/O trends

The other day Gartner issued a press release about their new findings that Users Are Becoming Increasingly Confused About the Issues and Solutions Surrounding Green IT.

This however what is missing from the Gartner report and action steps is to also say to read my new book “The Green and Virtual Data Center” (Auerbach).

However in all fairness, since Gartner has not yet seen it, I would seriously doubt that they would endorse anything other than one of their own publications.

Regardless, its great to see Gartner among others joining in and helping to transition industry awareness from Green Hype and help to close the Green gap (read more here and here) and begin addressing core issues that IT organizations can and are addressing to improve efficiency, address costs and enable sustainable business growth in an economic and environmental friendly way.

Thank you Gartner and let me know when and where you want a copy sent for a formal review and endorsement of my new book. Meanwhile, you can learn more at www.thegreenandvirtualdatacenter.com including a variety of green and related power, cooling, floor-space, environmental health and safety-EHS (PCFE) or green topics along with where to pre-order your advance copy from Amazon.com as well as other fine venues around the world.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Intelligent Power Management (IPM) and second generation MAID 2.0 on the rise

Storage I/O trends

In case you missed it today, Adaptec announced that they are the 1st vendor “This Week” to add support for Intelligent Power Management (IPM) to their storage systems. Adaptec joins a growing list of vendors who are deploying, or, who are program announcing some variation of IPM and second generation MAID 2.0 ability including support for different types of tiered disk drives including various combinations of Fibre Channel and SAS as well as SATA.

As a quick refresh, Massive or Monolithic Arrays of Idle or Inactive Disks (MAID) was popularized by 1st generation MAID vendor Copan who spins down disk drives to avoid energy usage. One of the challenges with 1st generation MAID is the poor performance by being able to only have at most 25% of the disk drives spinning at any time to transfer data when needed.

This is a balancing act between achieving energy avoidance and associated benefits vs. maintaining performance to move data when needed particularly for large restoration to support BC/DR or other purposes. Granted, 1st generation MAID systems like those from Copan while positioned as alternatives to high-performance disk storage systems to amplify potential energy savings on one hand, or, to put as an alternative to magnetic tape by providing random restore capability. The reality is that 1st generation MAID systems are finding their niche not for on-line primary or even on-line secondary storage, nor as a direct replacement for tape or even disk based libraries to support large-scale BC/DR, rather, in a sweet spot between secondary and near-line disk libraries and virtual tape libraries with a target application of very infrequently accessed of data.

Second generation MAID, aka MAID 2.0 is an evolution of the general technologies and capabilities extending functionality and flexibility while addressing quality of service (QoS), performance, availability, capacity and energy consumption using IPM also known as Adaptive Power Management (APM), dynamic bandwidth switching or scaling (DBS) among other names. The basic premise is to add flexibility building on 1st generation characteristics including data protection, resiliency and pro-active part or drive monitoring. Another basic premise of IPM. and MAID 2.0. solutions is to allow the performance and subsequent energy usage to vary, which is to cut the amount of performance and energy usage during in-active times, yet, when data needs to be accessed, to allow full performance without penalties for energy savings.

Second generation MAID solutions can be characterized by multiple power saving modes as well as flexible performance to adjust to changing workload and application needs. Another characteristic is the ability to work across different types of disk drives including Fibre Channel, SAS and SATA as opposed to only SATA drives found in 1st generation solutions as well as for the IPM or MAID 2.0 functionality to exist in a standard storage system or array instead of in a purpose-built dedicated storage system. Other capabilities include support for more granular power settings down to a RAID group or LUN level instead of across an entire array or storage system as well as support for different RAID levels among other features.

Examples of vendors who have either announced product or made statements of direction with regard to MAID 2.0 and IPM enabled storage systems include:

Adaptec (Today), Datadirect, EMC, Fujitsu, HDS, HGST (Hitachi Disk Drives), NEC, Nexsan, and Xyratex among others on a growing list of solutions.

For applications and data storage needs that need good performance and QoS over a range of changing usage conditions to balance good performance when needed to efficiently get work done to boost productivity, while saving or avoiding energy when little or no work needs to be done, take a look at current and emerging IPM and MAID 2.0 enabled storage systems as part of a tiered storage strategy to discuss power, cooling, floor-space and EHS (PCFE) related issues.

To learn more, check out the StorageIO Industry Trends and Perspective white paper Intelligent Power Management (IPM) and MAID 2.0 and visit www.thegreenandvirtualdatacenter.com well as www.storageio.com.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Closing the Green Gap – Green washing may be endangered, however addressing real green issues is here to stay

Storage I/O trends

Here’s a new article I wrote that just appeared over at Enterprise Storage Forum called Closing the Green Storage Gap.

Not all ‘green’ IT solutions or messages are created equal. Regardless of political views, the reality is that for business and IT sustainability, a focus on ecological issues and more importantly, their economic aspects cannot be ignored.

There are business benefits to using the most energy-efficient IT solutions to meet different data and application requirements. However, vendors are busy promoting ‘green’ stories and solutions that often miss where IT organization challenges and mandates exist. This article examines the growing gap between green messaging, or ‘Green Wash,’ and how to close the gap and enable IT organization issues to be addressed today in a way that sustains business growth in an economic and ecologically friendly way.

Have a read and a good weekend.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Links to Upcoming and Recent Webcasts and Videocasts

Here are links to several recent and upcoming Webcast and video casts covering a wide range of topics. Some of these free Webcast and video casts may require registration.

Industry Trends & Perspectives – Data Protection for Virtual Server Environments

Next Generation Data Centers Today: What’s New with Storage and Networking

Hot Storage Trends for 2008

Expanding your Channel Business with Performance and Capacity Planning

Top Ten I/O Strategies for the Green and Virtual Data Center

Cheers
Greg Schulz – StorageIO

Green, Virtual, Servers, Storage and Networking 2008 Beijing Olympics

Storage I/O trends

How about those opening 2008 Beijing Olympic ceremonies on NBC last night?

If you were like me, I had my DVR capture the event while out enjoying the nice August evening with some friends doing some relaxing and fishing (we did catch and release fish!) on the scenic St. Croix river.

John Nelson with a small mouth bass caught and released on the St. Croix River During 2008 Beijing Olympics
Fishing while DVR records 2008 Olympics

John Nelson with a northern pike (swamp shark) caught and released on the St. Croix River During 2008 Beijing Olympics
Fishing while DVR records 2008 Olympics

A young bald eagle seen during fishing on the St. Croix river during opening of 2008 Olympic games
A young bald eagle seen during fishing while DVR records 2008 Olympics

The reason I bring up the Olympics, servers, storage, networking, virtualization and green topics are a couple of themes. One being all the news and content available to keep track of what is happening with the games taking place all of which is being stored on servers, storage and relying on networks to access the rich media and unstructured data via the web or traditional media. The 2008 summer games are also being described as the on-line and virtual olympics. The amount of storage being used to store digital data from the 2008 Olympics for later playback, which then gets recorded on DVRs if not watched in real-time is staggering as are the number of servers and networking capabilities being used. In addition to the video, audio, still photos, text and blogs, then there are the security cameras in Beijing generating massive amounts of digital data.

For those who track or keep an eye or ear open towards data and storage management, the amount of data that continues to grow and number of copies that get created should be a familiar theme. Of course, you would then have heard that the magic elixir is to simply de-dupe everything. That is reduce your data footprint by eliminating all of those extra copies however easier said then done, especially when a copy of the games is being transmitted and saved to millions of DVRs or other forms of data storage servers around the world.

For the time being, I prefer that my DVR support more usable storage capacity and real-time compression so that I can keep more copies of my favorite shows and of course the Olympics all in HDTV, which of course chews up storage space faster than a highly animated PowerPoint slide deck from your favorite vendors most recent, or, upcoming product announcements.

The other theme is in addition to being Olympic time, as well as late summer here in the northern hemisphere or winter for our friends in the summer hemisphere, its also pre-briefing and early product announcement time for the barrage of fall server, storage, networking, I/O, software, virtualization and green related solutions. So far, Im not sure if its the Olympics or what, however the bait line on the upcoming announcements and briefings include the tags “Industry First”, “Industry Unique”, “Only Vendor”, “Only Product”, “Revolutionary”, “First Vendor” or “First Product”, “Fastest”, “Largest”, “Greenest” among other interesting spins and twists that would even make an Olympic gymnast dizzy.

So enjoy the Olympic , keep those hard disk drives in your DVR cool while managing the usable capacity and watch for more gold medal attempts both from Beijing, as well as from your favorite IT vendors coming to a podium to you soon with their upcoming announcements, some of which may be award winning. Also check out www.greendatastorage.com which is now also pointed to by www.thegreenandvirtualdatacenter.com that has a new look and feel as well as some updated content with more on the way.

Cheers
gs

technorati tags: Green Gap, Green Hype, Green IT, PCFE, The Green and Virtual Data Center, Virtualization, StorageIO, Green Washing

SMB capacity planning; Focusing on energy conservation

Storage I/O trends

Here’s a link to a new tip I wrote that is posted over at SearchSMBStorage on Capacity Planning and energy conservation.

Here are some added links to other recent tips I wrote and posted at a SearchSMBStorage:

Improve your storage energy efficiency

Data protection for virtual server environments

Data footprint reduction for SMBs

Is clustered NAS for SMBs?

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Data Proteciton for Virtual Environments

Storage I/O trends

Server virtualization continues to be a popular industry focus, particularly to discuss IT data center power, cooling, floor space and environmental (PCFE) issues commonly called green computing along with supporting next generation virtualized data center environments. There are many challenges and options related to protecting data and applications in a virtual server environment.

Here’s a link to a new white paper by myself that looks at various issues, challenges and along with various approaches for enabling data protection for virtual environments. This in-depth report explains what your organization needs to know as it moves into a virtual realm. Topics include background and issues, glossary of common virtual terms, re-architecting data protection, technologies and techniques, virtual machine movement, industry trends and much more …

The report is called Data Protection Options for Virtualized Servers: Demystifying Virtual Server Data Protection, Have a look for yourself.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Spring 2008 Storage Descisions Wrap-Up

Once again the Techtarget (TT) folks put on a great event at the spring 2008 edition of Storage Decisions (SD) event in Chicago, tip of the hat to the whole TT crew. SD is known as an IT consumer/user event as opposed to industry events like SNW that are known as a vendor to vendor networking event. TT has added a new form over the past year that occurs the day/night before SD focused on the channel and var audiences with a dinner networking seminar called StorageStrategies. While SD continues to be focused on the IT consumer aka user, the TT channel program is a means for vendors to get in front of perspective channel partners to tell their story and value proposition of why they should be partnered. It?s a fun and growing event that I have been involved with for over a year now talking with the channel folks about issues and opportunities to address the various needs IT organizations. If you are a vendor looking to expand your channel presence, or, a channel partner var looking for new solutions, technologies and partners, these series are a great way of networking.

The main focus however last week was the SD event which had a great turnout of around 550 IT and storage professionals (not counting vendors, exhibitors, vars, media and analysts). To put the attendance in perspective compared to other events. I guess you could virtualizes the attendance of IT folks at about 65,431 however the reality number quoted by TT and observed (during the sessions, lunch and so forth) was in the mid 500?s (not including vendors, exhibitors, vars, media, analysts, hotel personal, stumping politicians, high school marching bands, tour groups and the homeless). Talking with vendors and exhibitors, the census was that they were either getting a boat load of good leads, or, getting actual appointments and meetings for near term opportunities that might help their sales reps win or buy a new boat, car, home, or cup of coffee.

Having been both a customer and a vendor before becoming an analyst years ago, it?s fun to walk the exhibit area listening and watching the different approaches and pitches by the booth personal. Some are focused on just getting leads, some on showing you?re their demo, some on how well they memorized their buzzword laden sales pitch, some can even give you their elevator value prop pitch in less than 30 seconds to get you to stay for another five minutes prompting a rescheduling to give them another 20 minutes of time. I?m still waiting for some vendor to bring in the carnival midway skills game where participants use a water gun or other item to know that particular vendors competitors logo on a target down, or, to knock down various IT issues.

In between all my meetings, presentations, recording some new video techtalks (Data footprint reduction, hot topics for the channel, clustered storage and NAS for SMBs) and other activity at the recent Storage Decisions event in Chicago this past week, I was able to meet up with some friends and former co-workers for a relaxing dinner at Buddy Guy?s Legends across the street from the event hotel. Performing on stage was Vino Louden who plays the guitar with the sole and feeling of Stevie Ray Vaughn and creative flare of Jimmy Page backed by his three man band. If you have never been to Legends you still have time to go there as the joint is staying open until their new facility is ready.

As soon as TechTarget posts the links to the session presentations including my talks on ?Clustered Storage and NAS? that included Web 2.0 and bulk storage as well as my talk about ?Green and Energy Efficient Storage? I will post them on this blog.

Cheers
GS

Hot Storage Topics Converge on Chicago Next Week

Storage I/O trends

Next week in Chicago (May 12th) at Storage Strategies, the event for channel professional held the evening before StorageDecisions I will be talking about Hot Storage Topics for 2008 including addressing data protection for virtual environments, power cooling floor space environmental (PCFE) aka green items and the “Green Gap”, data footprint reduction for both on-line active and changing data using real-time data compression, archiving for in-active or dormant data and de-dupe for backup data. Also on the list of hot topics will be clustered NAS and clustered storage for Web 2.0 along with other timely and relevant items.

At the StorageDecisions event, I will be talking about ?Green and Environmental Friendly Storage? Tuesday morning May 13th in the presentation ?Practical Ways to Achieve Energy Efficiency – Power, Cooling, Floor-Space and Environmental (PCFE) Issues and Trends? looking at different issues including the ?Green Gap? or disconnect between messaging and common IT data center issues along with various options to boost efficiency for both active and in-active data and storage resources.

Also while at StorageDecisions next week, on Wednesday the 14th I will be talking about clustered storage including clustered NAS in the session ?Clustered Storage – ?From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0?. Given some recent vendor technology announcements and statements of direction, Web 2.0 and unstructured data are gaining popularity as are the confusing options or different types of clustered storage solutions including ?Cluster Wanna Bee?s?. If you are in Chicago next week, stop in and check out the event and if you can attend any of my sessions, stop by and say hello.

Cheers
GS