Has FCoE entered the trough of disillusionment?

This is part of an ongoing series of short industry trends and perspectives blog posts briefs based on what I am seeing and hearing in my conversations with IT professionals on a global basis.

These short posts compliment other longer posts along with traditional industry trends and perspective white papers, research reports, videos, podcasts, webcasts as well as solution brief content found a www.storageioblog.com/reports and www.storageio.com/articles.

Has FCoE (Fibre Channel over Ethernet) entered the trough of disillusionment?

IMHO Yes and that is not a bad thing if you like FCoE (which I do among other technologies).

The reason I think that it is good that FCoE is in or entering the trough is not that I do not believe in FCoE. Instead, the reason is that most if not all technologies that are more than a passing fad often go through a hype and early adopter phase before taking a breather prior to broader longer term adoption.

Sure there are FCoE solutions available including switches, CNAs and even storage systems from various vendors. However, FCoE is still very much in its infancy and maturing.

Based on conversations with IT customer professionals (e.g those that are not vendor, vars, consultants, media or analysts) and hearing their plans, I believe that FCoE has entered the proverbial trough of disillusionment which is a good thing in that FCoE is also ramping up for deployment.

Another common question that comes up regarding FCoE as well as other IO networking interfaces, transports and protocols is if they are temporal (temporary short life span) technologies.

Perhaps in the scope that all technologies are temporary however it is their temporal timeframe that should be of interest. Given that FCoE will probably have at least a ten to fifteen year temporal timeline, I would say in technology terms it has a relative long life for supporting coexistence on the continued road to convergence which appears to be around Ethernet.

That is where I feel FCoE is at currently, taking a break from the initial hype, maturing while IT organizations begin planning for its future deployment.

I see FCoE as having a bright future coexisting with other complimentary and enabling technologies such as IO Virtualization (IOV) including PCI SIG MRIOV, Converged Networking, iSCSI, SAS and NAS among others.

Keep in mind that FCoE does not have to be seen as competitive to iSCSI or NAS as they all can coexist on a common DCB/CEE/DCE environment enabling the best of all worlds not to mention choice. FCoE along with DCB/CEE/DCE provides IT professionals with choice options (e.g. tiered I/O and networking) to align the applicable technology to the task at hand for physical or

Again, the questions pertaining to FCoE for many organizations, particularly those not going to iSCSI or NAS for all or part of their needs should be when, where and how to deploy.

This means that for those with long lead time planning and deployment cycles, now is the time to putting your strategy into place for what you will be doing over the next couple of years if not sooner.

For those interested, here is a link (may require registration) to a good conversation taking place over on IT Toolbox regarding FCoE and other related themes that may be of interest.

Here are some links to additional related material:

FCoE Infrastructure Coming Together
2010 and 2011 Trends, Perspectives and Predictions: More of the same?
SNWSpotlight: 8G FC and FCoE, Solid State Storage
NetApp and Cisco roll out vSphere compatible FCoE solutions
Fibre Channel over Ethernet FAQs
Fast Fibre Channel and iSCSI switches deliver big pipes to virtualized SAN environments.
Poll: Networking Convergence, Ethernet, InfiniBand or both?
I/O Virtualization (IOV) Revisited
Will 6Gb SAS kill Fibre Channel?
Experts Corner: Q and A with Greg Schulz at StorageIO
Networking Convergence, Ethernet, Infiniband or both?
Vendors hail Fibre Channel over Ethernet spec
Cisco, NetApp and VMware combine for ‘end-to-end’ FCoE storage
FCoE: The great convergence, or not?
I/O virtualization and Fibre Channel over Ethernet (FCoE): How do they differ?
Chapter 9 – Networking with your servers and storage: The Green and Virtual Data Center (CRC)
Resilient Storage Networks: Designing Flexible Scalable Data Infrastructures (Elsevier)

That is all for now, hope you find these ongoing series of current or emerging Industry Trends and Perspectives posts of interest.

Of course let me know what your thoughts and perspectives are on this and other related topics.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

August 3, 2010October 17, 2024

Data footprint reduction (Part 1): Life beyond dedupe and changing data lifecycles

Over the past couple of weeks there has been a flurry of IT industry activity around data footprint impact reduction with Dell buying Ocarina and IBM acquiring Storwize. For those who want the quick (compacted, reduced) synopsis of what Dell buying Ocarina as well as IBM acquiring Storwize means read this post here along with some of my comments here and here.

Now, before any Drs or Divas of Dedupe get concerned and feel the need to debate dedupes expanding role, success or applicability, relax, take a deep breath, then read on and take another breath before responding if so inclined.

The reason I mention this is that some may mistake this as a piece against or not in favor of dedupe as it talks about life beyond dedupe which could be mistaken as indicating dedupes diminished role which is not the case (read ahead and see figure 5 to see the bigger picture).

Likewise some might feel that since this piece talks about archiving for compliance and non regulatory situations along with compression, data management and other forms of data footprint reduction they may be compelled to defend dedupes honor and future role.

Again, relax, take a deep breath and read on, this is not about the death of dedupe.

Now for others, you might wonder why the dedupe tongue in check humor mentioned above (which is what it is) and the answer is quite simple. The industry in general is drunk on dedupe and in some cases thus having numbed its senses not to mention having blurred its vision of the even bigger opportunities for the business benefits of data footprint reduction beyond todays backup centric or vmware server virtualization dedupe discussions.

Likewise, it is time for the industry to wake (or sober) up and instead of trying to stuff everything under or into the narrowly focused dedupe bottle. Instead, realize that there is a broader umbrella called data footprint impact reduction which includes among other techniques, dedupe, archive, compression, data management, data deletion and thin provisioning across all types of data and applications. What this means is a broader opportunity or market than what exists or being discussed today leveraging different techniques, technologies and best practices.

Consequently this piece is about expanding the discussion to the larger opportunity for vendors or vars to extend their focus to the bigger world of overall data footprint impact reduction beyond where currently focused. Likewise, this is about IT customers realizing that there are more opportunities to address data and storage optimization across your entire organization using various techniques instead of just focusing on backup.

In other words, there is a very bright future for dedupe as well as other techniques and technologies that fall under the data footprint reduction umbrella including data stored online, offline, near line, primary, secondary, tertiary, virtual and in a public or private cloud..

Before going further however lets take a step back and look at some business along with IT issues, challenges and opportunities.

What is the business and IT issue or challenge?
Given that there is no such thing as a data or information recession shown in figure 1, IT organizations of all size are faced with the constant demand to store more data, including multiple copies of the same or similar data, for longer periods of time.

Figure 1: IT resource demand growth continues

The result is an expanding data footprint, increased IT expenses, both capital and operational, due to additional Infrastructure Resource Management (IRM) activities to sustain given levels of application Quality of Service (QoS) delivery shown in figure 2.

Some common IT costs associated with supporting an increased data footprint include among others:

Data storage hardware and management software tools acquisition
Associated networking or IO connectivity hardware, software and services
Recurring maintenance and software renewal fees
Facilities fees for floor space, power and cooling along with IT staffing
Physical and logical security for data and IT resources
Data protection for HA, BC or DR including backup, replication and archiving

Figure 2: IT Resources and cost balancing conflicts and opportunities

Figure 2 shows the result is that IT organizations of all size are faced with having to do more with what they have or with less including maximizing available resources. In addition, IT organizations often have to overcome common footprint constraints (available power, cooling, floor space, server, storage and networking resources, management, budgets, and IT staffing) while supporting business growth.

Figure 2 also shows that to support demand, more resources are needed (real or virtual) in a denser footprint, while maintaining or enhancing QoS plus lowering per unit resource cost. The trick is improving on available resources while maintaining QoS in a cost effective manner. By comparison, traditionally if costs are reduced, one of the other curves (amount of resources or QoS) are often negatively impacted and vice versa. Meanwhile in other situations the result can be moving problems around that later resurface elsewhere. Instead, find, identify, diagnose and prescribe the applicable treatment or form of data footprint reduction or other IT IRM technology, technique or best practices to cure the ailment.

What is driving the expanding data footprint?
Granted more data can be stored in the same or smaller physical footprint than in the past, thus requiring less power and cooling per Gbyte, Tbyte or PByte. Data growth rates necessary to sustain business activity, enhanced IT service delivery and enable new applications are placing continued demands to move, protect, preserve, store and serve data for longer periods of time.

The popularity of rich media and Internet based applications has resulted in explosive growth of unstructured file data requiring new and more scalable storage solutions. Unstructured data includes spreadsheets, Power Point, slide decks, Adobe PDF and word documents, web pages, video and audio JPEG, MP3 and MP4 files. This trend towards increasing data storage requirements does not appear to be slowing anytime soon for organizations of all sizes.

After all, there is no such thing as a data or information recession!

Changing data access lifecycles
Many strategies or marketing stories are built around the premise that shortly after data is created data is seldom, if ever accessed again. The traditional transactional model lends itself to what has become known as information lifecycle management (ILM) where data can and should be archived or moved to lower cost, lower performing, and high density storage or even deleted where possible.

Figure 3 shows as an example on the left side of the diagram the traditional transactional data lifecycle with data being created and then going dormant. The amount of dormant data will vary by the type and size of an organization along with application mix.

Figure 3: Changing access and data lifecycle patterns

However, unlike the transactional data lifecycle models where data can be removed after a period of time, Web 2.0 and related data needs to remain online and readily accessible. Unlike traditional data lifecycles where data goes dormant after a period of time, on the right side of figure 3, data is created and then accessed on an intermittent basis with variable frequency. The frequency between periods of inactivity could be hours, days, weeks or months and, in some cases, there may be sustained periods of activity.

A common example is a video or some other content that gets created and posted to a web site or social networking site such as Face book, Linked in, or You Tube among others. Once the content is discussed, while it may not change, additional comment and collaborative data can be wrapped around the data as additional viewers discover and comment on the content. Solution approaches for the new category and data lifecycle model include low cost, relative good performing high capacity storage such as clustered bulk storage as well as leveraging different forms of data footprint reduction techniques.

Given that a large (and growing) percentage of new data is unstructured, NAS based storage solutions including clustered, bulk, cloud and managed service offerings with file based access are gaining in popularity. To reduce cost along with support increased business demands (figure 2), a growing trend is to utilize clustered, scale out and bulk NAS file systems that support NFS, CIFS for concurrent large and small IOs as well as optionally pNFS for large parallel access of files. These solutions are also increasingly being deployed with either built in or add on accessorized data footprint reduction techniques including archive, policy management, dedupe and compression among others.

What is your data footprint impact?
Your data footprint impact is the total data storage needed to support your various business application and information needs. Your data footprint may be larger than how much actual data storage you have as seen in figure 4. In Figure 4, an example is an organization that has 20TBytes of storage space allocated and being used for databases, email, home directories, shared documents, engineering documents, financial and other data in different formats (structured and unstructured) not to mention varying access patterns.

Figure 4: Expanding data footprint due to data proliferation and copies being retained

Of the 20TBytes of data allocated and used, it is very likely that the consumed storage space is not 100 percent used. Database tables may be sparsely (empty or not fully) allocated and there is likely duplicate data in email and other shared documents or folders. Additionally, of the 20TBytes, 10TBytes are duplicated to three different areas on a regular basis for application testing, training and business analysis and reporting purposes.

The overall data footprint is the total amount of data including all copies plus the additional storage required for supporting that data such as extra disks for Redundant Array of Independent Disks (RAID) protection or remote mirroring.

In this overly simplified example, the data footprint and subsequent storage requirement are several times that of the 20TBytes of data. Consequently, the larger the data footprint the more data storage capacity and performance bandwidth needed, not to mention being managed, protected and housed (powered, cooled, situated in a rack or cabinet on a floor somewhere).

Data footprint reduction techniques
While data storage capacity has become less expensive on a relative basis, as data footprint continue to expand in order to support business requirements, more IT resources will be needed to be made available in a cost effective, yet QoS satisfying manner (again, refer back to figure 2). What this means is that more IT resources including server, storage and networking capacity, management tools along with associated software licensing and IT staff time will be required to protect, preserve and serve information.

By more effectively managing the data footprint across different applications and tiers of storage, it is possible to enhance application service delivery and responsiveness as well as facilitate more timely data protection to meet compliance and business objectives. To realize the full benefits of data footprint reduction, look beyond backup and offline data improvements to include online and active data using various techniques such as those in table 1 among others.

There are several methods (shown in table 1) that can be used to address data footprint proliferation without compromising data protection or negatively impacting application and business service levels. These approaches include archiving of structured (database), semi structured (email) and unstructured (general files and documents), data compression (real time and offline) and data deduplication.

	Archiving	Compression	Deduplication
When to use	Structured (database), email and unstructured	Online (database, email, file sharing), backup or archive	Backup or archiving or recurring and similar data
Characteristic	Software to identify and remove unused data from active storage devices	Reduce amount of data to be moved (transmitted) or stored on disk or tape.	Eliminate duplicate files or file content observed over a period of time to reduce data footprint
Examples	Database, email, unstructured file solutions with archive storage	Host software, disk or tape, (network routers) and compression appliances or software as well as appearing in some primary storage system solutions	Backup and archive target devices and Virtual Tape Libraries (VTLs), specialized appliances
Caveats	Time and knowledge to know what and when to archive and delete, data and application aware	Software based solutions require host CPU cycles impacting application performance	Works well in background mode for backup data to avoid performance impact during data ingestion

Table 1: Data footprint reduction approaches and techniques

Archiving for compliance and general data retention
Data archiving is often perceived as a solution for compliance, however, archiving can be used for many other non compliance purposes. These include general data footprint reduction, to boost performance and enhance routine data maintenance and data protection. Archiving can be applied to structured databases data, semi structured email data and attachments and unstructured file data.

A key to deploying an archiving solution is having insight into what data exists along with applicable rules and policies to determine what can be archived, for how long, how many copies and how data ultimately may be finally retired or deleted. Archiving requires a combination of hardware, software and people to implement business rules.

A challenge with archiving is having the time and tools available to identify what data should be archived and what data can be securely destroyed when no longer needed. Further complicating archiving is that knowledge of the data value is also needed; this may well include legal issues as to who is responsible for making decisions on what data to keep or discard.

If a business can invest in the time and software tools, as well as identify which data to archive to support an effective archive strategy, the returns can be very positive towards reducing the data footprint without limiting the amount of information available for use.

Data compression (real time and offline)
Data compression is a commonly used technique for reducing the size of data being stored or transmitted to improve network performance or reduce the amount of storage capacity needed for storing data. If you have used a traditional or TCP/IP based telephone or cell phone, watched either a DVD or HDTV, listened to an MP3, transferred data over the internet or used email you have most likely relied on some form of compression technology that is transparent to you. Some forms of compression are time delayed, such as using PKZIP to zip files, while others are real time or on the fly based such as when using a network, cell phone or listening to an MP3.

Two different approaches to data compression that vary in time delay or impact on application performance along with the amount of compression and loss of data are loss less (no data loss) and lossy (some data loss for higher compression ratio). In addition to these approaches, there are also different implementations of including real time for no performance impact to applications and time delayed where there is a performance impact to applications.

In contrast to traditional ZIP or offline, time delayed compression approaches that require complete decompression of data prior to modification, online compression allows for reading from, or writing to, any location within a compressed file without full file decompression and resulting application or time delay. Real time appliance or target based compression capabilities are well suited for supporting online applications including databases, OLTP, email, home directories, web sites and video streaming among others without consuming host server CPU or memory resources or degrading storage system performance.

Note that with the increase of CPU server processing performance along with multiple cores, server based compression running in applications such as database, email, file systems or operating systems can be a viable option for some environments.

A scenario for using real time data compression is for time sensitive applications that require large amounts of data such as online databases, video and audio media servers, web and analytic tools. For example, databases such as Oracle support NFS3 Direct IO (DIO) and Concurrent IO (CIO) capabilities to enable random and direct addressing of data within an NFS based file. This differs from traditional NFS operations where a file would be sequential read or written.

Another example of using real time compression is to combine a NAS file server configured with 300GB or 600GB high performance 15.5K Fibre Channel or SAS HDDs in addition to flash based SSDs to boost the effective storage capacity of active data without introducing a performance bottleneck associated with using larger capacity HDDs. Of course, compression would vary with the type of solution being deployed and type of data being stored just as dedupe ratios will differ depending on algorithm along with if text or video or object based among other factors.

Deduplication (Dedupe)
Data deduplication (also known as single instance storage, commonalty factoring, data difference or normalization) is a data footprint reduction technique that eliminates the occurrence of the same data. Deduplication works by normalizing the data being backed up or stored by eliminating recurring or duplicate copies of files or data blocks depending on the implementation.

Some data deduplication solutions boast spectacular ratios for data reduction given specific scenarios, such as backup of repetitive and similar files, while providing little value over a broader range of applications.

This is in contrast with traditional data compression approaches that provide lower, yet more predictable and consistent data reduction ratios over more types of data and application, including online and primary storage scenarios. For example, in environments where there is little to no common or repetitive data files, data deduplication will have little to no impact while data compression generally will yield some amount of data footprint reduction across almost all types of data.

Some data deduplication solution providers have either already added, or have announced plans to add, compression techniques to compliment and increase the data footprint effectiveness of their solutions across a broader range of applications and storage scenarios, attesting to the value and importance of data compression to reduce data footprint.

When looking at deduplication solutions, determine if the solution is designed to scale in terms of performance, capacity and availability over a large amount of data along with how restoration of data will be impacted by scaling for growth. Other items to consider include how data is reduplicated, such as real time using inline or some form of time delayed post processing, and the ability to select the mode of operation.

For example, a dedupe solution may be able to process data at a specific ingest rate inline until a certain threshold is hit and then processing reverts to post processing so as to not cause a performance degradation to the application writing data to the deduplication solution. The downside of post processing is that more storage is needed as a buffer. It can, however, also enable solutions to scale without becoming a bottleneck during data ingestion.

However, there is life beyond dedupe which is to in no way diminish dedupe or its very strong and bright future, one that Im increasingly convinced of having talked with hundreds of IT professionals (e.g. the customers) is that only the surface is being scratched for dedupe, not to mention larger data footprint impact opportunity seen in figure 5.

Figure 5: Dedupe adoption and deployment waves over time

While dedupe is a popular technology from a discussion standpoint and has good deployment traction, it is far from reaching mass customer adoption or even broad coverage in environments where it is being used. StorageIO research shows broadest adoption of dedupe centered around backup in smaller or SMB environments (dedupe deployment wave one in figure 5) with some deployment in Remote Office Branch Office (ROBO) work groups as well as departmental environments.

StorageIO research also shows that complete adoption in many of those SMB, ROBO, work group or smaller environments has yet to reach 100 percent. This means that there remains a large population that has yet to deploy dedupe as well as further opportunities to increase the level of dedupe deployment by those already doing so.

There has also been some early adoption in larger core IT environments where dedupe coexists with complimenting existing data protection and preservation practices. Another current deployment scenario for dedupe has been for supporting core edge deployments in larger environments that provide support for backup and data protection of ROBO, work group and departmental systems.

Note that figure 5 simply shows the general types of environments in which dedupe is being adopted and not any sort of indicators as to the degree of deployment by a given customer or IT environment.

What to do about your expanding data footprint impact?
Develop an overall data foot reduction strategy that leverages different techniques and technologies addressing online primary, secondary and offline data. Assess and discover what data exists and how it is used in order to effectively manage storage needs.

Determine policies and rules for retention and deletion of data combining archiving, compression (online and offline) and dedupe in a comprehensive data footprint strategy. The benefit of a broader, more holistic, data footprint reduction strategy is the ability to address the overall environment, including all applications that generate and use data as well as IRM or overhead functions that compound and impact the data footprint.

Data footprint reduction: life beyond (and complimenting) dedupe
The good news is that the Drs. and Divas of dedupe marketing (the ones who also are good at the disco dedupe dance debates) have targeted backup as an initial market sweet (and success) spot shown in figure 5 given the high degree of duplicate data.

Figure 6: Leverage multiple data footprint reduction techniques and technologies

However that same good news is bad news in that there is now a stigma that dedupe is only for backup, similar to how archive was hijacked by the compliance marketing folks in the post Y2K era. There are several techniques that can be used individually to address specific data footprint reduction issues or in combination as seen in figure 7 to implement a more cohesive and effective data footprint reduction strategy.

Figure 7: How various data footprint reduction techniques are complimentary

What this means is that both archive, dedupe as well as other forms of data footprint reduction can and should be used beyond where they have been target marketed using the applicable tool for the task at hand. For example, a common industry rule of thumb is that on average, ten percent of data changes per day (your mileage and rate of change will certainly vary given applications, environment and other factors).

Now assuming that you have 100TB (feel free to subtract a zero or two, or add as many as needed) of data (note I did not say storage capacity or percent utilized), ten percent change would be 10TB that needs to be backed up, replicated and so forth. Now with basic 2 to 1 streaming tape compression (2.5 to 1 in upcoming LTO enhancements) would reduce the daily backup footprint from 10TB to 5TB.

Using dedupe with 10 to 1 would get that from 10TB down to 1TB or about the size of a large capacity disk drive. With 20 to 1 that cuts the daily backup down to 500GB and so forth. The net effect is that more daily backups can be stored in the same footprint which in turn helps expedite individual file recover by having more options to choose from off of the disk based cache, buffer or storage pool.

On the other hand, if your objective is to reduce and eliminate storage capacity, then the same amount of backups can be stored on less disk freeing up resources. Now take the savings times the number of days in your backup retention and you should see the numbers start to add up.

Now what about the other 90 percent of the data that may not have changed, or, that did change and exists on higher performance storage?

Can its footprint impact be reduced?

The answer should be perhaps or it depends as well as prompts the question of what tool would be best. There is a popular thinking as is often the case with industry buzzwords or technologies to use it everywhere. After all goes the thinking, if it is a good thing why not use and deploy more of it everywhere?

Keep in mind that dedupe trades time to perform thinking and apply intelligence to further reduce data in exchange for space capacity. Thus trading time for space capacity can have a negative impact on applications that need lower response time, higher performance where the focus is on rates vs ratios. For example, the other 90 to 100 percent of the data in the above example may have to be on a mix of high and medium performance storage to meet QoS or service level agreement (SLA) objectives. While it would fun or perhaps cool to try and achieve a high data reduction ratio on the entire 100TB of active data with dedupe (e.g. trying to achieve primary dedupe), the performance impacts could have a negative impact.

The option is to apply a mix of different data footprint reduction techniques across the entire 100TB. That is, use dedupe where applicable and higher reduction ratios can be achieved while balancing performance, compression used for streaming data to tape for retention or archive as well as in databases or other applications software not to mention in networks. Likewise, use real time compression or what some refer to as primary dedupe for online active changing data along with online static read only data.

Deploy a comprehensive data footprint reduction strategy combining various techniques and technologies to address point solution needs as well as the overall environment, including online, near line for backup, and offline for archive data.

Lets not forget about archiving, thin provisioning, space saving snapshots, commonsense data management among other techniques across the entire environment. In other words, if your focus is just on dedupe for backup to
achieve an optimized and efficient storage environment, you are also missing

out on a larger opportunity. However, this also means having multiple tools or

technologies in your IT IRM toolbox as well as understanding what to use when, where and why.

Data transfer rates is a key metric for performance (time) optimization such as meeting backup or restore or other data protection windows. Data reduction ratios is a key metric for capacity (space) optimization where the focus is on storing as much data in a given footprint

Some additional take away points:

Develop a data footprint reduction strategy for online and offline data
Energy avoidance can be accomplished by powering down storage
Energy efficiency can be accomplished by using tiered storage to meet different needs
Measure and compare storage based on idle and active workload conditions
Storage efficiency metrics include IOPS or bandwidth per watt for active data
Storage capacity per watt per footprint and cost is a measure for in active data
Small percentage reductions on a large scale have big benefits
Align the applicable form of virtualization for the given task at hand

Some links for additional reading on the above and related topics

Data footprint reduction (Part 2): Dell, IBM, Ocarina and Storwize
Chapter 8 and 10: The Green and Virtual Data Center (CRC)
Business Benefits of Data Footprint Impact Reduction
Long-Term Data Protection and Retention – Finding the Correct Balance
Application Transparency and Co-existence with Real-Time Data Compression
Enabling a Green and Energy Efficient Storage with Real-time Compression
Real-time Data Compression Integrity and Reliability
Real-Time Data Compression Performance Considerations
Real-time Data Compression for On-line Active Data
Application Agnostic Real-time Data Compression
Saving Money with Green IT: Time To Invest In Information Factories
Storage Efficiency and Optimization – The Other Green
Shifting from energy avoidance to energy efficiency
Industry Trends and Perspectives: Tape, Disk and Dedupe Coexistence
The Many Flavors of Deduplication
Experts Share De-Dupe Insights
Business Benefits of Policy Based Data De-Duplication
Comments on IBM buying Storwize primary compression
Comments on Dell buying Ocarina and primary

Wrap up (for now, read part II here)

For some applications reduction ratios are an important focus on the tools or modes of operations that achieve those results.

Likewise for other applications where the focus is on performance with some data reduction benefit, tools are optimized for performance first and reduction secondary.

Thus I expect messaging from some vendors to adjust (expand) to those capabilities that they have in their toolboxes (product portfolios) offerings

Consequently, IMHO some of the backup centric dedupe solutions may find themselves in niche roles in the future unless they can diversity. Vendors with multiple data footprint reduction tools will also do better than those with only a single function or focused tool.

However for those who only have a single or perhaps a couple of tools, well, guess what the approach and messaging will be.

After all, if all you have is a hammer everything looks like a nail, if all you have is a screw driver, well, you get the picture.

On the other hand, if you are still not clear on what all this means, send me a note, give a call, post a comment or a tweet and will be happy to discuss with you.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

June 10, 2010February 3, 2023

Initial Virtumania Appearance (Episode 14) with fellow vExperts

This past week I was invited to join some fellow vExperts as a first time guest on Rich Brambleys (@rbrambley and VMETC) podcast show called Virtumania.

Episode 14 (Virtualization and Networking Turf Wars) had as a theme as you can guest themes around physical, logical and virtual networking for virtual servers along with some of the politics and turf battles associated with managing those entities.

Also on the show were cohost Marc Farley (@3parfarley) of 3Par and StorageRap.com as well as regular guest Rick Vanover (@rickvanover) of RickVanover.com and other special guest David Davis (@davidmdavis) vmwarevideos.com in addition to myself.

For some fun, there is even some reference to rival gangs dancing for superiority in the Michael Jackson music video "Bad" which was produced by Greg Knieriemen (@knieriemen) of Chi Corporation for this Infosmack Production.

Check out the show here or here.

BTW: Is it just me or does Rich Brambley sound a little bit like Tom Petty without the accent?

Thanks guys, enjoyed being a guest on the show as well as talking with you all, hope to be able to do it again sometime soon.

Cheers gs

Greg Schulz – Author The Green and Virtual Data Center (CRC) and Resilient Storage Networks (Elsevier)
twitter @storageio

June 10, 2010May 17, 2021

VMware vExpert 2010: Thank You, Im Honored to be named a Member

This week while traveling I received an email note from John Troyer of VMware informing me that I have been nominated and selected as a VMware vExpert for 2010.

To say that I was surprised and honored would be an understatement.

Thus, I would like to thank all those involved in the nominations, evaluation and selection process for being named to this esteemed group.

I would also like to say congratulations, best wishes and hello to all of the other 2010 vExperts. Im Looking forward to being involved and participating in the VMware vExpert community.

Cheers gs

Greg Schulz – Author The Green and Virtual Data Center (CRC) and Resilient Storage Networks (Elsevier)
twitter @storageio

May 10, 2010January 1, 2022

EMC VPLEX: Virtual Storage Redefined or Respun?

In a flurry of announcements that coincide with EMCworld occurring in Boston this week of May 10 2010 EMC officially unveiled the Virtual Storage vision initiative (aka twitter hash tag of #emcvs) and initial VPLEX product. The Virtual Storage initiative was virtually previewed back in March (See my previous post here along with one from Stu Miniman (twitter @stu) of EMC here or here) and according to EMC the VPLEX product was made generally available (GA) back in April.

The Virtual Storage vision and associated announcements consisted of:

Virtual Storage vision – Big picture initiative view of what and how to enable private clouds
VPLEX architecture – Big picture view of federated data storage management and access
First VPLEX based product – Local and campus (Metro to about 100km) solutions
Glimpses of how the architecture will evolve with future products and enhancements

Figure 1: EMC Virtual Storage and Virtual Server Vision and Big Pictures

The Big Picture
The EMC Virtual Storage vision (Figure 1) is the foundation of a private IT cloud which should enable characteristics including transparency, agility, flexibility, efficient, always on, resiliency, security, on demand and scalable. Think of it this way, EMC wants to enable and facilitate for storage what is being done by server virtualization hypervisor vendors including VMware (which happens to be owned by EMC), Microsoft HyperV and Citrix/Xen among others. That is, break down the physical barriers or constraints around storage similar to how virtual servers release applications and their operating systems from being tied to a physical server.

While the current focus of desktop, server and storage virtualization has been focused on consolidation and cost avoidance, the next big wave or phase is life beyond consolidation where the emphasis expands to agility, flexibility, ease of use, transparency, and portability (Figure 2). In the next phase which puts an emphasis around enablement and doing more with what you have while enhancing business agility focus extends from how much can be consolidated or the number of virtual machines per physical machine to that of using virtualization for flexibility, transparency (read more here and here or watch here).

Figure 2: Virtual Storage Big Picture

That same trend will be happening with storage where the emphasis also expands from how much data can be squeezed or consolidated onto a given device to that of enabling flexibility and agility for load balancing, BC/DR, technology upgrades, maintenance and other routine Infrastructure Resource Management (IRM) tasks.

For EMC, achieving this vision (both directly for storage, and indirectly for servers via their VMware subsidiary) is via local and distributed (metro and wide area) federation management of physical resources to support virtual data center operations. EMC building blocks for delivering this vision including VPLEX, data and storage management federation across EMC and third party products, FAST (fully automated storage tiering), SSD, data protection and data footprint reduction and data protection management products among others.

Buzzword bingo aside (e.g. LAN, SAN, MAN, WAN, Pots and Pans) along with Automation, DWDM, Asynchronous, BC, BE or Back End, Cache coherency, Cache consistency, Chargeback, Cluster, db loss, DCB, Director, Distributed, DLM or Distributed Lock Management, DR, Foe or Fibre Channel over Ethernet, FE or Front End, Federated, FAST, Fibre Channel, Grid, HyperV, Hypervisor, IRM or Infrastructure Resource Management, I/O redirection, I/O shipping, Latency, Look aside, Metadata, Metrics, Public/Private Cloud, Read ahead, Replication, SAS, Shipping off to Boston, SRA, SRM, SSD, Stale Reads, Storage virtualization, Synchronization, Synchronous, Tiering, Virtual storage, VMware and Write through among many other possible candidates the big picture here is about enabling flexibility, agility, ease of deployment and management along with boosting resource usage effectiveness and presumably productivity on a local, metro and future global basis.

Figure 3: EMC Storage Federation and Enabling Technology Big Picture

The VPLEX Big Picture
Some of the tenants of the VPLEX architecture (Figure 3) include a scale out cluster or grid design for local and distributed (metro and wide area) access where you can start small and evolve as needed in a predictable and deterministic manner.

Figure 4: Generic Virtual Storage (Local SAN and MAN/WAN) and where VPLEX fits

The VPLEX architecture is targeted towards enabling next generation data centers including private clouds where ease and transparency of data movement, access and agility are essential. VPLEX sits atop existing EMC and third party storage as a virtualization layer between physical or virtual servers and in theory, other storage systems that rely on underlying block storage. For example in theory a NAS (NFS, CIFS, and AFS) gateway, CAS content archiving or Object based storage system or purpose specific database machine could sit between actual application servers and VPLEX enabling multiple layers of flexibility and agility for larger environments.

At the heart of the architecture is an engine running a highly distributed data caching algorithm that uses an approach where a minimal amount of data is sent to other nodes or members in the VPLEX environment to reduce overhead and latency (in theory boosting performance). For data consistency and integrity, a distributed cache coherency model is employed to protect against stale reads and writes along with load balancing, resource sharing and failover for high availability. A VPLEX environment consists of a federated management view across multiple VPLEX clusters including the ability to create a stretch volume that is accessible across multiple VPLEX clusters (Figure 5).

Figure 5: EMC VPLEX Big Picture

Figure 6: EMC VPLEX Local with 1 to 4 Engines

Each VPLEX local cluster (Figure 6) is made up of 1 to 4 engines (Figure 7) per rack with each engine consisting of two directors each having 64GByte of cache, localized compute Intel processors, 16 Front End (FE) and 16 Back End (BE) Fibre Channel ports configured in a high availability (HA). Communications between the directors and engines is Fibre Channel based. Meta data is moved between the directors and engines in 4K blocks to maintain consistency and coherency. Components are fully redundant and include phone home support.

Figure 7: EMC VPLEX Engine with redundant directors

VPLEX initially host servers supported include VMware, Cisco UCS, Windows, Solaris, IBM AIX, HPUX and Linux along with EMC PowerPath and Windows multipath management drivers. Local server clusters supported include Symantec VCS, Microsoft MSCS and Oracle RAC along with various volume mangers. SAN fabric connectivity supported includes Brocade and Cisco as well as Legacy McData based products.

VPLEX also supports cache (Figure 8 ) write thru to preserve underlying array based functionality and performance with 8,000 total virtualized LUNs per system. Note that underlying LUNs can be aggregated or simply passed through the VPLEX. Storage that attaches to the BE Fibre Channel ports include EMC Symmetrix VMAX and DMX along with CLARiiON CX and CX4. Third party storage supported includes HDS9000 and USPV/VM along with IBM DS8000 and others to be added as they are certified. In theory given that the VPLEX presents block based storage to hosts; one would also expect that NAS, CAS or other object based gateways and servers that rely on underlying block storage to also be supported in the future.

Figure 8: VPLEX Architecture and Distributed Cache Overview

Functionality that can be performed between the cluster nodes and engines with VPLEX include data migration and workload movement across different physical storage systems or sites along with shared access with read caching on a local and distributed basis. LUNS can also be pooled across different vendors underlying storage solutions that also retain their native feature functionality via VPLEX write thru caching.

Reads from various servers can be resolved by any node or engine that checks their cache tables (Figure 8 ) to determine where to resolve the actual I/O operation from. Data integrity checks are also maintained to prevent stale reads or write operations from occurring. Actual meta data communications between nodes is very small to enable state fullness while reducing overhead and maximizing performance. When a change to cache data occurs, meta information is sent to other nodes to maintain the distributed cache management index schema. Note that only pointers to where data and fresh cache entries reside are what is stored and communicated in the meta data via the distributed caching algorithm.

Figure 9: EMC VPLEX Metro Today

For metro deployments, two clusters (Figure 9) are utilized with distances supported up to about 100km or about 5ms of latency in a synchronous manner utilizing long distance Fibre Channel optics and transceivers including Dense Wave Division Multiplexing (DWDM) technologies (See Chapter 6: Metropolitan and Wide Area Storage Networking in Resilient Storage Networking (Elsevier) for additional details on LAN, MAN and WAN topics).

Initially EMC is supporting local or Metro including Campus based VPLEX deployments requiring synchronous communications however asynchronous (WAN) Geo and Global based solutions are planned for the future (Figure 10).

Figure 10: EMC VPLEX Future Wide Area and Global

Online Workload Migration across Systems and Sites
Online workload or data movement and migration across storage systems or sites is not new with solutions available from different vendors including Brocade, Cisco, Datacore, EMC, Fujitsu, HDS, HP, IBM, LSI and NetApp among others.

For synchronization and data mobility operations such as a VMware Vmotion or Microsoft HyperV Live migration over distance, information is written to separate LUNs in different locations across what are known as stretch volumes to enable non disruptive workload relocation across different storage systems (arrays) from various vendors. Once synchronization is completed, the original source can be disconnected or taken offline for maintenance or other common IRM tasks. Note that at least two LUNs are required, or put another way, for every stretch volume, two LUNs are subtracted from the total number of available LUNs similar to how RAID 1 mirroring requires at least two disk drives.

Unlike other approaches that for coherency and performance rely on either no cached data, or, extensive amounts of cached data along with subsequent overhead for maintaining state fullness (consistency and coherency) including avoiding stale reads or writes, VPLEX relies on a combination of distributed cache lookup tables along with pass thru access to underlying storage when or where needed. Consequently large amounts of data does not need to be cached as well as shipped between VPLEX devices to maintain data consistency, coherency or performance which should also help to keep costs affordable.

Approach is not unique, it is the implementation
Some storage virtualization solutions that have been software based running on an appliance or network switch as well as hardware system based have had a focus of emulating or providing competing capabilities with those of mid to high end storage systems. The premise has been to use lower cost, less feature enabled storage systems aggregated behind the appliance, switch or hardware based system to provide advanced data and storage management capabilities found in traditional higher end storage products.

VPLEX while like any tool or technology could be and probably will be made to do other things than what it is intended for is really focused on, flexibility, transparency and agility as opposed to being used as a means of replacing underlying storage system functionality. What this means is that while there is data movement and migration capabilities including ability to synchronize data across sites or locations, VPLEX by itself is not a replacement for the underlying functionality present in both EMC and third party (e.g. HDS, HP, IBM, NetApp, Oracle/Sun or others) storage systems.

This will make for some interesting discussions, debates and applies to oranges comparisons in particular with those vendors whose products are focused around replacing or providing functionality not found in underlying storage system products.

In a nut shell summary, VPLEX and the Virtual Storage story (vision) is about enabling agility, resiliency, flexibility, data and resource mobility to simply IT Infrastructure Resource Management (IRM). One of the key themes of global storage federation is anywhere access on a local, metro, wide area and global basis across both EMC and heterogeneous third party vendor hardware.

Lets Put it Together: When and Where to use a VPLEX
While many storage virtualization solutions are focused around consolidation or pooling, similar to first wave server and desktop virtualization, the next general broad wave of virtualization is life beyond consolidation. That means expanding the focus of virtualization from consolidation, pooling or LUN aggregation to that of enabling transparency for agility, flexibility, data or system movement, technology refresh and other common time consuming IRM tasks.

Some applications or usage scenarios in the future should include in addition to VMware Vmotion, Microsoft HypverV and Microsoft Clustering along with other host server closuring solutions.

Figure 11: EMC VPLEX Usage Scenarios

Thoughts and Industry Trends Perspectives:

The following are various thoughts, comments, perspectives and questions pertaining to this and storage, virtualization and IT in general.

Is this truly unique as is being claimed?

Interestingly, the message Im hearing out of EMC is not the claim that this is unique, revolutionary or the industries first as is so often the case by vendors, rather that it is their implementation and ability to deploy on a broad perspective basis that is unique. Now granted you will probably hear as is often the case with any vendor or fan boy/fan girl spins of it being unique and Im sure this will also serve up plenty of fodder for mudslinging in the blogsphere, YouTube galleries, twitter land and beyond.

What is the DejaVu factor here?

For some it will be nonexistent, yet for others there is certainly a DejaVu depending on your experience or what you have seen and heard in the past. In some ways this is the manifestation of many vision and initiatives from the late 90s and early 2000s when storage virtualization or virtual storage in an open context jumped into the limelight coinciding with SAN activity. There have been products rolled out along with proof of concept technology demonstrators, some of which are still in the market, others including companies have fallen by the way side for a variety of reasons.

Consequently if you were part of or read or listened to any of the discussions and initiatives from Brocade (Rhapsody), Cisco (SVC, VxVM and others), INRANGE (Tempest) or its successor CNT UMD not to mention IBM SVC, StorAge (now LSI), Incipient (now part of Texas Memory) or Troika among others you should have some DejaVu.

I guess that also begs the question of what is VPLEX, in band, out of band or hybrid fast path control path? From what I have seen it appears to be a fast path approach combined with distributed caching as opposed to a cache centric inband approaches such as IBM SVC (either on a server or as was tried on the Cisco special service blade) among others.

Likewise if you are familiar with IBM Mainframe GDPS or even EMC GDDR as well as OpenVMS Local and Metro clusters with distributed lock management you should also have DejaVu. Similarly if you had looked at or are familiar with any of the YottaYotta products or presentations, this should also be familiar as EMC acquired the assets of that now defunct company.

Is this a way for EMC to sell more hardware along with software products?

By removing barriers enabling IT staffs to support more data on more storage in a denser and more agile footprint the answer should be yes, something that we may see other vendors emulate, or, make noise about what they can or have been doing already.

How is this virtual storage spin different from the storage virtualization story?

That all depends on your view or definition as well as belief systems and preferences for what is or what is not virtual storage vs. storage virtualization. For some who believe that storage virtualization is only virtualization if and only if it involves software running on some hardware appliance or vendors storage system for aggregation and common functionality than you probably wont see this as virtual storage let alone storage virtualization. However for others, it will be confusing hence EMC introducing terms such as federation and avoiding terms including grid to minimize confusion yet play off of cloud crowd commotion.

Is VPLEX a replacement for storage system based tiering and replication?

I do not believe so and even though some vendors are making claims that tiered storage is dead, just like some vendors declared a couple of years ago that disk drives were going to be dead this year at the hands of SSD, neither has come to life so to speak pun intended. What this means for VPLEX is that it leverages underlying automated or manual tiering found in storage systems such as EMC FAST enabled or similar policy and manual functions in third party products.

What VPLEX brings to the table is the ability to transparently present a LUN or volume locally or over distance with shared access while maintaining cache and data coherency. This means that if a LUN or volume moves the applications or file system or volume managers expecting to access that storage will not be surprised, panic or encounter failover problems. Of course there will be plenty of details to be dug into and seen how it all actually works as is the case with any new technology.

Who is this for?

I see this as for environments that need flexibility and agility across multiple storage systems either from one or multiple vendors on a local or metro or wide area basis. This is for those environments that need ability to move workloads, applications and data between different storage systems and sites for maintenance, upgrades, technology refresh, BC/DR, load balancing or other IRM functions similar to how they would use virtual server migration such as VMotion or Live migration among others.

Do VPLEX and Virtual Storage eliminate need for Storage System functionality?

I see some storage virtualization solutions or appliances that have a focus of replacing underlying storage system functionality instead of coexisting or complementing. A way to test for this approach is to listen or read if the vendor or provider says anything along the lines of eliminating vendor lock in or control of the underlying storage system. That can be a sign of the golden rule of virtualization of whoever controls the virtualization functionality (at the server hypervisor or storage) controls the gold! This is why on the server side of things we are starting to see tiered hypervisors similar to tiered servers and storage where mixed hypervisors are being used for different purposes. Will we see tiered storage hypervisors or virtual storage solutions the answer could be perhaps or it depends.

Was Invista a failure not going into production and this a second attempt at virtualization?

There is a popular myth in the industry that Invista never saw the light of day outside of trade show expo or other demos however the reality is that there are actual customer deployments. Invista unlike other storage virtualization products had a different focus which was that around enabling agility and flexibility for common IRM tasks, similar the expanded focus of VPLEX. Consequently Invista has often been in apples to oranges comparison with other virtualization appliances that have as focus pooling along with other functions or in some cases serving as an appliance based storage system.

The focus around Invista and usage by those customers who have deployed it that I have talked with is around enabling agility for maintenance, facilitating upgrades, moves or reconfiguration and other common IRM tasks vs using it for pooling of storage for consolidation purposes. Thus I see VPLEX extending on the vision of Invista in a role of complimenting and leveraging underlying storage system functionality instead of trying to replace those capabilities with that of the storage virtualizer.

Is this a replacement for EMC Invista?

According to EMC the answer is no and that customers using Invista (Yes, there are customers that I have actually talked to) will continue to be supported. However I suspect that over time Invista will either become a low end entry for VPLEX, or, an entry level VPLEX solution will appear sometime in the future.

How does this stack up or compare with what others are doing?

If you are looking to compare to cache centric platforms such as IBMs SVC that adds extensive functionality and capabilities within the storage virtualization framework this is an apples to oranges comparison. VPLEX is providing cache pointers on a local and global basis functioning in a compliment to underlying storage system model where SVC caches at the specific cluster basis and enhancing functionality of underlying storage system. Rest assured there will be other apples to oranges comparisons made between these platforms.

How will this be priced?

When I asked EMC about pricing, they would not commit to a specific price prior to the announcement other than indicating that there will be options for on demand or consumption (e.g. cloud pricing) as well as pricing per engine capacity as well as subscription models (pay as you go).

What is the overhead of VPLEX?

While EMC runs various workload simulations (including benchmarks) internally as well as some publicly (e.g. Microsoft ESRP among others) they have been opposed to some storage simulation benchmarks such as SPC. The EMC opposition to simulations such as SPC have been varied however this could be a good and interesting opportunity for them to silence the industry (including myself) who continue ask them (along with a couple of other vendors including IBM and their XIV) when they will release public results.

What the interesting opportunity I think is for EMC is that they do not even have to benchmark one of their own storage systems such as a CLARiiON or VMAX, instead simply show the performance of some third party product that already is tested on the SPC website and then a submission with that product running attached to a VPLEX.

If the performance or low latency forecasts are as good as they have been described, EMC can accomplish a couple of things by:

Demonstrating the low latency and minimal to no overhead of VPLEX
Show VPLEX with a third party product comparing latency before and after
Provide a comparison to other virtualization platforms including IBM SVC

As for EMC submitting a VMAX or CLARiiON SPC test in general, Im not going to hold my breath for that, instead, will continue to look at the other public workload tests such as ESRP.

Additional related reading material and links:

Resilient Storage Networks: Designing Flexible Scalable Data Infrastructures (Elsevier)
Chapter 3: Networking Your Storage
Chapter 4: Storage and IO Networking
Chapter 6: Metropolitan and Wide Area Storage Networking
Chapter 11: Storage Management
Chapter 16: Metropolitan and Wide Area Examples

The Green and Virtual Data Center (CRC)
Chapter 3: (see also here) What Defines a Next-Generation and Virtual Data Center
Chapter 4: IT Infrastructure Resource Management (IRM)
Chapter 5: Measurement, Metrics, and Management of IT Resources
Chapter 7: Server: Physical, Virtual, and Software
Chapter 9: Networking with your Servers and Storage

Also see these:

Virtual Storage and Social Media: What did EMC not Announce?
Server and Storage Virtualization – Life beyond Consolidation
Should Everything Be Virtualized?
Was today the proverbial day that he!! Froze over?
Moving Beyond the Benchmark Brouhaha

Closing comments (For now):
As with any new vision, initiative, architecture and initial product there will be plenty of questions to ask, items to investigate, early adopter customers or users to talk with and determine what is real, what is future, what is usable and practical along with what is nice to have. Likewise there will be plenty of mud ball throwing and slinging between competitors, fans and foes which for those who enjoy watching or reading those you should be well entertained.

In general, the EMC vision and story builds on and presumably delivers on past industry hype, buzz and vision with solutions that can be put into environments as productivity tool that works for the customer, instead of the customer working for the tool.

Remember the golden rule of virtualization which is in play here is that whoever controls the virtualization or associated management controls the gold. Likewise keep in mind that aggregation can cause aggravation. So do not be scared, however look before you leap meaning do your homework and due diligence with appropriate levels of expectations, aligning applicable technology to the task at hand.

Also, if you have seen or experienced something in the past, you are more likely to have DejaVu as opposed to seeing things as revolutionary. However it is also important to leverage lessons learned for future success. YottaYotta was a lot of NaddaNadda, lets see if EMC can leverage their past experiences to make this a LottaLotta.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

March 24, 2010January 30, 2020

March Metric Madness: Fun with Simple Math

Its March and besides being spring in north America, it also means tournament season including the NCAA basket ball series among others known as March Madness.

Given the office pools and other forms of playing with numbers tied to the tournaments and real or virtual money, here is a quick timeout looking at some fun with math.

The fun is showing how simple math can be used to show relative growth for IT resources such as data storage. For example, say that you have 10Tbytes of storage or data and that it is growing at only 10 percent per year, in five years with simple math yields 14.6Tbytes.

Now lets assume growth rate is 50 percent per year and in the course of five years, instead of having 10Tbytes, that now jumps to 50.6Tbytes. If you have 100Tbytes today and at 50 percent growth rate, that would yield 506.3 Tbytes or about half of a petabyte in 5 years. If by chance you have say 1Pbyte or 1,000Tbytes today, at 25% year of year growth you would have 2.44Pbytes in 5 years.
Basic Storage Forecast
Figure 1 Fun with simple math and projected growth rates

Granted this is simple math showing basic examples however the point is that depending on your growth rate and amount of either current data or storage, you might be surprised at the forecast or projected needs in only five years.

In a nutshell, these are examples of very basic primitive capacity forecasts that would vary by other factors including if the data is 10Tbytes and your policies is for 25 percent free space, that would require even more storage than the base amount. Go with a different RAID level, some extra space for replication, snapshots, disk to disk backups and replication not to mention test development and those numbers go up even higher.

Sure those amounts can be offset with thin provisioning, dedupe, archiving, compression and other forms of data footprint reduction, however the point here is to realize how simple math can portray a very basic forecast and picture of growth.

Read more about performance and capacity in Chapter 10 – Performance and capacity planning for storage networks – Resilient Storage Networks (Elsevier) as well as at www.cmg.org (Computer Measurement Group)..

And that is all I have to say about this for now, enjoy March madness and fun with numbers.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

January 14, 2010December 24, 2020

2010 and 2011 Trends, Perspectives and Predictions: More of the same?

2011 is not a typo, I figured that since Im getting caught up on some things, why not get a jump as well.

Since 2009 went by so fast, and that Im finally getting around to doing an obligatory 2010 predictions post, lets take a look at both 2010 and 2011.

Actually Im getting around to doing a post here having already done interviews and articles for others soon to be released.

Based on prior trends and looking at forecasts, a simple predictions is that some of the items for 2010 will apply for 2011 as well given some of this years items may have been predicted by some in 2008, 2007, 2006, 2005 or, well ok, you get the picture. :)

Predictions are fun and funny in that for some, they are taken very seriously, while for others, at best they are taken with a grain of salt depending on where you sit. This applies both for the reader as well as who is making the predictions along with various motives or incentives.

Some are serious, some not so much…

For some, predictions are a great way of touting or promoting favorite wares (hard, soft or services) or getting yet another plug (YAP is a TLA BTW) in to meet coverage or exposure quota.

Meanwhile for others, predictions are a chance to brush up on new terms for the upcoming season of buzzword bingo games (did you pick up on YAP).

In honor of the Vancouver winter games, Im expecting some cool Olympic sized buzzword bingo games with a new slippery fast one being federation. Some buzzwords will take a break in 2010 as well as 2011 having been worked pretty hard the past few years, while others that have been on break, will reappear well rested, rejuvenated, and ready for duty.

Lets also clarify something regarding predictions and this is that they can be from at least two different perspectives. One view is that from a trend of what will be talked about or discussed in the industry. The other is in terms of what will actually be bought, deployed and used.

What can be confusing is sometimes the two perspectives are intermixed or assumed to be one and the same and for 2010 I see that trend continuing. In other words, there is adoption in terms of customers asking and investigating technologies vs. deployment where they are buying, installing and using those technologies in primary situations.

It is safe to say that there is still no such thing as an information, data or processing recession. Ok, surprise surprise; my dogs could have probably made that prediction during a nap. However what this means is more data will need to be moved, processed and stored for longer periods of time and at a lower cost without degrading performance or availability.

This means, denser technologies that enable a lower per unit cost of service without negatively impacting performance, availability, capacity or energy efficiency will be needed. In other words, watch for an expanded virtualization discussion around life beyond consolidation for servers, storage, desktops and networks with a theme around productivity and virtualization for agility and management enablement.

Certainly there will be continued merger and acquisitions on both a small as well as large scale ranging from liquidation sales or bargain hunting, to large and a mega block buster or two. Im thinking in terms of outside of the box, the type that will have people wondering perhaps confused as to why such a deal would be done until the whole picture is reveled and thought out.

In other words, outside of perhaps IBM, HP, Oracle, Intel or Microsoft among a few others, no vendor is too large not to be acquired, merged with, or even involved in a reverse merger. Im also thinking in terms of vendors filling in niche areas as well as building out their larger portfolio and IT stacks for integrated solutions.

Ok, lets take a look at some easy ones, lay ups or slam dunks:

More cluster, cloud conversations and confusion (public vs. private, service vs. product vs. architecture)
More server, desktop, IO and storage consolidation (excuse me, server virtualization)
Data footprint impact reduction ranging from deletion to archive to compress to dedupe among others
SSD and in particular flash continues to evolve with more conversations around PCM
Growing awareness of social media as yet another tool for customer relations management (CRM)
Security, data loss/leap prevention, digital forensics, PCI (payment card industry) and compliance
Focus expands from gaming/digital surveillance /security and energy to healthcare
Fibre Channel over Ethernet (FCoE) mainstream in discussions with some initial deployments
Continued confusion of Green IT and carbon reduction vs. economic and productivity (Green Gap)
No such thing as an information, data or processing recession, granted budgets are strained
Server, Storage or Systems Resource Analysis (SRA) with event correlation
SRA tools that provide and enable automation along with situational awareness

The green gap of confusion will continue with carbon or environment centric stories and messages continue to second back stage while people realize the other dimension of green being productivity.

As previously mentioned, virtualization of servers and storage continues to be popular with an expanding focus from just consolidation to one around agility, flexibility and enabling production, high performance or for other systems that do not lend themselves to consolidation to be virtualized.

6GB SAS interfaces as well as more SAS disk drives continue to gain popularity. I have said in the past there was a long shot that 8GFC disk drives might appear. We might very well see those in higher end systems while SAS drives continue to pick up the high performance spinning disk role in mid range systems.

Granted some types of disk drives will give way over time to others, for example high performance 3.5” 15.5K Fibre Channel disks will give way to 2.5” 15.5K SAS boosting densities, energy efficiency while maintaining performance. SSD will help to offload hot spots as they have in the past enabling disks to be more effectively used in their applicable roles or tiers with a net result of enhanced optimization, productivity and economics all of which have environmental benefits (e.g. the other Green IT closing the Green Gap).

What I dont see occurring, or at least in 2010

An information or data recession requiring less server, storage, I/O networking or software resources
OSD (object based disk storage without a gateway) at least in the context of T10
Mainframes, magnetic tape, disk drives, PCs, or Windows going away (at least physically)
Cisco cracking top 3, no wait, top 5, no make that top 10 server vendor ranking
More respect for growing and diverse SOHO market space
iSCSI taking over for all I/O connectivity, however I do see iSCSI expand its footprint
FCoE and flash based SSD reaching tipping point in terms of actual customer deployments
Large increases in IT Budgets and subsequent wild spending rivaling the dot com era
Backup, security, data loss prevention (DLP), data availability or protection issues going away
Brett Favre and the Minnesota Vikings winning the super bowl

What will be predicted at end of 2010 for 2011 (some of these will be DejaVU)

Many items that were predicted this year, last year, the year before that and so on…
Dedupe moving into primary and online active storage, rekindling of dedupe debates
Demise of cloud in terms of hype and confusion being replaced by federation
Clustered, grid, bulk and other forms of scale out storage grow in adoption
Disk, Tape, RAID, Mainframe, Fibre Channel, PCs, Windows being declared dead (again)
2011 will be the year of Holographic storage and T10 OSD (an annual prediction by some)
FCoE kicks into broad and mainstream deployment adoption reaching tipping point
16Gb (16GFC) Fibre Channel gets more attention stirring FCoE vs. FC vs. iSCSI debates
100GbE gets more attention along with 4G adoption in order to move more data
Demise of iSCSI at the hands of SAS at low end, FCoE at high end and NAS from all angles

Gaining ground in 2010 however not yet in full stride (at least from customer deployment)

On the connectivity front, iSCSI, 6Gb SAS, 8Gb Fibre Channel, FCoE and 100GbE
SSD/flash based storage everywhere, however continued expansion
Dedupe everywhere including primary storage – its still far from its full potential
Public and private clouds along with pNFS as well as scale out or clustered storage
Policy based automated storage tiering and transparent data movement or migration
Microsoft HyperV and Oracle based server virtualization technologies
Open source based technologies along with heterogeneous encryption
Virtualization life beyond consolidation addressing agility, flexibility and ease of management
Desktop virtualization using Citrix, Microsoft and VMware along with Microsoft Windows 7

Buzzword bingo hot topics and themes (in no particular order) include:

2009 and previous year carry over items including cloud, iSCSI, HyperV, Dedupe, open source
Federation takes over some of the work of cloud, virtualization, clusters and grids
E2E, End to End management preferably across different technologies
SAS, Serial Attached SCSI for server to storage systems and as disk to storage interface
SRA, E23, Event correlation and other situational awareness related IRM tools
Virtualization, Life beyond consolidation enabling agility, flexibility for desktop, server and storage
Green IT, Transitions from carbon focus to economic with efficiency enabling productivity
FCoE, Continues to evolve and mature with more deployments however still not at tipping point
SSD, Flash based mediums continue to evolve however tipping point is still over the horizon
IOV, I/O Virtualization for both virtual and non virtual servers
Other new or recycled buzzword bingo candidates include PCoIP, 4G,

RAID will again be pronounced as being dead no longer relevant yet being found in more diverse deployments from consumer to the enterprise. In other words, RAID may be boring and thus no longer relevant to talk about, yet it is being used everywhere and enhanced in evolutionary ways, perhaps for some even revolutionary.

Tape remains being declared dead (e.g. on the Zombie technology list) yet being enhanced, purchased and utilized at higher rates with more data stored than in past history. Instead of being killed off by the disk drive, tape is being kept around for both traditional uses as well as taking on new roles where it is best suited such as long term or bulk off-line storage of data in ultra dense and energy efficient not to mention economical manners.

What I am seeing and hearing is that customers using tape are able to reduce the number of drives or transports, yet due to leveraging disk buffers or caches including from VTL and dedupe devices, they are able to operate their devices at higher utilization, thus requiring fewer devices with more data stored on media than in the past.

Likewise, even though I have been a fan of SSD for about 20 years and am bullish on its continued adoption, I do not see SSD killing off the spinning disk drive anytime soon. Disk drives are helping tape take on this new role by being a buffer or cache in the form of VTLs, disk based backup and bulk storage enhanced with compression, dedupe, thin provision and replication among other functionality.

There you have it, my predictions, observations and perspectives for 2010 and 2011. It is a broad and diverse list however I also get asked about and see a lot of different technologies, techniques and trends tied to IT resources (servers, storage, I/O and networks, hardware, software and services).

Lets see how they play out.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

January 13, 2010March 7, 2022

Poll: Networking Convergence, Ethernet, InfiniBand or both?

I just received an email in my inbox from Voltaire along with a pile of other advertisements, advisories, alerts and announcements from other folks.

What caught my eye on the email was that it is announcing a new survey results that you can read here as well as below.

The question that this survey announcements prompts for me and hence why I am posting it here is how dominant will InfiniBand be on a go forward basis, the answer I think is it depends…

It depends on the target market or audience, what their applications and technology preferences are along with other service requirements.

I think that there is and will remain a place for Infiniband, the question is where and for what types of environments as well as why have both InfiniBand and Ethernet including Fibre Channel over Ethernet (FCoE) in support of unified or converged I/O and data networking.

So here is the note that I received from Voltaire:

Hello,

A new survey by Voltaire (NASDAQ: VOLT) reveals that IT executives plan to use InfiniBand and Ethernet technologies together as they refresh or build new data centers. They’re choosing a converged network strategy to improve fabric performance which in turn furthers their infrastructure consolidation and efficiency objectives.

The full press release is below. Please contact me if you would like to speak with a Voltaire executive for further commentary.

Regards,
Christy

____________________________________________________________
Christy Lynch| 978.439.5407(o) |617.794.1362(m)
Director, Corporate Communications
Voltaire – The Leader in Scale-Out Data Center Fabrics
christyl@voltaire.com | www.voltaire.com
Follow us on Twitter: www.twitter.com/voltaireltd

FOR IMMEDIATE RELEASE:

IT Survey Finds Executives Planning Converged Network Strategy:
Using Both InfiniBand and Ethernet

Fabric Performance Key to Making Data Centers Operate More Efficiently

CHELMSFORD, Mass. and ANANA, Israel January 12, 2010 – A new survey by Voltaire (NASDAQ: VOLT) reveals that IT executives plan to use InfiniBand and Ethernet technologies together as they refresh or build new data centers. They’re choosing a converged network strategy to improve fabric performance which in turn furthers their infrastructure consolidation and efficiency objectives.

Voltaire queried more than 120 members of the Global CIO & Executive IT Group, which includes CIOs, senior IT executives, and others in the field that attended the 2009 MIT Sloan CIO Symposium. The survey explored their data center networking needs, their choice of interconnect technologies (fabrics) for the enterprise, and criteria for making technology purchasing decisions.

“Increasingly, InfiniBand and Ethernet share the ability to address key networking requirements of virtualized, scale-out data centers, such as performance, efficiency, and scalability,” noted Asaf Somekh, vice president of marketing, Voltaire. “By adopting a converged network strategy, IT executives can build on their pre-existing investments, and leverage the best of both technologies.”

When asked about their fabric choices, 45 percent of the respondents said they planned to implement both InfiniBand with Ethernet as they made future data center enhancements. Another 54 percent intended to rely on Ethernet alone.

Among additional survey results:

When asked to rank the most important characteristics for their data center fabric, the largest number (31 percent) cited high bandwidth. Twenty-two percent cited low latency, and 17 percent said scalability.

When asked about their top data center networking priorities for the next two years, 34 percent again cited performance. Twenty-seven percent mentioned reducing costs, and 16 percent cited improving service levels.

A majority (nearly 60 percent) favored a fabric/network that is supported or backed by a global server manufacturer.

InfiniBand and Ethernet interconnect technologies are widely used in today’s data centers to speed up and make the most of computing applications, and to enable faster sharing of data among storage and server networks. Voltaire’s server and storage fabric switches leverage both technologies for optimum efficiency. The company provides InfiniBand products used in supercomputers, high-performance computing, and enterprise environments, as well as its Ethernet products to help a broad array of enterprise data centers meet their performance requirements and consolidation plans.

About Voltaire
Voltaire (NASDAQ: VOLT) is a leading provider of scale-out computing fabrics for data centers, high performance computing and cloud environments. Voltaire’s family of server and storage fabric switches and advanced management software improve performance of mission-critical applications, increase efficiency and reduce costs through infrastructure consolidation and lower power consumption. Used by more than 30 percent of the Fortune 100 and other premier organizations across many industries, including many of the TOP500 supercomputers, Voltaire products are included in server and blade offerings from Bull, HP, IBM, NEC and Sun. Founded in 1997, Voltaire is headquartered in Ra’anana, Israel and Chelmsford, Massachusetts. More information is available at www.voltaire.com or by calling 1-800-865-8247.

Forward Looking Statements
Information provided in this press release may contain statements relating to current expectations, estimates, forecasts and projections about future events that are "forward-looking statements" as defined in the Private Securities Litigation Reform Act of 1995. These forward-looking statements generally relate to Voltaire’s plans, objectives and expectations for future operations and are based upon management’s current estimates and projections of future results or trends. They also include third-party projections regarding expected industry growth rates. Actual future results may differ materially from those projected as a result of certain risks and uncertainties. These factors include, but are not limited to, those discussed under the heading "Risk Factors" in Voltaire’s annual report on Form 20-F for the year ended December 31, 2008. These forward-looking statements are made only as of the date hereof, and we undertake no obligation to update or revise the forward-looking statements, whether as a result of new information, future events or otherwise.

###

All product and company names mentioned herein may be the trademarks of their respective owners.

End of Voltaire transmission:

I/O, storage and networking interface wars come and go similar to other technology debates of what is the best or that will be supreme.

Some recent debates have been around Fibre Channel vs. iSCSI or iSCSI vs. Fibre Channel (depends on your perspective), SAN vs. NAS, NAS vs. SAS, SAS vs. iSCSI or Fibre Channel, Fibre Channel vs. Fibre Channel over Ethernet (FCoE) vs. iSCSI vs. InfiniBand, xWDM vs. SONET or MPLS, IP vs UDP or other IP based services, not to mention the whole LAN, SAN, MAN, WAN POTS and PAN speed games of 1G, 2G, 4G, 8G, 10G, 40G or 100G. Of course there are also the I/O virtualization (IOV) discussions including PCIe Single Root (SR) and Multi Root (MR) for attachment of SAS/SATA, Ethernet, Fibre Channel or other adapters vs. other approaches.

Thus when I routinely get asked about what is the best, my answer usually is a qualified it depends based on what you are doing, trying to accomplish, your environment, preferences among others. In other words, Im not hung up or tied to anyone particular networking transport, protocol, network or interface, rather, the ones that work and are most applicable to the task at hand

Now getting back to Voltaire and InfiniBand which I think has a future for some environments, however I dont see it being the be all end all it was once promoted to be. And outside of the InfiniBand faithful (there are also iSCSI, SAS, Fibre Channel, FCoE, CEE and DCE among other devotees), I suspect that the results would be mixed.

I suspect that the Voltaire survey reflects that as well as if I surveyed an Ethernet dominate environment I can take a pretty good guess at the results, likewise for a Fibre Channel, or FCoE influenced environment. Not to mention the composition of the environment, focus and business or applications being supported. One would also expect a slightly different survey results from the likes of Aprius, Broadcom, Brocade, Cisco, Emulex, Mellanox (they also are involved with InfiniBand), NextIO, Qlogic (they actually do some Infiniband activity as well), Virtensys or Xsigo (actually, they support convergence of Fibre Channel and Ethernet via Infiniband) among others.

Ok, so what is your take?

Whats your preffered network interface for convergence?

For additional reading, here are some related links:

I/O Virtualization (IOV) Revisited

I/O, I/O, Its off to Virtual Work and VMworld I Go (or went)

Buzzword Bingo 1.0 – Are you ready for fall product announcements?

StorageIO in the News Update V2010.1

The Green and Virtual Data Center (Chapter 9)

Also check out what others including Scott Lowe have to say about IOV here or, Stuart Miniman about FCoE here, or of Greg Ferro here.

Oh, and for what its worth for those concerned about FTC disclosure, Voltaire is not nor have they been a client of StorageIO, however, I did used to work for a Fibre Channel, iSCSI, IP storage, LAN, SAN, MAN, WAN vendor and wrote a book on the topics :).

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

January 8, 2010April 27, 2025

StorageIO in the News Update V2010.1

StorageIO is regularly quoted and interviewed in various industry and vertical market venues and publications both on-line and in print on a global basis.

The following are some coverage, perspectives and commentary by StorageIO on IT industry trends including servers, storage, I/O networking, hardware, software, services, virtualization, cloud, cluster, grid, SSD, data protection, Green IT and more since the last update.

Realizing that some prefer blogs to webs to twitter to other venues, here are some recent links among others to media coverage and comments by me on a different topics that are among others found at www.storageio.com/news.html:

SearchSMBStorage: Comments on EMC Iomega v.Clone for PC data syncronization – Jan 2010

Computerworld: Comments on leveraging cloud or online backup – Jan 2010

ChannelProSMB: Comments on NAS vs SAN Storage for SMBs – Dec 2009

ChannelProSMB: Comments on Affordable SMB Storage Solutions – Dec 2009

SearchStorage: Comments on What to buy a geek for the holidays, 2009 edition – Dec 2009

SearchStorage: Comments on EMC VMAX storage and 8GFC enhancements – Dec 2009

SearchStorage: Comments on Data Footprint Reduction – Dec 2009

SearchStorage: Comments on Building a private storage cloud – Dec 2009

SearchStorage: Comments on SSD in storage systems – Dec 2009

SearchStorage: Comments on slow adoption of file virtualization – Dec 2009

IT World: Comments on maximizing data security investments – Nov 2009

SearchCIO: Comments on storage virtualization for your organisation – Nov 2009

Processor: Comments on how to win approval for hardware upgrades – Nov 2009

Processor: Comments on the Future of Servers – Nov 2009

SearchITChannel: Comments on Energy-efficient technology sales depend on pitch – Nov 2009

SearchStorage: Comments on how to get from Fibre Channel to FCoE – Nov 2009

Minneapolis Star Tribune: Comments on Google Wave and Clouds – Nov 2009

SearchStorage: Comments on EMC and Cisco alliance – Nov 2009

SearchStorage: Comments on HP virtualizaiton enhancements – Nov 2009

SearchStorage: Comments on Apple canceling ZFS project – Oct 2009

Processor: Comments on EPA Energy Star for Server and Storage Ratings – Oct 2009

IT World Canada: Cloud computing, dot be scared, look before you leap – Oct 2009

IT World: Comments on stretching your data protection and security dollar – Oct 2009

Enterprise Storage Forum: Comments about Fragmentation and Performance? – Oct 2009

SearchStorage: Comments about data migration – Oct 2009

SearchStorage: Comments about What’s inside internal storage clouds? – Oct 2009

Enterprise Storage Forum: Comments about T-Mobile and Clouds? – Oct 2009

Storage Monkeys: Podcast comments about Sun and Oracle- Sep 2009

Enterprise Storage Forum: Comments on Maxiscale clustered, cloud NAS – Sep 2009

SearchStorage: Comments on Maxiscale clustered NAS for web hosting – Sep 2009

Enterprise Storage Forum: Comments on whos hot in data storage industry – Sep 2009

SearchSMBStorage: Comments on SMB Fibre Channel switch options – Sep 2009

SearchStorage: Comments on using storage more efficiently – Sep 2009

SearchStorage: Comments on Data and Storage Tiering including SSD – Sep 2009

Enterprise IT Planet: Comments on Data Deduplication – Sep 2009

SearchDataCenter: Comments on Tiered Storage – Sep 2009

Enterprise Storage Forum: Comments on Sun-Oracle Wedding – Aug 2009

Processor.com: Comments on Storage Network Snags – Aug 2009

SearchStorageChannel: Comments on I/O virtualizaiton (IOV) – Aug 2009

SearchStorage: Comments on Clustered NAS storage and virtualization – Aug 2009

SearchITChannel: Comments on Solid-state drive prices still hinder adoption – Aug 2009

Check out the Content, Tips, Tools, Videos, Podcasts plus White Papers, and News pages for additional commentary, coverage and related content or events.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

November 23, 2009October 18, 2024

What is the Future of Servers?

Recently I provided some comments and perspectives on the future of servers in an article over at Processor.com.

In general, blade servers will become more ubiquitous, that is they wont go away even with cloud, rather become more common place with even higher density processors with more cores and performance along with faster I/O and larger memory capacity per given footprint.

While the term blade server may fade giving way to some new term or phrase, rest assured their capabilities and functionality will not disappear, rather be further enhanced to support virtualization with VMware vsphere, Microsoft HyperV, Citrix/Zen along with public and private clouds, both for consolidation and in the next wave of virtualization called life beyond consolidation.

The other trend is that not only will servers be able to support more processing and memory per footprint; they will also do that drawing less energy requiring lower cooling demands, hence more Ghz per watt along with energy savings modes when less work needs to be performed.

Another trend is around convergence both in terms of packaging along with technology improvements from a server, I/O networking and storage perspective. For example, enhancements to shared PCIe with I/O virtualization, hypervisor optimization, and integration such as the recently announced EMC, Cisco, Intel and VMware VCE coalition and vblocks.

Read more including my comments in the article here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

September 12, 2009May 8, 2022

Performance = Availability StorageIOblog featured ITKE guest blog

Recently IT Knowledge Exchange named me and StorageIOblog as their weekly featured IT blog too which Im flattered and honored. Consequently, I did a guest blog for them titled Performance = Availability, Availability = Performance that you can read about here.

For those not familiar with ITKE, take a few minutes and go over and check it out, there is a wealth of information there on a diversity of topics that you can read about, or, you can also get involved and participate in the questions and answers discussions.

Speaking of ITKE, interested in “The Green and Virtual Data Center” (CRC), check out this link where you can download a free chapter of my book, along with information on how to order your own copy along with a special discount code from CRC press.

Thank you very much to Sean Brooks of ITKE and his social media team of Michael Morisy and Jenny Mackintosh for being named featured IT blogger, as well as for being able to do a guest post for them. It has been fantastic working them and particularly Jenny who helped with all of the logistics in putting together the various pieces including getting the post up on the web as well as in their news letter.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

September 5, 2009April 27, 2025

I/O, I/O, Its off to Virtual Work and VMworld I Go (or went)

Ok, so I should have used that intro last week before heading off to VMworld in San Francisco instead of after the fact.

Think of it as a high latency title or intro, kind of like attaching a fast SSD to a slow, high latency storage controller, or a fast server attached to a slow network, or fast network with slow storage and servers, it is what it is.

I/O virtualization (IOV), Virtual I/O (VIO) along with I/O and networking convergence have been getting more and more attention lately, particularly on the convergence front. In fact one might conclude that it is trendy to all of a sudden to be on the IOV, VIO and convergence bandwagon given how clouds, soa and SaaS hype are being challenged, perhaps even turning to storm clouds?

Lets get back on track, or in the case of the past week, get back in the car, get back in the plane, get back into the virtual office and what it all has to do with Virtual I/O and VMworld.

The convergence game has at its center Brocade emanating from the data center and storage centric I/O corner challenging Cisco hailing from the MAN, WAN, LAN general networking corner.

Granted both vendors have dabbled with success in each others corners or areas of focus in the past. For example, Brocade as via acquisitions (McData+Nishan+CNT+INRANGE among others) a diverse and capable stable of local and long distance SAN connectivity and channel extension for mainframe and open systems supporting data replication, remote tape and wide area clustering. Not to mention deep bench experience with the technologies, protocols and partners solutions for LAN, MAN (xWDM), WAN (iFCP, FCIP, etc) and even FAN (file area networking aka NAS) along with iSCSI in addition to Fibre Channel and FICON solutions.

Disclosure: Here’s another plug ;) Learn more about SANs, LANs, MANs, WANs, POTs, PANs and related technologies and techniques in my book “Resilient Storage Networks – Designing Flexible Scalable Data Infrastructures" (Elsevier).

Cisco not to be outdone has a background in the LAN, MAN, WAN space directly, or similar to Brocade via partnerships with product and experience and depth. In fact while many of my former INRANGE and CNT associates ended up at Brocade via McData or in-directly, some ended up at Cisco. While Cisco is known for general networking, the past several years they have gone from zero to being successful in the Fibre Channel and yes, even the FICON mainframe space while like Brocade (HBAs) dabbling in other areas like servers and storage not to mention consumer products.

What does this have to do with IOV and VIO, let alone VMworld and my virtual office, hang on, hold that thought for a moment, lets get the convergence aspect out of the way first.

On the I/O and networking convergence (e.g. Fibre Channel over Ethernet – FCoE) scene both Brocade (Converged Enhanced Ethernet-CEE) and Cisco (Data Center Ethernet – DCE) along with their partners are rallying around each others camps. This is similar to how a pair of prize fighters maneuvers in advance of a match including plenty of trash talk, hype and all that goes with it. Brocade and Cisco throwing mud balls (or spam) at each other, or having someone else do it is nothing new, however in the past each has had their core areas of focus coming from different tenets in some cases selling to different people in an IT environment or those in VAR and partner organizations. Brocade and Cisco are not alone nor is the I/O networking convergence game the only one in play as it is being complimented by the IOV and VIO technologies addressing different value propositions in IT data centers.

Now on to the IOV and VIO aspect along with VMworld.

For those of you that attended VMworld and managed to get outside of session rooms, or media/analyst briefing or reeducation rooms, or out of partner and advisory board meetings walking the expo hall show floor, there was the usual sea of vendors and technology. There were the servers (physical and virtual), storage (physical and virtual), terminals, displays and other hardware, I/O and networking, data protection, security, cloud and managed services, development and visualization tools, infrastructure resource management (IRM) software tools, manufactures and VARs, consulting firms and even some analysts with booths selling their wares among others.

Likewise, in the onsite physical data center to support the virtual environment, there were servers, storage, networking, cabling and associated hardware along with applicable software and tucked away in all of that, there were also some converged I/O and networking, and, IOV technologies.

Yes, IOV, VIO and I/O networking convergence were at VMworld in force, just ask Jon Torr of Xsigo who was beaming like a proud papa wanting to tell anyone who would listen that his wares were part of the VMworld data center (Disclosure: Thanks for the T-Shirt).

Virtensys had their wares on display with Bob Nappa more than happy to show the technology beyond an UhiGui demo including how their solution includes disk drives and an LSI MegaRAID adapter to support VM boot while leveraging off-the shelf or existing PCIe adapters (SAS, FC, FCoE, Ethernet, SATA, etc.) while allowing adapter sharing across servers, not to mention, they won best new technology at VMworld award.

NextIO who is involved in the IOV / VIO game was there along with convergence vendors Brocade, Cisco, Qlogic and Emulex among others. Rest assured, there are many other vendors and VARs in the VIO and IOV game either still in stealth, semi-stealth or having recently launched.

IOV and VIO are complimentary to I/O and networking convergence in that solutions like those from Aprius, Virtensys, Xsigo, NextIO and others. While they sound similar, and in fact there is confusion as to if Fibre Channel N_Port Virtual ID (FC_NPVID) and VMware virtual adapters are IOV and VIO vs. solutions that are focused on PCIe device/resource extension and sharing.

Another point of confusion around I/O virtualization and virtual I/O are blade system or blade center connectivity solutions such as HP Virtual Connect or IBM Fabric Manger not to mention those form Engenera add confusion to the equation. Some of the buzzwords that you will be hearing and reading more about include PCIe Single Root IOV (SR-IOV) and Multi-Root IOV (MR-IOV). Think of it this way, within VMware you have virtual adapters, and Fibre Channel Virtualization N_Port IDs for LUN mapping/masking, zone management and other tasks.

IOV enables localized sharing of physical adapters across different physical servers (blades or chassis) with distances measured in a few meters; after all, it’s the PCIe bus that is being extended. Thus, it is not a replacement for longer distance in the data center solutions such as FCoE or even SAS for that matter, thus they are complimentary, or at least should be considered complimentary.

The following are some links to previous articles and related material including an excerpt (yes, another plug ;)) from chapter 9 “Networking with you servers and storage” of new book “The Green and Virtual Data Center” (CRC). Speaking of virtual and physical, “The Green and Virtual Data Center” (CRC) was on sale at the physical VMworld book store this week, as well as at the virtual book stores including Amazon.com

The Green and Virtual Data Center

The Green and Virtual Data Center (CRC) on book shelves at VMworld Book Store

Links to some IOV, VIO and I/O networking convergence pieces among others, as well as news coverage, comments and interviews can be found here and here with StorageIOblog posts that may be of interest found here and here.

SearchSystemChannel: Comparing I/O virtualization and virtual I/O benefits – August 2009

Enterprise Storage Forum: I/O, I/O, It’s Off to Virtual Work We Go – December 2007

Byte and Switch: I/O, I/O, It’s Off to Virtual Work We Go (Book Chapter Excerpt) – April 2009

Thus I went to VMworld in San Francisco this past week as much of the work I do is involved with convergence similar to my background, that is, servers, storage, I/O networking, hardware, software, virtualization, data protection, performance and capacity planning.

As to the virtual work, well, I spent some time on airplanes this week which as is often the case, my virtual office, granted it was real work that had to be done, however I also had a chance to meet up with some fellow tweeters at a tweet up Tuesday evening before getting back in a plane in my virtual office.

Now, I/O, I/O, its back to real work I go at Server and StorageIO , kind of rhymes doesnt it!

I/O, I/O, Its off to Virtual Work and VMworld I Go (or went)

August 7, 2009April 27, 2025

Recent tips, videos, articles and more

Its been a busy year so far and there is still plenty more to do. Taking advantage of a short summer break, I’m getting caught up on some items including putting up a link to some of the recent articles, tips, reports, webcasts, videos and more that I have eluded to in recent posts. Realizing that some prefer blogs to webs to tweets to other venues, here are some links to recent articles, tips, videos, podcasts, webcasts, white papers and more that can be found on the StorageIO Tips, tools and White Papers pages.

Recent articles, columns, tips, white papers and reports:

ITworld: The new green data center: From energy avoidance to energy efficiency August 2009

SearchSystemsChannel: Comparing I/O virtualization and virtual I/O benefits July 2009

SearchDisasterRecovery: Top server virtualization myths in DR and BC July 2009

Enterprise Storage Forum: Saving Money with Green Data Storage Technology July 2009

SearchSMB ATE Tips: SMB Tips and ATE by Greg Schulz

SearchSMB ATE Tip: Tape library storage July 2009

SearchSMB ATE Tip: Server-based operating systems vs. PC-based operating systems June 2009

SearchSMB ATE Tip: Pros/cons of block/variable block dedupe June 2009

FedTechAt the Ready: High-availability storage hinges on being ready for a system failure May 2009

Byte & Switch Part XI – Key Elements For A Green and Virtual Data Center May 2009

Byte & Switch Part X – Basic Steps For Building a Green and Virtual Data Center May 2009

InfoStor Technology Options for Green Storage: April 2009

Byte & Switch Part IX – I/O, I/O, Its off to Virtual Work We Go: Networks role in Virtual Data Centers April 2009

Byte & Switch Part VIII – Data Storage Can Become Green: There are many steps you can take April 2009

Byte & Switch Part VII – Server Virtualization Can Save Costs April 2009

Byte & Switch Part VI – Building a Habitat for Technology April 2009

Byte & Switch Part V – Data Center Measurement, Metrics & Capacity Planning April 2009

zJournal Storage & Data Management: Tips for Enabling Green and Virtual Efficient Data Management March 2009

Serial Storage Wire (STA): Green and SASy = Energy and Economic, Effective Storage March 2009

SearchSystemsChannel: FAQs: Green IT strategies for solutions providers March 2009

Computer Technology Review: Recent Comments on The Green and Virtual Data Center March 2009

Byte & Switch Part IV – Virtual Data Centers Can Promote Business Growth March 2009

Byte & Switch Part III – The Challenge of IT Infrastructure Resource Management March 2009

Byte & Switch Part II – Building an Efficient & Ecologically Friendly Data Center March 2009

Byte & Switch Part I – The Green Gap – Addressing Environmental & Economic Sustainability March 2009

Byte & Switch Green IT and the Green Gap February 2009

GreenerComputing: Enabling a Green and Virtual Data Center February 2009

Some recent videos and podcasts include:

bmighty.com The dark side of SMB virtualization July 2009

bmighty.com SMBs Are Now Virtualization’s “Sweet Spot” July 2009

eWeek.com Green IT is not dead, its new focus is about efficiency July 2009

SearchSystemsChannel FAQ: Using cloud computing services opportunities to get more business July 2009

SearchStorage FAQ guide – How Fibre Channel over Ethernet can combine networks July 2009

SearchDataCenter Business Benefits of Boosting Web hosting Efficiency June 2009

SearchStorageChannel Disaster recovery services for solution providers June 2009

The Serverside The Changing Dynamic of the Data Center April 2009

TechTarget Virtualization and Consolidation for Agility: Intels Xeon Processor 5500 series May 2009

Intel Reduce Energy Usage while Increasing Business Productivity in the Data Center May 2009

WSRadio Closing the green gap and shifting towards an IT efficiency and productivity April 2009

bmighty.com July 2009

Check out the Tips, Tools and White Papers, and News pages for more commentary, coverage and related content or events.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

July 19, 2009March 7, 2022

Worried about IT M&A, here come the new startups!

Storage I/O trends

Late last year , I did a post (see here) countering the notion that there is a lack of innovation in IT and specifically around data storage. Recently I did a post about a Funeral for Friend, not to mention yesterdays post about Summer marriages.

For those who are concerned about lack of innovation, or, that consolidation will result in just a few big vendors, here’s some food for thought. Those big vendors in addition to growing via internal organic growth, also grow by buying or merging with other vendors. Those other vendors emerge as startups, some grow, blossom and are bought, some make a decent business on their own, some are looking to be bought, some need to be bought, some will see fire sales, liquidation or simply closing their doors and perhaps re-launching as a new company.

With all the M&A activity currently that has taken place, and I’m sure (speculation only ;) ) that there will be plenty more, here’s a short and far from comprehensive list of some startups or companies you may not have heard of yet. There are additional ones who are still in deep stealth, some on the list are still in stealth, yet talking and letting information trickle out, thus only non-NDA information is being shown here. In other words, you can find out about these via publicly available information and sources.

Something that I have noticed and talked with others in the industry about is that this generation of startups, at least for now are taking a far more low-key approach to their launches than in the past. Gone at least for now are the Dot COM era over the top announcements in some cases before there was even a product or shipping for actual customer production deployment scenario. This crop or corps of startups are taking their time leveraging the current economic situation to further incubate their technologies and go to market strategies, not to mention minimizing the amount of over the top VC funding we have seen in the past. Some of these may not appear to be storage related and that would be correct. This list includes those associated with data infrastructure technolgies from servers, to storage to networking, hardware, software and services among othes as a common theme.

Disclosure Notice: None of these companies mentioned are nor have ever been clients of StorageIO. Why do I mention this, why not!

Balesio – File compression solutions
Box.net – Internet/web/cloud storage service with high availability and backup
Cirrustore – Backup data protection tools
Dataslide – Hard rectangular disk (HRD)
Enclarity – Healthcare CRM and analysis tools
Enstratus – Amazon cloud computing management tools
Exludas – Multi core optimize
Firescope – CMDB data solutions
Greenbytes – ZFS based storage management solutions
Likewise – Open backup software for macs/linux/windows
Liquidcomputing – High density servers
Maxiscale – Web infrastructure (Stealth)
Metalogix – Archiving solutions
Neptuny – Capacity Planning
Netronome – Network and I/O optimization technology
Newboundary – IT policy management and IRM tools
Nexenta ZFS – based storage management solutions
Pergamumsystems – Archive solutions (Stealth)
Pranah – SMB Storage vendor formerly known as Marner
Procedo – Archiving and migration solutions
Rebit – Backup and data protection solutions
Rightscale – Amazon cloud computing management tools
Rmsource – Cloud backup solutions
RNAnetworks – Virtual memory management solutions
Scale Computing – Clustered storage management software
ScaleMP – Multi-core virtualization for scale out
SiberSystems – Goodsync data protection solutions
Sparebackup – Backup data protection solutions
StorageFusion – Storage resource analysis
Storspeed – NAS/NFS optimization solutions (Stealth)
Sugarsync – Backup and data protection solutions
Surgient – Cloud computing solutions
Synology – SMB storage solutions
TwinStrata – BC/DR analysis and assessment tools
Vadium – Security and encryption tools
Vembu – Backup data protection tools
Versant – Object database management solutions
Vipre – Security, data loss, data leak prevention
VirtenSys – Virtual I/O and I/O virtualization (IOV)
Vizrt – Video management software tools
WhipTail – Flash SSD solutions
Xenos – Archive and data footprint reduction solutions

Links to the above along with many other companies including manufactures and vars can be found on the Interesting Links page at StorageIO.

Food for thought for your summer technology picnic fun.

Nuf said for now.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

FOR IMMEDIATE RELEASE:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: