Clarifying Clustered Storage Confusion

Clustered storage can be iSCSI, Fibre Channel block based or NAS (NFS or CIFS or proprietary file system) file system based. Clustered storage can also be found in virtual tape library (VTL) including dedupe solutions along with other storage solutions such as those for archiving, cloud, medical or other specialized grids among others.

Recently in the IT and data storage specific industry, there has been a flurry of merger and acquisition (M&A) (Here and here), new product enhancement or announcement activity around clustered storage. For example, HP buying clustered file system vendor IBRIX complimenting their previous acquisition of another clustered file system vendor (PolyServe) a few years ago, or, of iSCSI block clustered storage software vendor LeftHand earlier this year. Another recent acquisition is that of LSI buying clustered NAS vendor ONstor, not to mention Dell buying iSCSI block clustered storage vendor EqualLogic about a year and half ago, not to mention other vendor acquisitions or announcements involving storage and clustering.

Where the confusion enters into play is the term cluster which means many things to different people, and even more so when clustered storage is combined with NAS or file based storage. For example, clustered NAS may infer a clustered file system when in reality a solution may only be multiple NAS filers, NAS heads, controllers or storage processors configured for availability or failover.

What this means is that a NFS or CIFS file system may only be active on one node at a time, however in the event of a failover, the file system shifts from one NAS hardware device (e.g. NAS head or filer) to another. On the other hand, a clustered file system enables a NFS or CIFS or other file system to be active on multiple nodes (e.g. NAS heads, controllers, etc.) concurrently. The concurrent access may be for small random reads and writes for example supporting a popular website or file serving application, or, it may be for parallel reads or writes to a large sequential file.

Clustered storage is no longer exclusive to the confines of high-performance sequential and parallel scientific computing or ultra large environments. Small files and I/O (read or write), including meta-data information, are also being supported by a new generation of multipurpose, flexible, clustered storage solutions that can be tailored to support different applications workloads.

There are many different types of clustered and bulk storage systems. Clustered storage solutions may be block (iSCSI or Fibre Channel), NAS or file serving, virtual tape library (VTL), or archiving and object-or content-addressable storage. Clustered storage in general is similar to using clustered servers, providing scale beyond the limits of a single traditional system—scale for performance, scale for availability, and scale for capacity and to enable growth in a modular fashion, adding performance and intelligence capabilities along with capacity.

For smaller environments, clustered storage enables modular pay-as-you-grow capabilities to address specific performance or capacity needs. For larger environments, clustered storage enables growth beyond the limits of a single storage system to meet performance, capacity, or availability needs.

Applications that lend themselves to clustered and bulk storage solutions include:

  • Unstructured data files, including spreadsheets, PDFs, slide decks, and other documents
  • Email systems, including Microsoft Exchange Personal (.PST) files stored on file servers
  • Users’ home directories and online file storage for documents and multimedia
  • Web-based managed service providers for online data storage, backup, and restore
  • Rich media data delivery, hosting, and social networking Internet sites
  • Media and entertainment creation, including animation rendering and post processing
  • High-performance databases such as Oracle with NFS direct I/O
  • Financial services and telecommunications, transportation, logistics, and manufacturing
  • Project-oriented development, simulation, and energy exploration
  • Low-cost, high-performance caching for transient and look-up or reference data
  • Real-time performance including fraud detection and electronic surveillance
  • Life sciences, chemical research, and computer-aided design

Clustered storage solutions go beyond meeting the basic requirements of supporting large sequential parallel or concurrent file access. Clustered storage systems can also support random access of small files for highly concurrent online and other applications. Scalable and flexible clustered file servers that leverage commonly deployed servers, networking, and storage technologies are well suited for new and emerging applications, including bulk storage of online unstructured data, cloud services, and multimedia, where extreme scaling of performance (IOPS or bandwidth), low latency, storage capacity, and flexibility at a low cost are needed.

The bandwidth-intensive and parallel-access performance characteristics associated with clustered storage are generally known; what is not so commonly known is the breakthrough to support small and random IOPS associated with database, email, general-purpose file serving, home directories, and meta-data look-up (Figure 1). Note that a clustered storage system, and in particular, a clustered NAS may or may not include a clustered file system.

Clustered Storage Model: Source The Green and Virtual Data Center (CRC)
Figure 1 – Generic clustered storage model (Courtesy “The Green and Virtual Data Center  (CRC)”

More nodes, ports, memory, and disks do not guarantee more performance for applications. Performance depends on how those resources are deployed and how the storage management software enables those resources to avoid bottlenecks. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion when processing diverse application workloads. Other things to consider include support for industry-standard interfaces, protocols, and technologies.

Scalable and flexible clustered file server and storage systems provide the potential to leverage the inherent processing capabilities of constantly improving underlying hardware platforms. For example, software-based clustered storage systems that do not rely on proprietary hardware can be deployed on industry-standard high-density servers and blade centers and utilizes third-party internal or external storage.

Clustered storage is no longer exclusive to niche applications or scientific and high-performance computing environments. Organizations of all sizes can benefit from ultra scalable, flexible, clustered NAS storage that supports application performance needs from small random I/O to meta-data lookup and large-stream sequential I/O that scales with stability to grow with business and application needs.

Additional considerations for clustered NAS storage solutions include the following.

  • Can memory, processors, and I/O devices be varied to meet application needs?
  • Is there support for large file systems supporting many small files as well as large files?
  • What is the performance for small random IOPS and bandwidth for large sequential I/O?
  • How is performance enabled across different application in the same cluster instance?
  • Are I/O requests, including meta-data look-up, funneled through a single node?
  • How does a solution scale as the number of nodes and storage devices is increased?
  • How disruptive and time-consuming is adding new or replacing existing storage?
  • Is proprietary hardware needed, or can industry-standard servers and storage be used?
  • What data management features, including load balancing and data protection, exists?
  • What storage interface can be used: SAS, SATA, iSCSI, or Fibre Channel?
  • What types of storage devices are supported: SSD, SAS, Fibre Channel, or SATA disks?

As with most storage systems, it is not the total number of hard disk drives (HDDs), the quantity and speed of tiered-access I/O connectivity, the types and speeds of the processors, or even the amount of cache memory that determines performance. The performance differentiator is how a manufacturer combines the various components to create a solution that delivers a given level of performance with lower power consumption.

To avoid performance surprises, be leery of performance claims based solely on speed and quantity of HDDs or the speed and number of ports, processors and memory. How the resources are deployed and how the storage management software enables those resources to avoid bottlenecks are more important. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion.

Learn more about clustered storage (block, file, VTL/dedupe, archive), clustered NAS, clustered file system, grids and cloud storage among other topics in the following links:

"The Many faces of NAS – Which is appropriate for you?"

Article: Clarifying Storage Cluster Confusion
Presentation: Clustered Storage: “From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0”
Video Interview: How to Scale Data Storage Systems with Clustering
Guidelines for controlling clustering
The benefits of clustered storage

Along with other material on the StorageIO Tips and Tools or portfolio archive or events pages.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Worried about IT M&A, here come the new startups!

Storage I/O trends

Late last year , I did a post (see here) countering the notion that there is a lack of innovation in IT and specifically around data storage. Recently I did a post about a Funeral for Friend, not to mention yesterdays post about Summer marriages.

For those who are concerned about lack of innovation, or, that consolidation will result in just a few big vendors, here’s some food for thought. Those big vendors in addition to growing via internal organic growth, also grow by buying or merging with other vendors. Those other vendors emerge as startups, some grow, blossom and are bought, some make a decent business on their own, some are looking to be bought, some need to be bought, some will see fire sales, liquidation or simply closing their doors and perhaps re-launching as a new company.

With all the M&A activity currently that has taken place, and I’m sure (speculation only ;) ) that there will be plenty more, here’s a short and far from comprehensive list of some startups or companies you may not have heard of yet. There are additional ones who are still in deep stealth, some on the list are still in stealth, yet talking and letting information trickle out, thus only non-NDA information is being shown here. In other words, you can find out about these via publicly available information and sources.

Something that I have noticed and talked with others in the industry about is that this generation of startups, at least for now are taking a far more low-key approach to their launches than in the past. Gone at least for now are the Dot COM era over the top announcements in some cases before there was even a product or shipping for actual customer production deployment scenario. This crop or corps of startups are taking their time leveraging the current economic situation to further incubate their technologies and go to market strategies, not to mention minimizing the amount of over the top VC funding we have seen in the past. Some of these may not appear to be storage related and that would be correct. This list includes those associated with data infrastructure technolgies from servers, to storage to networking, hardware, software and services among othes as a common theme.

Disclosure Notice: None of these companies mentioned are nor have ever been clients of StorageIO. Why do I mention this, why not!

Balesio – File compression solutions
Box.net – Internet/web/cloud storage service with high availability and backup
Cirrustore – Backup data protection tools
Dataslide – Hard rectangular disk (HRD)
Enclarity – Healthcare CRM and analysis tools
Enstratus – Amazon cloud computing management tools
Exludas – Multi core optimize
Firescope – CMDB data solutions
Greenbytes – ZFS based storage management solutions
Likewise – Open backup software for macs/linux/windows
Liquidcomputing – High density servers
Maxiscale – Web infrastructure (Stealth)
Metalogix – Archiving solutions
Neptuny – Capacity Planning
Netronome – Network and I/O optimization technology
Newboundary – IT policy management and IRM tools
Nexenta ZFS – based storage management solutions
Pergamumsystems – Archive solutions (Stealth)
Pranah – SMB Storage vendor formerly known as Marner
Procedo – Archiving and migration solutions
Rebit – Backup and data protection solutions
Rightscale – Amazon cloud computing management tools
Rmsource – Cloud backup solutions
RNAnetworks – Virtual memory management solutions
Scale Computing – Clustered storage management software
ScaleMP – Multi-core virtualization for scale out
SiberSystems – Goodsync data protection solutions
Sparebackup – Backup data protection solutions
StorageFusion – Storage resource analysis
Storspeed – NAS/NFS optimization solutions (Stealth)
Sugarsync – Backup and data protection solutions
Surgient – Cloud computing solutions
Synology – SMB storage solutions
TwinStrata – BC/DR analysis and assessment tools
Vadium – Security and encryption tools
Vembu – Backup data protection tools
Versant – Object database management solutions
Vipre – Security, data loss, data leak prevention
VirtenSys – Virtual I/O and I/O virtualization (IOV)
Vizrt – Video management software tools
WhipTail – Flash SSD solutions
Xenos – Archive and data footprint reduction solutions

Links to the above along with many other companies including manufactures and vars can be found on the Interesting Links page at StorageIO.

Food for thought for your summer technology picnic fun.

Nuf said for now.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Summer Weddings: EMC+Datadomain and HP+IBRIX

Storage I/O trends

Are you friend or family of the bride or groom?

Here’s comes the bride! (Audio)

That’s a question me and Mrs. Schulz were asked recently when we attended a wedding.

Summer months particularly June and August are known as wedding months (Hmmm, more merger & acquisition activity to come?). Summer is a nice time of the year for marriages at least in the U.S. and how ironic that we have already seen two well publicized IT data storage industry unions in the past couple of weeks, not to mention other smaller less publicized ones.

In one case, the California based bride (Datadomain-DDUP) had two courtiers (Massachusetts based EMC and California based NetApp, plus rumors of others). Fortunately one of those had a prenuptial that earned them a cool $57 million for their efforts (NetApp-NTAP) when EMC won the bride. Read more including some of my comments and perspectives among others about EMC, NTAP and DDUP here and here.

Yesterday, on a mid-July Friday, when things are normally quiet, in true wedding industry forum, news was released (here and here) that California based HP announced that it had bought Massachusetts based data and storage management software vendor IBRIX.

That’s a lot of activity involving California and Massachusetts in the past couple of weeks, not to mention the tornado sightings in the vicinity of EMCs Hopington Massachusetts headquarters coincidently around the same time the marriage to DDUP was formerly announced! What’s’ next, Aerosmith is out on tour, perhaps the Del Fuegos or Boston will perform at one of these wedding parties?

Within the data storage industry, publicly traded Datadomain (DDUP) is fairly well known to many for their role in helping to popularize the data footprint impact reduction technique refereed to as de-duplication (e.g. normalization, commonality factoring, intelligent compression, etc.). Adding to the awareness of DDUP was the recent highly public courtship with EMC eventually out-bidding NTAP with a dowry of about $2.1B USD. That type of press coverage and monetary amounts might normally be expected for the likes of a Madonna, Brittney Spears, Michael Jackson-RIP, Paris Hilton, Elizabeth Taylor or other celebrity unions covered by paparazzi with a similar number of attorneys involved.

On the other hand, IBRIX while known to some, is a lessor known entity compared to DDUP having taken a lower profile than even some of their close competitors. However for those who have been following and covering the clustered storage market (see here, here, here, here, here, here, here, here and here ), IBRIX is a well known entity.

IBRIX also has had ties to EMC having been involved in a pre-mari age affair with an reseller arrangement along with being "rumored" ;) to have been involved with ATMOS cloud or policy based storage solution formerly known as "Hulk". IBRIX has also quietly been involved with others like Dell as well as HP in similar to EMC reseller arrangements. Where IBRIX has been positioned is to address high performance, scale out parallel or concurrent clustered file system needs, both big and small I/O, sequential and random data storage and access. For example, in the media/entertainment and other industries along with enabling large Internet providers a bulk (low cost, high capacity) scale-out NAS (NFS & CIFS) option.

One of the reasons that IBRIX has been involved with the likes of EMC, Dell and HP among others is that unlike other vendors such as BlueArc, the once high-flying Isilon, NetApp, Onstor or Panasas, not to mention EMC Cellera NAS , is that those solutions are all bundled with proprietary hardware while IBRIX is software based. Where IBRIX Fusion fits is to enable NAS storage solutions using industry standard hardware (servers and storage) that are capable of being configured for both high performance compute (HPC) along with for low-cost general purpose bulk storage to support Web 2.0, social networking, home directories or on-line archives.

Consequently, and HP or Dell who just happen to sell servers, have had the ability of meeting large scale out and scale up NAS file serving applications by re-selling IBRIX installed on their servers or blade servers with either their own entry to mid-range lower cost, high performance and high capacity storage along with that of 3rd party vendors.

Ironically one of IBRIX’s competitors in the software NAS solution market was and remains PolyServe, software that HP acquired a couple of years ago to create their own scale out NAS solution (e.g. EFS). Other software based solutions include among others Lustre (Sun), CXFS (SGI), EMC ATMOS (I’m sure some will argue this is not scale out or NAS, will leave it at that for now) ;) not to mention those from IBM, Microsoft, Quantum (also re-sold by HP) or Symantec.

What does HP get with IBRIX?

Simple, the ability to own the IP (intellectual proprietary) that one of their competitors had been "rumored" to have been working with at one point, IP that their competitors had been reselling like themselves.

Thus HP gets more software IP that can and has been sold along with their hardware such as the Proliant servers and blade servers giving their customers choice, similar to what HP and other vendors do with their open servers. For example, HP had the ExDS9000 extreme storage system built on a blade server with high density, low cost, high capacity HP storage (e.g. HP Modular Disk System 600, HP MSA or even EVA).

This makes for a nice solution for bulk on-line and near-line storage applications where the emphasis is not as much on performance, rather massive scalability for storing on-line documents, archives, videos, images and other unstructured content which is where there is a lot of growth activity. The challenge is that the ExDS9x00 has only been available with the HP PolyServe software which works good for some environments, yet, for others, the clustered file system scale out capabilities of IBRIX were deployed.

With the addition of IBRIX, HP now should be able to provide their customers and prospects the choice of software to meet specific needs while maintaining an HP footprint, that is both hardware, software and services. HP has several different storage software stacks that they now own (e.g. Lefthand for clustered iSCSI, PolyServe for NFS/CIFS NAS, IBRIX for Clustered File system scale out NAS) not to mention those that it OEMS including among others Bycast (Medical Archive System) that is also OEM’d by IBM as their Medical Grid combined with IBM SOFS, Quantum StorNext and Microsoft Windows Storage Server and Sepaton (VTL and Dedupe) to name a few.

Do I think this was a good move by HP?

Yes as it gives them control over IP that they had been reselling as had some of their competitors who left IBRIX to HP to grab up. HP now has the IP which they can package with their hardware similar to how they have been doing, and giving customers choices to align the right hardware and software technology to the task at hand.

Whether it be Bycast for medical archiving, PolyServe or IBRIX for scale out NAS, Lefthand for clustered iSCSI, Sepaton for VTL and dedupe, Microsoft, Quantum StorNext for shared block storage serving or any of the other software packages HP offers with their industry standard servers, the customer has options.

For IBRIX customers and prospects, this move will give them a boost in a confidence that their decisions and investments are safe.

Ironically, vendors like Symantec with their Scaleable File Serving (SFS) clustered NAS solution that is also software based and runs on anyone’s open servers including those from HP gets a potential shot in the arm with HP validating the model and approach for bulk-storage and clustered NAS (Oh Mr. Salem, Mr. Dell is holding on Line 1, Mr. Chambers is on line 2 and Mr. Ellison on line 3 ;) )

Who’s going to be at the alter next? IMHO, I would keep an eye on (and this all just pure speculation) Bycast, Symantec, EMLX (Broadcom was a wake up call), Quantum, Sepaton, STEC, StorMagic, or ACS, maybe even 3PAR among other possibilities (think outside of the lines). I would not rule out a major game changer such as someone buying NetApp or the likes of an HP buying an EMC or Oracle buying a CSC, maybe even a CSCO buying someone like NTAP, how about Oracle buying NTAP and putting some attorneys out of work, not to mention, who will MSFT hook up with? Anything is possible as we have seen and traditional M&A wisdom is out the window.

Have fun at the next wedding you attend, go easy on the cake and wedding punch, especially if you will be doing any dancing (please, no You tube videos of the chicken dance) and be careful throwing rice or other items.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Brocade to Buy Foundry Networks – Prelude to Upcoming Converged Ethernet and FCoE Battle

Storage I/O trends

The emerging and maturing Fibre Channel over Ethernet (FCoE) and Converged Ethernet, aka Data Center Ethernet, Converged Enhanced Ethernet, Enterprise Ethernet among others marketing names activity is picking up. Today Brocade took a major step to shore up its already announced FCoE and converged Ethernet story which includes new directors and converged host bus adapters
by announcing intentions of buying

Ethernet high performance switching vendor Foundry Networks in a deal valued around $3B USD and some change. Not a bad deal for Foundry, some would say an expensive deal for Brocade, perhaps paying to much, however given some of the recent storage and networking related deals. For example IBM spending around $300M for a startup called XIV who claims to have shipped a few storage systems to a few customers, or, Dell spending about $1.3B to buy EqualLogic who had a few thousand customers (Could be the deal of the century for Dell compared to IBM and XIV, however time will tell), or EMC and some of its recent purchases like RSA, Avamar or bargains like WysDM, Mozy and Iomega not to mention Cisco having not been bashful about dropping some serious coin for standalone companies like Neuspeed (where are they now) for iSCSI as well as Andimao and more recently Nuovo. Regardless of if Mike Klayko (Brocade CEO) paid too much or not, he did what he had to do as part of his continuing activities to re-invent Brocade and leverage their core DNA and business focus of data infrastructures.

Brocade could probably have made a nice business for a few more years like some of the companies they have recently acquired tried to do including McData, CNT, INRANGE and so forth. However the reality is that sooner or later, they too (Brocade) would probably have been acquired by someone perhaps. With the acquisition of Foundry Networks, along with previous announcements for FCoE technologies and their existing products for NAS or file based storage management and iSCSI solutions, Brocade is signaling that they want to fight for survival as opposed to circle the wagons and guard their installed base and wheel house.

With the up-coming Converged Ethernet and FCoE battle royal shaping up to start in about 12 to 18 months, sooner for the early adopters who like to test and kick around technology early, or for those who want to go right to 10GbE today instead of 8Gb Fibre Channel, or, for those who like bleeding edge solutions. The reality even with recent proof of life plug-fest demos and claims of being ready for primetime, core Brocade customers particularly at the high-end of the market tend to be rather risk averse and cautions with their data infrastructure thus moving at a slower pace. For them, upgrading to 8Gb Fibre Channel may be the near term future while watching FCoE and converged Ethernet or converged enhanced Ethernet evolve and being transitioning in a couple of years. For these risk adverse type customers, bleeding edge technology means having a blood bank nearby and on call as downtime and disruption is not an option.

Rest assured, with Ciscopushing hard to stimulate the FCoE market and get people to skip 8Gb FC and switch over to 10GbE, there will be plenty more plug fest and proof of life demos, plenty of trash talking by both sides that will rival some of the best heavy weight match-ups.

Buyers beware, do your home work and if being an early adopter of FCoE and converged networks is right for you, with due diligence do some testing to see how everything really works in your environment from storage systems, to adapters, to switches, to protocol converters and gateways to management and diagnostic software. How does the whole ecosystem that matches your environment work for your scenario. If you are not comfortable with where the FCoE and converged Ethernet technologies and more importantly supporting ecosystem are at, take your time, monitor the situation as it unfolds over the next year or so leading up to the big battle royal between Brocade and Cisco.

Something that I think is interesting is that here we have Brocade and Cisco squaring off in a convergence battle between a general networking vendor (Cisco) and storage centric networking vendor (Brocade), both of whom have been built on organic growth as well as acquisitions. What?s even more interesting is that around 10 years ago back when Brocade was just getting started and Cisco was still trying to figure out Fibre Channel and iSCSI, 3COM had at the time the foresight to put together an alliance of Storage related partners to get into the then emerging SAN market place. The alliance was to include various storage vendors, switch and HBA as well as router or gateway vendors along with data and backup software vendors. Before the program could be officially launched, it was canceled just as all of the promotional material was about to be distributed due to poor finical health of 3COM. With a few exceptions, most of the participants in that early program, which was probably a year or two ahead of its time have either been bought or disappeared altogether. 3COM could have been a major force in a converged LAN and SAN market place instead of now watching Brocade and Cisco form the sidelines.

For now, congratulations to Mike Klayko and crew for demonstrating that they want to put up a fight and provide an alternative for their customers to Cisco and that they are serious about being a serious contender in the data infrastructure solution provider fight. For Cisco, looks like two of your competitors have now become one. Good luck and best wishes to both sides, Brocade and Cisco and I will be watching this battle from ring side as both parties line up and re-align their partner ecosystems.

Cheers
gs