HPs big December 3rd storage announcement

HP has been talking and promoting for several weeks (ok, months) their upcoming December 3rd storage announcements from the HP discovery event in Frankfurt Germany.

Well its now afternoon which means the early Monday morning December 3rd embargos have been lifted so I can now talk about what HP shared last Friday about todays announcements. Basically what I received was a series of press releases as well link to their updated web site providing information about todays announcements.

HP has enhanced the 3PAR aka P10000 with new models including for entry-level, as well as for higher performance enterprises needs. This also should beg the question for many longtime EVA (excuse me, P6000) customers, have they hit the end of the line? For scale out storage, HP has the StoreAll solutions (think about products formerly marketed as certain X9000 models based on Ibrix) with enhancements for analytics, bulk and various types of big data. In addition HP has enhanced its backup and recovery capabilities and Dedupe products including integration with Autonomy (here and here) along with capacity on demand services.

New 3PAR (P10000 models)

New StoreAll storage system

From the surface and what I have been able to see so far, looks like a good set of incremental enhancements from HP. Not much else to say until I can get some time to dig around deep to see what can be found on more details, however check out Calvin Zito (aka @hpstorageguy) the HP storage blogger who should have more information from HP.

Ok, nuff said (for now).

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Ceph Day Amsterdam 2012 (Object and cloud storage)

StorageIO industry trends cloud, virtualization and big data

Recently while I was in Europe presenting some sessions at conferences and doing some seminars, I was invited by Ed Saipetch (@edsai) of Inktank.com to attend the first Ceph Day in Amsterdam.

Ceph day image

As luck or fate would turn out, I was in Nijkerk which is about an hour train ride from Amsterdam central station plus a free day in my schedule. After a morning train ride and nice walk from Amsterdam Central I arrived at the Tobacco Theatre (a former tobacco trading venue) where Ceph Day was underway, and in time for lunch of Krokettens sandwich.

Attendees at Ceph Day

Lets take a quick step back and address for those not familiar what is Ceph (Cephalanthera) and why it was worth spending a day to attend this event. Ceph is an open source distributed object scale out (e.g. cluster or grid) software platform running on industry standard hardware.

Dell server supporting ceph demoSketch of ceph demo configuration

Ceph is used for deploying object storage, cloud storage and managed services, general purpose storage for research, commercial, scientific, high performance computing (HPC) or high productivity computing (commercial) along with backup or data protection and archiving destinations. Other software similar in functionality or capabilities to Ceph include OpenStack Swift, Basho Riak CS, Cleversafe, Scality and Caringo among others. There are also the tin wrapped software (e.g. appliances or pre-packaged) solutions such as Dell DX (Caringo), DataDirect Networks (DDN) WOS, EMC ATMOS and Centera, Amplidata and HDS HCP among others. From a service standpoint, these solutions can be used to build services similar Amazon S3 and Glacier, Rackspace Cloud files and Cloud Block, DreamHost DreamObject and HP Cloud storage among others.

Ceph cloud and object storage architecture image

At the heart of Ceph is RADOS a distributed object store that consists of peer nodes functioning as object storage devices (OSD). Data can be accessed via REST (Amazon S3 like) APIs, Libraries, CEPHFS and gateway with information being spread across nodes and OSDs using a CRUSH based algorithm (note Sage Weil is one of the authors of CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data). Ceph is scalable in terms of performance, availability and capacity by adding extra nodes with hard disk drives (HDD) or solid state devices (SSDs). One of the presentations pertained to DreamHost that was an early adopter of Ceph to make their DreamObjects (cloud storage) offering.

Ceph cloud and object storage deployment image

In addition to storage nodes, there are also an odd number of monitor nodes to coordinate and manage the Ceph cluster along with optional gateways for file access. In the above figure (via DreamHost), load balancers sit in front of gateways that interact with the storage nodes. The storage node in this example is a physical server with 12 x 3TB HDDs each configured as a OSD.

Ceph dreamhost dreamobject cloud and object storage configuration image

In the DreamHost example above, there are 90 storage nodes plus 3 management nodes, the total raw storage capacity (no RAID) is about 3PB (12 x 3TB = 36TB x 90 = 3.24PB). Instead of using RAID or mirroring, each objects data is replicated or copied to three (e.g. N=3) different OSDs (on separate nodes), where N is adjustable for a given level of data protection, for a usable storage capacity of about 1PB.

Note that for more usable capacity and lower availability, N could be set lower, or a larger value of N would give more durability or data protection at higher storage capacity overhead cost. In addition to using JBOD configurations with replication, Ceph can also be configured with a combination of RAID and replication providing more flexibility for larger environments to balance performance, availability, capacity and economics.

Ceph dreamhost and dreamobject cloud and object storage deployment image

One of the benefits of Ceph is the flexibility to configure it how you want or need for different applications. This can be in a cost-effective hardware light configuration using JBOD or internal HDDs in small form factor generally available servers, or high density servers and storage enclosures with optional RAID adapters along with SSD. This flexibility is different from some cloud and object storage systems or software tools which take a stance of not using or avoiding RAID vs. providing options and flexibility to configure and use the technology how you see fit.

Here are some links to presentations from Ceph Day:
Introduction and Welcome by Wido den Hollander
Ceph: A Unified Distributed Storage System by Sage Weil
Ceph in the Cloud by Wido den Hollander
DreamObjects: Cloud Object Storage with Ceph by Ross Turk
Cluster Design and Deployment by Greg Farnum
Notes on Librados by Sage Weil

Presentations during ceph day

While at Ceph day, I was able to spend a few minutes with Sage Weil Ceph creator and founder of inktank.com to record a pod cast (listen here) about what Ceph is, where and when to use it, along with other related topics. Also while at the event I had a chance to sit down with Curtis (aka Mr. Backup) Preston where we did a simulcast video and pod cast. The simulcast involved Curtis recording this video with me as a guest discussing Ceph, cloud and object storage, backup, data protection and related themes while I recorded this pod cast.

One of the interesting things I heard, or actually did not hear while at the Ceph Day event that I tend to hear at related conferences such as SNW is a focus on where and how to use, configure and deploy Ceph along with various configuration options, replication or copy modes as opposed to going off on erasure codes or other tangents. In other words, instead of focusing on the data protection protocol and algorithms, or what is wrong with the competition or other architectures, the Ceph Day focused was removing cloud and object storage objections and enablement.

Where do you get Ceph? You can get it here, as well as via 42on.com and inktank.com.

Thanks again to Sage Weil for taking time out of his busy schedule to record a pod cast talking about Ceph, as well 42on.com and inktank for hosting, and the invitation to attend the first Ceph Day in Amsterdam.

View of downtown Amsterdam on way to train station to return to Nijkerk
Returning to Amsterdam central station after Ceph Day

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Seven databases in seven weeks, a book review of NoSQL databases

StorageIO industry trends cloud, virtualization and big data

Seven Databases in Seven Weeks (A Guide to Modern Databases and the NoSQL Movement) is a book written Eric Redmond (@coderoshi) and Jim Wilson (@hexlib), part of The Pragmatic Programmers (@pragprog) series that takes a look at several non SQL based database systems.

Cover image of seven databases in seven weeks book image

Coverage includes PostgreSQL, Riak, Apache HBase, MongoDB, Apache CouchDB, Neo4J and Redis with plenty of code and architecture examples. Also covered include relational vs. key value, columnar and document based systems among others.

The details: Seven Databases in Seven Weeks
Paperback: 352 pages
Publisher: Pragmatic Bookshelf (May 18, 2012)
Language: English
ISBN-10: 1934356921
ISBN-13: 978-1934356920
Product Dimensions: 7.5 x 0.8 x 9 inches

Buzzwords (or keywords) include availability, consistency, performance and related themes. Others include MongoDB, Cassandra, Redis, Neo4J, JSON, CouchDB, Hadoop, HBase, Amazon Dynamo, Map Reduce, Riak (Basho) and Postgres along with data models including relational, key value, columnar, document and graph along with big data, little data, cloud and object storage.

While this book is not a how to tutorial or installation guide, it does give a deep dive into the different databases covered. The benefit is gaining an understanding of what the different databases are good for, strengths, weakness, where and when to use or choose them for various needs.

Look inside seven databases in seven weeks book image
A look inside my copy of Seven Databases in Seven Days

Who should this book includes applications developers, programmers, Cloud, big data and IT/ICT architects, planners and designers along with database, server, virtualization and storage professionals. What I like about the book is that it is a great intro and overview along with sufficient depth to understand what these different solutions can and cannot do, when, where and why to use these tools for different situations in a quick read format and plenty of detail.

Would I recommend buying it: Yes, I bought a copy myself on Amazon.com, get your copy by clicking here.

Ok, nuff said

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

SSD, flash and DRAM, DejaVu or something new?

StorageIO industry trends cloud, virtualization and big data

Recently I was in Europe for a couple of weeks including stops at Storage Networking World (SNW) Europe in Frankfurt, StorageExpo Holland, Ceph Day in Amsterdam (object and cloud storage), and Nijkerk where I delivered two separate 2 day, and a single 1 day seminar.

Image of Frankfurt transtationImage of inside front of ICE train going from Frankfurt to Utrecht

At the recent StorageExpo Holland event in Utrecht, I gave a couple of presentations, one on cloud, virtualization and storage networking trends, the other taking a deeper look at Solid State Devices (SSD’s). As in the past, StorageExpo Holland was great in a fantastic venue, with many large exhibits and great attendance which I heard was over 6,000 people over two days (excluding exhibitor vendors, vars, analysts, press and bloggers) which was several times larger than what was seen in Frankfurt at the SNW event.

Image of Ilja Coolen (twitter @@iCoolen) who was session host for SSD presentation in UtrechtImage of StorageExpo Holland exhibit show floor in Utrecht

Both presentations were very well attended and included lively interactive discussion during and after the sessions. The theme of my second talk was SSD, the question is not if, rather what to use where, how and when which brings us up to this post.

For those who have been around or using SSD for more than a decade outside of cell phones, camera, SD cards or USB thumb drives, that probably means DRAM based with some form of data persistency mechanisms. More recently mention SSD and that implies nand flash-based, either MLC or eMLC or SLC or perhaps emerging mram or PCM. Some might even think of NVRAM or other forms of SSD including emerging mram or mem-resistors among others, however lets stick to nand flash and dram for now.

image of ssd technology evolution

Often in technology what is old can be new, what is new can be seen as old, if you have seen, experienced or done something before you will have a sense of DejaVu and it might be evolutionary. On the other hand, if you have not seen, heard, experienced, or found a new audience, then it can be  revolutionary or maybe even an industry first ;).

Technology evolves, gets improved on, matures, and can often go in cycles of adoption, deployment, refinement, retirement, and so forth. SSD in general has been an on again, off again type cycle technology for the past several decades except for the past six to seven years. Normally there is an up cycle tied to different events, servers not being fast enough or affordable so use SSD to help address performance woes, or drives and storage systems not being fast enough and so forth.

Btw, for those of you who think that the current SSD focused technology (nand flash) is new, it is in fact 25 years old and still evolving and far from reaching its full potential in terms of customer deployment opportunities.

StorageIO industry trends cloud, virtualization and big data

Nand flash memory has helped keep SSD practical for the past several years riding the similar curve that is keeping hard disk drives (HDD’s) that they were supposed  to replace alive. That is improved reliability, endurance or duty cycle, better annual failure rate (AFR), larger space capacity, lower cost, and enhanced interfaces, packaging, power and functionality.

Where SSD can be used and options

DRAM historically at least for enterprise has been the main option for SSD based solutions using some form of data persistency. Data persistency options include battery backup combined with internal HDD’s to de stage information from the DRAM before power was lost. TMS (recently bought by IBM) was one of the early SSD vendors from the DRAM era that made the transition to flash including being one of the first many years ago to combine DRAM as a cache layer over nand flash as a persistency or de-stage layer. This would be an example of if you were not familiar with TMS back then and their capacities, you might think or believe that some more recent introductions are new and revolutionary, and perhaps they are in their own right or with enough caveats and qualifiers.

An emerging trend, which for some will be Dejavu, is that of using more DRAM in combination with nand flash SSD.

Oracle is one example of a vendor who IMHO rather quietly (intentionally or accidentally) has done this in the 7000 series storage systems as well as ExaData based database storage systems. Rest assured they are not alone and in fact many of the legacy large storage vendors have also piled up large amounts of DRAM based cache in their storage systems. For example EMC with 2TByte of DRAM cache in their VMAX 40K, or similar systems from Fujitsu HP, HDS, IBM and NetApp (including recent acquisition of DRAM based CacheIQ) among others. This has also prompted the question of if SSD has been successful in traditional storage arrays, systems or appliances as some would have you believe not, click here to learn more and cast your vote.

SSD, IO, memory and storage hirearchy

So is the future in the past? Some would say no, some will say yes, however IMHO there are lessons to learn and leverage from the past while looking and moving forward.

Early SSD’s were essentially RAM disks, that is a portion of main random access memory (RAM) or what we now call DRAM set aside as a non persistent (unless battery backed up) cache or device. Using a device driver, applications could use the RAM disk as though it were a normal storage system. Different vendors springing up with drivers for various platforms and disappeared as their need were reduced with faster storage systems, interfaces and ram disks drives supplied by vendors, not to mention SSD devices.

Oh, for you tech trivia types, there was also database machines from the late 80’s such as Briton Lee that would offload your database processing functions to a specialized appliance. Sound like Oracle ExaData  I, II or III to anybody?

Image of Oracle ExaData storage system

Ok, so we have seen this movie before, no worries, old movies or shows get remade, and unless you are nostalgic or cling to the past, sure some of the remakes are duds, however many can be quite good.

Same goes with the remake of some of what we are seeing now. Sure there is a generation that does not know nor care about the past, its full speed ahead and leverage what will get them there.

Thus we are seeing in memory databases again, some of you may remember the original series (pick your generation, platform, tool and technology) with each variation getting better. With 64 bit processor, 128 bit and beyond file system and addressing, not to mention ability for more DRAM to be accessed directly, or via memory address extension, combined with memory data footprint reduction or compression, there is more space to put things (e.g. no such thing as a data or information recession).

Lets also keep in mind that the best IO is the IO that you do not have to do, and that SSD which is an extension of the memory map plays by the same rules of real estate. That is location matters.

Thus, here we go again for some of you (DejaVu), while for others get ready for a new and exciting ride (new and revolutionary). We are back to the future with in memory database which while for a time will take some pressure from underlying IO systems until they once again out grow server memory addressing limits (or IT budgets).

However for those who do not fall into a false sense of security, no fear, as there is no such thing as a data or information recession. Sure as the sun rises in the east and sets in the west, sooner or later those IO’s that were or are being kept in memory will need to be de-staged to persistent storage, either nand flash SSD, HDD or somewhere down the road PCM, mram and more.

StorageIO industry trends cloud, virtualization and big data

There is another trend that with more IOs being cached, reads are moving to where they should resolve which is closer to the application or via higher up in the memory and IO pyramid or hierarchy (shown above).

Thus, we could see a shift over time to more writes and ugly IOs being sent down to the storage systems. Keep in mind that any cache historically provides temporal relieve, question is how long of a temporal relief or until the next new and revolutionary or DejaVu technology shows up.

Ok, go have fun now, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Is SSD only for performance?

StorageIO industry trends cloud, virtualization and big data

Normally solid state devices (SSD) including non-persistent DRAM, and persistent nand flash are thought of in the context of performance including bandwidth or throughput, response time or latency, and IOPS or transactions. However there is another role where SSD are commonly used where the primary focus is not performance. Besides consumer devise such as iPhones, iPads, iPods, Androids, MP3, cell phones and digital cameras, the other use is for harsh environments.

Harsh environments include those (both commercial and government) where use of SSDs are a solution to vibration or other rough handling. These include commercial and military aircraft, telemetry and mobile command, control and communications, energy exploration among others.

What’s also probably not commonly thought about is that the vendors or solution providers for the above specialized environments include mainstream vendors including IBM (via their TMS acquisition) and EMC among others. Yes, EMC is involved with deploying SSD in different environments including all nand flash-based VNX systems.

In a normal IT environment, vibration should not be an issue for storage devices assuming quality solutions with good enclosures are used. However some environments that are pushing the limits on density may become more susceptible to vibration. Not all of those use cases will be SSD opportunities, however some that can leverage IO density along with tolerance to vibration will be a good fit.

Does that mean HDDs can not or should not be used in high density environments where vibration can be an issue?

That depends.

If the right drive enclosures, type of drive are used  following manufactures recommendations, then all should be good. Keep in mind that there are many options to leverage SSD for various scenarios.

Which tool or technology to use when, where or how much will depend on the specific situation, or perhaps your preferences for a given product or approach.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Data Center Infrastructure Management (DCIM) and IRM

StorageIO industry trends cloud, virtualization and big data

There are many business drivers and technology reasons for adopting data center infrastructure management (DCIM) and infrastructure Resource Management (IRM) techniques, tools and best practices. Today’s agile data centers need updated management systems, tools, and best practices that allow organizations to plan, run at a low-cost, and analyze for workflow improvement. After all, there is no such thing as an information recession driving the need to move process and store more data. With budget and other constraints, organizations need to be able to stretch available resources further while reducing costs including for physical space and energy consumption.

The business value proposition of DCIM and IRM includes:

DCIM, Data Center, Cloud and storage management figure

Data Center Infrastructure Management or DCIM also known as IRM has as their names describe a focus around management resources in the data center or information factory. IT resources include physical floor and cabinet space, power and cooling, networks and cabling, physical (and virtual) servers and storage, other hardware and software management tools. For some organizations, DCIM will have a more facilities oriented view focusing on physical floor space, power and cooling. Other organizations will have a converged view crossing hardware, software, facilities along with how those are used to effectively deliver information services in a cost-effective way.

Common to all DCIM and IRM practices are metrics and measurements along with other related information of available resources for gaining situational awareness. Situational awareness enables visibility into what resources exist, how they are configured and being used, by what applications, their performance, availability, capacity and economic effectiveness (PACE) to deliver a given level of service. In other words, DCIM enabled with metrics and measurements that matter allow you to avoid flying blind to make prompt and effective decisions.

DCIM, Data Center and Cloud Metrics Figure

DCIM comprises the following:

  • Facilities, power (primary and standby, distribution), cooling, floor space
  • Resource planning, management, asset and resource tracking
  • Hardware (servers, storage, networking)
  • Software (virtualization, operating systems, applications, tools)
  • People, processes, policies and best practices for management operations
  • Metrics and measurements for analytics and insight (situational awareness)

The evolving DCIM model is around elasticity, multi-tenant, scalability, flexibility, and is metered and service-oriented. Service-oriented, means a combination of being able to rapidly give new services while keeping customer experience and satisfaction in mind. Also part of being focused on the customer is to enable organizations to be competitive with outside service offerings while focusing on being more productive and economic efficient.

DCIM, Data Center and Cloud E2E management figure

While specific technology domain areas or groups may be focused on their respective areas, interdependencies across IT resource areas are a matter of fact for efficient virtual data centers. For example, provisioning a virtual server relies on configuration and security of the virtual environment, physical servers, storage and networks along with associated software and facility related resources.

You can read more about DCIM, ITSM and IRM in this white paper that I did, as well as in my books Cloud and Virtual Data Storage Networking (CRC Press) and The Green and Virtual Data Center (CRC Press).

Ok, nuff said, for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

SSD past, present and future with Jim Handy

Now also available via

This is a new episode in the continuing StorageIO industry trends and perspectives pod cast series (you can view more episodes or shows along with other audio and video content here) as well as listening via iTunes or via your preferred means using this RSS feed (https://storageio.com/StorageIO_Podcast.xml)

StorageIO industry trends cloud, virtualization and big data

In this episode, I talk with SSD nand flash and DRAM chip analyst Jim Handy of Objective Analysis at the LSI AIS (Accelerating Innovation Summit) 2012 in San Jose. Our conversation includes SSD past, present and future, market and industry trends, who are doing what and things to keep an eye and ear, open for along with server, storage and memory convergence.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Jim and myself.

StorageIO podcast

Also available via

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Enjoy this episode SSD Past, Present and Future with Jim Handy.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Have SSDs been unsuccessful with storage arrays (with poll)?

Storage I/O Industry Trends and Perspectives

I hear people talking about how Solid State Devices (SSDs) have not been successful with or for vendors of storage arrays, particular legacy storage systems. Some people have also asserted that large storage arrays are dead at the hands of new purpose-built SSD appliances or storage systems (read more here).

As a reference, legacy storage systems include those from EMC (VMAX and VNX), IBM (DS8000, DCS3700, XIV, and V7000), and NetApp FAS along with those from Dell, Fujitsu, HDS, HP, NEC and Oracle among others.

Granted EMC have launched new SSD based solutions in addition to buying startup eXtremeIO (aka Project X), and IBM bought SSD industry veteran TMS. IMHO, neither of those actions by either vendor signals an early retirement for their legacy storage solutions, instead opening up new markets giving customers more options for addressing data center and IO performance challenges. Keep in mind that the best IO is the one that you do not have to do with the second best being the least impact to applications in a cost-effective way.

SSD, IO, memory and storage hirearchy

Sometimes I even hear people citing or using some other person or source to attribute or make their assertions sound authoritative. You know the game, according to XYZ or, ABC said blah blah blah blah. Of course if you say or repeat something often enough, or hear it again and again, it can become self-convincing (e.g. industry adoption vs. customer deployments). Likewise depending on how many degrees of separation exists between you and the information you get, the more that it can change from what it originally was.

So what about it, has SSD not been successful for legacy storage system vendors and is the only place that SSD has had success is with startups or non-array based solutions?

While there have been some storage systems (arrays and appliances) that may not perform up to their claimed capabilities due to various internal architecture or implementation bottlenecks. For the most part the large vendors including EMC, HP, HDS, IBM, NetApp and Oracle have done very well shipping SSD drives in their solutions. Likewise some of the clean sheet new design based startup systems, as well as some of the startups with hybrid solutions combing HDDs  and SSDs have done well while others are still emerging.

Where SSD can be used and options

This could also be an example where myth becomes reality based on industry adoption vs. customer deployment. What this means is that the myth is that it is the startups that are having success vs. the legacy vendors from an industry adoption conversation standpoint and thus believed by some.

On the other hand, the myth is that vendors such as EMC or NetApp have not had success with their arrays and SSD yet their customer deployments prove otherwise. There is also a myth that only PCIe based SSD can be of value and that drive based SSDs are not worth using which I have a good idea where that myth comes from.

IMHO it is a depends, however safe to say from what I have seen directly that there are some vendors of storage arrays, including so-called legacy systems that have had very good success with SSD. Likewise have seen where some startups have done ok with their new clean sheet designs, including EMC (Project X). Oh, at least for now I am not a believer that with the all SSD based project “X” over at EMC that the venerable VMAX  formerly known as DMX and its predecessors Symmetric have finally hit the end of the line. Rather they will be positioned and play to different markets for some time yet.

Over at IBM I don’t think the DS8000 or XIV or V7000 and SVC folks are winding things down now that they bought SSD vendor TMS who has SSD appliances and PCIe cards. Rest assured there have been success by PCIe flash card vendors both as targets (FusionIO) and cache or hybrid cache and target systems such as those from Intel, LSI, Micron, and TMS (now IBM) among others. Oh, and if you have not noticed, check out what Qlogic, Emulex and some of the other traditional HBA vendors have done with and around SSD caching.

So where does the FUD that storage systems have not had success with SSD come from?

I suspect from those who would rather not see or hear about those who have had success taking away attention from them or their markets. In other words, using Fear, Uncertainty and Doubt (FUD) or some community peer pressure, there is a belief by some that if you hear enough times that something is dead or not of a benefit; you will look at the alternatives.

Care to guess what the preferred alternative is for some? If you guessed a PCIe card or SSD based appliance from your favorite startup that would be a fair assumption.

On the other hand, my educated guess (ok, its much more informed than a guess ;) ) is that if you ask a vendor such as EMC or NetApp they would disagree, while at the same time articulate benefits of different approaches and tools. Likewise, my educated guess is that if you ask some others, they will say mixed things and of course if you talk with the pure plays, take a wild yet educated guess what they will say.

Here is my point.

SSD, DRAM, PCM and storage adoption timeline

The SSD market, including DRAM, nand flash (SLC or MLC or any other xLC), emerging PCM or future mram among other technologies and packaging options is still in its relative infancy. Yes, I know there have been significant industry adoption and many early customer deployments, however talking with IT organizations of all size as well as with vendors and vars, customer deployment of SSD is far from reaching its full potential meaning a bright future.

Simply putting an SSD, card or drive into a solution does not guarantee results.

Likewise having a new architecture does not guarantee things will be faster.

Fast storage systems need fast devices (HDD, HHDD and SSDs) along with fast interfaces to connect with fast servers. Put a fast HDD, HHDD or SSD into a storage system that has bottlenecks (hardware, software, architectural design) and you may not see the full potential of the technology. Likewise put fast ports or interfaces on a storage system that has fast devices however also a bottleneck in its controller has or system architecture and you will not realize the full potential of that solution.

This is not unique to legacy or traditional storage systems, arrays or appliances as it is also the case with new clean sheet designs.

There are many new solutions that are or should be as fast as their touted marketing stories present, however just because something looks impressive in a YouTube video or slide deck or WebEx does not mean it will be fast in your environment. Some of these new design SSD based solutions will displace some legacy storage systems or arrays while many others will find new opportunities. Similar to how previous generation SSD storage appliances found roles complementing traditional storage systems, so to will many of these new generation of products.

What this all means is to navigate your way through the various marketing and architecture debates, benchmarks battles, claims and counter claims to understand what fits your needs and requires.

StorageIO industry trends cloud, virtualization and big data

What say you?

Ok, nuff said

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Mr. Backup (Curtis Preston) goes back to Ceph School

Now also available via

This is a new episode in the continuing StorageIO industry trends and perspectives pod cast series (you can view more episodes or shows along with other audio and video content here) as well as listening via iTunes or via your preferred means using this RSS feed (https://storageio.com/StorageIO_Podcast.xml)

StorageIO industry trends cloud, virtualization and big data

In this episode, I am at the Ceph day in Amsterdam Holland event at the Tobacco Theatre hosted by on42.com and inktank.com.

Ceph Day Amsterdam 2012

My guest for this episode is Curtis (Mr. Backup) Preston (@wcpreston) of Backup School and Backup Central fame where we discuss what is Ceph and object storage, cloud storage, file systems, backup and data protection along with dinner we had at an Indonesian restaurant .

Dinner Restaurant Blauw Utrecht Netherlands
Mr Backup getting ready to compress and dedupe dinner

The dinner we are referring to was at Restaurant Blauw in Utrecht Holland (click here) where Curtis and me were joined by Hans De Leenher @hansdeleenher of Veeam (thanks again for the dinner, that was a disclosure btw ;) ).

Note that this is a special episode in that while I’m recording the pod cast, Curtis is recording a video of our discussion for his truebit.tv site that you can view here.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Curtis and myself.

StorageIO podcast

Also available via

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Also check out the companion to this pod cast where I meet up with Ceph Creator Sage Weil while at Ceph Day.

Enjoy this episode Mr. Backup (Curtis Preston) goes back to Ceph School.

 

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Ben Woo on Big Data Buzzword Bingo and Business Benefits

Now also available via

This is a new episode in the continuing StorageIO industry trends and perspectives pod cast series (you can view more episodes or shows along with other audio and video content here) as well as listening via iTunes or via your preferred means using this RSS feed (https://storageio.com/StorageIO_Podcast.xml)

StorageIO industry trends cloud, virtualization and big data

In this episode, In this episode, Im joined in Frankfurt Germany by Ben Woo (@benwoony) of Neuralytix.com. Our conversation includes cloud; big data and how buzzword bingo technology focused discussions can result in missed business benefits for both vendors and customers. We also reminisce about MTI where we worked together along with protecting home storage.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Ben and myself.

StorageIO podcast

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Enjoy this episode with Ben Woo talking big data and business benefits vs. buzzword bingo.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Ceph Day in Amsterdam and Sage Weil on Object Storage

Now also available via

This is a new episode in the continuing StorageIO industry trends and perspectives pod cast series (you can view more episodes or shows along with other audio and video content here) as well as listening via iTunes or via your preferred means using this RSS feed (https://storageio.com/StorageIO_Podcast.xml)

StorageIO industry trends cloud, virtualization and big data

In this episode, I am at the Ceph day in Amsterdam Holland event at the Tobacco Theatre. My guest for this episode is Ceph (Cephalanthera) creator Sage Weil who is also the founder of inktank.com that provides services and support for the open source based Ceph project.

For those not familiar with Ceph, it is an open source distributed object scale out software platform that can be used for deploying cloud and managed services, general purpose storage for research, commercial, scientific, high performance computing (HPC) or high productivity computing (commercial) along with backup or data protection and archiving destinations.

During our conversation Sage presents an overview of what Ceph is (e.g. Ceph for non Dummies), where and how it can be used, some history of the project and how it fits in with or provides an alternative to other solutions. Sage also talks about the business or commercial considerations for open source based projects, importance of community and having good business mentors and partners as well as staying busy with his young family.

If you are a Ceph fan, gain more insight into Sage along with Ceph day sponsors Inktank and 42on. On the other hand, if you new to object storage, open source storage software or cloud storage, listen in to gain perspectives of where technology such as Ceph fits for public, private, hybrid or traditional environments.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Sage and myself.

StorageIO podcast

Also available via

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Enjoy this episode Ceph Day in Amsterdam with Sage Weil.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Little data, big data and very big data (VBD) or big BS?

StorageIO industry trends cloud, virtualization and big data

This is an industry trends and perspective piece about big data and little data, industry adoption and customer deployment.

If you are in any way associated with information technology (IT), business, scientific, media and entertainment computing or related areas, you may have heard big data mentioned. Big data has been a popular buzzword bingo topic and term for a couple of years now. Big data is being used to describe new and emerging along with existing types of applications and information processing tools and techniques.

I routinely hear from different people or groups trying to define what is or is not big data and all too often those are based on a particular product, technology, service or application focus. Thus it should be no surprise that those trying to police what is or is not big data will often do so based on what their interest, sphere of influence, knowledge or experience and jobs depend on.

Traveling and big data images

Not long ago while out traveling I ran into a person who told me that big data is new data that did not exist just a few years ago. Turns out this person was involved in geology so I was surprised that somebody in that field was not aware of or working with geophysical, mapping, seismic and other legacy or traditional big data. Turns out this person was basing his statements on what he knew, heard, was told about or on sphere of influence around a particular technology, tool or approach.

Fwiw, if you have not figured out already, like cloud, virtualization and other technology enabling tools and techniques, I tend to take a pragmatic approach vs. becoming latched on to a particular bandwagon (for or against) per say.

Not surprisingly there is confusion and debate about what is or is not big data including if it only applies to new vs. existing and old data. As with any new technology, technique or buzzword bingo topic theme, various parties will try to place what is or is not under the definition to align with their needs, goals and preferences. This is the case with big data where you can routinely find proponents of Hadoop and Map reduce position big data as aligning with the capabilities and usage scenarios of those related technologies for business and other forms of analytics.

SAS software for big data

Not surprisingly the granddaddy of all business analytics, data science and statistic analysis number crunching is the Statistical Analysis Software (SAS) from the SAS Institute. If these types of technology solutions and their peers define what is big data then SAS (not to be confused with Serial Attached SCSI which can be found on the back-end of big data storage solutions) can be considered first generation big data analytics or Big Data 1.0 (BD1 ;) ). That means Hadoop Map Reduce is Big Data 2.0 (BD2 ;) ;) ) if you like, or dislike for that matter.

Funny thing about some fans and proponents or surrogates of BD2 is that they may have heard of BD1 like SAS with a limited understanding of what it is or how it is or can be used. When I worked in IT as a performance and capacity planning analyst focused on servers, storage, network hardware, software and applications I used SAS to crunch various data streams of event, activity and other data from diverse sources. This involved correlating data, running various analytic algorithms on the data to determine response times, availability, usage and other things in support of modeling, forecasting, tuning and trouble shooting. Hmm, sound like first generation big data analytics or Data Center Infrastructure Management (DCIM) and IT Service Management (ITSM) to anybody?

Now to be fair, comparing SAS, SPSS or any number of other BD1 generation tools to Hadoop and Map Reduce or BD2 second generation tools is like comparing apples to oranges, or apples to pears.

Lets move on as there is much more to what is big data than simply focus around SAS or Hadoop.

StorageIO industry trends cloud, virtualization and big data

Another type of big data are the information generated, processed, stored and used by applications that result in large files, data sets or objects. Large file, objects or data sets include low resolution and high-definition photos, videos, audio, security and surveillance, geophysical mapping and seismic exploration among others. Then there are data warehouses where transactional data from databases gets moved to for analysis in systems such as those from Oracle, Teradata, Vertica or FX among others. Some of those other tools even play (or work) in both traditional e.g. BD1 and new or emerging BD2 worlds.

This is where some interesting discussions, debates or disagreements can occur between those who latch onto or want to keep big data associated with being something new and usually focused around their preferred tool or technology. What results from these types of debates or disagreements is a missed opportunity for organizations to realize that they might already be doing or using a form of big data and thus have a familiarity and comfort zone with it.

By having a familiarity or comfort zone vs. seeing big data as something new, different, hype or full of FUD (or BS), an organization can be comfortable with the term big data. Often after taking a step back and looking at big data beyond the hype or fud, the reaction is along the lines of, oh yeah, now we get it, sure, we are already doing something like that so lets take a look at some of the new tools and techniques to see how we can extend what we are doing.

Likewise many organizations are doing big bandwidth already and may not realize it thinking that is only what media and entertainment, government, technical or scientific computing, high performance computing or high productivity computing (HPC) does. I’m assuming that some of the big data and big bandwidth pundits will disagree, however if in your environment you are doing many large backups, archives, content distribution, or copying large amounts of data for different purposes that consume big bandwidth and need big bandwidth solutions.

Yes I know, that’s apples to oranges and perhaps stretching the limits of what is or can be called big bandwidth based on somebody’s definition, taxonomy or preference. Hopefully you get the point that there is diversity across various environments as well as types of data and applications, technologies, tools and techniques.

StorageIO industry trends cloud, virtualization and big data

What about little data then?

I often say that if big data is getting all the marketing dollars to generate industry adoption, then little data is generating all the revenue (and profit or margin) dollars by customer deployment. While tools and technologies related to Hadoop (or Haydoop if you are from HDS) are getting industry adoption attention (e.g. marketing dollars being spent) revenues from customer deployment are growing.

Where big data revenues are strongest for most vendors today are centered around solutions for hosting, storing, managing and protecting big files, big objects. These include scale out NAS solutions for large unstructured data like those from Amplidata, Cray, Dell, Data Direct Networks (DDN), EMC (e.g. Isilon), HP X9000 (IBRIX), IBM SONAS, NetApp, Oracle and Xyratex among others. Then there flexible converged compute storage platforms optimized for analytics and running different software tools such as those from EMC (Greenplum), IBM (Netezza), NetApp (via partnerships) or Oracle among others that can be used for different purposes in addition to supporting Hadoop and Map reduce.

If little data is databases and things not generally lumped into the big data bucket, and if you think or perceive big data only to be Hadoop map reduce based data, then does that mean all the large unstructured non little data is then very big data or VBD?

StorageIO industry trends cloud, virtualization and big data

Of course the virtualization folks might want to if they have not already corner the V for Virtual Big Data. In that case, then instead of Very Big Data, how about very very Big Data (vvBD). How about Ultra-Large Big Data (ULBD), or High-Revenue Big Data (HRBD), granted the HR might cause some to think its unique for Health Records, or Human Resources, both btw leverage different forms of big data regardless of what you see or think big data is.

Does that then mean we should really be calling videos, audio, PACs, seismic, security surveillance video and related data to be VBD? Would this further confuse the market, or the industry or help elevate it to a grander status in terms of size (data file or object capacity, bandwidth, market size and application usage, market revenue and so forth)?

Do we need various industry consortiums, lobbyists or trade groups to go off and create models, taxonomies, standards and dictionaries based on their constituents needs and would they align with those of the customers, after all, there are big dollars flowing around big data industry adoption (marketing).

StorageIO industry trends cloud, virtualization and big data

What does this all mean?

Is Big Data BS?

First let me be clear, big data is not BS, however there is a lot of BS marketing BS by some along with hype and fud adding to the confusion and chaos, perhaps even missed opportunities. Keep in mind that in chaos and confusion there can be opportunity for some.

IMHO big data is real.

There are different variations, use cases and types of products, technologies and services that fall under the big data umbrella. That does not mean everything can or should fall under the big data umbrella as there is also little data.

What this all means is that there are different types of applications for various industries that have big and little data, virtual and very big data from videos, photos, images, audio, documents and more.

Big data is a big buzzword bingo term these days with vendor marketing big dollars being applied so no surprise the buzz, hype, fud and more.

Ok, nuff said, for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Networking with Bruce Ravid and Bruce Rave

Now also available via

This is the eighth (here is the first, second, third, fourth, fifth, sixth, and seventh) in a series of StorageIO industry trends and perspective audio blog and pod cast discussions from Storage Networking World (SNW) Fall 2012 in Santa Clara California.

StorageIO industry trends cloud, virtualization and big data

In this episode, my co-host Bruce Rave aka Bruce Ravid of Ravid and Associates (twitter @brucerave) recap the recent SNW conference and series of pod casts. Our conversation also covers importance of networking and career tips (Bruce is an executive recruiter aka career advisory consultant) for those of you that are new and upcoming, as well those of you who are seasoned veterans to standout in a crowd.

Bruce also talks about his internet music radio show called Go Deep on moheak.com along with up and coming bands to keep an eye and ear open for in 2013. Check out Bruces sites at ravid.com and godeepmusic.net as well as listen to his internet radio show that airs weekly Sunday evenings 7 to 9PM PT on moheak.com.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Bruce and myself.

StorageIO podcast

Also available via

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts from SNW and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Enjoy listening to Networking with Bruce Ravid.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Industry trends and perspectives: SNW 2012 Rapping with Dave Raffo of SearchStorage

Now also available via

This is the seventh (here is the first, second, third, fourth, fifth and sixth) in a series of StorageIO industry trends and perspective audio blog and pod cast discussions from Storage Networking World (SNW) Fall 2012 in Santa Clara California.

StorageIO industry trends cloud, virtualization and big data

Given how at conference conversations tend to occur in the hallways, lobbies and bar areas of venues, what better place to have candid conversations with people from throughout the industry, some you know, some you will get to know better.

In this episode, my co-host Bruce Rave aka Bruce Ravid of Ravid and Associates (twitter @brucerave) meets up Sr. News Director Dave Raffo of TechTarget and Search Storage in the SNW trade show expo hall. Our conversation covers past and present SNWs along with other industry conferences, industry trends, software defined buzzwords, Green Bay Packers smack and more.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Dave, Bruce and myself.

StorageIO podcast

Also available via

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts from SNW and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Enjoy listening to Rapping with Dave Raffo of Search Storage from the Fall SNW 2012 pod cast.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved