object storage Archives

November 26, 2012December 29, 2025

Seven databases in seven weeks, a book review of NoSQL databases

Seven Databases in Seven Weeks (A Guide to Modern Databases and the NoSQL Movement) is a book written Eric Redmond (@coderoshi) and Jim Wilson (@hexlib), part of The Pragmatic Programmers (@pragprog) series that takes a look at several non SQL based database systems.

Coverage includes PostgreSQL, Riak, Apache HBase, MongoDB, Apache CouchDB, Neo4J and Redis with plenty of code and architecture examples. Also covered include relational vs. key value, columnar and document based systems among others.

The details: Seven Databases in Seven Weeks
Paperback: 352 pages
Publisher: Pragmatic Bookshelf (May 18, 2012)
Language: English
ISBN-10: 1934356921
ISBN-13: 978-1934356920
Product Dimensions: 7.5 x 0.8 x 9 inches

Buzzwords (or keywords) include availability, consistency, performance and related themes. Others include MongoDB, Cassandra, Redis, Neo4J, JSON, CouchDB, Hadoop, HBase, Amazon Dynamo, Map Reduce, Riak (Basho) and Postgres along with data models including relational, key value, columnar, document and graph along with big data, little data, cloud and object storage.

While this book is not a how to tutorial or installation guide, it does give a deep dive into the different databases covered. The benefit is gaining an understanding of what the different databases are good for, strengths, weakness, where and when to use or choose them for various needs.

A look inside my copy of Seven Databases in Seven Days

Who should this book includes applications developers, programmers, Cloud, big data and IT/ICT architects, planners and designers along with database, server, virtualization and storage professionals. What I like about the book is that it is a great intro and overview along with sufficient depth to understand what these different solutions can and cannot do, when, where and why to use these tools for different situations in a quick read format and plenty of detail.

Would I recommend buying it: Yes, I bought a copy myself on Amazon.com, get your copy by clicking here.

Ok, nuff said

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

November 12, 2012December 29, 2025

Mr. Backup (Curtis Preston) goes back to Ceph School

Now also available via

This is a new episode in the continuing StorageIO industry trends and perspectives pod cast series (you can view more episodes or shows along with other audio and video content here) as well as listening via iTunes or via your preferred means using this RSS feed )

In this episode, I am at the Ceph day in Amsterdam Holland event at the Tobacco Theatre hosted by on42.com and inktank.com.

Ceph Day Amsterdam 2012

My guest for this episode is Curtis (Mr. Backup) Preston (@wcpreston) of Backup School and Backup Central fame where we discuss what is Ceph and object storage, cloud storage, file systems, backup and data protection along with dinner we had at an Indonesian restaurant .

Dinner Restaurant Blauw Utrecht Netherlands
Mr Backup getting ready to compress and dedupe dinner

The dinner we are referring to was at Restaurant Blauw in Utrecht Holland (click here) where Curtis and me were joined by Hans De Leenher @hansdeleenher of Veeam (thanks again for the dinner, that was a disclosure btw ;) ).

Note that this is a special episode in that while I’m recording the pod cast, Curtis is recording a video of our discussion for his truebit.tv site that you can view here.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Curtis and myself.

Also available via

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Also check out the companion to this pod cast where I meet up with Ceph Creator Sage Weil while at Ceph Day.

Enjoy this episode Mr. Backup (Curtis Preston) goes back to Ceph School.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

October 28, 2012December 29, 2025

Little data, big data and very big data (VBD) or big BS?

This is an industry trends and perspective piece about big data and little data, industry adoption and customer deployment.

If you are in any way associated with information technology (IT), business, scientific, media and entertainment computing or related areas, you may have heard big data mentioned. Big data has been a popular buzzword bingo topic and term for a couple of years now. Big data is being used to describe new and emerging along with existing types of applications and information processing tools and techniques.

I routinely hear from different people or groups trying to define what is or is not big data and all too often those are based on a particular product, technology, service or application focus. Thus it should be no surprise that those trying to police what is or is not big data will often do so based on what their interest, sphere of influence, knowledge or experience and jobs depend on.

Not long ago while out traveling I ran into a person who told me that big data is new data that did not exist just a few years ago. Turns out this person was involved in geology so I was surprised that somebody in that field was not aware of or working with geophysical, mapping, seismic and other legacy or traditional big data. Turns out this person was basing his statements on what he knew, heard, was told about or on sphere of influence around a particular technology, tool or approach.

Fwiw, if you have not figured out already, like cloud, virtualization and other technology enabling tools and techniques, I tend to take a pragmatic approach vs. becoming latched on to a particular bandwagon (for or against) per say.

Not surprisingly there is confusion and debate about what is or is not big data including if it only applies to new vs. existing and old data. As with any new technology, technique or buzzword bingo topic theme, various parties will try to place what is or is not under the definition to align with their needs, goals and preferences. This is the case with big data where you can routinely find proponents of Hadoop and Map reduce position big data as aligning with the capabilities and usage scenarios of those related technologies for business and other forms of analytics.

Not surprisingly the granddaddy of all business analytics, data science and statistic analysis number crunching is the Statistical Analysis Software (SAS) from the SAS Institute. If these types of technology solutions and their peers define what is big data then SAS (not to be confused with Serial Attached SCSI which can be found on the back-end of big data storage solutions) can be considered first generation big data analytics or Big Data 1.0 (BD1 ;) ). That means Hadoop Map Reduce is Big Data 2.0 (BD2 ;) ;) ) if you like, or dislike for that matter.

Funny thing about some fans and proponents or surrogates of BD2 is that they may have heard of BD1 like SAS with a limited understanding of what it is or how it is or can be used. When I worked in IT as a performance and capacity planning analyst focused on servers, storage, network hardware, software and applications I used SAS to crunch various data streams of event, activity and other data from diverse sources. This involved correlating data, running various analytic algorithms on the data to determine response times, availability, usage and other things in support of modeling, forecasting, tuning and trouble shooting. Hmm, sound like first generation big data analytics or Data Center Infrastructure Management (DCIM) and IT Service Management (ITSM) to anybody?

Now to be fair, comparing SAS, SPSS or any number of other BD1 generation tools to Hadoop and Map Reduce or BD2 second generation tools is like comparing apples to oranges, or apples to pears.

Lets move on as there is much more to what is big data than simply focus around SAS or Hadoop.

Another type of big data are the information generated, processed, stored and used by applications that result in large files, data sets or objects. Large file, objects or data sets include low resolution and high-definition photos, videos, audio, security and surveillance, geophysical mapping and seismic exploration among others. Then there are data warehouses where transactional data from databases gets moved to for analysis in systems such as those from Oracle, Teradata, Vertica or FX among others. Some of those other tools even play (or work) in both traditional e.g. BD1 and new or emerging BD2 worlds.

This is where some interesting discussions, debates or disagreements can occur between those who latch onto or want to keep big data associated with being something new and usually focused around their preferred tool or technology. What results from these types of debates or disagreements is a missed opportunity for organizations to realize that they might already be doing or using a form of big data and thus have a familiarity and comfort zone with it.

By having a familiarity or comfort zone vs. seeing big data as something new, different, hype or full of FUD (or BS), an organization can be comfortable with the term big data. Often after taking a step back and looking at big data beyond the hype or fud, the reaction is along the lines of, oh yeah, now we get it, sure, we are already doing something like that so lets take a look at some of the new tools and techniques to see how we can extend what we are doing.

Likewise many organizations are doing big bandwidth already and may not realize it thinking that is only what media and entertainment, government, technical or scientific computing, high performance computing or high productivity computing (HPC) does. I’m assuming that some of the big data and big bandwidth pundits will disagree, however if in your environment you are doing many large backups, archives, content distribution, or copying large amounts of data for different purposes that consume big bandwidth and need big bandwidth solutions.

Yes I know, that’s apples to oranges and perhaps stretching the limits of what is or can be called big bandwidth based on somebody’s definition, taxonomy or preference. Hopefully you get the point that there is diversity across various environments as well as types of data and applications, technologies, tools and techniques.

What about little data then?

I often say that if big data is getting all the marketing dollars to generate industry adoption, then little data is generating all the revenue (and profit or margin) dollars by customer deployment. While tools and technologies related to Hadoop (or Haydoop if you are from HDS) are getting industry adoption attention (e.g. marketing dollars being spent) revenues from customer deployment are growing.

Where big data revenues are strongest for most vendors today are centered around solutions for hosting, storing, managing and protecting big files, big objects. These include scale out NAS solutions for large unstructured data like those from Amplidata, Cray, Dell, Data Direct Networks (DDN), EMC (e.g. Isilon), HP X9000 (IBRIX), IBM SONAS, NetApp, Oracle and Xyratex among others. Then there flexible converged compute storage platforms optimized for analytics and running different software tools such as those from EMC (Greenplum), IBM (Netezza), NetApp (via partnerships) or Oracle among others that can be used for different purposes in addition to supporting Hadoop and Map reduce.

If little data is databases and things not generally lumped into the big data bucket, and if you think or perceive big data only to be Hadoop map reduce based data, then does that mean all the large unstructured non little data is then very big data or VBD?

Of course the virtualization folks might want to if they have not already corner the V for Virtual Big Data. In that case, then instead of Very Big Data, how about very very Big Data (vvBD). How about Ultra-Large Big Data (ULBD), or High-Revenue Big Data (HRBD), granted the HR might cause some to think its unique for Health Records, or Human Resources, both btw leverage different forms of big data regardless of what you see or think big data is.

Does that then mean we should really be calling videos, audio, PACs, seismic, security surveillance video and related data to be VBD? Would this further confuse the market, or the industry or help elevate it to a grander status in terms of size (data file or object capacity, bandwidth, market size and application usage, market revenue and so forth)?

Do we need various industry consortiums, lobbyists or trade groups to go off and create models, taxonomies, standards and dictionaries based on their constituents needs and would they align with those of the customers, after all, there are big dollars flowing around big data industry adoption (marketing).

What does this all mean?

Is Big Data BS?

First let me be clear, big data is not BS, however there is a lot of BS marketing BS by some along with hype and fud adding to the confusion and chaos, perhaps even missed opportunities. Keep in mind that in chaos and confusion there can be opportunity for some.

IMHO big data is real.

There are different variations, use cases and types of products, technologies and services that fall under the big data umbrella. That does not mean everything can or should fall under the big data umbrella as there is also little data.

What this all means is that there are different types of applications for various industries that have big and little data, virtual and very big data from videos, photos, images, audio, documents and more.

Big data is a big buzzword bingo term these days with vendor marketing big dollars being applied so no surprise the buzz, hype, fud and more.

Ok, nuff said, for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

October 27, 2012December 29, 2025

Industry trends and perspectives: SNW 2012 Rapping with Dave Raffo of SearchStorage

Now also available via

This is the seventh (here is the first, second, third, fourth, fifth and sixth) in a series of StorageIO industry trends and perspective audio blog and pod cast discussions from Storage Networking World (SNW) Fall 2012 in Santa Clara California.

Given how at conference conversations tend to occur in the hallways, lobbies and bar areas of venues, what better place to have candid conversations with people from throughout the industry, some you know, some you will get to know better.

In this episode, my co-host Bruce Rave aka Bruce Ravid of Ravid and Associates (twitter @brucerave) meets up Sr. News Director Dave Raffo of TechTarget and Search Storage in the SNW trade show expo hall. Our conversation covers past and present SNWs along with other industry conferences, industry trends, software defined buzzwords, Green Bay Packers smack and more.

Click here (right-click to download MP3 file) or on the microphone image to listen to the conversation with Dave, Bruce and myself.

Also available via

Watch (and listen) for more StorageIO industry trends and perspectives audio blog posts pod casts from SNW and other upcoming events. Also be sure to heck out other related pod casts, videos, posts, tips and industry commentary at StorageIO.com and StorageIOblog.com.

Enjoy listening to Rapping with Dave Raffo of Search Storage from the Fall SNW 2012 pod cast.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

October 14, 2012April 27, 2025

Trick or treat and vendor fun games

In the spirit of Halloween and zombies season, a couple of thoughts come to mind about vendor tricks and treats. This is an industry trends and perspectives post, part of an ongoing series looking at various technology and fun topics.

The first trick or treat game pertains to the blame game; you know either when something breaks, or at the other extreme, before you have even made a decision to buy something. The trick or treat game for decision-making goes something like this.

Vendor “A” says products succeed with their solution while failure results with a solution from “B” when doing “X”. Otoh, vendor “B” claims that “X” will fail when using a solution from vendor “A”. In fact, you can pick what you want to substitute for “X”, perhaps VDI, PCIe, Big Data, Little Data, Backup, Archive, Analytics, Private Cloud, Public Cloud, Hybrid Cloud, eDiscovery you name it.

This is not complicated math or big data problem requiring a high-performance computing (HPC) platform. A HPC Zetta-Flop processing ability using 512 bit addressing of 9.9 (e.g. 1 nine) PettaBytes of battery-backed DRAM and an IO capability of 9.99999 (e.g. 5 9’s) trillion 8 bit IOPS to do table pivots or runge kutta numerical analysis, map reduce, SAS or another modeling with optional iProduct or Android interface are not needed.

StorageIO images of touring Texas Advanced Computing (e.g. HPC) Center

Can you solve this equation? Hint it does not need a PhD or any other advanced degree. Another hint, if you have ever been at any side of the technology product and services decision-making table, regardless of the costume you wore, you should know the answer.

Of course the question of would “X” fail regardless of who or what “A” or “B” let alone a “C”, “D” or “F”? In other words, it is not the solution, technology, vendor or provider, rather the problem or perhaps even lack thereof that is the issue. Or is it a case where there is a solution from “A”, “B” or any others that is looking for a problem, and if it is the wrong problem, there can be a wrong solution thus failure?

Another trick or treat game is vendors public relations (PR) or analyst relations (AR) people to ask for one thing and delivery or ask another. For example, some vendor, service provider, their marketing AR and PR people or surrogates make contact wanting to tell of various success and failure story. Of course, this is usually their success and somebody else’s failure, or their victory over something or someone who sometimes can be interesting. Of course, there are also the treats to get you to listen to the above, such as tempt you with a project if you meet with their subject, which may be a trick of a disappearing treat (e.g. magic, poof it is gone after the discussion).

There are another AR and PR trick and treat where they offer on behalf of their representative organization or client to a perspective or exclusive insight on their competitor. Of course, the treat from their perspective is that they will generously expose all that is wrong with what a competitor is saying about their own (e.g. the competitors) product.

Let me get this straight, I am not supposed to believe what somebody says about his or her own product, however, supposed to believe what a competitor says is wrong with the competition’s product, and what is right with his or her own product.

Hmm, ok, so let me get this straight, a competitor say “A” wants to tell me what somebody say from “B” has told me is wrong and I should schedule a visit with a truth squad member from “A” to get the record set straight about “B”?

Does that mean then that I go to “B” for a rebuttal, as well as an update about “A” from “B”, assuming that what “A” has told me is also false about themselves, and perhaps about “B” or any other?

Too be fair, depending on your level of trust and confidence in either a vendor, their personal or surrogates, you might tend to believe more from them vs. others, or at least until you been tricked after given treats. There may be some that have been tricked, or they tried applying to many treats to present a story that behind the costume might be a bit scary.

Having been through enough of these, and I candidly believe that sometimes “A” or “B” or any other party actually do believe that they have more or better info about their competitor and that they can convince somebody about what their competitor is doing better than the competitor can. I also believe that there are people out there who will go to “A” or “B” and believe what they are told by based on their preference, bias or interests.

When I hear from vendors, VARs, solution or service providers and others, it’s interesting hearing point, counterpoint and so forth, however if time is limited, I’am more interested in hearing from such as “A” about them, what they are doing, where success, where challenges, where going and if applicable, under NDA go into more detail.

Customer success stories are good, however again, if interested in what works, what kind of works, or what does not work, chances are when looking for G2 vs. GQ, a non-scripted customer conversation or perspective of the good, the bad and the ugly is preferred, even if under NDA. Again, if time is limited which it usually is, focus on what is being done with your solution, where it is going and if compelled send follow-up material that can of course include MUD and FUD about others if that is your preference.

Then there is when during a 30 minute briefing, the vendor or solution provider is still talking about trends, customer pain points, what competitors are doing at 21 minutes into the call with no sign of an announcement, update or news in site

Lets not forget about the trick where the vendor marketing or PR person reaches out and says that the CEO, CMO, CTO or some other CxO or Chief Jailable Officer (CJO) wants to talk with you. Part of the trick is when the CxO actually makes it to the briefing and is not ready, does not know why the call is occurring, or, thinks that a request for an audience has been made with them for an interview or something else.

A treat is when 3 to 4 minutes into a briefing, the vendor or solution provider has already framed up what and why they are doing something. This means getting to what they are announcing or planning on doing and getting into a conversation to discuss what they are doing and making good follow-up content and resources available.

Sometimes a treat is when a briefer goes on autopilot nailing their script for 29 of a 30 minute session then use the last-minute to ask if there are any questions. The reason autopilot briefings can be a treat is when they are going over what is in the slide deck, webex, or press release thus affording an opportunity to get caught up on other things while talk at you. Hmm, perhaps need to consider playing some tricks in reward for those kind of treats? ;)

Do not be scared, not everybody is out to trick you with treats, and not all treats have tricks attached to them. Be prepared, figure out who is playing tricks with treats, and who has treats without tricks.

Oh, and as a former IT customer, vendor and analyst, one of my favorites is contact information of my dogs to vendors who require registration on their websites for basic things such as data sheets. Another is supplying contact information of competing vendors sales reps to vendors who also require registration for basic data sheets or what should otherwise be generally available information as opposed to more premium treats. Of course there are many more fun tricks, however lets leave those alone for now.

Note: Zombie voting rules apply which means vote early, vote often, and of course vote for those who cannot include those that are dead (real or virtual).

Where To Learn More

View additiona related material via the following links.

- Can we get a side of context with them IOPS and other storage metrics?
- Revisiting RAID storage remains relevant and resources
- NVMe overview and primer – Part I
- Part 1 of HDD for content servers series Trends and Content Application Servers
- As the platters spin, HDD’s for cloud, virtual and traditional storage environments
- Hard Disk Drives (HDD) for Virtual Environments
- Server and Storage I/O performance and benchmarking tools
- Server storage I/O performance benchmark workload scripts Part I and Part II
- How to test your HDD, SSD or all flash array (AFA) storage fundamentals
- What is the best server storage I/O workload benchmark? It depends
- I/O, I/O how well do you know about good or bad server and storage I/Os?
- Big Files Lots of Little File Processing Benchmarking with Vdbench
- Part II – NVMe overview and primer (Different Configurations)
- PCIe Server I/O Fundamentals
- If NVMe is the answer, what are the questions?
- NVMe Wont Replace Flash By Itself
- Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
- Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
- Data Infrastructure Overview, Its Whats Inside of Data Centers
- Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
- The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
- The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
- Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
- www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Watch out for tricks and treats, have a safe and fun Zombie (aka Halloween) season. See you while out and about this fall and don’t forget to take part in the ongoing zombie technology poll. Oh, and be safe with trick or treat and vendor fun games

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2026 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

October 4, 2012December 29, 2025

StorageIO going Dutch and Deutsch fall 2012

StorageIO industry trends cloud, virtualization and big data

Following a busy spring and summer schedule, the fall 2012 StorageIO out and about activities are underway including events on both the European and North American continents.

In addition to in person events, there are also some virtual activities including live and recorded video and audio sessions, as well as webcast on the fall schedule with more in the works.

Some of the fall events include SNW (past SNW posts here, here, and here) in Santa Clara, as well as SNW Europe and Power the Cloud event (Frankfurt Deutschland aka Germany) October 30 and 31st where I will be doing some meetings and briefing, along with attending sessions and the expo activities.

On November 1st its off to Storage Expo Holland in Utrecht (here and here) where I will be presenting two sessions. One is on SSD industry trends and tips on deployment with a theme of not if, rather when, where, why and with what to use SSD. In addition I will be doing a general industry trends and perspective session on gaining confidence with clouds, virtualization, data and storage networking including object storage and backup (e.g. data protection modernization).

European travel tools and technologies

In addition to the above activities, following successful past events in Nijkerk Holland including the most recent May 2012 sessions, a new seminar has been announced focused on backup, restore, BC, DR and archiving hosted by Brouwer Consultancy on November 5th and 6th 2012. These workshop format seminars are very interactive providing independent perspectives on technology, tools, trends and what to do to address various challenges including more informed and effective IT decision-making.

Greg in action Nijkerk Storage Seminar

In addition to the new seminar that you can learn more about here, two other sessions will also be offered in Holland. These include a backup, restore, BC, DR and archiving. The other session is a backup, restore, BC, DR and archiving covering storage and networking industry trends covering clouds, virtualization and other broad topics.

Examples of Dutch refreshments

Learn more about the dutch seminars including how to register here.

Watch for more events, seminars, live video, webinars and virtual trade shows by visiting the StorageIO events page.

Drop me a note if you would like to schedule or arrange for a meeting, webinar, seminar or other activity at an event near you. If you planning to be in or near Holland early November, and interested in scheduling a meeting or session, send me a note or contact Brouwer Consultancy (here) to make arrangements.

Time to get ready for these and other events, ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

September 26, 2012December 29, 2025

Cloud, virtualization, storage and networking in an election year

My how time flies, seems like just yesterday (back in 2008) that I did a piece titled Politics and Storage, or, storage in an election year V2.008 and if you are not aware, it is 2012 and thus an election year in the U.S. as well as in many other parts of the world. Being an election year it’s not just about politicians, their supporters, pundits, surrogates, donors and voters, it’s also a technology decision-making and acquisition year (as are most years) for many environments.

Similar to politics, some technology decisions will be major while others will be minor or renewals so to speak. Major decisions will evolve around strategies, architectures, visions, implementation plans and technology selections including products, protocols, processes, people, vendors or suppliers and services for traditional, virtual and cloud data infrastructure environments.

Vendors, suppliers, service providers and their associated industry forums or alliances and trade groups are in various sales and marketing awareness campaigns. These various campaigns will decide who will be chosen by their customers or prospects for technology acquisitions ranging from hardware, software and services including servers, storage, IO and networking, desktops, power, cooling, facilities, management tools, virtualization and cloud products and services along with related items.

The politics of data infrastructures including servers, storage, networking, hardware, software and services spanning physical, cloud and virtual environments has similarities to other political races. These include many organizations in the form of inter departmental rivalry over budgets or funding, service levels, decision-making, turf wars and technology ownership not to mention the usual vendor vs. vendor, VAR vs. VAR, service provider vs. service provider or other match ups.

On the other hand, data and storage are also being used to support political campaigns in many ways across physical, virtual and cloud deployment scenarios.

StorageIO industry trends cloud, virtualization and big data

Let us not forget about the conventions or what are more commonly known as shows, conferences, user group events in the IT world. For example EMCworld earlier this year, Dell Storage Forum, or the recent VMworld (or click here to view video from past VMworld party with INXS), Oracle Open World along with many vendor analyst, partner, press and media or blogger days.

Here are some 2012 politics of data infrastructure and storage campaign match-ups:

Vendor lock in, is it a problem and who is responsible
Replication and snapshots vs. Backup vs. data protection modernization
Erasure codes vs. RAID
Public vs. Private and hybrid clouds
Cloud products vs. cloud APIs vs. cloud services
IBM and the Better Business Bureau vs. Oracle marketing claims
Cloud confidence vs. cloud data loss vs. loss of access
Taking shared responsibility for data protection vs. blaming others
Bring your own device (BYOD) vs. IT supplied
VDI vs. Physical and traditional desktops including windows performance
EMC vs NetApp in the race for unified or anything else storage related
Big iron vs. little iron vs. virtual iron or software defined
EMC vs. Oracle in the race for big data buzz
Environmental focused vs. economic and productivity enabling Green IT
Green IT myths and missed opportunities
Oracle vs. IBM in the race for big data and little data (databases)
Clusters, clouds and grids vs. traditional architectures
Seagate vs. Western Digital (WD) in the race for Hard Disk Drives (HDD)
Hard vs. soft products and services
SSD vs. HDD vs. HHDD and SSD startups vs. established vendors
EMC and Lenovo vs. Dell, HP, IBM, NetApp and others
Industry adoption vs. industry deployment
PCIe SSD vendors vs. storage array and appliance vendors
Nand flash vs. any new SSD entrants for persistent memory
Consolidate everything vs. virtualize many things
SAN, NAS or Unified vs. Cloud object vs. DAS vs. SAS vs. FCoE
Microsoft Hyper-V and Citrix Xen and KVM vs. VMware vSphere
Microsoft, HP and others vs. Amazon and Goggle for cloud supremacy
Edgy vs. civility, G2 vs. GQ, entertainment vs. education
Fear and FUD vs. credibility and confidence
Samsung vs. Apple lawsuit(s) part deux
IOV, SDN, and software defined anything vs. hardware defined anything

Speaking of networks vs. server and storage or software and convergence, how about Brocade vs. Cisco, Qlogic vs. Emulex, Broadcom vs. Mellanox, Juniper vs. HP and Dell (Force10) or Arista vs. others in the race for SAN LAN MAN WAN POTS and PANs.

Then there are the claims, counter claims, pundits, media, bloggers, trade groups or lobbyist, marketing alliance or pacs, paid for ads and posts, tweets and videos along with supporting metrics for traditional and social media.

Lets also not forget about polls, and more polls.

Certainly, there are vendors vs. vendors relying on their campaign teams (sales, marketing, engineering, financing and external surrogates) similar to what you would find with a politician, of course scope, size and complexity would vary.

Surrogates include analyst, bloggers, consultants, business partners, community organizers, editors, VARs, influencers, press, public relations and publications among others. Some claim to be objective and free of vendor influence while leveraging simple to complex schemes for renumeration (e.g. getting paid) while others simply state what they are doing and with whom.

Likewise, some point fingers at others who are misbehaving while deflecting away from what they are actually doing. Hmm, sounds like the pundit or surrogate two-step (as opposed to the Potomac two step) and prompts the question of who is checking the fact checkers and making disclosures (disclosure: this piece is being sponsored by StorageIO ;) )?

StorageIO industry trends cloud, virtualization and big data

What this all means?

Use your brain, use your eyes and ears, and use your nose all of which have dual paths to your senses.

In other words, if something sounds or looks too good to be true, it probably isn’t.

Likewise if something smells funny or does not feel right to your senses or common sense, it probably is not or at least requires a closer look or analysis.

Be an informed decision maker balancing needs vs. wants to make effective selections regardless of if for a major or minor item, technology, trend, product, process, protocol or service. Informed decisions also mean looking at both current and evolving or future trends, challenges and needs which for data infrastructures including servers, storage, networking, IO fabrics, cloud and virtualization means factoring in changing data and information life cycles and access or usage patterns. After all, while there are tough economic times on a global basis, there is no such thing as a data or information recession.

This also means gaining insight and awareness of issues and challenges, plus balancing awareness and knowledge (G2) vs. looks, appearances and campaign sales pitches (GQ) for your particular environment, priorities and preferences.

Keep in mind and in the spirit of legendary Chicago style voting, when it comes to storage and data infrastructure topics, technologies and decisions, spend early, spend often and spend for those who cannot to keep the vendors and their ecosystem of partners happy.

Note that this post is neither supported, influenced, endorsed or paid for by any vendors, VARs, service providers, trade groups, political action committees or Picture Archive Communication system (e.g. PACs), both of which deal with and in big data along with industry consortiums, their partners, customers or surrogates and neither would they probably approve of it anyway’s.

With that being said, I am Greg Schulz of StorageIO and am not running for or from anything this year and I do endorse the above post ;).

Ok, nuff said for now

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

July 11, 2012December 29, 2025

Announcing SAS SANs for Dummies book, LSI edition

There is a new (free) book that I’m a co-author of along Bruce Grieshaber and Larry Jacob (both of LSI) along with foreword by Harry Mason of LSI and President of the SCSI Trade Association titled SAS SANs for Dummies compliments of LSI.

SAS SANs for Dummies, LSI Edition

This new book (ebook and print hard copy) looks at Serial Attached SCSI (SAS) and how it can be used beyond traditional direct attached storage (DAS) configurations for support various types of storage mediums including SSD, HDD and tape. These configuration options include as entry-level SAN with SAS switches for small clusters or server virtualization, or as shared DAS as well as being a scale out back-end solution for NAS, object, cloud and big data storage solutions.

Here is the table of contents (TOC) of SAS SANs for Dummies

Chapter 1: Data storage challenges

Storage Growth Demand Drivers

Recognizing Challenges

Solutions and Opportunities

Chapter 2: Storage Area Networks

Introducing Storage Area Networks

Moving from Dedicated Internal to Shared Storage

Chapter 3: SAS Basics

Introducing the Basics of SAS

How SAS Functions

Components of SAS

SAS Target Devices

SAS for SANs

Chapter 4: SAS Usage Scenarios

Understanding SAS SANs Usage

Shared SAS SANs Scenarios including:

SAS in HPC environments
Big data and big bandwidth
Database, e-mail, back-office
NAS and object storage servers
Cloud, wen and high-density
Server virtualization

Chapter 5: Advanced SAS Topics

The SAS Physical Layer

Choosing SAS Cabling

Using SAS Switch Zoning

SAS HBA Target Mode

Chapter 6: Nine Common Questions

Can You Interconnect Switches?

What Is SAS Cable Distance?

How Many Servers Can Be In a SAS SAN?

How Do You Manage SAS Zones?

How Do You Configure SAS for HA?

How Does SAS Zoning Compare to LUN Mapping?

Who Has SAS Solutions?

How Do SAS SANs Compare?

Where Can You Learn More?

Chapter 7: Next Steps

SAS Going Forward

Next Steps

Great Take Away’s

Regardless of if you are looking to use SAS as a primary SAN interface, or leverage it for DAS or implementing back-end storage for big-data, NAS, object, cloud or other types of scalable storage solutions, check out and get your free copy of SAS SANs for Dummies here compliments of LSI.

SAS SANs for Dummies, LSI Edition

Click here to ask your free copy of SAS SANs for Dummies compliments of LSI, tell them Greg from StorageIO sent you and enjoy the book.

Ok, nuff said.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

May 3, 2012April 27, 2025

What is the best kind of IO? The one you do not have to do

Updated 2/10/2018

What is the best kind of IO? If no IO (input/output) operation is the best IO, than the second best IO is the one that can be done as close to the application and processor with best locality of reference. Then the third best IO is the one that can be done in less time, or at least cost or impact to the requesting application which means moving further down the memory and storage stack (figure 1).

Figure 1 memory and storage hierarchy

The problem with IO is that they are basic operation to get data into and out of a computer or processor so they are required; however, they also have an impact on performance, response or wait time (latency). IO require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data to their destination or retrieve from where stored. While IOs cannot be eliminated, their impact can be greatly improved or optimized by doing fewer of them via caching, grouped reads or writes (pre-fetch, write behind) among other techniques and technologies.

Think of it this way, instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip; however, that optimization may also take longer. Hence sometimes it makes sense to go on a couple of quick, short low latency trips vs. one single larger one that takes half a day however accomplishes many things. Of course, how far you have to go on those trips (e.g. locality) makes a difference of how many you can do in a given amount of time.

What is locality of reference?

Locality of reference refers to how close (e.g location) data exists for where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, then level 1 (L1), level 2 (L2) or level 3 (L3) onboard cache, followed by dynamic random access memory (DRAM). Then would come memory also known as storage on PCIe cards such as nand flash solid state device (SSD) or accessible via an adapter on a direct attached storage (DAS), SAN or NAS device. In the case of a PCIe nand flash SSD card, even though physically the nand flash SSD is closer to the processor, there is still the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with Meta or control information to further optimize and improve locality of reference. In other words, help with cache hits, cache use and cache effectiveness vs. simply boosting cache utilization.

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Can we get a side of context with them IOPS and other storage metrics?
WHEN AND WHERE TO USE NAND FLASH SSD FOR VIRTUAL SERVERS
Revisiting RAID storage remains relevant and resources
NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

What can you do the cut the impact of IO

Establish baseline performance and availability metrics for comparison
Realize that IOs are a fact of IT virtual, physical and cloud life
Understand what is a bad IO along with its impact
Identify why an IO is bad, expensive or causing an impact
Find and fix the problem, either with software, application or database changes
Throw more software caching tools, hyper visors or hardware at the problem
Hardware includes faster processors with more DRAM and fast internal busses
Leveraging local PCIe flash SSD cards for caching or as targets
Utilize storage systems or appliances that have intelligent caching and storage optimization capabilities (performance, availability, capacity).
Compare changes and improvements to baseline, quantify improvement

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

January 3, 2011April 27, 2025

As the Hard Disk Drive HDD continues to spin

Updated 2/10/2018

Despite having been repeatedly declared dead at the hands of some new emerging technology over the past several decades, the Hard Disk Drive (HDD) continues to spin and evolve as it moves towards its 60th birthday.

More recently HDDs have been declared dead due to flash SSD that according to some predictions, should have caused the HDD to be extinct by now.

Meanwhile, having not yet died in addition to having qualified for its AARP membership a few years ago, the HDD continues to evolve in capacity, smaller form factor, performance, reliability, density along with cost improvements.

Back in 2006 I did an article titled Happy 50th, hard drive, but will you make it to 60?

IMHO it is safe to say that the HDD will be around for at least a few more years if not another decade (or more).

This is not to say that the HDD has outlived its usefulness or that there are not other tiered storage mediums to do specific jobs or tasks better (there are).

Instead, the HDD continues to evolve and is complimented by flash SSD in a way that HDDs are complimenting magnetic tape (another declared dead technology) each finding new roles to support more data being stored for longer periods of time.

After all, there is no such thing as a data or information recession!

What the importance of this is about technology tiering and resource alignment, matching the applicable technology to the task at hand.

Technology tiering (Servers, storage, networking, snow removal) is about aligning the applicable resource that is best suited to a particular need in a cost as well as productive manner. The HDD remains a viable tiered storage medium that continues to evolve while taking on new roles coexisting with SSD and tape along with cloud resources. These and other technologies have their place which ideally is finding or expanding into new markets instead of simply trying to cannibalize each other for market share.

Here is a link to a good story by Lucas Mearian on the history or evolution of the hard disk drive (HDD) including how a 1TB device that costs about $60 today would have cost about a trillion dollars back in the 1950s. FWIW, IMHO the 1 trillion dollars is low and should be more around 2 to 5 trillion for the one TByte if you apply common costs for management, people, care and feeding, power, cooling, backup, BC, DR and other functions.

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Hybrid Hard Disk Drives (HHDD) (combine flash + RAM along with an integrated HDD)
7.2K RPM 2.5 inch SAS (or SATA) 1TB HDD
Top 10 technologies that remain the backbone of storage
Hard disk drives density improvements with perpendicular recording (from 2006)
Seagate discusses HAMR (Heat Assisted Magnetic Recording) and BPM (Bit Patterned Media)
Can we get a side of context with them IOPS and other storage metrics?
WHEN AND WHERE TO USE NAND FLASH SSD FOR VIRTUAL SERVERS
Revisiting RAID storage remains relevant and resources
NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

IMHO, it is safe to say that the HDD is here to stay for at least a few more years (if not decades) or at least until someone decides to try a new creative marketing approach by declaring it dead (again).

Ok, nuff said, for now.

September 17, 2010March 7, 2022

What is DFR or Data Footprint Reduction?

Updated 10/9/2018

What is DFR or Data Footprint Reduction?

Data Footprint Reduction (DFR) is a collection of techniques, technologies, tools and best practices that are used to address data growth management challenges. Dedupe is currently the industry darling for DFR particularly in the scope or context of backup or other repetitive data.

However DFR expands the scope of expanding data footprints and their impact to cover primary, secondary along with offline data that ranges from high performance to inactive high capacity.

Consequently the focus of DFR is not just on reduction ratios, its also about meeting time or performance rates and data protection windows.

This means DFR is about using the right tool for the task at hand to effectively meet business needs, and cost objectives while meeting service requirements across all applications.

Examples of DFR technologies include Archiving, Compression, Dedupe, Data Management and Thin Provisioning among others.

Read more about DFR in Part I and Part II of a two part series found here and here.

Where to learn more

Learn more about data footprint reducton (DFR), data footprint overhead and related topics via the following links:

Next Generation Hybrid Software Defined Data Infrastructures Are In Your Future #blogtobertech
Data Footprint Reduction – Software Defined Data Infrastructure Essentials
PACE your Server Storage I/O decision making, its about application requirements
Announcing Software Defined Data Infrastructure Essentials Book by Greg Schulz
July 2018 Server StorageIO Data Infrastructure Update Newsletter
Pictures Over Stillwater Drone Pro Shop and Resource Links
2018 Hot Popular New Trending Data Infrastructure Vendors to Watch
Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Server Storage I/O Tradecraft Trends
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)
If NVMe is the answer, what are the questions?
NVMe Primer (or refresh), The NVMe Place, The SSD Place, and the Object Storage Center

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means

That is all for now, hope you find these ongoing series of current or emerging Industry Trends and Perspectives posts of interest.

Ok, nuff said, for now.

Cheers Gs

December 15, 2009April 27, 2025

What do NAS NASA NASCAR have in common?

server storage I/O data infrastructure trends

Updated 2/10/2018

The other day it dawned on me what do NAS, NASA NASCAR have in common?

Several things in addition to all starting with the letters NAS it turns out.

For example, they all deal with round objects, NAS or Network Attached storage involved with circular spinning disk drives, NASA or National Aeronautical Space Administration besides involved with aircraft that have tires that go round and round, or airplanes circling waiting for landing.

In the case of NASA they are also involved with sending craft or devices to circle other planets or moons and land or crash into them. Sometimes NAS along with other storage systems have disk drives that crash, similar to how NASCAR events see accidents.
NAS

NASCAR is also involved with vehicles that dont or at least should not fly, however they do go round and round on a track, often paved however sometimes mud or dirt tracks plus high tech exists with computers and various data models, not to mention the NASCAR air force.

In addition to being involved with round objects and activities, all three are also involved in computing, generating, processing, storing and retrieving for analysis of data, not to mention high performance requirements.

NAS based storage can also be relied upon for serving the needs of NASA and NASCAR data and informational needs.

And FWIW, just for fun, look at what you get when you spell NAS, NASA or NASCAR backwards:

RACSAN
ASAN
SAN

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Not much actually other than to stimulate some thought, discussion as well as perhaps have some fun with technology during the holiday season.

Im sure if I put some more thought to it, more similarities would or will come to mind.

However, for now, thats it for a quick thought, what similarities do you see or know about with NAS, NASA and NASCAR?

Ok, nuf fun for now, time to work on some other posts, content and projects.

Ok, nuff said, for now.

July 23, 2009October 17, 2024

Clarifying Clustered Storage Confusion

Clustered storage can be iSCSI, Fibre Channel block based or NAS (NFS or CIFS or proprietary file system) file system based. Clustered storage can also be found in virtual tape library (VTL) including dedupe solutions along with other storage solutions such as those for archiving, cloud, medical or other specialized grids among others.

Recently in the IT and data storage specific industry, there has been a flurry of merger and acquisition (M&A) (Here and here), new product enhancement or announcement activity around clustered storage. For example, HP buying clustered file system vendor IBRIX complimenting their previous acquisition of another clustered file system vendor (PolyServe) a few years ago, or, of iSCSI block clustered storage software vendor LeftHand earlier this year. Another recent acquisition is that of LSI buying clustered NAS vendor ONstor, not to mention Dell buying iSCSI block clustered storage vendor EqualLogic about a year and half ago, not to mention other vendor acquisitions or announcements involving storage and clustering.

Where the confusion enters into play is the term cluster which means many things to different people, and even more so when clustered storage is combined with NAS or file based storage. For example, clustered NAS may infer a clustered file system when in reality a solution may only be multiple NAS filers, NAS heads, controllers or storage processors configured for availability or failover.

What this means is that a NFS or CIFS file system may only be active on one node at a time, however in the event of a failover, the file system shifts from one NAS hardware device (e.g. NAS head or filer) to another. On the other hand, a clustered file system enables a NFS or CIFS or other file system to be active on multiple nodes (e.g. NAS heads, controllers, etc.) concurrently. The concurrent access may be for small random reads and writes for example supporting a popular website or file serving application, or, it may be for parallel reads or writes to a large sequential file.

Clustered storage is no longer exclusive to the confines of high-performance sequential and parallel scientific computing or ultra large environments. Small files and I/O (read or write), including meta-data information, are also being supported by a new generation of multipurpose, flexible, clustered storage solutions that can be tailored to support different applications workloads.

There are many different types of clustered and bulk storage systems. Clustered storage solutions may be block (iSCSI or Fibre Channel), NAS or file serving, virtual tape library (VTL), or archiving and object-or content-addressable storage. Clustered storage in general is similar to using clustered servers, providing scale beyond the limits of a single traditional system—scale for performance, scale for availability, and scale for capacity and to enable growth in a modular fashion, adding performance and intelligence capabilities along with capacity.

For smaller environments, clustered storage enables modular pay-as-you-grow capabilities to address specific performance or capacity needs. For larger environments, clustered storage enables growth beyond the limits of a single storage system to meet performance, capacity, or availability needs.

Applications that lend themselves to clustered and bulk storage solutions include:

Unstructured data files, including spreadsheets, PDFs, slide decks, and other documents
Email systems, including Microsoft Exchange Personal (.PST) files stored on file servers
Users’ home directories and online file storage for documents and multimedia
Web-based managed service providers for online data storage, backup, and restore
Rich media data delivery, hosting, and social networking Internet sites
Media and entertainment creation, including animation rendering and post processing
High-performance databases such as Oracle with NFS direct I/O
Financial services and telecommunications, transportation, logistics, and manufacturing
Project-oriented development, simulation, and energy exploration
Low-cost, high-performance caching for transient and look-up or reference data
Real-time performance including fraud detection and electronic surveillance
Life sciences, chemical research, and computer-aided design

Clustered storage solutions go beyond meeting the basic requirements of supporting large sequential parallel or concurrent file access. Clustered storage systems can also support random access of small files for highly concurrent online and other applications. Scalable and flexible clustered file servers that leverage commonly deployed servers, networking, and storage technologies are well suited for new and emerging applications, including bulk storage of online unstructured data, cloud services, and multimedia, where extreme scaling of performance (IOPS or bandwidth), low latency, storage capacity, and flexibility at a low cost are needed.

The bandwidth-intensive and parallel-access performance characteristics associated with clustered storage are generally known; what is not so commonly known is the breakthrough to support small and random IOPS associated with database, email, general-purpose file serving, home directories, and meta-data look-up (Figure 1). Note that a clustered storage system, and in particular, a clustered NAS may or may not include a clustered file system.

Clustered Storage Model: Source The Green and Virtual Data Center (CRC)
Figure 1 – Generic clustered storage model (Courtesy “The Green and Virtual Data Center (CRC)”

More nodes, ports, memory, and disks do not guarantee more performance for applications. Performance depends on how those resources are deployed and how the storage management software enables those resources to avoid bottlenecks. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion when processing diverse application workloads. Other things to consider include support for industry-standard interfaces, protocols, and technologies.

Scalable and flexible clustered file server and storage systems provide the potential to leverage the inherent processing capabilities of constantly improving underlying hardware platforms. For example, software-based clustered storage systems that do not rely on proprietary hardware can be deployed on industry-standard high-density servers and blade centers and utilizes third-party internal or external storage.

Clustered storage is no longer exclusive to niche applications or scientific and high-performance computing environments. Organizations of all sizes can benefit from ultra scalable, flexible, clustered NAS storage that supports application performance needs from small random I/O to meta-data lookup and large-stream sequential I/O that scales with stability to grow with business and application needs.

Additional considerations for clustered NAS storage solutions include the following.

Can memory, processors, and I/O devices be varied to meet application needs?
Is there support for large file systems supporting many small files as well as large files?
What is the performance for small random IOPS and bandwidth for large sequential I/O?
How is performance enabled across different application in the same cluster instance?
Are I/O requests, including meta-data look-up, funneled through a single node?
How does a solution scale as the number of nodes and storage devices is increased?
How disruptive and time-consuming is adding new or replacing existing storage?
Is proprietary hardware needed, or can industry-standard servers and storage be used?
What data management features, including load balancing and data protection, exists?
What storage interface can be used: SAS, SATA, iSCSI, or Fibre Channel?
What types of storage devices are supported: SSD, SAS, Fibre Channel, or SATA disks?

As with most storage systems, it is not the total number of hard disk drives (HDDs), the quantity and speed of tiered-access I/O connectivity, the types and speeds of the processors, or even the amount of cache memory that determines performance. The performance differentiator is how a manufacturer combines the various components to create a solution that delivers a given level of performance with lower power consumption.

To avoid performance surprises, be leery of performance claims based solely on speed and quantity of HDDs or the speed and number of ports, processors and memory. How the resources are deployed and how the storage management software enables those resources to avoid bottlenecks are more important. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion.

Learn more about clustered storage (block, file, VTL/dedupe, archive), clustered NAS, clustered file system, grids and cloud storage among other topics in the following links:

"The Many faces of NAS – Which is appropriate for you?"

Article: Clarifying Storage Cluster Confusion
Presentation: Clustered Storage: “From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0”
Video Interview: How to Scale Data Storage Systems with Clustering
Guidelines for controlling clustering
The benefits of clustered storage

Along with other material on the StorageIO Tips and Tools or portfolio archive or events pages.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

December 14, 2008April 27, 2025

Server Storage I/O Network Virtualization Whats Next?

Server Storage I/O Network Virtualization Whats Next?
Updated 9/28/18

There are many faces and thus functionalities of virtualization beyond the one most commonly discussed which is consolidation or aggregation. Other common forms of virtualization include emulation (which is part of enabling consolidation) which can be in the form of a virtual tape library for storage to bridge new disk technology to old software technology, processes, procedures and skill sets. Other forms of virtualization functionality for life beyond consolidation include abstraction for transparent movement of applications or operating systems on servers, or data on storage to support planned and un-planned maintenance, upgrades, BC/DR and other activities.

So the gist is that there are many forms of virtualization technologies and techniques for servers, storage and even I/O networks to address different issues including life beyond consolidation. However the next wave of consolidation could and should be that of reducing the number of logical images, or, the impact of the multiple operating systems and application images, along with their associated management costs.

This may be easier said than done, however, for those looking to cut costs even further than from what can be realized by reducing physical footprints (e.g. going from 10 to 1 or from 250 to 25 physical servers), there could be upside however it will come at a cost. The cost is like that of reducing data and storage footprint impacts with such as data management and archiving.

Savings can be realized by archiving and deleting data via data management however that is easier said than done given the cost in terms of people time and ability to decide what to archive, even for non-compliance data along with associated business rules and policies to be defined (for automation) along with hardware, software and services (managed services, consulting and/or cloud and SaaS).

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Can we get a side of context with them IOPS and other storage metrics?
WHEN AND WHERE TO USE NAND FLASH SSD FOR VIRTUAL SERVERS
Revisiting RAID storage remains relevant and resources
NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals
Catching Up With Summer 2018 IBM Cloudy Software Defined Storage Announcements
Server StorageIO 2018 VMworld Data Infrastructure Buzzword Bingo Puzzle

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Ok, nuff said, for now.

Share this:

Share this:

Share this:

Share this:

Where To Learn More

What This All Means

Share this:

Share this:

Share this:

Share this:

What is the best kind of IO? The one you do not have to do

Where To Learn More

What This All Means

Share this:

As the Hard Disk Drive HDD continues to spin

Where To Learn More

What This All Means

Share this:

What is DFR or Data Footprint Reduction?

Where to learn more

What this all means

Share this:

What do NAS NASA NASCAR have in common?

Where To Learn More

What This All Means

Share this:

Share this:

Where To Learn More

What This All Means

Share this: