More storage and IO metrics that matter

It is great to see more conversations and coverage around storage metrics that matter beyond simply focusing on cost per GByte or TByte (e.g. space capacity). Likewise, it is also good to see conversations expanding beyond data footprint reduction (DFR) from a space capacity savings or reduction ratio to also address data movement and transfer rates. Also good to see is increase in discussion around input/output operations per section (IOPs) tying into conversations from virtualization, VDI, cloud to Sold State Devices (SSD).

Other storage and IO metrics that matter include latency or response time, which is how fast work is done, or time spent. Latency also ties to IOPS in that as more work arrives to be done (IOPS) of various size, random or sequential, reads or writes, queue depths are an indicator of how well work is flowing. Another storage and IO metric that matters is availability because without it, performance or capacity can be affected. Likewise, without performance, availability can be affected.

Needless to say that I am just scratching the surface here with storage and IO metrics that matter for physical, virtual and cloud environments from servers to networks to storage.

Here is a link to a post I did called IO, IO, it is off to storage and IO metrics we go that ties in themes of performance measurements and solid-state disk (SSD) among others. Also check out this piece about why VASA (VMware storage analysis metrics) is important to have your VMware CASA along with Windows boot storage and IO performance for VDI and traditional planning purposes.

Check out this post about metrics and measurements that matter along with this conversation about IOPs, capacity, bandwidth and purchasing discussion topics.

Related links on storage IO metrics and SSD performance
What is the best kind of IO? The one you do not have to do
Is SSD dead? No, however some vendors might be
Storage and IO metrics that matter
IO IO it is off to Storage and IO metrics we go
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

What is the best kind of IO? The one you do not have to do

What is the best kind of IO? The one you do not have to do

data infrastructure server storage I/O trends

Updated 2/10/2018

What is the best kind of IO? If no IO (input/output) operation is the best IO, than the second best IO is the one that can be done as close to the application and processor with best locality of reference. Then the third best IO is the one that can be done in less time, or at least cost or impact to the requesting application which means moving further down the memory and storage stack (figure 1).

Storage and IO or I/O locality of reference and storage hirearchy
Figure 1 memory and storage hierarchy

The problem with IO is that they are basic operation to get data into and out of a computer or processor so they are required; however, they also have an impact on performance, response or wait time (latency). IO require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data to their destination or retrieve from where stored. While IOs cannot be eliminated, their impact can be greatly improved or optimized by doing fewer of them via caching, grouped reads or writes (pre-fetch, write behind) among other techniques and technologies.

Think of it this way, instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip; however, that optimization may also take longer. Hence sometimes it makes sense to go on a couple of quick, short low latency trips vs. one single larger one that takes half a day however accomplishes many things. Of course, how far you have to go on those trips (e.g. locality) makes a difference of how many you can do in a given amount of time.

What is locality of reference?

Locality of reference refers to how close (e.g location) data exists for where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, then level 1 (L1), level 2 (L2) or level 3 (L3) onboard cache, followed by dynamic random access memory (DRAM). Then would come memory also known as storage on PCIe cards such as nand flash solid state device (SSD) or accessible via an adapter on a direct attached storage (DAS), SAN or NAS device. In the case of a PCIe nand flash SSD card, even though physically the nand flash SSD is closer to the processor, there is still the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with Meta or control information to further optimize and improve locality of reference. In other words, help with cache hits, cache use and cache effectiveness vs. simply boosting cache utilization.

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

What can you do the cut the impact of IO

  • Establish baseline performance and availability metrics for comparison
  • Realize that IOs are a fact of IT virtual, physical and cloud life
  • Understand what is a bad IO along with its impact
  • Identify why an IO is bad, expensive or causing an impact
  • Find and fix the problem, either with software, application or database changes
  • Throw more software caching tools, hyper visors or hardware at the problem
  • Hardware includes faster processors with more DRAM and fast internal busses
  • Leveraging local PCIe flash SSD cards for caching or as targets
  • Utilize storage systems or appliances that have intelligent caching and storage optimization capabilities (performance, availability, capacity).
  • Compare changes and improvements to baseline, quantify improvement

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

More Storage IO momentus HHDD and SSD moments part II

This follows the first of a two-part series on my latest experiences with Hybrid Hard Disk Drives (HHDD’s) and Solid State Devices (SSD’s). In my ongoing last momentus moment post I discussed what I have done with HHDD’s and setting the stage for expanded SSD use. I have the newer HHDD’s, e.g. Seagate Momentus XT II 750GB (8GB SLC nand flash) installed and have since bought another from Amazon as well as having some of the older 500GB (4GB SLC nand flash) in various systems. Those are all functioning great, however still waiting and looking forward to the rumored firmware enhancements to boost write capabilities.

This brings me up to the latest momentus moment which now includes SSD’s.

Well its two years later and I now have a 256GB (usable capacity is lower) Samsung SSD that I bought from Amazon.com and installed in one of my laptops and just as when I made the first switch to HHDD’s, I also have a backup copy/clone to fall back to in case of emergency.

Was it worth the wait? Yes, particularly using the HHDD’s to bridge the gap and enable some productivity gain which more than paid for them based on some different projects. I’m already seeing productivity improvements that will make future upgrades more easy to justify (to myself).

I deviated from my strategy a bit and installed the SSD about six months earlier than I was planning to do so because of a physical barrier. That physical barrier was my new traveling laptop only accepts 7mm height 2.5 inch small form factor devices and the 750GB HHDD that I had planned on installing was 2.5mm to thick which pushed up the SSD installation.

What will become of the 750GB HHDD? Its being redeployed to help speed up file serving, backups and other functions.

Will I replace the HHDD’s in my other workstations and laptops now with SSD’s? Across the board no, not yet, however there is one other system that is a prime candidate to maybe upgrade in a month or two (maybe less).

Will I stick with the Samsung SSD’s or look at other options? I’m keeping my options open and using this as a gauge to test and compare other options in a real world working environment as opposed to a lab bench test simulation. In other words, taking the next step past the lab test and product reviews, gaining comfort and confidence and then trying out with real use activity.

What will happen in the future as I install more SSD’s and have surplus HHDD’s? Redeployed them of course into file or NAS servers, backup targets that in turn will replace HDD’s that will either get retired, or redeployed to replace older, smaller capacity, higher cost to handle HDD’s used for offsite protection.

I tried using the software that came with the SSD to do the cloning and should have known better, however wanted to see what the latest version of ghost was like (it was a waste of time to be polite). Instead I used Seagate Discwizard (aka Acronis) which requires at least one Seagate product (source or target) for cloning.

Cloning from the Seagate HHDD that have been previously cloned from the Hitachi HDD that came with the laptop, was a none issue. However, I wanted to see what would happen if I attached the Samsung SSD to the Seagate Goflex cable and clone directly from the Hitachi HDD, it worked. Hence another reason to have some of the Seagate Goflex cables (USB and eSATA) like the ones I bought at Amazon.com around in your toolbox.

While I do not have concrete empirical numbers to share, cloning from a HDD to a SSD is shall we say fast, however, what’s really fun to watch is cloning from a HHDD to a SSD using an eSata (GoFlex) connector adapter. The reason I say that it is fun is that you don’t have to sit and wait for hours, it’s not minutes to move 100s of GBs, however you can very much see the progress bar move at a good pace.

Also, I put the HHDD on an eSata port and try that out as a backup or data dump target if you have the need for speed, capacity and cost effectiveness, yes its fast, has lots of capacity and so forth. Now if Seagate and Synology or EMC Iomega would get their acts together and add support for the HHDD’s in those different unified SMB and SOHO NAS solutions, that would be way cool.

Will I be racing to put SSD’s in my other laptops or workstations soon? Probably not as there are things in the works and working their way into and through the market place that I wanted to wait for, and thus will wait for now, that is unless a more interesting opportunity pops up.

Related links on SDD, HHDD and HDD
More Storage IO momentus HHDD and SSD moments part I
More Storage IO momentus HHDD and SSD moments part II
IO IO it is off to Storage and IO metrics we go
New Seagate Momentus XT Hybrid drive (SSD and HDD)
Other Momentus moments posts here here, here, here and here
SSD and Storage System Performance
Speaking of speeding up business with SSD storage
Are Hard Disk Drives (HDD’s) getting too big?
Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?
Why SSD based arrays and storage appliances can be a good idea (Part I)
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?
SSD options for Virtual (and Physical) Environments Part IV: What type of SSD is best for your needs

Ok, nuff said for now.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

IT Optimization, efficiency, convergence and cloud conversations from SNW

Recently I did a presentation titled backup, restore, BC, DR and archiving (hmm, I think I know of a book with the same title) at the spring 2012 SNW in Dallas. My presentation was on the first morning of the session as I needed to be in Boston to record a video the following Tuesday morning, thus I missed out on the storm clouds and tornadoes that rolled in the next day.

While I was at SNW, had the honor of being a guest on Calvin Zito (aka @HPStorageguy) pod cast that can be found on his Around the Storage Block Blog or by clicking here.

Cloud and Virtual Data Storage Networking Conversation

Check out our conversations about clouds, related topics and more from a practical perspective cutting through the hype and fud.

Oh, if you are interested in Cloud and Virtual Data Storage Networking, click here to learn more about the book, or backup, restore, BC, DR and archiving to find various backup, restore, BC, DR and archiving, and here to see some upcoming events, activities and venues both in the U.S. and in Europe.

Ok, nuff said for now.

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Part IV: PureSystems, something old, something new, something from big blue

This is the fourth in a five-part series around the recent IBM PureSystems announcements. You can view the earlier post here, and the next post here.

So what does this mean for IBM Business Partners (BPs) and ISVs?
What could very well differentiate IBM PureSystems from those of other competitors is to take what their partner NetApp has done with FlexPods combing third-party applications from Microsoft and SAP among others and take it to the next level. Similar to what helped make EMC Centera a success (or at least sell a lot of them) was inclusion and leveraging third-party ISVs and BPs  to add value. Compared to other vendors with object based or content accessible storage (CAS) or online archive platforms that focused on the technology feature, function speeds and feeds, EMC realized the key was getting ISVs to support so that BPs and their own direct sales force could sell the solution.

With PureSystems, IBM is revisiting what they have done in the past which if offer bundled solutions providing incentives for ISVs to support and BPs to sell the IBM brand solution. EMC took an early step with including VMware with their Vblock combing server, storage, networking and software with NetApp taking the next step adding SAP, Microsoft and other applications. Dell, HP, Oracle and others are following suit so it only makes sense that IBM returns to its roots leveraging its DNA to reach out and get their ISVs who are now, have been in the past, or are new opportunities to be on board.

IBM is throwing its resources including their innovation centers for training around the world where business partners can get the knowledge and technical support they need. In other words, workshops or seminars on how to sell deploy and setting up of these systems, application and customer testing or proof of concepts and things one would expect out of IBM for such an initiative. In addition to technology and sales training along with marketing support, IBM is making their financing capabilities available to help customers as well as offer incentives to their business partners to simplify acquisitions.

So what buzzword bingo topics and themes did IBM address with this announcement:
IBM did a fantastic job in terms of knocking the ball out of the park with this announcement pertaining buzzword bingo and deserves an atta boy or atta girl!

So what about how this will affect sales of Bladecenters  or other systems?
If all IBM and their BPs do are, encroach on existing systems sales to circle the wagons and protect the installed base, which would be one thing. However if IBM and their BPs can use the new packaging and model approach to reestablish customers and partnerships, or open and expand into new adjacent markets, then the net differences should be more Bladecenters (excuse me, PureFlex) being sold.

So what will this cost?
IBM is citing entry PureSystems Express models starting at around $100,000 USD for base systems with others starting at around $200,000 and $300,000 expandable into larger configurations and budgets. Note that like airlines that advertise a low airfare and then you get to pay extra for peanuts, drinks, extra bag space, changes to reservations and so forth, look at these and related systems not just for the first starting price, also for expansion costs over different time periods. Contact IBM, your BP or ISV to find out what one of these systems will do for and cost you.

So what about VARs and IBM business partners (BPs)?
This could be a boon for those BPs and ISVs  that had previously sold their software solutions bundled with IBM hardware platforms who were being challenged by other converged solution stacks or were being forced to unbundled. This will also allow those business partners to compete on par with other converged solutions or continue selling the pieces of what they are familiar with however under a new umbrellas. Of course, pricing will be a focus and concern for some who will want to see what added value exists vs. acquiring the various components. This also means that IBM will have to make incentives available for their partners to make a living while also allowing their customers to afford solutions and maximize their return on innovation (the new ROI) and enablement.

Click here to view the next post in this series, ok nuff said for now.

Here are some links to learn more:
Various IBM Redbooks and related content
The blame game: Does cloud storage result in data loss?
What do you need when its time to buy a new server?
2012 industry trends perspectives and commentary (predictions)
Convergence: People, Processes, Policies and Products
Buzzword Bingo and Acronym Update V2.011
The function of XaaS(X) Pick a letter
Hard product vs. soft product
Buzzword Bingo and Acronym Update V2.011
Part I: PureSystems, something old, something new, something from big blue
Part II: PureSystems, something old, something new, something from big blue
Part III: PureSystems, something old, something new, something from big blue
Part IV: PureSystems, something old, something new, something from big blue
Part V: PureSystems, something old, something new, something from big blue
Cloud and Virtual Data Storage Networking

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Part V: PureSystems, something old, something new, something from big blue

This is the fifth in a five-part series around the recent IBM PureSystems announcements. You can view the earlier post here.

So what about vendor or technology lock in?
So who is responsible for vendor or technology lock in? When I was working in IT organizations, (e.g. what vendors call the customer) the thinking was vendors are responsible for lock in. Later when I worked for different vendors (manufactures and VARs) the thinking was lock in is what was caused by the competition. More recently I’m of the mind set that vendor lock in is a shared responsibility issue and topic. I’m sure some marketing wiz or sales type will be happy to explain the subtle differences of how their solution does not cause lock in.

Vendor lock in can be a shared responsibility. Generally speaking, lock in, stickiness and account control are essentially the same, or at least strive to get similar results. For example, vendor lock in too some has a negative stigma. However vendor stickiness may be a new term, perhaps even sounding cool thus it is not a concern. Remember the Mary Poppins song a spoon full of sugar makes the medicine go down? In other words, sometimes changing and using a different term such as sticky vs. vendor lock in helps make the situation taste better.

So what should you do?
Take a closer look if you are considering converged infrastructures, cloud or data centers in a box, turnkey application or information services deployment platforms. Likewise, if you are looking at specific technologies such as those from Cisco UCS, Dell vStart, EMC Vblock (or via VCE), HP, NetApp FlexPod or Oracle (ExaLogic, ExaData, etc) among others, also check out the IBM PureSystems (Flex and PureApplication). Compare and contrast these converged solutions with your traditional procurement and deployment modes including cost of acquiring hardware, software, ongoing maintenance or service fees along with value or benefit of bundled tools. There may be a higher cost for converged systems in some scenarios, however compare on the value and benefit derived vs. doing the integration yourself.

Compare and contrast how converged solutions enable, however also consider what constraints exists in terms of flexibility to reconfigure in the future or make other changes. For example as part of integration, does a solution take a lowest common denominator approach to software and firmware revisions for compatibility that may lag behind what you can apply to standalone components. Also, compare and contrast various reference architectures with different solution bundles or packages.

Most importantly compare and evaluate the solutions on their ability to meet and exceed your base requirements while adding value and enabling return on innovation while also being cost-effective. Do not be scared of these bundled solutions; however do your homework to make informed decisions including overcoming any concerns of lock in or future costs and fees. While these types of solutions are cool or interesting from a technology perspective and can streamline acquisition and deployment, make sure that there is a business benefit that can be addressed as well as enablement of new capabilities.

So what does this all mean?
Congratulations to IBM with their PureSystems for leveraging their DNA and roots bundling what had been unbundled before cloud and stacks were popular and trendy. IBM has done a good job of talking vision and strategy along lines of converged and dynamic, elastic and smart, clouds and other themes for past couple of years while selling the pieces as parts of solutions or ala carte or packaged by their ISVs and business partners.

What will be interesting to see is if bladecenter customers shift to buying PureFlex, which should be an immediate boost to give proof points of adoption, while essentially up selling what was previously available. However, more interesting will be to see if net overall new customers and footprints are sold as opposed to simply selling a newer and enhanced version of previous components.

In other words will IBM be able to keep up their focus and execution where they have sold the previous available components, while also holding onto current ISV and BP footprint sales and perhaps enabling those partners to recapture some hardware and solution sales that had been unbundled (e.g. ISV software sold separate of IBM platforms) and move into new adjacent markets.

Here are some links to learn more:
Various IBM Redbooks and related content
The blame game: Does cloud storage result in data loss?
What do you need when its time to buy a new server?
2012 industry trends perspectives and commentary (predictions)
Convergence: People, Processes, Policies and Products
Buzzword Bingo and Acronym Update V2.011
The function of XaaS(X) Pick a letter
Hard product vs. soft product
Buzzword Bingo and Acronym Update V2.011
Part I: PureSystems, something old, something new, something from big blue
Part II: PureSystems, something old, something new, something from big blue
Part III: PureSystems, something old, something new, something from big blue
Part IV: PureSystems, something old, something new, something from big blue
Part V: PureSystems, something old, something new, something from big blue
Cloud and Virtual Data Storage Networking

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Here are some links to learn more:
Various IBM Redbooks and related content
The blame game: Does cloud storage result in data loss?
What do you need when its time to buy a new server?
2012 industry trends perspectives and commentary (predictions)
Convergence: People, Processes, Policies and Products
Buzzword Bingo and Acronym Update V2.011
The function of XaaS(X) – Pick a letter
Hard product vs. soft product
Buzzword Bingo and Acronym Update V2.011
Part I: PureSystems, something old, something new, something from big blue
Part II: PureSystems, something old, something new, something from big blue
Part III: PureSystems, something old, something new, something from big blue
Part IV: PureSystems, something old, something new, something from big blue
Part V: PureSystems, something old, something new, something from big blue
Cloud and Virtual Data Storage Networking

Ok, so what is next, lets see how this unfolds for IBM and their partners.

Nuff said for now.

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Part III: PureSystems, something old, something new, something from big blue

This is the third in a five-part series around the recent IBM PureSystems announcements. You can view the earlier post here, and the next post here.

So what about the IBM Virtual Appliance Factory?
Where PureFlex and PureApplication (PureSystems) are the platforms or vehicles for enabling your journey to efficient and effective information services delivery, and PureSystem centre (or center for those of you in the US) is the portal or information center, the IBM Virtual Appliance Factory (VAF) is a collection of tools, technologies, processes and methodologies. The VAF  helps developers or ISVs to prepackage applications or solutions for deployment into Kernel Virtual Machine (KVM) on Intel and IBM PowerVM  virtualized environments that are also supported by PureFlex and PureApplication  systems.

VAF technologies include Distributed Management Task Force (DMTF) Open Virtual Alliance (OVA) Open Virtualization Format (OVF) along with other tools for combing operating systems (OS), middleware and solution software into a delivery package or a virtual appliance that can be deployed into cloud and virtualized environments. Benefits include reducing complexity of working logical partions (LPAR) and VM configuration, abstraction and portability for deployment or movement from private to public environments. Net result should be less complexity lowering costs while reducing mean time to install and deploy. Here is a link to learn more about VAF and its capabilities and how to get started.

So what does cloud ready mean?
IBM is touting cloud ready capability in the context of rapid out of the box, ease of deployment and use as well as easy to acquire. This is in line with what others are doing with converged server, storage, networking, hardware, software and hypervisor solutions. IBM is also touting that they are using the same public available products as what they use in their own public services SmartCloud offerings.

So what is scale in vs. scale up, scale out or scale within?
Traditional thinking is that scaling refers to increasing capacity. Scaling also means increasing performance, availability, functionality with stability. Scaling with stability means that as performance, availability, capacity or other features are increased problems are not introduced or complexity is not increased. For example, scaling with stability for performance should not result in loss of availability or capacity, capacity increase should not be at the cost of performance or availability, should not cost performance or capacity and management tools should work for you, instead of you working for them.

Scaling up and scaling out have been used to describe scaling performance, availability, capacity and other attributes beyond the limits of a single system, box or cabinet. For example clustered, cloud, grid and other approaches refer to scaling out or horizontally across different physical resources. Scaling up or scaling vertically means scaling within in a system using faster, denser technologies doing more in the same footprint. HDS announced a while back what they refer to 3D scaling which embraces the above notions of scaling up, out and within across different dimensions. IBM is building on that by emphasizing scaling leveraging faster, denser components such as Power7 and Intel processors to scale within the box or system or node, which can also be scaled out using enhanced networking from IBM and their partners.

So what about backup/restore, BC, DR and general data protection?
I would expect IBM to step up and talk about how they can leverage their data protection and associated management toolsets, technologies and products. IBM has the components (hardware, software) already for backup/restore, BC, DR, data protection and security along with associated service offerings. One would expect IBM to not only come out with a backup, restore, BC, DR and archiving solution or version, as well as ones for archiving or data preservation, compliance appliance variants as well as related themes. We know that IBM has the pieces, people, process and practices, let us see if IBM has learned from their competitors who may have missed data protection messaging opportunities. Sometimes what is assumed to be understood does not get discussed, however often what is assumed and is not understood should be discussed, hence, let us see if IBM does more than say oh yes, we have those capabilities and products too.

So what do these have compared to others who are doing similar things?
Different vendors have taken various approaches for bringing converged products or solutions to the market place. Not surprising, storage centric vendors EMC and NetApp have partnered with Cisco for servers (compute). Where Cisco was known for networking having more recently moved into compute servers, EMC and NetApp are known for storage and moving into converged space with servers. Since EMC and NetApp often compete with storage solutions offerings from traditional server vendors Dell, HP, IBM and Oracle among others, and now Cisco is also competing with those same server vendors it has previously partnered with for networking thus it makes sense for Cisco, EMC and NetApp to partner.

While EMC owns a large share of VMware, they do also support Microsoft and other partners including Citrix. NetApp followed EMC into the converged space partnering with Cisco for compute and networking adding their own storage along with supporting hypervisors from Citrix, Microsoft and VMware along with third-party ISVs including Microsoft and SAP among others. Dell has evolved from reference architectures to products called vStart that leverage their own technologies along with those of partners.

A challenge for Dell however is that vStart  sounds more like a service offering as opposed to a product that they or their VARs and business partners can sell and add value around. HP is also in the converged game as is Oracle among others. With PureSystems IBM is building on what their competitors and in some cases partners are doing by adding and messaging more around the many ISVs and applications that are part of the PureSystems initiative. Rest assured, there is more to PureSystems than simply some new marketing, press releases, videos and talking about partners and ISVs. The following table provides a basic high level comparison of what different vendors are doing or working towards and is not intended to be a comprehensive review.

Who

What

Server

Storage

Network

Software

Other comments

Cisco

UCS

Cisco

Partner

Cisco

Cisco and Partners

Various hypervisors and OS

Dell

vStart

Dell

Dell

Dell and Partners

Dell and partners

Various hypervisors, OS and bundles

EMC
VCE

Vblock VSPEX

Cisco

EMC

Cisco and partners

EMC, Cisco and partners

Various hypervisors, OS and bundles, VSPEX adds more partner solution bundles

HP

Converged

HP

HP

HP and partners

HP and partners

Various hypervisors, OS and bundles

IBM

PureFlex

IBM

IBM

IBM and partners

IBM and partners

Various hypervisors, OS and bundles adding more ISV partners

NetApp

FlexPod

Cisco

NetApp

Cisco and partners

NetApp, Cisco and partners

Various hypervisors, OS and bundles for SAP, Microsoft among others

Oracle

ExaLogic (Exadata  database)

Oracle

Oracle

Partners

Oracle and partners

Various Oracle software tools and technologies

So what took IBM so long compared to others?
Good question, what is the saying? Rome was not built-in a day!

Click here to view the next post in this series, ok, nuff said for now.

Here are some links to learn more:
Various IBM Redbooks and related content
The blame game: Does cloud storage result in data loss?
What do you need when its time to buy a new server?
2012 industry trends perspectives and commentary (predictions)
Convergence: People, Processes, Policies and Products
Buzzword Bingo and Acronym Update V2.011
The function of XaaS(X) Pick a letter
Hard product vs. soft product
Buzzword Bingo and Acronym Update V2.011
Part I: PureSystems, something old, something new, something from big blue
Part II: PureSystems, something old, something new, something from big blue
Part III: PureSystems, something old, something new, something from big blue
Part IV: PureSystems, something old, something new, something from big blue
Part V: PureSystems, something old, something new, something from big blue
Cloud and Virtual Data Storage Networking

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Part II: PureSystems, something old, something new, something from big blue

This is the second in a five-part series around the recent IBM PureSystems announcements. You can view the earlier post here, and the next post here.

So what are the speeds and feeds of a PureFlex system?
The components that make up the PureFlex line include:

  • IBM management node (server with management software tools).
  • 10Gb Ethernet (LAN) switch, adapters and associated cabling.
  • IBM V7000 virtual storage (also see here and here).
  • Dual 8GFC (8Gb Fibre Channel) SAN switches and adapters.
  • Servers with either x86 xSeries using for example Intel Sandy Bridge EP 2.6 GHz 8 core processors, or IBMs Power7 based pSeries for AIX. Note that IBM with their blade center systems (now rebadged as part of being PureSystems) support various IO and networking interfaces include SAS, Ethernet, Fibre Channel (FC), Fibre Channel over Ethernet (FCoE), and InfiniBand using adapters and switches from various partners.
  • Virtual machine (VM) hypervisors such as Microsoft Hyper V and VMware vSphere/ESX among others. In addition to x86 based hypervisors or kernel virtual machines (KVM), IBM also supports its own virtual technology found in Power7 based systems. Check IBM support matrix for specific configurations and current offerings.
  • Optional middleware such as IBM WebSphere.

Read more speeds and feeds at the various IBM sites including on Tony Pearson’s blog site.

So what is IBM PureApplication System?
This builds off and on PureFlex systems as a foundation for deploying various software stacks to deliver traditional IT applications or cloud Platform as a Service (PaaS) or Software as a Service (SaaS) and Application as a Service (AaaS) models. For example cloud or web stacks, java, database, analytics or other applications with buzzwords of elastic, scalable, repeatable, self-service, rapid provisioning, resilient, multi tenant and secure among others. Note that if are playing or into Buzzword bingo, go ahead and say Bingo when you are ready as IBM has a winner in this category.

So what is the difference between PureFlex and PureApplication systems?
PureApplication systems leverage PureFlex technologies adding extra tools and functionality for cloud like application functionality delivery.

So what is IBM PureSystems Centre?
It is a portal or central place where IBM and their business partner solutions pertaining to PureApplication and PureFlex systems can be accessed for including information for first installation support along with maintenance and upgrades. At launch, IBM is touting more than 150 solutions or applications that are available or qualified for deployment on PureApplication and PureFlex systems. In addition, IBM Patterns (aka templates) can also be accessed via this venue. Examples of application or independent software vendor (ISV) developed solutions for banking, education, financial, government, healthcare and insurance can be found at the PureSystems Centre portal (here, here and here).

So what part of this is a service and what is a product?
Other than the PureSystem center, which is a web portal for accessing information and technologies, PureFlex and PureApplication along with Virtual Appliance Factory are products or solutions that can be bought from IBM or their business partners. In addition, IBM business partners or third parties can also use these solutions housed in their own, a customer, or third-party facility for delivering managed service provided (MSP) capabilities, along with other PaaS and SaaS or AaaS type functionalities. In other words, these solutions can be bought or leased by IT and other organizations for their own use in a traditional IT deployment model, private, hybrid or public cloud model.

Another option is for service providers to acquire these solutions for use in developing and delivering their own public and private or hybrid services. IBM is providing the hard product (hardware and software) that enables your return on innovation (the new ROI) to create and deliver your own soft product (services and experiences) consumed by those who use those capabilities. In addition to traditional financial quantitative return on investment (traditional ROI) and total cost of ownership (TCO), the new ROI complements those by adding a qualitative aspect. Your return on innovation will be dependent on what you are capable of doing that enables your customers or clients to be productive or creative. For example enabling your customers or clients to boost productivity, remove complexity and cost while maintaining or enhancing Quality of Service (QoS), service level objectives (SLOs) and service level agreements (SLAs) in addition to supporting growth by using a given set of hard products. Thus, your soft product is a function of your return on innovation and vise versa.

Note that in this context, not to be confused with hardware and software, hard product are those technologies including hardware, software and services that are obtained and deployed as a soft product. A soft product in this context does not refer to software, rather the combination of hard products plus your own developed or separately obtained software and tools along with best practices and usage models. Thus, two organizations can use the same hard products and deliver separate soft products with different attributes and characteristics including cost, flexibility and customer experience.

So what is a Pattern of Expertise?
Combines operational know how experience and knowledge about common infrastructure resource management (IRM), data center infrastructure management (DCIM) and other commonly repeatable related process, practices and workflows including provisioning. Common patterns of activity and expertise for routine or other time-consuming tasks, which some might refer to as templates or workflows enable policy driven based automation. For example, IBM cites recurring time-consuming tasks that lend themselves to being automated such as provisioning, configuration, and upgrades and associated IRM, DCIM and data protection, storage and application management activities. Automation software tools are included as part of the PureSystems with patterns being downloadable as packages for common tasks and applications found at the IBM PureSystem center.

At announcement, there are three types or categories of patterns:

  • IBM patterns: Factory created and supplied with the systems based on experiences IBM has derived from various managers, engineers and technologist for automating common tasks including configuration, deployment and application upgrades and maintenance. The aim is to cut the amount of time and intervention for deployment of applications and other common functions enabling IT staff to be more productive and address other needs.
  • ISV patterns: These leverage experience and knowledge from ISVs partnered with IBM, which at time of launch numbers over 125 vendors offering certified PureSystems Ready applications. The benefit and objective are to cut the time and complexity associated with procuring (e.g. purchasing), deploying and managing third-party ISV software. Downloadable patterns packages can be found at the IBM PureSystem center.
  • Customer patterns: Enables customers to collect and package their own knowledge, processes, rules, policies and best practices into patterns for automation. In addition to collecting knowledge for acquisition, configuration, day to day management and troubleshooting, these patterns can facility automation of tasks to ease on boarding of new staff employees or contractors. In addition, these patterns or templates capture workflows for automation enabling shorter deployment times of systems and applications into locations where skill sets do not exist.

Here is a link to some additional information about patterns on the IBM developerWorks site.

Click here to view the next post in this series, ok, nuff said for now.

Here are some links to learn more:
Various IBM Redbooks and related content
The blame game: Does cloud storage result in data loss?
What do you need when its time to buy a new server?
2012 industry trends perspectives and commentary (predictions)
Convergence: People, Processes, Policies and Products
Buzzword Bingo and Acronym Update V2.011
The function of XaaS(X) Pick a letter
Hard product vs. soft product
Buzzword Bingo and Acronym Update V2.011
Part I: PureSystems, something old, something new, something from big blue
Part II: PureSystems, something old, something new, something from big blue
Part III: PureSystems, something old, something new, something from big blue
Part IV: PureSystems, something old, something new, something from big blue
Part V: PureSystems, something old, something new, something from big blue
Cloud and Virtual Data Storage Networking

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Part I: PureSystems, something old, something new, something from big blue

This is the first in a five-part series around the recent IBM PureSystems announcements. You can view the next post here.

For a certain generation of IBM faithful or followers the recently announced PureFlex and PureApplication systems might give a sense of DejaVu perhaps even causing some to wonder if they just woke up from a long Rip Van Winkle type nap.

Yet for another generation who may not yet be future IBM followers, fans, partners or customers, there could be a sense of something new and revolutionary with the PureFlex and PureApplication systems (twitter @ibmpuresystems).

In between those two groups, exist others who are either scratching their heads or reinvigorated with enthusiasm to get out and be able to discuss opportunities around little data (traditional and transactional) and big data, servers, virtualized, converged infrastructure, dynamic data centers, private clouds, ITaaS, SaaS and AaaS, PaaS, IaaS and other related themes or buzzword bingo topics.

Let us dig a little deeper and look at some So What types of questions and industry trends perspectives comments around what IBM has announced.

So what did IBM announce?
IBM announced PureSystems including:

  • PureFlex systems, products and technologies
  • PureApplication systems
  • PureSystems Centre

You can think of IBM PureSystems and Flex Systems Products and technology as a:

  • Private cloud or turnkey solution bundle solution
  • Platform deploying public or hybrid clouds
  • Data center in a box or converged and dynamic system
  • ITaaS or SaaS/AaaS or PaaS or IaaS or Cloud in a box
  • Rackem stack and package them type solution

So what is an IBM PureFlex System and what is IBM using?
It is a factory integrated data and compute infrastructure in a cabinet combing cloud, virtualization, servers, data and storage networking capabilities. The IBM PureFlex system is comprised of various IBM and products and technologies (hardware, software and services) optimized with management across physical and virtual resources (servers, storage (V7000), networking, operating systems, hypervisors and tools).

PureFlex includes automation and optimization technologies along with what IBM is referring to as patterns of expertise or what you might relate to as templates. Support for various hypervisors and management integration along with application and operating system support by leveraging IBM xSeries (x86 such as Intel) and pSeries (Power7) based processors for compute. Storage is the IBM V7000 (here and here) with networking and connectivity via IBM and their partners. The solution is capable of supporting traditional, virtual and cloud deployment models as well as platform for deploying Infrastructure as a Service (IaaS) on a public, managed service provider (MSP), hosting or private basis.

Click here to view the next post in this series, ok nuff said for now.

Here are some links to learn more:
Various IBM Redbooks and related content
The blame game: Does cloud storage result in data loss?
What do you need when its time to buy a new server?
2012 industry trends perspectives and commentary (predictions)
Convergence: People, Processes, Policies and Products
Buzzword Bingo and Acronym Update V2.011
The function of XaaS(X) Pick a letter
Hard product vs. soft product
Buzzword Bingo and Acronym Update V2.011
Part I: PureSystems, something old, something new, something from big blue
Part II: PureSystems, something old, something new, something from big blue
Part III: PureSystems, something old, something new, something from big blue
Part IV: PureSystems, something old, something new, something from big blue
Part V: PureSystems, something old, something new, something from big blue
Cloud and Virtual Data Storage Networking

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Going dutch and other Spring 2012 StorageIO activities

Spring 2012 StorageIO traveling out and about events are underway with activities already having occurred in New York City along with several online live and recorded web casts that you can find here and backup, restore, BC, DR and archiving. Other upcoming events and traveling to various venues include Dallas (SNW), San Francisco, Washington DC, Nijkerk Netherlands and Las Vegas among others you can see here. Themes and topics of these and other events include data center convergence, infrastructure optimization, data protection modernization, data protection for virtual and cloud environments, performance and capacity planning, metrics that matter and strategy among others.

Greg in action Nijkerk Storage Seminar

For those of you in the Netherlands, or elsewhere in Europe, I’m going to be doing a two-day seminar for storage professionals along with for those involved in strategy, architecture and related data infrastructure topics on May 7 and 8. On May 9, I will be doing a deep dive companion seminar. You can learn more about these seminars being organized by Brouwer Consultancy in Nijkerk Netherlands by visiting their site here which includes agenda and related information.

Watch for more events, seminars, webinars and virtual trade shows by visiting the StorageIO events page.

Drop me a note if you would like to schedule or arrange for a seminar or event near you.

Ok, nuff said for now, see you out and about

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

If March 31st is backup day, dont be fooled with restore on April 1st

With March 31st as world backup day, hopefully some will keep recovery and restoration in mind to not be fooled on April 1st.

Lost data

When it comes to protecting data, it may not be a headline news disaster such as earthquake, fire, flood, hurricane or act of man, rather something as simply accidentally overwriting a file, not to mention virus or other more likely to occur problems. Depending upon who you ask, some will say backup or saving data is more important while others will standby that it is recovery or restoration that matter. Without one the other is not practical, they need each other and both need to be done as well as tested to make sure they work.

Just the other day I needed to restore a file that I accidentally overwrote and as luck would have it, my local bad copy had also just overwrote my local backup. However I was able to go and pull an earlier version from my cloud provider which gave a good opportunity to test and try some different things. In the course of testing, I did find some things that have since been updated as well as found some things to optimize for the future.

Destroyed data

My opinion is that if not used properly including ignoring best practices, any form of data storage medium or media as well as software could result or be blamed for data loss. For some people they have lost data as a result of using cloud storage services just as other people have lost data or access to information on other storage mediums and solutions. For example, data has been lost on cloud, tape, Hard Disk Drives (HDDs), Solid State Devices (SSD), Hybrid HDDs (HHDD), RAID and non RAID, local and remote and even optical based storage systems large and small. In some cases, there have been errors or problems with the medium or media, in other cases storage systems have lost access to, or lost data due to hardware, firmware, software, or configuration including due to human error among other issues.

Now is the time to start thinking about modernizing data protection, and that means more than simply swapping out media. Data protection modernization the past several years has been focused on treating the symptoms of downstream problems at the target or destination. This has involved swapping out or moving media around, applying data footprint reduction (DFR) techniques downstream to give near term tactical relief as has been the cause with backup, restore, BC and DR for many years. The focus is starting to expand to how to discuss the source of the problem with is an expanding data footprint upstream or at the source using different data footprint reduction tools and techniques. This also means using different metrics including keeping performance and response time in perspective as part of reduction rates vs. ratios while leveraging different techniques and tools from the data footprint reduction tool box. In other words, its time to stop swapping out media like changing tires that keep going flat on a car, find and fix the problem, change the way data is protected (and when) to cut the impact down stream.

Here is a link to a free download of chapter 5 (Data Protection: Backup/Restore and Business Continuance / Disaster Recovery) from my new book Cloud and Virtual Data Storage Networking (CRC Press).

Cloud and Virtual Data Storage NetworkingIntel Recommended Reading List

Additional related links to read more and sources of information:

Choosing the Right Local/Cloud Hybrid Backup for SMBs
E2E Awareness and insight for IT environments
Poll: What Do You Think of IT Clouds?
Convergence: People, Processes, Policies and Products
What do VARs and Clouds as well as MSPs have in common?
Industry adoption vs. industry deployment, is there a difference?
Cloud conversations: Loss of data access vs. data loss
Clouds and Data Loss: Time for CDP (Commonsense Data Protection)?
Clouds are like Electricity: Dont be scared
Wit and wisdom for BC and DR
Criteria for choosing the right business continuity or disaster recovery consultant
Local and Cloud Hybrid Backup for SMBs
Is cloud disaster recovery appropriate for SMBs?
Laptop data protection: A major headache with many cures
Disaster recovery in the cloud explained
Backup in the cloud: Large enterprises wary, others climbing on board
Cloud and Virtual Data Storage Networking (CRC Press, 2011)
Enterprise Systems Backup and Recovery: A Corporate Insurance Policy

Take a few minutes out of your busy schedule and check to see if your backups and data protection are working, as well as make sure to test restoration and recovery to avoid an April fools type surprise. One last thing, you might want to check out the data storage prayer while you are at it.

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Is 14.4TBytes of data storage for $52,503 a good deal? It depends!

A news story about the school board in Marshall Missouri approving data storage plans in addition to getting good news on health insurance rates just came into my in box.

I do not live in or anywhere near Marshall Missouri as I live about 420 miles north in the Stillwater Minnesota area.

What caught my eye about the story is the dollar amount ($52,503) and capacity amount (14.4TByte) for the new Marshall school district data storage solution to replace their old, almost full 4.8TByte system.

That prompted me to wonder, if the school district are getting a really good deal (if so congratulations), paying too much, or if about right.

Industry Trends and Perspectives

Not knowing what type of storage system they are getting, it is difficult to know what type of value the Marshall School district is getting with their new solution. For example, what type of performance and availability in addition to capacity? What type of system and features such as snapshots, replication, data footprint reduction aka DFR capabilities (archive, compression, dedupe, thin provisioning), backup, cloud access, redundancy for availability, application agents or integration, virtualization support, tiering. Or if the 14.4TByte is total (raw) or usable storage capacity or if it includes two storage systems for replication. Or what type of drives (SSD, fast SAS HDD or high-capacity SAS or SATA HDDs), block (iSCSI, SAS or FC) or NAS (CIFS and NFS) or unified, management software and reporting tools among capabilities not to mention service and warranty.

Sure there are less expensive solutions that might work, however since I do not know what their needs and wants are, saying they paid too much would not be responsible. Likewise, not knowing their needs vs. wants, requirements, growth and application concerns, given that there are solutions that cost a lot more with extensive capabilities, saying that they got the deal of the century would also not be fair. Maybe somewhere down the road we will hear some vendor and VAR make a press release announcement about their win in taking out a competitor from the Marshall school district, or perhaps that they upgraded a system they previously sold so we can all learn more.

With school districts across the country trying to stretch their budgets to go further while supporting growth, it would be interesting to hear more about what type of value the Marshall school district is getting from their new storage solution. Likewise, it would also be interesting to hear what alternatives they looked at that were more expensive, as well as cheaper however with less functionality. I’m guessing some of the cloud crowd cheerleaders will also want to know why the school district is going the route they are vs. going to the cloud.

IMHO value is not the same thing as less or lower cost or cheaper, instead its the benefit derived vs. what you pay. This means that something might cost more than something cheaper, however if I get more benefit from what might be more expensive, then it has more value.

Industry Trends and Perspectives

If you are a school district of similar size, what criteria or requirements would you want as opposed to need, and then what would you do or have you done?

What if you are a commercial or SMB environment, again not knowing the feature functionality benefit being obtained, what requirements would you have including want to have (e.g. nice to have) vs. must or have to have (e.g. what you are willing to pay more for), what would you do or have done?

How about if you were a cloud or managed service provider (MSP) or a VAR representing one of the many services, what would your pitch and approach be beyond simply competing on a cost per TByte basis?

Or if you are a vendor or VAR facing a similar opportunity, again not knowing the requirements, what would you recommend a school district or SMB environment to do, why and how to cost justify it?

What this all means to me is the importance of looking beyond lowest cost, or cost per capacity (e.g. cost per GByte or TByte) also factoring in value, feature functionality benefit.

Ok, nuff said for now, I need to get my homework assignments done.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Why SSD based arrays and storage appliances can be a good idea (Part II)

This is the second of a two-part post about why storage arrays and appliances with SSD drives can be a good idea, here is link to the first post.

So again, why would putting drive form factors SSDs be a bad idea for existing storage systems, arrays and appliances?

Benefits of SSD drive in storage systems, arrays and appliances:

  • Familiarity with customers who buy and use these devices
  • Reduces time to market enabling customers to innovate via deployment
  • Establish comfort and confidence with SSD technology for customers
  • Investment protection of currently installed technology (hardware and software)
  • Interoperability with existing interfaces, infrastructure, tools and policies
  • Reliability, availability and serviceability (RAS) depending on vendor implementation
  • Features and functionality (replicate, snapshot, policy, tiering, application integration)
  • Known entity in terms of hardware, software, firmware and microcode (good or bad)
  • Share SSD technology across more servers or accessing applications
  • Good performance assuming no controller, hardware or software bottlenecks
  • Wear leveling and other SSD flash management if implemented
  • Can end performance bottlenecks if backend (drives) are a problem
  • Coexist or complemented with server-based SSD caching

Note, the mere presence of SSD drives in a storage system, array or appliance will not guarantee or enable the above items to be enabled, nor to their full potential. Different vendors and products will implement to various degrees of extensibility SSD drive support, so look beyond the check box of feature, functionality. Dig in and understand how extensive and robust the SSD implementation is to meet your specific requirements.

Caveats of SSD drives in storage systems, arrays and appliances:

  • May not use full performance potential of nand flash SLC technology
  • Latency can be an issue for those who need extreme speed or performance
  • May not be the most innovative newest technology on the block
  • Fun for startup vendors, marketers and their fans to poke fun at
  • Not all vendors add value or optimization for endurance of drive SSD
  • Seen as not being technology advanced vs. legacy or mature systems

Note that different vendors will have various performance characteristics, some good for IOPs, others for bandwidth or throughput while others for latency or capacity. Look at different products to see how they will vary to meet your particular needs.

Cost comparisons are tricky. SSD in HDD form factors certainly cost more than raw flash dies, however PCIe cards and FTL (flash translation layer) controllers also cost more than flash chips by themselves. In other words, apples to apples comparisons are needed. In the future, ideally the baseboard or motherboard vendors will revise the layout to support nand flash (or its replacement) with DRAM DIMM type modules along with associated FTL and BIOS to handle the flash program/erase cycles (P/E) and wear leveling management, something that DRAM does not have to encounter. While that provides great location or locality of reference (figure 1), it is also a more complex approach that takes time and industry cooperation.

Locality of reference for memory and storage
Figure 1: Locality of reference for memory and storage

Certainly, for best performance, just like realty location matters and thus locality of reference comes into play. That is put the data as close to the server as possible, however when sharing is needed, then a different approach or a companion technique is required.

Here are some general thoughts about SSD:

  • Some customers and organizations get the value and role of SSD
  • Some see where SSD can replace HDD, others see where it compliments
  • Yet others are seeing the potential, however are moving cautiously
  • For many environments better than current performance is good enough
  • Environments with the need for speed need every bit of performance they can get
  • Storage systems and arrays or appliances continue to evolve including the media they use
  • Simply looking at how some storage arrays, systems and appliances have evolved, you can get an idea on how they might look in the future which could include not only SAS as a backend or target, also PCIe. After all, it was not that long ago where backend drive connections went from propriety to open parallel SCSI or SSA to Fibre Channel loop (or switched) to SAS.
  • Engineers and marketers tend to gravitate to newer products nand technology, which is good, as we need continued innovation on that front.
  • Customers and business people tend to gravitate towards deriving greatest value out of what is there for as long as possible.
  • Of course, both of the latter two points are not always the case and can be flip flopped.
  • Ultrahigh end environments and corner case applications will continue to push the limits and are target markets for some of the newer products and vendors.
  • Likewise, enterprise, mid market and other mainstream environments (outside of their corner case scenarios) will continue to push known technology to its limits as long as they can derive some business benefit value.

While not perfect, SSD in a HDD form factor with a SAS or SATA interface properly integrated by vendors into storage systems (or arrays or appliances) are a good fit for many environments today. Likewise, for some environments, new from the ground up SSD based solutions that leverage flash DIMM or daughter cards or PCIe flash cards are a fit. So to are PCIe flash cards either as a target, or as cache to complement storage system (arrays and appliances). Certainly, drive slots in arrays take up space for SSD, however so to does occupying PCIe space particularly in high density servers that require every available socket and slot for compute and DRAM memory. Thus, there are pros and cons, features and benefits of various approaches and which is best will depend on your needs and perhaps preferences, which may or may not be binary.

I agree that for some applications and solutions, non drive form factor SSD make sense while in others, compatibility has its benefits. Yet in other situations nand flash such as SLC combined with HDD and DRAM tightly integrated such as in my Momentus XT HHDD is good for laptops, however probably not a good fit for enterprise yet. Thus, SSD options and placements are not binary, of course, sometimes opinions and perspectives will be.

For some situations PCIe, based cards in servers or appliances make sense, either as a target or as cache. Likewise for other scenarios drive format SSD make sense in servers and storage systems, appliances, arrays or other solutions. Thus while all of those approaches are used for storing binary digital data, the solutions of what to use when and where often will not be binary, that is unless your approach is to use one tool or technique for everything.

Here are some related links to learn more about SSD, where and when to use what:
Why SSD based arrays and storage appliances can be a good idea (Part I)
IT and storage economics 101, supply and demand
Researchers and marketers dont agree on future of nand flash SSD
Speaking of speeding up business with SSD storage
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments, Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

Why SSD based arrays and storage appliances can be a good idea (Part I)

This is the first of a two-part series, you can read part II here.

Robin Harris (aka @storagemojo) recently in a blog post asks a question and thinks solid state devices (SSDs) using SAS or SATA interface in traditional hard disk drive (HDD) form factors are a bad idea in storage arrays (e.g. storage systems or appliances). My opinion is that as with many things about storing, processing or moving binary digital data (e.g. 1s and 0s) the answer is not always clear. That is there may not be a right or wrong answer instead it depends on the situation, use or perhaps abuse scenario. For some applications or vendors, adding SSD packaged in HDD form factors to existing storage systems, arrays and appliances makes perfect sense, likewise for others it does not, thus it depends (more on that in a bit). While we are talking about SSD, Ed Haletky (aka @texiwill) recently asked a related question of Fix the App or Add Hardware, which could easily be morphed into a discussion of Fix the SSD, or Add Hardware. Hmmm, maybe a future post idea exists there.

Lets take a step back for a moment and look at the bigger picture of what prompts the question of what type of SSD to use where and when along as well as why various vendors want you to look at things a particular way. There are many options for using SSD that is packaged in various ways to meet diverse needs including here and here (see figure 1).

Various SSD packaging options
Figure 1: Various packaging and deployment options for SSD

The growing number of startup and established vendors with SSD enabled storage solutions vying to win your hearts, minds and budget is looking like the annual NCAA basketball tournament (aka March Madness and march metrics here and here). Some of vendors have or are adding SSD with SAS or SATA interfaces that plug into existing enclosures (drive slots). These SSDs have the same form factor of a 2.5 inch small form factor (SFF) or 3.5 inch HDDs with a SAS or SATA interface for physical and connectivity interoperability. Other vendors have added PCIe based SSD cards to their storage systems or appliances as a cache (read or read and write) or a target device similar to how these cards are installed in servers.

Simply adding SSD either in a drive form factor or as a PCIe card to a storage system or appliance is only part of a solution. Sure, the hardware should be faster than a traditional spinning HDD based solution. However, what differentiates the various approaches and solutions is what is done with the storage systems or appliances software (aka operating system, storage applications, management, firmware or micro code).

So are SSD based storage systems, arrays and appliances a bad idea?

If you are a startup or established vendor able to start from scratch with a clean sheet design not having to worry about interoperability and customer investment protection (technology, people skills, software tools, etc), then you would want to do something different. For example, leverage off the shelf components such as a PCIe flash SSD card in an industry standard server combined with your software for a solution. You could also use extra DRAM memory in those servers combined with PCIe flash SSD cards perhaps even with embedded HDDs for a backing or preservation medium.

Other approaches might use a mix of DRAM, PCIe flash cards, as either a cache or target combined with some drive form factor SSDs. In other words, there is no right or wrong approach; sure, there are different technical merits that have advantages for various applications or environments. Likewise, people have preferences particular for technology focused who tend to like one approach vs. another. Thus, we have many options to leverage, use or abuse.

In his post, Robin asks a good question of if nand flash SSD were being put into a new storage system, why not use the PCIe backplane vs. using nand flash on DIMM vs. using drive formats, all of which are different packaging options (Figure 1). Some startups have gone the all backplane approach, some have gone with the drive form factor, some have gone with a mix and some even using HDDs in the background. Likewise some traditional storage system and array vendors who support a mix of SSD and HDD drive form factor devices also leverage PCIe cards, either as a server-based cache (e.g. EMC VFCahe) or installed as a performance accelerator module (e.g. NetApp PAM) in their appliances.

While most vendors who put SSD drive form factor drives into their storage systems or appliances (or serves for that matter) use them as data targets for creating LUNs or file systems, others use them for internal functionality. By internal functionality I mean instead of the SSD appearing as another drive or target, they are used exclusively by the storage system or appliance for caching or similar purposes. On storage systems, this can be to increase the size of persistent cache such as EMC on the CLARiiON and VNX (e.g. FAST Cache). Another use is on backup or dedupe target appliances where SSDs are used to store dictionary, index or meta data repositories as opposed to being a general data pool.

Part two of this post looks at the benefits and caveats of SSD in storage arrays.

Here are some related links to learn more about SSD, where and when to use what:
Why SSD based arrays and storage appliances can be a good idea (Part II)
IT and storage economics 101, supply and demand
Researchers and marketers don’t agree on future of nand flash SSD
Speaking of speeding up business with SSD storage
EMC VFCache respinning SSD and intelligent caching (Part I)
EMC VFCache respinning SSD and intelligent caching (Part II)
SSD options for Virtual (and Physical) Environments: Part I Spinning up to speed on SSD
SSD options for Virtual (and Physical) Environments, Part II: The call to duty, SSD endurance
SSD options for Virtual (and Physical) Environments Part III: What type of SSD is best for you?

Ok, nuff said for now, check part II.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved