Chat with Cash Coleman talking ClearDB, cloud database and Johnny Cash

Podcast with Cash Coleman talking ClearDB, cloud database and Johnny Cash

audio

In this episode from the SNIA DSI 2014 event I am joined by Cashton Coleman (@Cash_Coleman).

cash coleman cleardb

Introducing Cashton (Cash) Coleman and ClearDB

Cashton (Cash) is a Software architect, product mason, family bonder, life builder, idea founder along with Founder & CEO of SuccessBricks, Inc., makers of ClearDB. ClearDB is a provider of MySQL database software tools for cloud and physical environments. In our conversation talk about ClearDB, what they do and whom they do it with including deployments in cloud’s as well as onsite. For example if you are using some of the Microsoft Azure cloud services using MySQL, you may already be using this technology. However, there is more to the story and discussion including how Cash got his name, how to speed up databases for little and big data among other topics.

If you are a database person, you will want to listen to what Cash has to say about boosting performance and getting more value out of your physical hardware or cloud services. On the other hand if you are a storage person, listen in to get some insight and ideas on to address database performance and resiliency. For others who just like to listen to new trends, technology talk, or hear about emerging companies to keep an eye on, you wont want to miss the podcast conversation.

Topics and themes discussed:

  • Traditional and Cloud Database
  • MySQL and Database as a Service (DaaS)
  • Microsoft Azure and Amazon Web Services (AWS)
  • Little Data, Big Data and Big Data Databases
  • Boosting database performance with less hardware
  • Getting more value out of fast SSD hardware
  • Database performance and resiliency
  • What’s the Johnny Cash and Cloud Connection
  • Check out ClearDB and listen in to the conversation with Cash podcast here.

    Also available via 

    Ok, nuff said.

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Book review: Rethinking Enterprise Storage by Microsoft/Storsimple Marc Farley

    Storage I/O trends

    Book review: Rethinking Enterprise Storage – A Hybrid Cloud Model by Marc Farley

    The O’Reilly @oreillymedia media folks (oops, excuse me, Microsoft Press) sent me out (that’s a disclosure btw) an real soft cover print copy of Rethinking Enterprise Storage – A Hybrid Cloud Model by Marc Farley aka @MicroFarley of Microsoft/Storsimple that features a forward by Martin Glassborow aka @Storagebod.

    Rethinking Enterprise Storage - A Hybrid Cloud Model

    Topics and themes covered in the book

    • Understanding scale storage architectures (hmm, great way of saying hybrid ;)
    • Rethinking data protection including disaster recovery (DR) best practices
    • Enhancing data protection using cloud snapshots beyond traditional backups
    • Deterministic thin recovery capabilities while dynamically expanding capacity to the cloud
    • Implement data footprint reduction (DFR) including archiving digital documents to the cloud
    • Insight and awareness into keep performance indicators along with various use cases

    Rethinking Enterprise Storage book Details

    Publisher: Microsoft Press
    Author: Marc Farley
    Paper back
    Features: Many diagrams, figures, index, glossary
    Pages: 101
    ISBN: 978-0-7356-7990-3
    Published: 2013
    MSRP: $9.99 USD

    Sample pages of rethinking enterprise storage
    One of the many books many figures on the right, on the left i needed to hold a page down ;)!

    What’s inside the book

    Make no mistake that this is a Microsoft and Storsimple themed book, however IMHO Marc (aka Farley) does a great job of making it more relevant than just another vendor product book (JAVPB). While it is a Microsoft focused book around enabling hybrid cloud storage for various applications, the premises presented could be adapted for other environments or implementations. The book at 101 pages including table of contents (TOC), index, appendix, glossary and other front matter is a very easy and fast read while providing more information or coverage than what might be found in a "Dummies" type themed book.

    Looking inside Rethinking Enterprise Storage by Marc Farley
    Start thinking outside the box (or cloud), imagine what you can do with a Hybrid cloud!

    Summary

    Overall I found the book to be good and not just because I know Marc or that the O’Reilly folks sent me a free copy (I had actually previously received the electronic ebook version), rather that it is timely and does a nice job of conveying the topic theme and setting up the conversation, time to rethink storage for enterprise and other environments. IMHO the question is not if hybrid cloud storage is in your future, rather when, where, why, for what, how, with whom and related conversations. While you can buy a copy of the book at various venues, it shouldn’t take a lot of effort to get your own printed soft cover copy, or an ebook version.

    Btw, here’s a pod cast discussion with Marc Farley from spring 2013 at SNW, as well as a link to a hybrid cloud and object storage post he did over at Microsoft Technet.

    To summarize and quote Marc Farley "Hey now…."

    Ok, nuff said

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2013 StorageIO and UnlimitedIO All Rights Reserved

    November 2013 Server and StorageIO Update Newsletter & AWS reinvent info


    November 2013 Server and StorageIO Update Newsletter & AWS reinvent info

    Welcome to the November 2013 edition of the StorageIO Update (newsletter) containing trends perspectives on cloud, virtualization and data infrastructure topics. Fall (here in North America) has been busy with in-person, on-line live and virtual events along with various client projects, research, time in the StorageIO cloud, virtual and physical lab test driving, validating and doing proof of concept research among other tasks. Check out the industry trends perspectives articles, comments and blog posts below that covers some activity over the past month.

    Last week I had the chance to attend the second annual AWS re:Invent event in Las Vegas, see my comments, perspectives along with a summary of announcements from that conference below.

    Watch for future posts, commentary, perspectives and other information down the road (and in the not so distant future) pertaining to information and data infrastructure topics, themes and trends across cloud, virtual, legacy server, storage, networking, hardware and software. Also check out our backup, restore, BC, DR and archiving (Under the resources section on StorageIO.com) for various presentation, book chapter downloads and other content.

    Enjoy this edition of the StorageIO Update newsletter.

    Ok, nuff said (for now)

    Cheers gs

    StorageIO Industry Trends and Perspectives

    Industry trends: Amazon Web Services (AWS) re:Invent

    Last week I attended the AWS re:Invent event in Las Vegas. This was the second annual AWS re:Invent conference which while having an AWS and cloud theme, it is also what I would describe as a data infrastructure event.

    As a data infrastructure event AWS re:Invent spans traditional legacy IT and applications to newly invented, re-written, re-hosted or re-platformed ones from existing and new organizations. By this I mean a mix of traditional IT or enterprise people as well as cloud and virtual geek types (said with affection and all due respect of course) across server (operating system, software and tools), storage (primary, secondary, archive and tools), networking, security, development tools, applications and architecture.

    That also means management from application and data protection spanning High Availability (HA), Business Continuance (BC), Disaster Recovery (DR), backup/restore, archiving, security, performance and capacity planning, service management among other related themes across public, private, hybrid and community cloud environments or paradigms. Hmm, I think I know of a book that covers the above and other related topic themes, trends, technologies and best practices called Cloud and Virtual Data Storage Networking (CRC Press) available via Amazon.com in print and Kindle (among other) versions.

    During the event AWS announced enhanced and new services including:

    • WorkSpaces (Virtual Desktop Infrastructure – VDI) announced as a new service for cloud based desktops across various client devices including laptops, Kindle Fire, iPad and Android tablets using PCoIP.
    • Kinesis which is a managed service for real-time processing of streaming (e.g. Big) data at scale including ability to collect and process hundreds of GBytes of data per second across hundreds of thousands of data sources. On top of Kinesis you can build your big data applications or conduct analysis to give real-time key performance indicator dashboards, exception and alarm or event notification and other informed decision-making activity.
    • EC2 C3 instances provide Intel Xeon E5 processors and Solid State Device (SSD) based direct attached storage (DAS) like functionality vs. EBS provisioned IOPs for cost-effective storage I/O performance and compute capabilities.
    • Another EC2 enhancement are G2 instance that leverage high performance NVIDIA GRID GPU with 1,536 parallel processing cores. This new instance is well suited for 3D graphics, rendering, streaming video and other related applications that need large-scale parallel or high performance compute (HPC) also known as high productivity compute.
    • Redshift (cloud data warehouse) now supports cross region snapshots for HA, BC and DR purposes.
    • CloudTrail records AWS API calls made via the management console for analytics and logging of API activity.
    • Beta of Trusted Advisor dashboard with cost optimization saving estimates including EBS and provisioned IOPs
    • Relational Database Service (RDS) support for PostgresSQL including multi-AZ deployment.
    • Ability to discover and launch various software from AWS Marketplace via the EC2 Console. The AWS Marketplace for those not familiar with it is a catalog of various software or application titles (over 800 products across 24 categories) including free and commercial licensed solutions that include SAP, Citrix, Lotus Notes/Domino among many others.
    • AppStream is a low latency (STX protocol based) service for streaming resource (e.g. compute, storage or memory) intensive applications and games from AWS cloud to various clients, desktops or mobile devices. This means that the resource intensive functionality can be shifted to the cloud, while providing a low latency (e.g. fast) user experience off-loading the client from having to support increased compute, memory or storage capabilities. Key to AppStream is the ability to stream data in a low-latency manner including over networks normally not suited for high quality or bandwidth intensive applications. IMHO AppStream while focused initially on mobile app’s and gaming, being a bit streaming technology has the potential to be used for other similar functions that can leverage download speed improvements.
    • When I asked an AWS person if or what role AppStream might have or related to WorkSpaces their only response was a large smile and no comment. Does this mean WorkSpaces leverages AppStream? Candidly I don’t know, however if you look deeper into AppStream and expand your horizons, see what you can think up in terms of innovation. Updated 11/21/13 AWS has provided clarification that WorkSpaces is based on PCoIP while AppStream uses the STX protocols.

      Check out AWS Sr. VP Andy Jassy keynote presentation here.

    Overall I found the AWS re:Invent event to be a good conference spanning many aspects and areas of focus which means I will be putting it on my must attend list for 2014.

    StorageIO Industry Trends and PerspectivesIndustry trends tips, commentary, articles and blog posts
    What is being seen, heard and talked about while out and about

    The following is a synopsis of some StorageIOblog posts, articles and comments in different venues on various industry trends, perspectives and related themes about clouds, virtualization, data and storage infrastructure topics among related themes.

    Storage I/O posts

    Recent industry trends, perspectives and commentary by StorageIO Greg Schulz in various venues:

    NetworkComputing: Comments on Software-Defined Storage Startups Win Funding

    Digistor: Comments on SSD and flash storage
    InfoStor: Comments on data backup and virtualization software

    ITbusinessEdge: Comments on flash SSD and hybrid storage environments

    NetworkComputing: Comments on Hybrid Storage Startup Nimble Storage Files For IPO

    InfoStor: Comments on EMC’s Light to Speed: Flash, VNX, and Software-Defined

    InfoStor: Data Backup Virtualization Software: Four Solutions

    ODSI: Q&A With Greg Schulz – A Quick Roundup of Data Storage Industry

    Recent StorageIO Tips and Articles in various venues:

    FedTechMagazine: 3 Tips for Maximizing Tiered Hypervisors
    InfoStor:
    RAID Remains Relevant, Really!

    Storage I/O trends

    Recent StorageIO blog post:

    EMC announces XtremIO General Availability (Part I) – Announcement analysis of the all flash SSD storage system
    Part II: EMC announces XtremIO General Availability, speeds and feeds – Part two of two part series with analysis
    What does gaining industry traction or adoption mean too you? – There is a difference between buzz and deployment
    Fall 2013 (September and October) StorageIO Update Newsletter – In case you missed the fall edition, here it is

    StorageIO Industry Trends and Perspectives

    Check out our objectstoragecenter.com page where you will find a growing collection of information and links on cloud and object storage themes, technologies and trends.

    Server and StorageIO seminars, conferences, web cats, events, activities StorageIO activities (out and about)

    Seminars, symposium, conferences, webinars
    Live in person and recorded recent and upcoming events

    While 2013 is winding down, the StorageIO calendar continues to evolve, here are some recent and upcoming activities.

    December 11, 2013 Backup.UData Protection for Cloud 201Backup.U
    Google+ hangout
    December 3, 2013 Backup.UData Protection for Cloud 101Backup.U
    Online Webinar
    November 19, 2013 Backup.UData Protection for Virtualization 201Backup.U
    Google+ hangout
    November 12-13, 2013AWS re:InventAWS re:Invent eventLas Vegas, NV
    November 5, 2013 Backup.UData Protection for Virtualization 101Backup.U
    Online Webinar
    October 22, 2013 Backup.UData Protection for Applications 201Backup.U
    Google+ hangout

    Click here to view other upcoming along with earlier event activities. Watch for more 2013 events to be added soon to the StorageIO events calendar page. Topics include data protection modernization (backup/restore, HA, BC, DR, archive), data footprint reduction (archive, compression, dedupe), storage optimization, SSD, object storage, server and storage virtualization, big data, little data, cloud and object storage, performance and management trends among others.

    Vendors, VAR’s and event organizers, give us a call or send an email to discuss having us involved in your upcoming pod cast, web cast, virtual seminar, conference or other events.

    If you missed the Fall (September and October) 2013 StorageIO update newsletter, click here to view that and other previous editions as HTML or PDF versions. Subscribe to this newsletter (and pass it along)

    and click here to subscribe to this news letter. View archives of past StorageIO update news letters as well as download PDF versions at: www.storageio.com/newsletter

    Ok, nuff said (for now).
    Cheers Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved    

    Some fall 2013 AWS cloud storage and compute enhancements

    Storage I/O trends

    Some fall 2013 AWS cloud storage and compute enhancements

    I just received via Email the October Amazon Web Services (AWS) Newsletter in advance of the re:Invent event next week in Las Vegas (yes I will be attending).

    AWS October newsletter and enhancement updates

    What this means

    AWS is arguably the largest of the public cloud services with a diverse set of services and options across multiple geographic regions to meet different customer needs. As such it is not surprising to see AWS continue to expand their service offerings expanding their portfolio both in terms of features, functionalities along with extending their presences in different geographies.

    Lets see what else AWS announces next week in Las Vegas at their 2013 re:Invent event.

    Click here to view the current October 2013 AWS newsletter. You can view (and signup for) earlier AWS newsletters here, and while you are at it, view the current and recent StorageIO Update newsletters here.

    Ok, nuff said (for now)

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Cloud conversations: Has Nirvanix shutdown caused cloud confidence concerns?

    Storage I/O trends

    Cloud conversations: Has Nirvanix shutdown caused cloud confidence concerns?

    Recently seven plus year old cloud storage startup Nirvanix announced that they were finally shutting down and that customers should move their data.

    nirvanix customer message

    Nirvanix has also posted an announcement that they have established an agreement with IBM Softlayer (read about that acquisition here) to help customers migrate to those services as well as to those of Amazon Web Services (AWS), (read more about AWS in this primer here), Google and Microsoft Azure.

    Cloud customer concerns?

    With Nirvanix shutting down there has been plenty of articles, blog posts, twitter tweets and other conversations asking if Clouds are safe.

    Btw, here is a link to my ongoing poll where you can cast your vote on what you think about clouds.

    IMHO clouds can be safe if used in safe ways which includes knowing and addressing your concerns, not to mention following best practices, some of which pre-date the cloud era, sometimes by a few decades.

    Nirvanix Storm Clouds

    More on this in a moment, however lets touch base on Nirvanix and why I said they were finally shutting down.

    The reason I say finally shutting down is that there were plenty of early warning signs and storm clouds circling Nirvanix for a few years now.

    What I mean by this is that in their seven plus years of being in business, there have been more than a few CEO changes, something that is not unheard of.

    Likewise there have been some changes to their business model ranging from selling their software as a service to a solution to hosting among others, again, smart startups and establishes organizations will adapt over time.

    Nirvanix also invested heavily in marketing, public relations (PR) and analyst relations (AR) to generate buzz along with gaining endorsements as do most startups to get recognition, followings and investors if not real customers on board.

    In the case of Nirvanix, the indicator signs mentioned above also included what seemed like a semi-annual if not annual changing of CEOs, marketing and others tying into business model adjustments.

    cloud storage

    It was only a year or so ago that if you gauged a company health by the PR and AR news or activity and endorsements you would have believed Nirvanix was about to crush Amazon, Rackspace or many others, perhaps some actually did believe that, followed shortly there after by the abrupt departure of their then CEO and marketing team. Thus just as fast as Nirvanix seemed to be the phoenix rising in stardom their aura started to dim again, which could or should have been a warning sign.

    This is not to solo out Nirvanix, however given their penchant for marketing and now what appears to some as a sudden collapse or shutdown, they have also become a lightning rod of sort for clouds in general. Given all the hype and fud around clouds when something does happen the distract ors will be quick to jump or pile on to say things like "See, I told you, clouds are bad".

    Meanwhile the cloud cheerleaders may go into denial saying there are no problems or issues with clouds, or they may go back into a committee meeting to create a new stack, standard, API set marketing consortium alliance. ;) On the other hand, there are valid concerns with any technology including clouds that in general there are good implementations that can be used the wrong way, or questionable implementations and selections used in what seem like good ways that can go bad.

    This is not to say that clouds in general whether as a service, solution or product on a public, private or hybrid bases are any riskier than traditional hardware, software and services. Instead what this should be is a wake up call for people and organizations to review clouds citing their concerns along with revisiting what to do or can be done about them.

    Clouds: Being prepared

    Ben Woo of Neuralytix posted this question comment to one of the Linked In groups Collateral Considerations If You Were/Are A Nirvanix Customer which I posted some tips and recommendations including:

    1) If you have another copy of your data somewhere else (which you should btw), how will your data at Nirvanix be securely erased, and the storage it resides on be safely (and secure) decommissioned?

    2) if you do have another copy of your data elsewhere, how current is it, can you bring it up to date from various sources (including update from Nirvanix while they stay online)?

    3) Where will you move your data to short or near term, as well as long-term.

    4) What changes will you make to your procurement process for cloud services in the future to protect against situations like this happening to you?

    5) As part of your plan for putting data into the cloud, refine your strategy for getting it out, moving it to another service or place as well as having an alternate copy somewhere.

    Fwiw any data I put into a cloud service there is also another copy somewhere else which even though there is a cost, there is a benefit, The benefit is that ability to decide which to use if needed, as well as having a backup/spare copy.

    Storage I/O trends

    Cloud Concerns and Confidence

    As part of cloud procurement as services or products, the same proper due diligence should occur as if you were buying traditional hardware, software, networking or services. That includes checking out not only the technology, also the companies financial, business records, customer references (both good and not so good or bad ones) to gain confidence. Part of gaining that confidence also involves addressing ahead of time how you will get your data out of or back from that services if needed.

    Keep in mind that if your data is very important, are you going to keep it in just one place? For example I have data backed-up as well as archived to cloud providers, however I also have local copies either on-site or off.

    Likewise there is data I have local kept at alternate locations including cloud. Sure that is costly, however by not treating all of my data and applications the same, I’m able to balance those costs out, plus use cost advantages of different services as well as on-site to be effective. I may be spending no less on data protection, in fact I’m actually spending a bit more, however I also have more copies and versions of important data and in multiple locations. Data that is not changing often does not get protected as often, however there are multiple copies to meet different needs or threat risks.

    Storage I/O trends

    Don’t be scared of clouds, be prepared

    While some of the other smaller cloud storage vendors will see some new customers, I suspect that near to mid-term, it will be the larger, more established and well funded providers that gain the most from this current situation. Granted some customers are looking for alternatives to the mega cloud providers such as Amazon, Google, HP, IBM, Microsoft and Rackspace among others, however there are a long list of others some of which who are not so well-known that should be such as Centurylink/Savvis, Verizon/Terremark, Sungurd, Dimension Data, Peak, Bluehost, Carbonite, Mozy (owned by EMC), Xerox ACS, Evault (owned by Seagate) not to mention a long list of many others.

    Something to be aware of as part of doing your due diligence is determining who or what actually powers a particular cloud service. The larger providers such as Rackspace, Amazon, Microsoft, HP among others have their own infrastructure while some of the smaller service providers may in fact use one of the larger (or even smaller) providers as their real back-end. Hence understanding who is behind a particular cloud service is important to help decide the viability and stability of who it is you are subscribed to or working with.

    Something that I have said for the past couple of years and a theme of my book Cloud and Virtual Data Storage Networking (CRC Taylor & Francis) is do not be scared of clouds, however be ready, do your homework.

    This also means having cloud concerns is a good thing, again don’t be scared, however find what those concerns are along with if they are major or minor. From that list you can start to decide how or if they can be worked around, as well as be prepared ahead of time should you either need all of your cloud data back quickly, or should that service become un-available.

    Also when it comes to clouds, look beyond lowest cost or for free, likewise if something sounds too good to be true, perhaps it is. Instead look for value or how much do you get per what you spend including confidence in the service, service level agreements (SLA), security, and other items.

    Keep in mind, only you can prevent data loss either on-site or in the cloud, granted it is a shared responsibility (With a poll).

    Additional related cloud conversation items:
    Cloud conversations: AWS EBS Optimized Instances
    Poll: What Do You Think of IT Clouds?
    Cloud conversations: Gaining cloud confidence from insights into AWS outages
    Cloud conversations: confidence, certainty and confidentiality
    Cloud conversation, Thanks Gartner for saying what has been said
    Cloud conversations: AWS EBS, Glacier and S3 overview (Part III)
    Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)
    Don’t Let Clouds Scare You – Be Prepared
    Everything Is Not Equal in the Datacenter, Part 3
    Amazon cloud storage options enhanced with Glacier
    What do VARs and Clouds as well as MSPs have in common?
    How many degrees separate you and your information?

    Ok, nuff said.

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Can we get a side of context with them IOPS server storage metrics?

    Can we get a side of context with them server storage metrics?

    Whats the best server storage I/O network metric or benchmark? It depends as there needs to be some context with them IOPS and other server storage I/O metrics that matter.

    There is an old saying that the best I/O (Input/Output) is the one that you do not have to do.

    In the meantime, let’s get a side of some context with them IOPS from vendors, marketers and their pundits who are tossing them around for server, storage and IO metrics that matter.

    Expanding the conversation, the need for more context

    The good news is that people are beginning to discuss storage beyond space capacity and cost per GByte, TByte or PByte for both DRAM or nand flash Solid State Devices (SSD), Hard Disk Drives (HDD) along with Hybrid HDD (HHDD) and Solid State Hybrid Drive (SSHD) based solutions. This applies to traditional enterprise or SMB IT data center with physical, virtual or cloud based infrastructures.

    hdd and ssd iops

    This is good because it expands the conversation beyond just cost for space capacity into other aspects including performance (IOPS, latency, bandwidth) for various workload scenarios along with availability, energy effective and management.

    Adding a side of context

    The catch is that IOPS while part of the equation are just one aspect of performance and by themselves without context, may have little meaning if not misleading in some situations.

    Granted it can be entertaining, fun to talk about or simply make good press copy for a million IOPS. IOPS vary in size depending on the type of work being done, not to mention reads or writes, random and sequential which also have a bearing on data throughout or bandwidth (Mbytes per second) along with response time. Not to mention block, file, object or blob as well as table.

    However, are those million IOP’s applicable to your environment or needs?

    Likewise, what do those million or more IOPS represent about type of work being done? For example, are they small 64 byte or large 64 Kbyte sized, random or sequential, cached reads or lazy writes (deferred or buffered) on a SSD or HDD?

    How about the response time or latency for achieving them IOPS?

    In other words, what is the context of those metrics and why do they matter?

    storage i/o iops
    Click on image to view more metrics that matter including IOP’s for HDD and SSD’s

    Metrics that matter give context for example IO sizes closer to what your real needs are, reads and writes, mixed workloads, random or sequential, sustained or bursty, in other words, real world reflective.

    As with any benchmark take them with a grain (or more) of salt, they key is use them as an indicator then align to your needs. The tool or technology should work for you, not the other way around.

    Here are some examples of context that can be added to help make IOP’s and other metrics matter:

    • What is the IOP size, are they 512 byte (or smaller) vs. 4K bytes (or larger)?
    • Are they reads, writes, random, sequential or mixed and what percentage?
    • How was the storage configured including RAID, replication, erasure or dispersal codes?
    • Then there is the latency or response time and IO queue depths for the given number of IOPS.
    • Let us not forget if the storage systems (and servers) were busy with other work or not.
    • If there is a cost per IOP, is that list price or discount (hint, if discount start negotiations from there)
    • What was the number of threads or workers, along with how many servers?
    • What tool was used, its configuration, as well as raw or cooked (aka file system) IO?
    • Was the IOP’s number with one worker or multiple workers on a single or multiple servers?
    • Did the IOP’s number come from a single storage system or total of multiple systems?
    • Fast storage needs fast serves and networks, what was their configuration?
    • Was the performance a short burst, or long sustained period?
    • What was the size of the test data used; did it all fit into cache?
    • Were short stroking for IOPS or long stroking for bandwidth techniques used?
    • Data footprint reduction (DFR) techniques (thin provisioned, compression or dedupe) used?
    • Were write data committed synchronously to storage, or deferred (aka lazy writes used)?

    The above are just a sampling and not all may be relevant to your particular needs, however they help to put IOP’s into more contexts. Another consideration around IOPS are the configuration of the environment, from an actual running application using some measurement tool, or are they generated from a workload tool such as IOmeter, IOrate, VDbench among others.

    Sure, there are more contexts and information that would be interesting as well, however learning to walk before running will help prevent falling down.

    Storage I/O trends

    Does size or age of vendors make a difference when it comes to context?

    Some vendors are doing a good job of going for out of this world record-setting marketing hero numbers.

    Meanwhile other vendors are doing a good job of adding context to their IOP or response time or bandwidth among other metrics that matter. There is a mix of startup and established that give context with their IOP’s or other metrics, likewise size or age does not seem to matter for those who lack context.

    Some vendors may not offer metrics or information publicly, so fine, go under NDA to learn more and see if the results are applicable to your environments.

    Likewise, if they do not want to provide the context, then ask some tough yet fair questions to decide if their solution is applicable for your needs.

    Storage I/O trends

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    What this means is let us start putting and asking for metrics that matter such as IOP’s with context.

    If you have a great IOP metric, if you want it to matter than include some context such as what size (e.g. 4K, 8K, 16K, 32K, etc.), percentage of reads vs. writes, latency or response time, random or sequential.

    IMHO the most interesting or applicable metrics that matter are those relevant to your environment and application. For example if your main application that needs SSD does about 75% reads (random) and 25% writes (sequential) with an average size of 32K, while fun to hear about, how relevant is a million 64 byte read IOPS? Likewise when looking at IOPS, pay attention to the latency, particular if SSD or performance is your main concern.

    Get in the habit of asking or telling vendors or their surrogates to provide some context with them metrics if you want them to matter.

    So how about some context around them IOP’s (or latency and bandwidth or availability for that matter)?

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    IBM buys Softlayer, for software defined infrastructures and clouds?

    Storage I/O trends

    IBM today announced that they are acquiring privately held Dallas Texas-based Softlayer and Infrastructure as a Service (IaaS) provider.

    IBM is referring to this as Cloud without Compromise (read more about clouds, conversations and confidence here).

    It’s about the management, flexibly, scale up, out and down, agility and valueware.

    Is this IBM’s new software defined data center (SDDC) or software defined infrastructure (SDI) or software defined management (SDM), software defined cloud (SDC) or software defined storage (SDS) play?

    This is more than a software defined marketing or software defined buzzword announcement.
    buzzword bingo

    If your view of software define ties into the theme of leveraging, unleashing resources, enablement, flexibility, agility of hardware, software or services, then you may see Softlayer as part of a software defined infrastructure.

    On the other hand, if your views or opinions of what is or is not software defined align with a specific vendor, product, protocol, model or punditry then you may not agree, particular if it is in opposition to anything IBM.

    Cloud building blocks

    During today’s announcement briefing call with analysts there was a noticeable absence of software defined buzz talk which given its hype and usage lately, was a refreshing welcome relief. So with that, lets set the software defined conversation aside (for now).

    Cloud image

    Who is Softlayer, why is IBM interested in them?

    Softlayer provide software and services to support both SMB, SME and other environments with bare metal (think traditional hosted servers), along with multi-tenant (shared) cloud virtual public and private cloud service offerings.

    Softlayer supports various applications, environments from little data processing to big data analytics to little data processing, from social to mobile to legacy. This includes those app’s or environments that were born in the cloud, or legacy environments looking to leverage cloud in a complimentary way.

    Some more information about Softlayer includes:

    • Privately held IaaS firm founded in 2005
    • Estimated revenue run rate of around $400 million with 21,000 customers
    • Mix of SMB, SME and Web-based or born in the cloud customers
    • Over 100,000 devices under management
    • Provides a common modularized management framework set of tools
    • Mix of customers from Web startups to global enterprise
    • Presence in 13 data centers across the US, Asia and Europe
    • Automation, interoperability, large number of API access and supported
    • Flexibility, control and agility for physical (bare metal) and cloud or virtual
    • Public, private and data center to data center
    • Designed for scale, durability and resiliency without complexity
    • Part of OpenStack ecosystem both leveraging and supporting it
    • Ability for customers to use OpenStack, Cloudstack, Citrix, VMware, Microsoft and others
    • Can be white or private labeled for use as a service by VARs

    Storage I/O trends

    What IBM is planning for Softlayer

    Softlayer will report into IBM Global Technology Services (GTS) complimenting existing capabilities which includes ten cloud computing centers on five continents. IBM has created a new Cloud Services Division and expects cloud revenues could be $7 billion annually by the end of 2015. Amazon Web Services (AWS) is estimated to hit about $3.8 Billion by end of 2013. Note that in 2012 AWS target available market was estimated to be about $11 Billion which should become larger moving forward. Rackspace by comparison had recent earning announcements on May 8 2013 of $362 Million with most that being hosting vs. cloud services. That works out to an annualized estimated run rate of $1.448 Billion (or better depending on growth).

    I mention AWS and Rackspace to illustrate the growth potential for IBM and Softlayer to discuss the needs of both cloud services customers such as those who use AWS (among other providers), as well as bare metal or hosting or dedicated servers such as with Rackspace among others.

    Storage I/O trends

    What is not clear at this time is if IBM is combing traditional hosting, managed services, new offerings, products and services in that $7 billion number. In other words if the $7 billion represents what the revenues of the new Cloud Services Division independent of other GTS or legacy offerings as well as excluding hardware, software products from STG (Systems Technology Group) among others, that would be impressive and a challenge to the likes of AWS.

    IBM has indicated that it will leverage its existing Systems Technology Group (STG) portfolio of servers and storage extending the capabilities of Softlayer. While currently x86 based, one could expect IBM to leverage and add support for their Power systems line of processors and servers, Puresystems, as well as storage such as XIV or V7000 among others for tier 1 needs.

    Some more notes:

    • Ties into IBM Smart Cloud initiatives, model and paradigm
    • This deal is expected to close 3Q 2013, terms or price were not disclosed.
    • Will enable Softlayer to be leveraged on a larger, broader basis by IBM
    • Gives IBM increased access to SMB, SME and web customers than in the past
    • Software and development to stay part of Softlayer
    • Provides IBM an extra jumpstart play for supporting and leveraging OpenStack
    • Compatible and supports Cloustack and Citrix who are also IBM partners
    • Also compatible and supports VMware who is also an IBM partner

    Storage I/O trends

    Some other thoughts and perspectives

    This is a good and big move for IBM to add value and leverage their current portfolios of both services, as well as products and technologies. However it is more than just adding value or finding new routes to markets for those goods and services, it’s also about enablement IBM has long been in the services including managed services, out or in sourcing and hosting business. This can be seen as another incremental evolution of those offerings to both existing IBM enterprise customers, as well to reach new, emerging along with SMB or SME’s that tend to grow up and become larger consumers of information and data infrastructure services.

    Further this helps to add some product and meaning around the IBM Smart Cloud initiatives and programs (not that there was not before) giving customers, partners and resellers something tangible to see, feel, look at, touch and gain experience not to mention confidence with clouds.

    On the other hand, is IBM signaling that they want more of the growing business that AWS has been realizing, not to mention Microsoft Azure, Rackspace, Centurylink/Savvis, Verizon/Terremark, CSC, HP Cloud, Cloudsigma, Bluehost among many others (if I missed you or your favorite provider, feel free to add it to the comments section). This also gets IBM added Devops exposure something that Softlayer practices, as well as a Openstack play, not to mention cloud, software defined, virtual, big data, little data, analytics and many other buzzword bingo terms.

    Congratulations to both IBM and the Softlayer folks, now lets see some execution to watch how this unfolds.

    Ok, nuff said.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Part II: How many IOPS can a HDD HHDD SSD do with VMware?

    How many IOPS can a HDD HHDD SSD do with VMware?

    server storage data infrastructure i/o iop hdd ssd trends

    Updated 2/10/2018

    This is the second post of a two-part series looking at storage performance, specifically in the context of drive or device (e.g. mediums) characteristics of How many IOPS can a HDD HHDD SSD do with VMware. In the first post the focus was around putting some context around drive or device performance with the second part looking at some workload characteristics (e.g. benchmarks).

    A common question is how many IOPS (IO Operations Per Second) can a storage device or system do?

    The answer is or should be it depends.

    Here are some examples to give you some more insight.

    For example, the following shows how IOPS vary by changing the percent of reads, writes, random and sequential for a 4K (4,096 bytes or 4 KBytes) IO size with each test step (4 minutes each).

    IO Size for test
    Workload Pattern of test
    Avg. Resp (R+W) ms
    Avg. IOP Sec (R+W)
    Bandwidth KB Sec (R+W)
    4KB
    100% Seq 100% Read
    0.0
    29,736
    118,944
    4KB
    60% Seq 100% Read
    4.2
    236
    947
    4KB
    30% Seq 100% Read
    7.1
    140
    563
    4KB
    0% Seq 100% Read
    10.0
    100
    400
    4KB
    100% Seq 60% Read
    3.4
    293
    1,174
    4KB
    60% Seq 60% Read
    7.2
    138
    554
    4KB
    30% Seq 60% Read
    9.1
    109
    439
    4KB
    0% Seq 60% Read
    10.9
    91
    366
    4KB
    100% Seq 30% Read
    5.9
    168
    675
    4KB
    60% Seq 30% Read
    9.1
    109
    439
    4KB
    30% Seq 30% Read
    10.7
    93
    373
    4KB
    0% Seq 30% Read
    11.5
    86
    346
    4KB
    100% Seq 0% Read
    8.4
    118
    474
    4KB
    60% Seq 0% Read
    13.0
    76
    307
    4KB
    30% Seq 0% Read
    11.6
    86
    344
    4KB
    0% Seq 0% Read
    12.1
    82
    330

    Dell/Western Digital (WD) 1TB 7200 RPM SATA HDD (Raw IO) thread count 1 4K IO size

    In the above example the drive is a 1TB 7200 RPM 3.5 inch Dell (Western Digital) 3Gb SATA device doing raw (non file system) IO. Note the high IOP rate with 100 percent sequential reads and a small IO size which might be a result of locality of reference due to drive level cache or buffering.

    Some drives have larger buffers than others from a couple to 16MB (or more) of DRAM that can be used for read ahead caching. Note that this level of cache is independent of a storage system, RAID adapter or controller or other forms and levels of buffering.

    Does this mean you can expect or plan on getting those levels of performance?

    I would not make that assumption, and thus this serves as an example of using metrics like these in the proper context.

    Building off of the previous example, the following is using the same drive however with a 16K IO size.

    IO Size for test
    Workload Pattern of test
    Avg. Resp (R+W) ms
    Avg. IOP Sec (R+W)
    Bandwidth KB Sec (R+W)
    16KB
    100% Seq 100% Read
    0.1
    7,658
    122,537
    16KB
    60% Seq 100% Read
    4.7
    210
    3,370
    16KB
    30% Seq 100% Read
    7.7
    130
    2,080
    16KB
    0% Seq 100% Read
    10.1
    98
    1,580
    16KB
    100% Seq 60% Read
    3.5
    282
    4,522
    16KB
    60% Seq 60% Read
    7.7
    130
    2,090
    16KB
    30% Seq 60% Read
    9.3
    107
    1,715
    16KB
    0% Seq 60% Read
    11.1
    90
    1,443
    16KB
    100% Seq 30% Read
    6.0
    165
    2,644
    16KB
    60% Seq 30% Read
    9.2
    109
    1,745
    16KB
    30% Seq 30% Read
    11.0
    90
    1,450
    16KB
    0% Seq 30% Read
    11.7
    85
    1,364
    16KB
    100% Seq 0% Read
    8.5
    117
    1,874
    16KB
    60% Seq 0% Read
    10.9
    92
    1,472
    16KB
    30% Seq 0% Read
    11.8
    84
    1,353
    16KB
    0% Seq 0% Read
    12.2
    81
    1,310

    Dell/Western Digital (WD) 1TB 7200 RPM SATA HDD (Raw IO) thread count 1 16K IO size

    The previous two examples are excerpts of a series of workload simulation tests (ok, you can call them benchmarks) that I have done to collect information, as well as try some different things out.

    The following is an example of the summary for each test output that includes the IO size, workload pattern (reads, writes, random, sequential), duration for each workload step, totals for reads and writes, along with averages including IOP’s, bandwidth and latency or response time.

    disk iops

    Want to see more numbers, speeds and feeds, check out the following table which will be updated with extra results as they become available.

    Device
    Vendor
    Make

    Model

    Form Factor
    Capacity
    Interface
    RPM Speed
    Raw
    Test Result
    HDD
    HGST
    Desktop
    HK250-160
    2.5
    160GB
    SATA
    5.4K
    HDD
    Seagate
    Mobile
    ST2000LM003
    2.5
    2TB
    SATA
    5.4K
    HDD
    Fujitsu
    Desktop
    MHWZ160BH
    2.5
    160GB
    SATA
    7.2K
    HDD
    Seagate
    Momentus
    ST9160823AS
    2.5
    160GB
    SATA
    7.2K
    HDD
    Seagate
    MomentusXT
    ST95005620AS
    2.5
    500GB
    SATA
    7.2K(1)
    HDD
    Seagate
    Barracuda
    ST3500320AS
    3.5
    500GB
    SATA
    7.2K
    HDD
    WD/Dell
    Enterprise
    WD1003FBYX
    3.5
    1TB
    SATA
    7.2K
    HDD
    Seagate
    Barracuda
    ST3000DM01
    3.5
    3TB
    SATA
    7.2K
    HDD
    Seagate
    Desktop
    ST4000DM000
    3.5
    4TB
    SATA
    HDD
    HDD
    Seagate
    Capacity
    ST6000NM00
    3.5
    6TB
    SATA
    HDD
    HDD
    Seagate
    Capacity
    ST6000NM00
    3.5
    6TB
    12GSAS
    HDD
    HDD
    Seagate
    Savio 10K.3
    ST9300603SS
    2.5
    300GB
    SAS
    10K
    HDD
    Seagate
    Cheetah
    ST3146855SS
    3.5
    146GB
    SAS
    15K
    HDD
    Seagate
    Savio 15K.2
    ST9146852SS
    2.5
    146GB
    SAS
    15K
    HDD
    Seagate
    Ent. 15K
    ST600MP0003
    2.5
    600GB
    SAS
    15K
    SSHD
    Seagate
    Ent. Turbo
    ST600MX0004
    2.5
    600GB
    SAS
    SSHD
    SSD
    Samsung
    840 PRo
    MZ-7PD256
    2.5
    256GB
    SATA
    SSD
    HDD
    Seagate
    600 SSD
    ST480HM000
    2.5
    480GB
    SATA
    SSD
    SSD
    Seagate
    1200 SSD
    ST400FM0073
    2.5
    400GB
    12GSAS
    SSD

    Performance characteristics 1 worker (thread count) for RAW IO (non-file system)

    Note: (1) Seagate Momentus XT is a Hybrid Hard Disk Drive (HHDD) based on a 7.2K 2.5 HDD with SLC nand flash integrated for read buffer in addition to normal DRAM buffer. This model is a XT I (4GB SLC nand flash), may add an XT II (8GB SLC nand flash) at some future time.

    As a starting point, these results are raw IO with file system based information to be added soon along with more devices. These results are for tests with one worker or thread count, other results will be added with such as 16 workers or thread counts to show how those differ.

    The above results include all reads, all writes, mix of reads and writes, along with all random, sequential and mixed for each IO size. IO sizes include 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, 1024K and 2048K. As with any workload simulation, benchmark or comparison test, take these results with a grain of salt as your mileage can and will vary. For example you will see some what I consider very high IO rates with sequential reads even without file system buffering. These results might be due to locality of reference of IO’s being resolved out of the drives DRAM cache (read ahead) which vary in size for different devices. Use the vendor model numbers in the table above to check the manufactures specs on drive DRAM and other attributes.

    If you are used to seeing 4K or 8K and wonder why anybody would be interested in some of the larger sizes take a look at big fast data or cloud and object storage. For some of those applications 2048K may not seem all that big. Likewise if you are used to the larger sizes, there are still applications doing smaller sizes. Sorry for those who like 512 byte or smaller IO’s as they are not included. Note that for all of these unless indicated a 512 byte standard sector or drive format is used as opposed to emerging Advanced Format (AF) 4KB sector or block size. Watch for some more drive and device types to be added to the above, along with results for more workers or thread counts, along with file system and other scenarios.

    Using VMware as part of a Server, Storage and IO (aka StorageIO) test platform

    vmware vexpert

    The above performance results were generated on Ubuntu 12.04 (since upgraded to 14.04 which was hosted on a VMware vSphere 5.1 (upgraded to 5.5U2) purchased version (you can get the ESXi free version here) with vCenter enabled system. I also have VMware workstation installed on some of my Windows-based laptops for doing preliminary testing of scripts and other activity prior to running them on the larger server-based VMware environment. Other VMware tools include vCenter Converter, vSphere Client and CLI. Note that other guest virtual machines (VMs) were idle during the tests (e.g. other guest VMs were quiet). You may experience different results if you ran Ubuntu native on a physical machine or with different adapters, processors and device configurations among many other variables (that was a disclaimer btw ;) ).

    Storage I/O trends

    All of the devices (HDD, HHDD, SSD’s including those not shown or published yet) were Raw Device Mapped (RDM) to the Ubuntu VM bypassing VMware file system.

    Example of creating an RDM for local SAS or SATA direct attached device.

    vmkfstools -z /vmfs/devices/disks/naa.600605b0005f125018e923064cc17e7c /vmfs/volumes/dat1/RDM_ST1500Z110S6M5.vmdk

    The above uses the drives address (find by doing a ls -l /dev/disks via VMware shell command line) to then create a vmdk container stored in a dat. Note that the RDM being created does not actually store data in the .vmdk, it’s there for VMware management operations.

    If you are not familiar with how to create a RDM of a local SAS or SATA device, check out this post to learn how.This is important to note in that while VMware was used as a platform to support the guest operating systems (e.g. Ubuntu or Windows), the real devices are not being mapped through or via VMware virtual drives.

    vmware iops

    The above shows examples of RDM SAS and SATA devices along with other VMware devices and dats. In the next figure is an example of a workload being run in the test environment.

    vmware iops

    One of the advantages of using VMware (or other hypervisor) with RDM’s is that I can quickly define via software commands where a device gets attached to different operating systems (e.g. the other aspect of software defined storage). This means that after a test run, I can quickly simply shutdown Ubuntu, remove the RDM device from that guests settings, move the device just tested to a Windows guest if needed and restart those VMs. All of that from where ever I happen to be working from without physically changing things or dealing with multi-boot or cabling issues.

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    So how many IOPs can a device do?

    That depends, however have a look at the above information and results.

    Check back from time to time here to see what is new or has been added including more drives, devices and other related themes.

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    How many I/O iops can flash SSD or HDD do?

    How many i/o iops can flash ssd or hdd do with vmware?

    sddc data infrastructure Storage I/O ssd trends

    Updated 2/10/2018

    A common question I run across is how many I/O iopsS can flash SSD or HDD storage device or system do or give.

    The answer is or should be it depends.

    This is the first of a two-part series looking at storage performance, and in context specifically around drive or device (e.g. mediums) characteristics across HDD, HHDD and SSD that can be found in cloud, virtual, and legacy environments. In this first part the focus is around putting some context around drive or device performance with the second part looking at some workload characteristics (e.g. benchmarks).

    What about cloud, tape summit resources, storage systems or appliance?

    Lets leave those for a different discussion at another time.

    Getting started

    Part of my interest in tools, metrics that matter, measurements, analyst, forecasting ties back to having been a server, storage and IO performance and capacity planning analyst when I worked in IT. Another aspect ties back to also having been a sys admin as well as business applications developer when on the IT customer side of things. This was followed by switching over to the vendor world involved with among other things competitive positioning, customer design configuration, validation, simulation and benchmarking HDD and SSD based solutions (e.g. life before becoming an analyst and advisory consultant).

    Btw, if you happen to be interested in learn more about server, storage and IO performance and capacity planning, check out my first book Resilient Storage Networks (Elsevier) that has a bit of information on it. There is also coverage of metrics and planning in my two other books The Green and Virtual Data Center (CRC Press) and Cloud and Virtual Data Storage Networking (CRC Press). I have some copies of Resilient Storage Networks available at a special reader or viewer rate (essentially shipping and handling). If interested drop me a note and can fill you in on the details.

    There are many rules of thumb (RUT) when it comes to metrics that matter such as IOPS, some that are older while others may be guess or measured in different ways. However the answer is that it depends on many things ranging from if a standalone hard disk drive (HDD), Hybrid HDD (HHDD), Solid State Device (SSD) or if attached to a storage system, appliance, or RAID adapter card among others.

    Taking a step back, the big picture

    hdd image
    Various HDD, HHDD and SSD’s

    Server, storage and I/O performance and benchmark fundamentals

    Even if just looking at a HDD, there are many variables ranging from the rotational speed or Revolutions Per Minute (RPM), interface including 1.5Gb, 3.0Gb, 6Gb or 12Gb SAS or SATA or 4Gb Fibre Channel. If simply using a RUT or number based on RPM can cause issues particular with 2.5 vs. 3.5 or enterprise and desktop. For example, some current generation 10K 2.5 HDD can deliver the same or better performance than an older generation 3.5 15K. Other drive factors (see this link for HDD fundamentals) including physical size such as 3.5 inch or 2.5 inch small form factor (SFF), enterprise or desktop or consumer, amount of drive level cache (DRAM). Space capacity of a drive can also have an impact such as if all or just a portion of a large or small capacity devices is used. Not to mention what the drive is attached to ranging from in internal SAS or SATA drive bay, USB port, or a HBA or RAID adapter card or in a storage system.

    disk iops
    HDD fundamentals

    How about benchmark and performance for marketing or comparison tricks including delayed, deferred or asynchronous writes vs. synchronous or actually committed data to devices? Lets not forget about short stroking (only using a portion of a drive for better IOP’s) or even long stroking (to get better bandwidth leveraging spiral transfers) among others.

    Almost forgot, there are also thick, standard, thin and ultra thin drives in 2.5 and 3.5 inch form factors. What’s the difference? The number of platters and read write heads. Look at the following image showing various thickness 2.5 inch drives that have various numbers of platters to increase space capacity in a given density. Want to take a wild guess as to which one has the most space capacity in a given footprint? Also want to guess which type I use for removable disk based archives along with for onsite disk based backup targets (compliments my offsite cloud backups)?

    types of disks
    Thick, thin and ultra thin devices

    Beyond physical and configuration items, then there are logical configuration including the type of workload, large or small IOPS, random, sequential, reads, writes or mixed (various random, sequential, read, write, large and small IO). Other considerations include file system or raw device, number of workers or concurrent IO threads, size of the target storage space area to decide impact of any locality of reference or buffering. Some other items include how long the test or workload simulation ran for, was the device new or worn in before use among other items.

    Tools and the performance toolbox

    Then there are the various tools for generating IO’s or workloads along with recording metrics such as reads, writes, response time and other information. Some examples (mix of free or for fee) include Bonnie, Iometer, Iorate, IOzone, Vdbench, TPC, SPC, Microsoft ESRP, SPEC and netmist, Swifttest, Vmark, DVDstore and PCmark 7 among many others. Some are focused just on the storage system and IO path while others are application specific thus exercising servers, storage and IO paths.

    performance tools
    Server, storage and IO performance toolbox

    Having used Iometer since the late 90s, it has its place and is popular given its ease of use. Iometer is also long in the tooth and has its limits including not much if any new development, never the less, I have it in the toolbox. I also have Futremark PCmark 7 (full version) which turns out has some interesting abilities to do more than exercise an entire Windows PC. For example PCmark can use a secondary drive for doing IO to.

    PCmark can be handy for spinning up with VMware (or other tools) lots of virtual Windows systems pointing to a NAS or other shared storage device doing real world type activity. Something that could be handy for testing or stressing virtual desktop infrastructures (VDI) along with other storage systems, servers and solutions. I also have Vdbench among others tools in the toolbox including Iorate which was used to drive the workloads shown below.

    What I look for in a tool are how extensible are the scripting capabilities to define various workloads along with capabilities of the test engine. A nice GUI is handy which makes Iometer popular and yes there are script capabilities with Iometer. That is also where Iometer is long in the tooth compared to some of the newer generation of tools that have more emphasis on extensibility vs. ease of use interfaces. This also assumes knowing what workloads to generate vs. simply kicking off some IOPs using default settings to see what happens.

    Another handy tool is for recording what’s going on with a running system including IO’s, reads, writes, bandwidth or transfers, random and sequential among other things. This is where when needed I turn to something like HiMon from HyperIO, if you have not tried it, get in touch with Tom West over at HyperIO and tell him StorageIO sent you to get a demo or trial. HiMon is what I used for doing start, stop and boot among other testing being able to see IO’s at the Windows file system level (or below) including very early in the boot or shutdown phase.

    Here is a link to some other things I did awhile back with HiMon to profile some Windows and VDI activity test profiling.

    What’s the best tool or benchmark or workload generator?

    The one that meets your needs, usually your applications or something as close as possible to it.

    disk iops
    Various 2.5 and 3.5 inch HDD, HHDD, SSD with different performance

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    That depends, however continue reading part II of this series to see some results for various types of drives and workloads.

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    EMC ViPR software defined object storage part III

    Storage I/O trends

    This is part III in a series of posts pertaining to EMC ViPR software defined storage and object storage. You can read part I here and part II here.

    EMCworld

    More on the object opportunity

    Other object access includes OpenStack storage part Swift, AWS S3 HTTP and REST API access. This also includes ViPR supporting EMC Atmos, VNX and Isilon arrays as southbound persistent storage in addition.

    object storage
    Object (and cloud) storage access example

    EMC is claiming that over 250 VNX systems can be abstracted to support scaling with stability (performance, availability, capacity, economics) using ViPR. Third party storage will be supported along with software such as OpenStack Swift, Ceph and others running on commodity hardware. Note that EMC has some history with object storage and access including Centera and Atmos. Visit the micro site I have setup called www.objectstoragecenter.com and watch for more content to be updated and added there.

    More on the ViPR control plane and controller

    ViPR differs from some others in that it does not sit in the data path all the time (e.g. between application servers and storage systems or cloud services) to cut potential for bottlenecks.

    ViPR architecture

    Organizations that can use ViPR include enterprise, SMB, CSP or MSP and hosting sites. ViPR can be used in a control mode to leverage underlying storage systems, appliances and services intelligence and functionality. This means ViPR can be used to complement as oppose to treat southbound or target storage systems and services as dumb disks or JBOD.

    On the other hand, ViPR will also have a suite of data services such as snapshot, replication, data migration, movement, tiering to add value for when those do not exist. Customers will be free to choose how they want to use and deploy ViPR. For example leveraging underlying storage functionality (e.g. lightweight model), or in a more familiar storage virtualization model heavy lifting model. In the heavy lifting model more work is done by the virtualization or abstraction software to create an added value, however can be a concern for bottlenecks depending how deployed.

    Service categories

    Software defined, storage hypervisor, virtual storage or storage virtualization?

    Most storage virtualization, storage hypervisors and virtual storage solutions that are hardware or software based (e.g. software defined) implemented what is referred to as in band. With in band the storage virtualization software or hardware sits between the applications (northbound) and storage systems or services (southbound).

    While this approach can be easier to carry out along with add value add services, it can also introduce scaling bottlenecks depending on implementations. Examples of in band storage virtualization includes Actifio, DataCore, EMC VMAX with third-party storage, HDS with third-party storage, IBM SVC (and their V7000 Storwize storage system based on it) and NetApp Vseries among others. An advantage of in band approaches is that there should not need to be any host or server-side software requirements and SAN transparency.

    There is another approach called out-of-band that has been tried. However pure out-of-band requires a management system along with agents, drivers, shims, plugins or other software resident on host application servers.

    fast path control path
    Example of generic fast path control path model

    ViPR takes a different approach, one that was seen a few years ago with EMC Invista called fast path, control path that for the most part stays out of the data path. While this is like out-of-band, there should not be a need for any host server-side (e.g. northbound) software. By being a fast path control path, the virtualization or abstraction and management functions stay out of the way for data being moved or work being done.

    Hmm, kind of like how management should be, there to help when needed, out-of-the-way not causing overhead other times ;).

    Is EMC the first (even with Invista) to leverage fast path control path?

    Actually up until about a year or so ago, or shortly after HP acquired 3PAR they had a solution called Storage Virtualization Services Platform (SVPS) that was OEMd from LSI (e.g. StorAge). Unfortunately, HP decided to retire that as opposed to extend its capabilities for file and object access (northbound) as well as different southbound targets or destination services.

    Whats this northbound and southbound stuff?

    Simply put, think in terms of a vertical stack with host servers (PMs or VMs) on the top with applications (and hypervisors or other tools such as databases) on top of them (e.g. north).

    software defined storage
    Northbound servers, southbound storage systems and cloud services

    Think of storage systems, appliances, cloud services or other target destinations on the bottom (or south). ViPR sits in between providing storage services and management to the northbound servers leveraging the southbound storage.

    What host servers can VIPR support for serving storage?

    VIPR is being designed to be server agnostic (e.g. virtual or physical), along with operating system agnostic. In addition VIPR is being positioned as capable of serving northbound (e.g. up to application servers) block, file or object as well as accessing southbound (e.g. targets) block, file and object storage systems, file systems or services.

    Note that a difference between earlier similar solutions from EMC have been either block based (e.g. Invista, VPLEX, VMAX with third-party storage), or file based. Also note that this means VIPR is not just for VMware or virtual server environments and that it can exist in legacy, virtual or cloud environments.

    ViPR image

    Likewise VIPR is intended to be application agnostic supporting little data, big data, very big data ( VBD) along with Hadoop or other specialized processing. Note that while VIPR will support HDFS in addition to NFS and CIFS file based access, Hadoop will not be running on or in the VIPR controllers as that would live or run elsewhere.

    How will VIPR be deployed and licensed?

    EMC has indicated that the VIPR controller will be delivered as software that installs into a virtual appliance (e.g. VMware) running as a virtual machine (VM) guest. It is not clear when support will exist for other hypervisors (e.g. Microsoft Hyper-V, Citrix/XEN, KVM or if VMware vSphere with vCenter or simply on ESXi free version). As of the announcement pre briefing, EMC had not yet finalized pricing and licensing details. General availability is expected in the second half of calendar 2013.

    Keep in mind that the VIPR controller (software) runs as a VM that can be hosted on a clustered hypervisor for HA. In addition, multiple VIPR controllers can exist in a cluster to further enhance HA.

    Some questions to be addressed among others include:

    • How and where are IOs intercepted?
    • Who can have access to the APIs, what is the process, is there a developers program, SDK along with resources?
    • What network topologies are supported local and remote?
    • What happens when JBOD is used and no advanced data services exist?
    • What are the characteristics of the object access functionality?
    • What if any specific switches or data path devices and tools are needed?
    • How does a host server know to talk with its target and ViPR controller know when to intercept for handling?
    • Will SNIA CDMI be added and when as part of the object access and data services capabilities?
    • Are programmatic bindings available for the object access along with support for other APIs including IOS?
    • What are the performance characteristics including latency under load as well as during a failure or fault scenario?
    • How will EMC place Vplex and its caching model on a local and wide area basis vs. ViPR or will we see those two create some work together, if so, what will that be?

    Bottom line (for now):

    Good move for EMC, now let us see how they execute including driving adoption of their open APIs, something they have had success in the past with Centera and other solutions. Likewise, let us see what other storage vendors become supported or add support along with how pricing and licensing are rolled out. EMC will also have to articulate when and where to use ViPR vs. VPLEX along with other storage systems or management tools.

    Additional related material:
    Are you using or considering implementation of a storage hypervisor?
    Cloud and Virtual Data Storage Networking (CRC)
    Cloud conversations: Public, Private, Hybrid what about Community Clouds?
    Cloud, virtualization, storage and networking in an election year
    Does software cut or move place of vendor lock-in?
    Don’t Use New Technologies in Old Ways
    EMC VPLEX: Virtual Storage Redefined or Respun?
    How many degrees separate you and your information?
    Industry adoption vs. industry deployment, is there a difference?
    Many faces of storage hypervisor, virtual storage or storage virtualization
    People, Not Tech, Prevent IT Convergence
    Resilient Storage Networks (Elsevier)
    Server and Storage Virtualization Life beyond Consolidation
    Should Everything Be Virtualized?
    The Green and Virtual Data Center (CRC)
    Two companies on parallel tracks moving like trains offset by time: EMC and NetApp
    Unified storage systems showdown: NetApp FAS vs. EMC VNX
    backup, restore, BC, DR and archiving
    VMware buys virsto, what about storage hypervisor’s?
    Who is responsible for vendor lockin?

    Ok, nuff said (for now)

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    EMC ViPR software defined object storage part II

    Storage I/O trends

    This is part II in a series of posts pertaining to EMC ViPR software defined storage and object storage. You can read part I here and part III here.

    EMCworld

    Some questions and discussion topics pertaining to ViPR:

    Whom is ViPR for?

    Organizations that need to scale with stability across EMC, third-party or open storage software stacks and commodity hardware. This applies to large and small enterprise, cloud service providers, managed service providers, virtual and cloud environments/

    What this means for EMC hardware/platform/systems?

    They can continue to be used as is, or work with ViPR or other deployment modes.

    Does this mean EMC storage systems are nearing their end of life?

    IMHO for the most part not yet, granted there will be some scenarios where new products will be used vs. others, or existing ones used in new ways for different things.

    As has been the case for years if not decades, some products will survive, continue to evolve and find new roles, kind of like different data storage mediums (e.g. ssd, disk, tape, etc).

    How does ViPR work?

    ViPR functions as a control plane across the data and storage infrastructure supporting both north and southbound. northbound refers to use from or up to application servers (physical machines PM and virtual machines VMs). southbound refers target or destination storage systems. Storage systems can be traditional EMC or third-party (NetApp mentioned as part of first release), appliances, just a bunch of disks (JBOD) or cloud services.

    Some general features and functions:

    • Provisioning and allocation (with automation)
    • Data and storage migration or tiering
    • Leverage scripts, templates and workbooks
    • Support service categories and catalogs
    • Discovery, registration of storage systems
    • Create of storage resource pools for host systems
    • Metering, measuring, reporting, charge or show back
    • Alerts, alarms and notification
    • Self-service portal for access and provisioning

    ViPR data plane (adding data services and value when needed)

    Another part is the data plane for implementing data services and access. For block and file when not needed, ViPR steps out-of-the-way leveraging the underlying storage systems or services.

    object storage
    Object storage access

    When needed, the ViPR data plane can step in to add added services and functionality along with support object based access for little data and big data. For example, Hadoop Distributed File System (HDFS) services can support northbound analytic software applications running on servers accessing storage managed by ViPR.

    Continue reading in part III of this series here including how ViPR works, who it is for and more analysis.

    Ok, nuff said (for now)

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    EMC ViPR virtual physical object and software defined storage (SDS)

    Storage I/O trends

    Introducing EMC ViPR

    This is the first in a three part series, read part II here, and part III here.

    During the recent EMCworld event in Las Vegas among other things, EMC announced ViPR (read announcement here) . Note that this ViPR is not the same EMC Viper project from a few years ago that was focused on data footprint reduction (DFR) including dedupe. ViPR has been in the works for a couple of years taking a step back rethinking how storage is can be used going forward.

    EMCworld

    ViPR is not a technology creation developed in a vacuum instead includes customer feedback, wants and needs. Its core themes are extensible, open and scalable.

    EMCworld

    On the other hand, ViPR addresses plenty of buzzword bingo themes including:

    • Agility, flexibility, multi-tenancy, orchestration
    • Virtual appliance and control plane
    • Data services and storage management
    • IT as a Service (ITaaS) and Infrastructure as a Service (IaaS)
    • Scaling with stability without compromise
    • Software defined storage
    • Public, private, hybrid cloud
    • Big data and little data
    • Block, file and object storage
    • Control plane and data plane
    • Storage hypervisor, virtualization and virtual storage
    • Heterogeneous (third-party) storage support
    • Open API and automation
    • Self-service portals, service catalogs

    Buzzword bingo

    Note that this is essentially announcing the ViPR product and program initiative with general availability slated for second half of 2013.

    What is ViPR addressing?

    IT and data infrastructure (server, storage, IO and networking hardware, software) challenges for traditional, virtual and cloud environments.

    • Data growth, after all, there is no such thing as an information recession with more data being generated, moved, processed, stored and retained for longer periods of time. Then again, people and data are both getting larger and living longer, for both little data and big data along with very big data.
    • Overhead and complexities associated with managing and using an expanding, homogenous (same vendor, perhaps different products) or heterogeneous (different vendors and products) data infrastructure across cloud, virtual and physical, legacy and emerging. This includes add, changes or moves, updates and upgrades, retirement and replacement along with disposition, not to mention protecting data in an expanding footprint.
    • road to cloud

    • Operations and service management, fault and alarm notification, resolution and remediation, rapid provisioning, removing complexity and cost of doing things vs. simply cutting cost and compromising service.

    EMC ViPR

    What is this software defined storage stuff?

    There is the buzzword aspect, and then there is the solution and business opportunity.

    First the buzzword aspect and bandwagon:

    • Software defined marketing (SDM) Leveraging software defined buzzwords.
    • Software defined data centers (SDDC) Leveraging software to derive more value from hardware while enabling agility, flexibility, and scalability and removing complexity. Think the Cloud and Virtual Data Center models including those from VMware among others.
    • Software defined networking (SDN) Rather than explain, simply look at Nicira that VMware bought in 2012.
    • Software defined storage (SDS) Storage software that is independent of any specific hardware, which might be a bit broad, however it is also narrower than saying anything involving software.
    • Software defined BS (SDBS) Something that usually happens as a result when marketers and others jump on a bandwagon, in this case software defined marketing.

    Note that not everything involved with software defined is BS, only some of the marketing spins and overuse. The downside to the software defined marketing and SDBS is the usual reaction of skepticism, cynicism and dismissal, so let us leave the software defined discussion here for now.

    software defined storage

    An example of software defined storage can be storage virtualization, virtual storage and storage hypervisors that are hardware independent. Note that when I say hardware independent, that also means being able to support different vendors systems. Now if you want to have some fun with the software defined storage diehards or purist, tell them that all hardware needs software and all software needs hardware, even if virtual. Further hardware is defined by its software, however lets leave sleeping dogs lay where they rest (at least for now ;)).

    Storage hypervisors were a 2012 popular buzzword bingo topic with plenty of industry adoption and some customer deployment. While 2012 saw plenty of SDM buzz including SDC, SDN 2013 is already seeing an increase including software defined servers, and software defined storage.

    Regardless of what you view of software defined storage, storage hypervisor, storage virtualization and virtual storage is, the primary focus and goal should be addressing business and application needs. Unfortunately, some of the discussions or debates about what is or is not software defined and related themes lose focus of what should be the core goal of enabling business and applications.

    Continue reading in part II of this series here including how ViPR works, who it is for and more analysis.

    Ok, nuff said (for now)

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    March 2013 Server and StorageIO Update Newsletter

    StorageIO News Letter Image
    March 2013 News letter

    Welcome to the March 2013 edition of the StorageIO Update news letter including a new format and added content.

    You can get access to this news letter via various social media venues (some are shown below) in addition to StorageIO web sites and subscriptions.

    Click on the following links to view the March 2013 edition as (HTML sent via Email) version, or PDF versions.

    Visit the news letter page to view previous editions of the StorageIO Update.

    You can subscribe to the news letter by clicking here.

    Enjoy this edition of the StorageIO Update news letter, let me know your comments and feedback.

    Nuff said for now

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Cloud conversations: AWS EBS, Glacier and S3 overview (Part III)

    Storage I/O industry trends image

    Amazon Web Services (AWS) recently added EBS Optimized support for enhanced bandwidth EC2 instances (read more here). This industry trends and perspective cloud conversation is the third (tying the posts together) in a three-part series companion to the AWS EBS optimized post found here. Part I is here (closer look at EBS) and part II is here (closer look at S3).

    AWS image via Amazon.com

    Cloud storage and object storage I/O figure
    Cloud and object storage access example via Cloud and Virtual Data Storage Networking

    AWS cloud storage gateway

    In 2012 AWS released their Storage Gateway that you can use and try for free here using either an EC2 Amazon Machine Instance (AMI), or deployed locally on a hypervisor such as VMware vSphere/ESXi. About a year ago I did a storage gateway post (First, second and third impressions) when it was first released. I will do a new post soon following up with my later impressions and experiences of having used it recently. For now, my quick (fourth impressions can be found here in this AWS Marketplace review). In general, the gateway is an AWS alternative to using third product gateway, appliances of software tools for accessing AWS storage.

    AWS Storage Gateway
    Image courtesy of www.amazon.com

    When deployed locally on a VM, the storage gateway communicates using the AWS API’s back to the S3 and EBS (depending on how configured) storage services. Locally, the storage gateway presents an iSCSI block access method for Windows or other servers to use.

    There are two modes with one being Gateway-Stored and the other Gateway-Cached. Gateway-Stored uses your primary storage mapped to the storage gateway as primary storage and asynchronous (time delayed) snapshots (user defined) to S3 via EBS volumes. This is a handy way to have local storage for low latency access, yet use AWS for HA, BC and DR, along with a means for doing migration into or out of AWS. Gateway-cache mode places primary storage in AWS S3 with a local cached copy to reduce network overhead.

    Storage I/O industry trends image

    When I tried the gateway a month or so ago, using both modes, I was not able to view any of my data using standard S3 tools. For example if I looked in my S3 buckets the objects do not appear, something that AWS said had to do with where and how those buckets and objects are managed. Otoh, I was able to see EBS snapshots for the gateway-stored mode including using that as a means of moving data between local and AWS EC2 instances. Note that regardless of the AWS storage gateway mode, some local cache storage is needed, and likewise some EBS volumes will be needed depending on what mode is used.

    When I used the gateway, a Windows Server mounted the iSCSI volume presented by the storage gateway and in turn served that to other systems as a shared folder. Thus while having block such as iSCSI is nice, a NAS (NFS or CIFS) presentation and access mode would also be useful. However more on the storage gateway in a future post. Also note that beyond the free trial period (you may have to pay for storage being used) for using the gateway, there are also fees for S3 and EBS storage volumes use.

    AWS image via Amazon.com

    What about Glacier?

    Shortly after its release last year, I did this piece about Glacier and have since been doing some testing proof of concepts with it.

    I like Glacier and its prospects for doing some various things, particular for inactive data including deep archives that will seldom if every be accessed, yet need to be retained. The business value proposition of Glacier is that it has a very high durability and low-cost assuming that you do not need to frequently access your data, and when you do, that you can wait 3 to 5 hours before retrieving it from your S3 buckets.

    Access to Glacier is via API or AWS console so getting things into and out of it can be a challenge. For example I wanted to see if I could use AWS storage gateway to more easily bulk move things into Glacier via S3, however no luck, or at least today. Speaking of S3, by setting your policies you determine when objects get moved into Glacier as well as how long they will stay there, you can read more about Glacier here and via AWS here.

    Storage I/O industry trends image

    How much do these AWS services cost?

    Fees vary depending on which region is selected, amount of space capacity, level or durability and availability, performance along with type of service. S3 pricing can be found here including a free trial tier along with optional fees. Other AWS fees for EC2 can be found here, EBS pricing here, Glacier here, and storage gateway costs are located here.

    Note that there is a myth that cloud vendors have hidden fees which may be the case for some, however so far I have not seen that to be the case with AWS. However, as a consumer, designer or architect, doing your homework and looking at the above links among others you can be ready and understand the various fees and options. Hence like procuring traditional hardware, software or services, do your due diligence and be an informed shopper.

    Amazon Web Services (AWS) image

    Some more service cost notes include:

    Note that with S3 Standard and RRS objects there is not a charge for deletion of objects, however there is a pro-rated charge per GByte of Glacier objects removed prior to 90 days. Glacier also allows up to 5% of your average monthly storage usage (pro-rated daily) to be restored with no charge, other fees apply for restoring larger amounts in a given period. Thus if you are planning on accessing and using data, analyze what your activity and usage will be as part of calculating your costs with Glacier. Read more about Glacier here.

    Standard EBS volumes are changed by the amount of storage space capacity you provision in GB until released. For EBS snapshot copies there are fees for transferring data across regions, once moved, the rates of the new region apply for the snapshot.

    Amazon Web Services (AWS) image

    As with Standard volumes, volume storage for Provisioned IOPS volumes is charged by the amount you provision in GB per month. With Provisioned IOPS volumes, you are also charged by the amount you provision in IOPS pro-rated as a percentage of days you have it in use for the month.

    Thus important for cloud storage planning to know not only your space requirements, also IOP’s, bandwidth, and level of availability as well as durability. so for Standard volumes, you will likely see a lower number of I/O requests on your bill than is seen by your application unless you sync all of your I/Os to disk. Thus pay attention to what your needs are in terms of availability (accessibility), durability (resiliency or survivability), space capacity, and performance.

    Leverage AWS CloudWatch tools and API’s to monitoring that matter for timely insight and situational awareness into how EBS, EC2, S3, Glacier, Storage Gateway and other services are being used (or costing you). Also visit the AWS service health status dashboard to gain insight into how things are running to help gain confidence with cloud services and solutions.

    Storage I/O industry trends image

    When it comes to Cloud, Virtualization, Data and Storage Networking along with AWS among other services, tools and technologies including object storage, we are just scratching the surface here.

    Hopefully this helps to fill in some gaps giving more information addressing questions, along with generating new ones to prepare for your journey with clouds. After all, don’t be scared of clouds. Be prepared, do your homework, identify your concerns and then address those to gain cloud confidence.

    Additional reading and related items:

  • Cloud conversations: AWS EBS optimized instances
  • Cloud conversations: AWS EBS, Glacier and S3 overview (Part I)
  • Cloud conversations: AWS EBS, Glacier and S3 overview (Part II)
  • Cloud conversations: AWS Government Cloud (GovCloud)
  • Cloud conversations: Gaining cloud confidence from insights into AWS outages
  • AWS (Amazon) storage gateway, first, second and third impressions
  • Cloud conversations: Public, Private, Hybrid what about Community Clouds?
  • Amazon cloud storage options enhanced with Glacier
  • Amazon Web Services (AWS) and the NetFlix Fix?
  • Cloud conversation, Thanks Gartner for saying what has been said
  • Cloud and Virtual Data Storage Networking via Amazon.com
  • Seven Databases in Seven Weeks
  • www.objectstoragecenter.com
  • Ok, nuff said (for now).

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved