Big Files Lots of Little File Processing Benchmarking with Vdbench

Big Files Lots of Little File Processing Benchmarking with Vdbench


server storage data infrastructure i/o File Processing Benchmarking with Vdbench

Updated 2/10/2018

Need to test a server, storage I/O networking, hardware, software, services, cloud, virtual, physical or other environment that is either doing some form of file processing, or, that you simply want to have some extra workload running in the background for what ever reason? An option is File Processing Benchmarking with Vdbench.

I/O performance

Getting Started


Here’s a quick and relatively easy way to do it with Vdbench (Free from Oracle). Granted there are other tools, both for free and for fee that can similar things, however we will leave those for another day and post. Here’s the con to this approach, there is no Uui Gui like what you have available with some other tools Here’s the pro to this approach, its free, flexible and limited by your creative, amount of storage space, server memory and I/O capacity.

If you need a background on Vdbench and benchmarking, check out the series of related posts here (e.g. www.storageio.com/performance).

Get and Install the Vdbench Bits and Bytes


If you do not already have Vdbench installed, get a copy from the Oracle or Source Forge site (now points to Oracle here).

Vdbench is free, you simply sign-up and accept the free license, select the version down load (it is a single, common distribution for all OS) the bits as well as documentation.

Installation particular on Windows is really easy, basically follow the instructions in the documentation by copying the contents of the download folder to a specified directory, set up any environment variables, and make sure that you have Java installed.

Here is a hint and tip for Windows Servers, if you get an error message about counters, open a command prompt with Administrator rights, and type the command:

$ lodctr /r


The above command will reset your I/O counters. Note however that command will also overwrite counters if enabled so only use it if you have to.

Likewise *nix install is also easy, copy the files, make sure to copy the applicable *nix shell script (they are in the download folder), and verify Java is installed and working.

You can do a vdbench -t (windows) or ./vdbench -t (*nix) to verify that it is working.

Vdbench File Processing

There are many options with Vdbench as it has a very robust command and scripting language including ability to set up for loops among other things. We are only going to touch the surface here using its file processing capabilities. Likewise, Vdbench can run from a single server accessing multiple storage systems or file systems, as well as running from multiple servers to a single file system. For simplicity, we will stick with the basics in the following examples to exercise a local file system. The limits on the number of files and file size are limited by server memory and storage space.

You can specify number and depth of directories to put files into for processing. One of the parameters is the anchor point for the file processing, in the following examples =S:\SIOTEMP\FS1 is used as the anchor point. Other parameters include the I/O size, percent reads, number of threads, run time and sample interval as well as output folder name for the result files. Note that unlike some tools, Vdbench does not create a single file of results, rather a folder with several files including summary, totals, parameters, histograms, CSV among others.


Simple Vdbench File Processing Commands

For flexibility and ease of use I put the following three Vdbench commands into a simple text file that is then called with parameters on the command line.
fsd=fsd1,anchor=!fanchor,depth=!dirdep,width=!dirwid,files=!numfiles,size=!filesize

fwd=fwd1,fsd=fsd1,rdpct=!filrdpct,xfersize=!fxfersize,fileselect=random,fileio=random,threads=!thrds

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=!etime,interval=!itime

Simple Vdbench script

# SIO_vdbench_filesystest.txt
#
# Example Vdbench script for file processing
#
# fanchor = file system place where directories and files will be created
# dirwid = how wide should the directories be (e.g. how many directories wide)
# numfiles = how many files per directory
# filesize = size in in k, m, g e.g. 16k = 16KBytes
# fxfersize = file I/O transfer size in kbytes
# thrds = how many threads or workers
# etime = how long to run in minutes (m) or hours (h)
# itime = interval sample time e.g. 30 seconds
# dirdep = how deep the directory tree
# filrdpct = percent of reads e.g. 90 = 90 percent reads
# -p processnumber = optional specify a process number, only needed if running multiple vdbenchs at same time, number should be unique
# -o output file that describes what being done and some config info
#
# Sample command line shown for Windows, for *nix add ./
#
# The real Vdbench script with command line parameters indicated by !=
#

fsd=fsd1,anchor=!fanchor,depth=!dirdep,width=!dirwid,files=!numfiles,size=!filesize

fwd=fwd1,fsd=fsd1,rdpct=!filrdpct,xfersize=!fxfersize,fileselect=random,fileio=random,threads=!thrds

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=!etime,interval=!itime

Big Files Processing Script


With the above script file defined, for Big Files I specify a command line such as the following.
$ vdbench -f SIO_vdbench_filesystest.txt fanchor=S:\SIOTemp\FS1 dirwid=1 numfiles=60 filesize=5G fxfersize=128k thrds=64 etime=10h itime=30 numdir=1 dirdep=1 filrdpct=90 -p 5576 -o SIOWS2012R220_NOFUZE_5Gx60_BigFiles_64TH_STX1200_020116

Big Files Processing Example Results


The following is one of the result files from the folder of results created via the above command for Big File processing showing totals.


Run totals

21:09:36.001 Starting RD=format_for_rd1

Feb 01, 2016 .Interval. .ReqstdOps.. ...cpu%... read ....read.... ...write.... ..mb/sec... mb/sec .xfer.. ...mkdir... ...rmdir... ..create... ...open.... ...close... ..delete...
rate resp total sys pct rate resp rate resp read write total size rate resp rate resp rate resp rate resp rate resp rate resp
21:23:34.101 avg_2-28 2848.2 2.70 8.8 8.32 0.0 0.0 0.00 2848.2 2.70 0.00 356.0 356.02 131071 0.0 0.00 0.0 0.00 0.1 109176 0.1 0.55 0.1 2006 0.0 0.00

21:23:35.009 Starting RD=rd1; elapsed=36000; fwdrate=max. For loops: None

07:23:35.000 avg_2-1200 4939.5 1.62 18.5 17.3 90.0 4445.8 1.79 493.7 0.07 555.7 61.72 617.44 131071 0.0 0.00 0.0 0.00 0.0 0.00 0.1 0.03 0.1 2.95 0.0 0.00


Lots of Little Files Processing Script


For lots of little files, the following is used.


$ vdbench -f SIO_vdbench_filesystest.txt fanchor=S:\SIOTEMP\FS1 dirwid=64 numfiles=25600 filesize=16k fxfersize=1k thrds=64 etime=10h itime=30 dirdep=1 filrdpct=90 -p 5576 -o SIOWS2012R220_NOFUZE_SmallFiles_64TH_STX1200_020116

Lots of Little Files Processing Example Results


The following is one of the result files from the folder of results created via the above command for Big File processing showing totals.
Run totals

09:17:38.001 Starting RD=format_for_rd1

Feb 02, 2016 .Interval. .ReqstdOps.. ...cpu%... read ....read.... ...write.... ..mb/sec... mb/sec .xfer.. ...mkdir... ...rmdir... ..create... ...open.... ...close... ..delete...
rate resp total sys pct rate resp rate resp read write total size rate resp rate resp rate resp rate resp rate resp rate resp
09:19:48.016 avg_2-5 10138 0.14 75.7 64.6 0.0 0.0 0.00 10138 0.14 0.00 158.4 158.42 16384 0.0 0.00 0.0 0.00 10138 0.65 10138 0.43 10138 0.05 0.0 0.00

09:19:49.000 Starting RD=rd1; elapsed=36000; fwdrate=max. For loops: None

19:19:49.001 avg_2-1200 113049 0.41 67.0 55.0 90.0 101747 0.19 11302 2.42 99.36 11.04 110.40 1023 0.0 0.00 0.0 0.00 0.0 0.00 7065 0.85 7065 1.60 0.0 0.00


Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

The above examples can easily be modified to do different things particular if you read the Vdbench documentation on how to setup multi-host, multi-storage system, multiple job streams to do different types of processing. This means you can benchmark a storage systems, server or converged and hyper-converged platform, or simply put a workload on it as part of other testing. There are even options for handling data footprint reduction such as compression and dedupe.

Ok, nuff said, for now.

Gs

Greg Schulz - Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Some August 2015 Amazon Web Services (AWS) and Microsoft Azure Cloud Updates

Storage I/O trends

Some August 2015 Amazon Web Services (AWS) and Microsoft Azure Cloud Updates

Cloud Services Providers continue to extend their feature, function and capabilities and the following are two examples. Being a customer of both Amazon Web Services (AWS) as well as Microsoft Azure (among others), I receive monthly news updates about service improvements along with new features. Here are a couple of examples involving recent updates from AWS and Azure.

Azure enhancements

Microsoft Azure customer update

Azure Premium Storage generally available in Japan East

Solid State Device (SSD) based Azure Premium Storage is now available in Japan East region. Add up to 32 TB and more than 64,000 IOPs (read operations) per virtual machine with  Azure Premium Storage. Learn more about Azure storage and pricing here.

Azure Data Factory generally available

Data Factory is a cloud based data integration service for automated management as well as movement and transformation of data, learn more and view pricing options here.

AWS Partner Updates

Recent Amazon Web Services (AWS) customer update included the following pertaining to partner storage solutions.

AWS partner updates

AWS Partner Network APN

Learn more about AWS Partner Network (APN) here or click on the above image.

AWS APN competency programs include:

  • Storage
  • Healthcare
  • Life Sciences
  • SAP Solutions
  • Microsoft Solutions
  • Oracle Solutions
  • Marketing and Commerce
  • Big Data
  • Security
  • Digital Media

AWS Partner Network (APN) Solutions for Storage include:

Archiving to AWS Glacier

  • Commvault
  • NetApp (AltaVault)
  • Backup to AWS using S3

  • CloudBerry Lab
  • Commvault
  • Ctera
  • Druva
  • NetApp (AltaVault)

  • Primary Cloud File and NAS storage complementing on-premises (e.g. your local) storage

  • Avere
  • Ctera
  • NetApp (Cloud OnTap)
  • Panzura
  • SoftNAS
  • Zadara

  • Secure File Transfer

  • Aspera
  • Signiant

  • Note that the above are those listed on the AWS Storage Partner Page as of this being published and subject to change. Likewise other solutions that are not part of the AWS partner program may not be listed.

    Where to read, watch and learn more

    Storage I/O trends

    What this all means and wrap up

    Cloud Service Providers (CSP) continue to enhance their capabilities, as well as their footprints as part of growth. In addition to technology, tools and number of regions, sites and data centers, the CSPs are also expanding their partner networks both about how many partners, also in the scope of those partnerships. Some of these partnerships are in the scope of the cloud as a destination, others are for enabling hybrid where public clouds become an extension complementing traditional IT. Everything is not the same in most environments and one type of cloud approach does not have to suit or fit all needs, hence the value of hybrid cloud deployment and usage.

    Ok, nuff said, for now…

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    May and June 2015 Server StorageIO Update Newsletter

    Volume 15, Issue V & VI

    Hello and welcome to this joint May and June 2015 Server StorageIO update newsletter. Here in the northern hemisphere its summer which means holiday vacations among other things.

    There has been a lot going on this spring and so far this summer with more in the wings. Summer can also be a time to get caught up on some things, preparing for others while hopefully being able to enjoy some time off as well.

    In terms of what have I been working on (or with)? Clouds (OpenStack, vCloud Air, AWS, Azure, GCS among others), virtual and containers, flash SSD devices (drives, cards), software defining, content servers, NVMe, databases, data protection items, servers, cache and micro-tiering among other things.

    Speaking of getting caught up, back in early May among many other conferences (Cisco, Docker, HP, IBM, OpenStack, Red Hat and many other events) was EMCworld. EMC covered my hotel and registration costs to attend the event in Las Vegas (thanks EMC, that’s a disclosure btw ;). View a summary StorageIOblog post covering EMCworld 2015 here along with recent EMC announcements including Acquisition of cloud services vendor Virtustream for $1.2B, and ECS 2.0.

    Server and Storage I/O Wrappings

    This months newsletter has a focus on software and storage wrappings, that is, how your storage or software is packaged, delivered or deployed. For example traditional physical storage systems, software defined storage as shrink-wrap or download, tin-wrapped software as an appliance, virtual wrapped such as a virtual storage appliance or cloud wrapped among others.

    OpenStack software defined cloud

    OpenStack (both the organization, community, event and software) continue to gain momentum. The latest release known as Kilo (more Kilo info here) was released in early April followed by the OpenStack summit in May.

    Some of you might be more involved with OpenStack vs. others, perhaps having already deployed into your production environment. Perhaps you, like myself have OpenStack running in a lab for proof of concept, research, development or learning among other things.

    You might even be using the services of a public cloud or managed service provider that is powered by OpenStack. On the other hand, you might be familiar with OpenStack from reading up on it, watching videos, listening to podcast’s or attending events to figure out what it is, where it fits, as well as what can your organization use it for.

    Drew Robb (@Robbdrew) has a good overview piece about OpenStack and storage over at Enterprise Storage Forum (here). OpenStack is a collection of tools or bundles for building private, hybrid and public clouds. These various open source projects within the OpenStack umbrella include compute (Nova) and virtual machine images (Glance). Other components include dashboard management (Horizon), security and identity control (Keystone), network (Neutron), object storage (Swift), block storage (Cinder) and file-based storage (Manila) among others.

    It’s up to the user to decide which pieces you will add. For example, you can use Swift without having virtual machines and vice versa. Read Drew’s complete article here.

    Btw, if you missed it, not only has OpenStack added file support (e.g. Manila), Amazon Web Services (AWS) also recently added Elastic File Services (EFS) complementing there Elastic Block Services (EBS).

    Focus on Storage Wrappings

    Software exists and gets deployed in various places as shown in the following examples.

    software wrapped storage

    • Cloud wrapped software – software that can be deployed in a cloud instance.
    • Container wrapped software – software deployed in a docker or other container
    • Firmware wrapped software – software that gets packaged and deployed as firmware in a server, storage, network device or adapter
    • Shrink wrapped software – software that can be downloaded and deployed where you want
    • Tin wrapped software – software that is packaged or bundled with hardware (e.g. tin) such as an appliance or storage system
    • Virtual wrapped software

    server storage software wrapping

    StorageIOblog posts

    Data Protection Diaries

    Modernizing Data Protection
    Using new and old things in new ways

    This is part of an ongoing series of posts that part of www.storageioblog.com/data-protection-diaries-main/ on data protection including archiving, backup/restore, business continuance (BC), business resiliency (BC), data footprint reduction (DFR), disaster recovery (DR), High Availability (HA) along with related themes, tools, technologies, techniques, trends and strategies.
    world backup day (and test your restore) image licensed from Shutterstock by StorageIO

    Data protection is a broad topic that spans from logical and physical security to HA, BC, BR, DR, archiving(including life beyond compliance) along with various tools, technologies, techniques. Key is aligning those to the needs of the business or organization for today’s as well as tomorrows requirements. Instead of doing things what has been done in the past that may have been based on what was known or possible due to technology capabilities, why not start using new and old things in new ways.

    Let’s start using all the tools in the data protection toolbox regardless of if they are new or old, cloud, virtual, physical, software defined product or service in new ways while keeping the requirements of the business in focus. Read more from this post here.

    In case you missed it:

    View other recent as well as past blog posts here

    In This Issue


  • Industry Trends Perspectives News
  • Commentary in the news
  • Tips and Articles
  • StorageIOblog posts
  • Events and Webinars
  • Recommended Reading List
  • StorageIOblog posts
  • Server StorageIO Lab reports
  • Resources and Links
  • Industry News and Activity

    Recent Industry news and activity

    AWS adds new M4 virtual machine instances
    Cisco provides FCoE proof of life

    Google new cloud storage pricing
    HP announces new data center services
    HDS announces new products & services
    IBM enhances storage portfolio

    IBTA announces RoCE initiative
    InfiniteIO announces network/cloud cache
    Intel buying FPGA specialist Altera
    NetApp – Changes CEO

    View other recent and upcoming events here

    StorageIO Commentary in the news

    StorageIO news (image licensed for use from Shutterstock by StorageIO)
    Recent Server StorageIO commentary and industry trends perspectives about news, activities and announcements.

    BizTechMagazine: Comments on how to simplify your data center with virtualization
    EnterpriseStorageForum: Comments on Open Stack and Clouds
    EnterpriseStorageForum: Comments on Top Ten Software Defined Storage Tips, Gotchas and Cautions
    EdTech: Comments on Harness Power with New Processors

    Processor: Comments on Protecting Your Servers & Networking equipment
    EdTech: Comments on Harness Power with New Processors

    Processor: Comments on Improve Remote Server Management including KVM
    CyberTrend: Comments on Software Defined Data Center and virtualization
    BizTechMagazine: Businesses Prepare as End-of-Life for Windows Server 2003 Nears
    InformationWeek: Top 10 sessions from Interop Las Vegas 2015
    CyberTrend: Comments on Software Defined Data Center and Virtualization

    View more trends comments here

    Vendors you may not heard of

    This is a new section starting in this issue where various new or existing vendors as well as service providers you may not have heard about will be listed.

    CloudHQ – Cloud management tools
    EMCcode Rex-Ray – Container management
    Enmotus FUZE – Flash leveraged micro tiering
    Rubrik – Data protection management
    Sureline – Data protection management
    Virtunet systems – VMware flash cache software
    InfiniteIO – Cloud and NAS cache appliance
    Servers Direct – Server and storage platforms

    Check out more vendors you may know, have heard of, or that are perhaps new on the Server StorageIO Industry Links page here. There are over 1,000 entries (and growing) vendors on the links page.

    StorageIO Tips and Articles

    So you have a new storage device or system. How will you test or find its performance? Check out this quick-read tip on storage benchmark and testing fundamentals over at BizTech.

    Check out these resources and links on server storage I/O performance and benchmarking tools. View more tips and articles here

    Webinars

    BrightTalk Webinar – June 23 2015 9AM PT
    Server Storage I/O Innovation v2.015: Protect Preserve & Serve Your Information

    Videos and Podcasts

    VMware vCloud Air Server StorageIO Lab Test Drive Ride along videos.

    Server StorageIO Lab vCloud test drive video part 1Server StorageIO Lab vCloud test drive video part 2
    VMware vCloud Air test drive videos Part I & II

    StorageIO podcasts are also available via and at StorageIO.tv

    Various Industry Events

     

    VMworld August 30-September 3 2015

    Flash Memory Summit August 11-13

    Interop – April 29 2015 Las Vegas (Voted one of top ten sessions at Interop, more here)
    Smart Shopping for Your Storage Strategy

    View other recent and upcoming events here

    Webinars

    BrightTalk Webinar – June 23 2015 9AM PT
    Server Storage I/O Innovation v2.015: Protect Preserve & Serve Your Information

    From StorageIO Labs

    Research, Reviews and Reports

    VMware vCloud Air Test Drive
    VMware vCloud Air
    local and distributed NAS (NFS, CIFS, DFS) file data. Read more here.

    VMware vCloud Air

    VMware vCloud Air provides a platform similar to those just mentioned among others for your applications and their underlying resource needs (compute, memory, storage, networking) to be fulfilled. In addition, it should not be a surprise that VMware vCloud Air shares many common themes, philosophies and user experiences with the traditional on-premises based VMware solutions you might be familiar with.

    View other StorageIO lab review reports here

    Resources and Links

    Check out these useful links and pages:
    storageio.com/links
    objectstoragecenter.com
    storageioblog.com/data-protection-diaries-main/

    storageperformance.us
    thessdplace.com
    storageio.com/raid
    storageio.com/ssd

    Enjoy this edition of the Server StorageIO update newsletter and watch for new tips, articles, StorageIO lab report reviews, blog posts, videos and podcasts along with in the news commentary appearing soon.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    EMCworld 2015 How Do You Want Your Storage Wrapped?

    Server Storage I/O trends

    EMCworld 2015 How Do You Want Your Storage Wrapped?

    Back in early May I was invited by EMC to attend EMCworld 2015 which included both the public sessions, as well as several NDA based discussions. Keep in mind that there is the known, there is the unknown (or assumed or speculated) and in between there are NDA’s, nuff said on that. EMC covered my hotel and registration costs to attend the event in Las Vegas (thanks EMC, that’s a disclosure btw ;) and here is a synopsis of various EMCworld 2015 announcements.

    What EMC announced

    • VMAX3 enhancements to the EMC enterprise flagship storage platform to keep it relevant for traditional legacy workloads as well as for in a converged, scale-out, cloud, virtual and software defined environment.
    • VNX 3200 entry-level All Flash Array (AFA) flash SSD system starting at $25,000 USD for a 3TB unified platform with full data services found in other VNX products.
    • vVNX aka Virtual VNX aka "project liberty" which is a community (e.g. free) software version of the VNX. vVNX is a Virtual Storage Appliance (VSA) that you download and run on a VMware platform. Learn more and download here. Note the install will do a CPU type check so forget about trying to run it on a Intel Nuc or similar, I tried just because I could, the install will protect you from doing such things.
    • Various data protection related items including new Datadomain platforms as well as software updates and integration with other EMC platforms (storage systems).
    • All Flash Array (AFA) XtremIO 4.0 enhancements including larger clusters, larger nodes to boost performance, capacity and availability, along with copy service updates among others improvements.
    • Preview of DSSD shared (inside a rack) external flash Solid State Device (SSD) including more details. While much of DSSD is still under NDA, EMC did provide more public details at EMCworld. Between what was displayed and announced publicly at EMCworld as well as what can be found via Google (or other searches) you can piece together more of the DSSD story. What is known publicly today is that DSSD leverages the new Non-Volatile Memory express (NVMe) access protocol built upon underlying PCIe technology. More on DSSD in future discussions,if you have not done so, get an NDA deep dive briefing on it from EMC.
    • ScaleIO is now available via a free download here including both Windows and Linux clients as well as instructions for those operating systems as well as VMware.
    • ViPR can also be downloaded here for free (has been previously available) from here as well as it has been placed into open source by EMC.

    What EMC announced since EMCworld 2015

    • Acquisition of cloud services (and software tools) vendor Virtustream for $1.2B adding to the federation cloud services portfolio (companion to VMware vCloud Air).
    • Release of ECS 2.0 including a free download here. This new version of ECS (Elastic Cloud Storage) can be used independent of the ViPR controller, or in conjunction with ViPR. In addition ECS now has about 80% of the functionality of the Centera object storage platform. The remaining 20% functionality (mainly regulatory compliance governance) of Centera will be added to ECS in the future providing a migration path for Centera customers. In case you are wondering what does EMC do with Centera, Atmos, ViPR and now ECS, answer is that ECS can work with or without ViPR, second is that the functionality of Centera, Atmos are being rolled into ECS. ECS as a refresher is software that transforms general purpose industry standard servers with direct storage into a scale-out HDFS and object storage solution.
    • Check out EMCcode including S3motion that I use and have reviewed here. Also check out EMCcode Rex-Ray which if you are into docker containers, it should be of interest, I know I’m interested in it.

    Server Storage I/O trends

    What this all means and wrap-up

    There were no single major explosive announcements however the sum of all the announcements together should not be over shadowed by the big tent made for TV (or web) big tent productions and entertainment. What EMC announced was effectively how would you like, how do you want and need your storage and associated data services along with management wrapped.

    tin wrapped software

    By being wrapped, do you want your software defined storage management and storage wrapped in a legacy turnkey solution such as VMAX3, VNX or Isilon, do you want or need it to be hybrid or all flash, converged and unified, block, file or object.

    software wrapped storage

    Or do you need or want the software defined storage management and storage to be "shrink wrapped" as a download so you can deploy on your own hardware "tin wrapped" or as a VSA "virtual wrapped" or cloud wrapped? Do you need or want the software defined storage management and storage to leverage anybody’s hardware while being open source?

    server storage software wrapping

    How do you need or want your storage to be wrapped to fit your specific needs, that IMHO was the essence of what EMC announced at EMCworld 2015, granted the motorcycles and other production entertainment was engaging as well as educational.

    Ok, nuff said for now

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Data Protection Gumbo = Protect Preserve and Serve Information

    Storage I/O trends

    Data Protection Gumbo = Protect Preserve and Serve Information

    Recently I was invited to be a guest on the podcast Data Protection Gumbo hosted by Demetrius Malbrough (@dmalbrough).

    Data Protection Gumbo Podcast Description
    Data Protection Gumbo is set up with the aim of expanding the awareness of anyone responsible for protecting mission critical data, by providing them with a mix of the latest news, data protection technologies, and interesting facts on topics in the Data Backup and Recovery industry.

    Data Protection Gumbo Also available on

    Protect Preserve and Serve Applications, Information and Data

    Keep in mind that a fundamental role of Information Technology (IT) is to protect, preserve and serve business or organizations information assets including applications, configuration settings and data for use when or where needed.

    Our conversation covers various aspects of data protection which has a focus of protect preserve and serve information, applications and data across different environments and customer segments. While we discuss enterprise and small medium business (SMB) data protection, we also talk about trends from Mobile to the cloud among many others tools, technologies and techniques.

    Where to learn more

    Learn more about data protection and related trends, tools and technologies via the following links:

    Data Protection Gumbo Also available on

    What this all means and wrap-up

    Data protection is a broad topic that spans from logical and physical security to high availability (HA), disaster recovery (DR), business continuance (BC), business resiliency (BR), archiving (including life beyond compliance) along with various tools, technologies, techniques. Keeping with the theme of protect preserve and serve, data protection to be modernized needs to become and be seen as a business asset or enabler vs. an after thought or cost over-head topic. Also, keep in mind that only you can prevent data loss, are your restores ready for when you need them?

    Check out Demetrius Data Protection Gumbo podcast, also check out his Linkedin Backup & Recovery Professionals group. Speaking of data protection, check out the www.storageioblog.com/data-protection-diaries-main/ page for more coverage of backup/restore, HA, BC, DR, archiving and restated themes.

    Ok, nuff said, for now..

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    March 2015 Server StorageIO Update Newsletter

     

     

    Volume 15, Issue III

    Hello and welcome to this March 2015 Server and StorageIO update newsletter. Here in the northern hemisphere at least by the calendar spring is here, weather wise winter continues to linger in some areas. March also means in the US college university sports tournaments with many focused on their NCAA men’s basketball championship brackets.

    Besides various college championships, March also has a connection to back up and data protection. Thus this months newsletter has a focus on data protection, after all March 31 is World Backup Day which means it should also be World Restore test day!

    Focus on Data Protection

    Data protection including backup/restore, business continuance (BC), disaster recovery (DR), business resiliency (BR) and archiving across physical, virtual and cloud environments.

    Data Protection Fundamentals

    A reminder on the importance of data protection including backup, BC, DR and related technologies is to make sure they are occuring as planned. Also test your copies and remember the 4 3 2 1 rule or guide.

    4 – Versions (different time intervals)
    3 – Copies of critical data (including versions)
    2 – Different media, devices or systems
    1 – Off-site (cloud or elsewhere)

    The above means having at least four (4) different versions from various points in time of your data. Having three (3) copies including various versions protects against one or more copies being corrupt or damaged. Placing those versions and copies on at least two (2) different storage systems, devices or media if something happens.

    While it might be common sense, a bad April Fools recovery joke would be finding out all of your copies were on the same device which is damaged. That might seem obvious however sometimes the obvious needs to be stated. Also make sure that at least one (1) of your copies is off-site either on off-line media (tape, disk, ssd, optical) or cloud.

    Take a few moments and to verify that your data protection strategy is being implemented and practiced as intended. Also test what is being copied including not only restore the data from cloud, disk, ssd or tape, also make sure you can actually read or use the data being protected. This means make sure that your security credentials including access certificates and decryption occur as expected.

    Watch for more news, updates industry trends perspectives commentary, tips, articles and other information at Storageio.com, StorageIOblog.com, various partner venues as well as in future newsletters.

    StorageIOblog posts

    Data Protection Diaries
    Are restores ready for World Backup Day?
    In case you forgot or did not know, World Backup Day is March 31 2015 (@worldbackupday) so now is a good time to be ready. The only challenge that I have with the World Backup Day (view their site here) that has gone on for a few years know is that it is a good way to call out the importance of backing up or protecting data.
    world backup day test your restore

    However it’s also time to put more emphasis and focus on being able to make sure those backups or protection copies actually work.

    By this I mean doing more than making sure that your data can be read from tape, disk, SSD or cloud service actually going a step further and verifying that restored data can actually be used (read, written, etc).

    The problem, issue and challenges are simple, are your applications, systems and data protected as well as can you use those protection copies (e.g. backups, snapshots, replicas or archives) when as well as were needed? Read more here about World Backup Day and what I’m doing as well as various tips to be ready for successful recovery and avoid being an April 1st fool ;).

    Cloud Conversations
    AWS S3 Cross Region Replication
    Amazon Web Services (AWS) announced several enhancements including a new Simple Storage Service (S3) cross-region replication of objects from a bucket (e.g. container) in one region to a bucket in another region.

    AWS also recently enhanced Elastic Block Storage (EBS) increasing maximum performance and size of Provisioned IOPS (SSD) and General Purpose (SSD) volumes. EBS enhancements included ability to store up to 16 TBytes of data in a single volume and do 20,000 input/output operations per second (IOPS). Read more about EBS and other AWS server, storage I/O  enhancements here.
    AWS regions and availability zones (AZ)
    Example of some AWS Regions and AZs

    AWS S3 buckets and objects are stored in a specific region designated by the customer or user (AWS S3, EBS, EC2, Glacier, Regions and Availability Zone primer can be found here). The challenge being addressed by AWS with S3 replication is being able to move data (e.g. objects) stored in AWS buckets in one region to another in a safe, secure, timely, automated, cost-effective way.

    Continue reading more here about AWS S3 bucket and object replication feature along with related material.

    Additional March StorageIOblog posts include:

    Server Storage I/O performance (Image licensed from Shutterstock by StorageIO)

     

     

    View other recent as well as past blog posts here

    In This Issue

    • Industry Trends Perspectives News
    • Commentary in the news
    • Tips and Articles
    • StorageIOblog posts
    • Events and Webinars
    • Recommended Reading List
    • StorageIOblog posts
    • Server StorageIO Lab reports
    • Resources and Links

     

    Industry News and Activity

    Recent Industry news and activity

    EMC sets up cloudfoundry Dojo
    AWS S3, EBS IOPs and other updates
    New backup/data protection vendor Rubrik
    Google adds nearline Cloud Storage
    AWS and Microsoft Cloud Price battle

    View other recent and upcoming events here

    StorageIO Commentary in the news

    StorageIO news (image licensed for use from Shutterstock by StorageIO)
    Recent Server StorageIO commentary and industry trends perspectives about news, activities and announcements.

    Processor: Enterprise Backup Solution Tips
    Processor: Failed & Old Drives
    EnterpriseStorageForum: Disk Buying Guide
    ChannelProNetwork: 2015 Tech and SSD
    Processor: Detect & Avoid Drive Failures

    View more trends comments here

    StorageIO Tips and Articles

    So you have a new storage device or system. How will you test or find its performance? Check out this quick-read tip on storage benchmark and testing fundamentals over at BizTech.

    Keeping with this months theme of data protection including backup/restore, BC, DR, BR and archiving, here are some more tips. These tips span server storage I/O networking hardware, software, cloud, virtual, performance, data protection applications and related themes including:

    • Test your data restores, can you read and actually use the data? Is you data decrypted, proper security certificates applied?
    • Remember to back up or protect your security encryption keys, certificates and application settings!
    • Revisit what format your data is being saved in including how will you be able to use data saved to the cloud. Will you be able to do a restore to a cloud server or do you need to make sure a copy of your backup tools are on your cloud server instances?

    Check out these resources and links on server storage I/O performance and benchmarking tools. View more tips and articles here

    Various Industry Events

    EMCworld – May 4-6 2015

    Interop – April 29 2015 (Las Vegas)

    Presenting Smart Shopping for Your Storage Strategy

    NAB – April 14-15 2015

    SNIA DSI Event – April 7-9

    View other recent and upcoming events here

    Webinars

    December 11, 2014 – BrightTalk
    Server & Storage I/O Performance

    December 10, 2014 – BrightTalk
    Server & Storage I/O Decision Making

    December 9, 2014 – BrightTalk
    Virtual Server and Storage Decision Making

    December 3, 2014 – BrightTalk
    Data Protection Modernization

    Videos and Podcasts

    StorageIO podcasts are also available via and at StorageIO.tv

    From StorageIO Labs

    Research, Reviews and Reports

    Datadynamics StorageX
    Datadynamics StorageX

    More than a data mover migration tool, StorageX is a tool for adding management and automation around unstructured local and distributed NAS (NFS, CIFS, DFS) file data. Read more here.

    View other StorageIO lab review reports here

    Recommended Reading List

    This is a new section being introduced in this edition of the Server StorageIO update mentioning various books, websites, blogs, articles, tips, tools, videos, podcasts along with other things I have found interesting and want to share with you.

      • Introducing s3motion (via EMCcode e.g. opensource) a tool for copying buckets and objects between public, private and hybrid clouds (e.g. AWS S3, GCS, Microsoft Azure and others) as well as object storage systems. This is a great tool which I have added to my server storage I/O cloud, virtual and physical toolbox. If you are not familiar with EMCcode check it out to learn more…
    • Running Hadoop on Ubuntu Linux (Series of tutorials) for those who want to get their hands dirty vs. using one of the All In One (AIO) appliances.
      • Yellow-bricks (Good blog focused on virtualization, VMware and other related themes) by Duncan Epping @duncanyb

    Resources and Links

    Check out these useful links and pages:
    storageio.com/links
    objectstoragecenter.com
    storageioblog.com/data-protection-diaries-main/

    storageperformance.us
    thessdplace.com
    storageio.com/raid
    storageio.com/ssd

    Enjoy this edition of the Server and StorageIO update newsletter and watch for new tips, articles, StorageIO lab report reviews, blog posts, videos and podcasts along with in the news commentary appearing soon.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Data Protection Diaries: Are your restores ready for World Backup Day 2015?

    Data Protection Diaries: Are your restores ready for World Backup Day 2015?

    This is part of an ongoing data protection diaries series of post about, well, cloud and data protection and what I’m doing pertaining to World Backup Day 2015 along with related topics.

    In case you forgot or did not know, World Backup Day is March 31 2015 (@worldbackupday) so now is a good time to be ready. The only challenge that I have with the World Backup Day (view their site here) that has gone on for a few years know is that it is a good way to call out the importance of backing up or protecting data. However its time to also put more emphasis and focus on being able to make sure those backups or protection copies actually work.

    By this I mean doing more than making sure that your data can be read from tape, disk, SSD or cloud service actually going a step further and verifying that restored data can actually be used (read, written, etc).

    The Problem, Issue, Challenge, Opportunity and Need

    The problem, issue and challenges are simple, are your applications, systems and data protected as well as can you use those protection copies (e.g. backups, snapshots, replicas or archives) when as well as were needed?

    storage I/O data protection

    The opportunity is simple, avoiding downtime or impact to your business or organization by being proactive.

    Understanding the challenge and designing a strategy

    The following is my preparation checklist for World Backup Data 2015 (e.g. March 31 2015) which includes what I need or want to protect, as well as some other things to be done including testing, verification, address (remediate or fix) known issues while identifying other areas for future enhancements. Thus perhaps like yours, data protection for my environment which includes physical, virtual along with cloud spanning servers to mobile devices is constantly evolving.

    collect TPM metrics from SQL Server with hammerdb
    My data protection preparation, checklist and to do list

    Finding a solution

    While I already have a strategy, plan and solution that encompasses different tools, technologies and techniques, they are also evolving. Part of the evolving is to improve while also exploring options to use new and old things in new ways as well as eat my down dog food or walk the talk vs. talk the talk. The following figure provides a representation of my environment that spans physical, virtual and clouds (more than one) and how different applications along with systems are protected against various threats or risks. Key is that not all applications and data are the same thus enabling them to be protected in different ways as well as over various intervals. Needless to say there is more to how, when, where and with what different applications and systems are protected in my environment than show, perhaps more on that in the future.

    server storageio and unlimitedio data protection
    Some of what my data protection involves for Server StorageIO

    Taking action

    What I’m doing is going through my checklist to verify and confirm the various items on the checklist as well as find areas for improvement which is actually an ongoing process.

    Do I find things that need to be corrected?

    Yup, in fact found something that while it was not a problem, identified a way to improve on a process that will once fully implemented enabler more flexibility both if a restoration is needed, as well as for general everyday use not to mention remove some complexity and cost.

    Speaking of lessons learned, check this out that ties into why you want 4 3 2 1 based data protection strategies.

    Storage I/O trends

    Where to learn more

    Here are some extra links to have a look at:

    Data Protection Diaries
    Cloud conversations: If focused on cost you might miss other cloud storage benefits
    5 Tips for Factoring Software into Disaster Recovery Plans
    Remote office backup, archiving and disaster recovery for networking pros
    Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)
    Given outages, are you concerned with the security of the cloud?
    Data Archiving: Life Beyond Compliance
    My copies were corrupted: The 3-2-1 rule
    Take a 4-3-2-1 approach to backing up data
    Cloud and Virtual Data Storage Networks – Chapter 8 (CRC/Taylor and Francis)

    What this all means and wrap-up

    Be prepared, be proactive when it comes to data protection and business resiliency vs. simply relying reacting and recovering hoping that all will be ok (or works).

    Take a few minutes (or longer) and test your data protection including backup to make sure that you can:

    a) Verify that in fact they are working protecting applications and data in the way expected

    b) Restore data to an alternate place (verify functionality as well as prevent a problem)

    c) Actually use the data meaning it is decrypted, inflated (un-compressed, un-de duped) and security certificates along with ownership properties properly applied

    d) Look at different versions or generations of protection copies if you need to go back further in time

    e) Identify area of improvement or find and isolate problem issues in advance vs. finding out after the fact

    Time to get back to work checking and verifying things as well as attending to some other items.

    Ok, nuff said, for now…

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    How to test your HDD SSD or all flash array (AFA) storage fundamentals

    How to test your HDD SSD AFA Hybrid or cloud storage

    server storage data infrastructure i/o hdd ssd all flash array afa fundamentals

    Updated 2/14/2018

    Over at BizTech Magazine I have a new article 4 Ways to Performance Test Your New HDD or SSD that provides a quick guide to verifying or learning what the speed characteristic of your new storage device are capable of.

    An out-take from the article used by BizTech as a "tease" is:

    These four steps will help you evaluate new storage drives. And … psst … we included the metrics that matter.

    Building off the basics, server storage I/O benchmark fundamentals

    The four basic steps in the article are:

    • Plan what and how you are going to test (what’s applicable for you)
    • Decide on a benchmarking tool (learn about various tools here)
    • Test the test (find bugs, errors before a long running test)
    • Focus on metrics that matter (what’s important for your environment)

    Server Storage I/O performance

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    To some the above (read the full article here) may seem like common sense tips and things everybody should know otoh there are many people who are new to servers storage I/O networking hardware software cloud virtual along with various applications, not to mention different tools.

    Thus the above is a refresher for some (e.g. Dejavu) while for others it might be new and revolutionary or simply helpful. Interested in HDD’s, SSD’s as well as other server storage I/O performance along with benchmarking tools, techniques and trends check out the collection of links here (Server and Storage I/O Benchmarking and Performance Resources).

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    I/O, I/O how well do you know good bad ugly server storage I/O iops?

    How well do you know good bad ugly I/O iops?

    server storage i/o iops activity data infrastructure trends

    Updated 2/10/2018

    There are many different types of server storage I/O iops associated with various environments, applications and workloads. Some I/Os activity are iops, others are transactions per second (TPS), files or messages per time (hour, minute, second), gets, puts or other operations. The best IO is one you do not have to do.

    What about all the cloud, virtual, software defined and legacy based application that still need to do I/O?

    If no IO operation is the best IO, then the second best IO is the one that can be done as close to the application and processor as possible with the best locality of reference.

    Also keep in mind that aggregation (e.g. consolidation) can cause aggravation (server storage I/O performance bottlenecks).

    aggregation causes aggravation
    Example of aggregation (consolidation) causing aggravation (server storage i/o blender bottlenecks)

    And the third best?

    It’s the one that can be done in less time or at least cost or effect to the requesting application, which means moving further down the memory and storage stack.

    solving server storage i/o blender and other bottlenecks
    Leveraging flash SSD and cache technologies to find and fix server storage I/O bottlenecks

    On the other hand, any IOP regardless of if for block, file or object storage that involves some context is better than those without, particular involving metrics that matter (here, here and here [webinar] )

    Server Storage I/O optimization and effectiveness

    The problem with IO’s is that they are a basic operations to get data into and out of a computer or processor, so there’s no way to avoid all of them, unless you have a very large budget. Even if you have a large budget that can afford an all flash SSD solution, you may still meet bottlenecks or other barriers.

    IO’s require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data too their destination or retrieve them from where they are stored. While IO’s cannot be eliminated, their impact can be greatly improved or optimized by, among other techniques, doing fewer of them via caching and by grouping reads or writes (pre-fetch, write-behind).

    server storage I/O STI and SUT

    Think of it this way: Instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip. However, that optimization may also mean your drive will take longer. So, sometimes it makes sense to go on a couple of quick, short, low-latency trips instead of one larger one that takes half a day even as it accomplishes many tasks. Of course, how far you have to go on those trips (i.e., their locality) makes a difference about how many you can do in a given amount of time.

    Locality of reference (or proximity)

    What is locality of reference?

    This refers to how close (i.e., its place) data exists to where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, ready to be acted on immediately. This would be followed by levels 1, 2, and 3 (L1, L2, and L3) onboard caches, followed by main memory, or DRAM. After that comes solid-state memory typically NAND flash either on PCIe cards or accessible on a direct attached storage (DAS), SAN, or NAS device. 

    server storage I/O locality of reference

    Even though a PCIe NAND flash card is close to the processor, there still remains the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with meta or control information to further optimize and improve locality of reference. In other words, this information is used to help with cache hits, cache use, and cache effectiveness vs. simply boosting cache use.

    SSD to the rescue?

    What can you do the cut the impact of IO’s?

    There are many steps one can take, starting with establishing baseline performance and availability metrics.

    The metrics that matter include IOP’s, latency, bandwidth, and availability. Then, leverage metrics to gain insight into your application’s performance.

    Understand that IO’s are a fact of applications doing work (storing, retrieving, managing data) no matter whether systems are virtual, physical, or running up in the cloud. But it’s important to understand just what a bad IO is, along with its impact on performance. Try to identify those that are bad, and then find and fix the problem, either with software, application, or database changes. Perhaps you need to throw more software caching tools, hypervisors, or hardware at the problem. Hardware may include faster processors with more DRAM and faster internal busses.

    Leveraging local PCIe flash SSD cards for caching or as targets is another option.

    You may want to use storage systems or appliances that rely on intelligent caching and storage optimization capabilities to help with performance, availability, and capacity.

    Where to gain insight into your server storage I/O environment

    There are many tools that you can be used to gain insight into your server storage I/O environment across cloud, virtual, software defined and legacy as well as from different layers (e.g. applications, database, file systems, operating systems, hypervisors, server, storage, I/O networking). Many applications along with databases have either built-in or optional tools from their provider, third-party, or via other sources that can give information about work activity being done. Likewise there are tools to dig down deeper into the various data information infrastructure to see what is happening at the various layers as shown in the following figures.

    application storage I/O performance
    Gaining application and operating system level performance insight via different tools

    windows and linux storage I/O performance
    Insight and awareness via operating system tools on Windows and Linux

    In the above example, Spotlight on Windows (SoW) which you can download for free from Dell here along with Ubuntu utilities are shown, You could also use other tools to look at server storage I/O performance including Windows Perfmon among others.

    vmware server storage I/O
    Hypervisor performance using VMware ESXi / vsphere built-in tools

    vmware server storage I/O performance
    Using Visual ESXtop to dig deeper into virtual server storage I/O performance

    vmware server storage i/o cache
    Gaining insight into virtual server storage I/O cache performance

    Wrap up and summary

    There are many approaches to address (e.g. find and fix) vs. simply move or mask data center and server storage I/O bottlenecks. Having insight and awareness into how your environment along with applications is important to know to focus resources. Also keep in mind that a bit of flash SSD or DRAM cache in the applicable place can go along way while a lot of cache will also cost you cash. Even if you cant eliminate I/Os, look for ways to decrease their impact on your applications and systems.

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    >Keep in mind: SSD including flash and DRAM among others are in your future, the question is where, when, with what, how much and whose technology or packaging.

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    Revisiting RAID data protection remains relevant resource links

    Revisiting RAID data protection remains relevant and resources

    Storage I/O trends

    Updated 2/10/2018

    RAID data protection remains relevant including erasure codes (EC), local reconstruction codes (LRC) among other technologies. If RAID were really not relevant anymore (e.g. actually dead), why do some people spend so much time trying to convince others that it is dead or to use a different RAID level or enhanced RAID or beyond raid with related advanced approaches?

    When you hear RAID, what comes to mind?

    A legacy monolithic storage system that supports narrow 4, 5 or 6 drive wide stripe sets or a modern system support dozens of drives in a RAID group with different options?

    RAID means many things, likewise there are different implementations (hardware, software, systems, adapters, operating systems) with various functionality, some better than others.

    For example, which of the items in the following figure come to mind, or perhaps are new to your RAID vocabulary?

    RAID questions

    There are Many Variations of RAID Storage some for the enterprise, some for SMB, SOHO or consumer. Some have better performance than others, some have poor performance for example causing extra writes that lead to the perception that all parity based RAID do extra writes (some actually do write gathering and optimization).

    Some hardware and software implementations using WBC (write back cache) mirrored or battery backed-BBU along with being able to group writes together in memory (cache) to do full stripe writes. The result can be fewer back-end writes compared to other systems. Hence, not all RAID implementations in either hardware or software are the same. Likewise, just because a RAID definition shows a particular theoretical implementation approach does not mean all vendors have implemented it in that way.

    RAID is not a replacement for backup rather part of an overall approach to providing data availability and accessibility.

    data protection and durability

    What’s the best RAID level? The one that meets YOUR needs

    There are different RAID levels and implementations (hardware, software, controller, storage system, operating system, adapter among others) for various environments (enterprise, SME, SMB, SOHO, consumer) supporting primary, secondary, tertiary (backup/data protection, archiving).

    RAID comparison
    General RAID comparisons

    Thus one size or approach does fit all solutions, likewise RAID rules of thumbs or guides need context. Context means that a RAID rule or guide for consumer or SOHO or SMB might be different for enterprise and vise versa, not to mention on the type of storage system, number of drives, drive type and capacity among other factors.

    RAID comparison
    General basic RAID comparisons

    Thus the best RAID level is the one that meets your specific needs in your environment. What is best for one environment and application may be different from what is applicable to your needs.

    Key points and RAID considerations include:

    · Not all RAID implementations are the same, some are very much alive and evolving while others are in need of a rest or rewrite. So it is not the technology or techniques that are often the problem, rather how it is implemented and then deployed.

    · It may not be RAID that is dead, rather the solution that uses it, hence if you think a particular storage system, appliance, product or software is old and dead along with its RAID implementation, then just say that product or vendors solution is dead.

    · RAID can be implemented in hardware controllers, adapters or storage systems and appliances as well as via software and those have different features, capabilities or constraints.

    · Long or slow drive rebuilds are a reality with larger disk drives and parity-based approaches; however, you have options on how to balance performance, availability, capacity, and economics.

    · RAID can be single, dual or multiple parity or mirroring-based.

    · Erasure and other coding schemes leverage parity schemes and guess what umbrella parity schemes fall under.

    · RAID may not be cool, sexy or a fun topic and technology to talk about, however many trendy tools, solutions and services actually use some form or variation of RAID as part of their basic building blocks. This is an example of using new and old things in new ways to help each other do more without increasing complexity.

    ·  Even if you are not a fan of RAID and think it is old and dead, at least take a few minutes to learn more about what it is that you do not like to update your dead FUD.

    Wait, Isn’t RAID dead?

    There is some dead marketing that paints a broad picture that RAID is dead to prop up something new, which in some cases may be a derivative variation of parity RAID.

    data dispersal
    Data dispersal and durability

    RAID rebuild improving
    RAID continues to evolve with rapid rebuilds for some systems

    Otoh, there are some specific products, technologies, implementations that may be end of life or actually dead. Likewise what might be dead, dying or simply not in vogue are specific RAID implementations or packaging. Certainly there is a lot of buzz around object storage, cloud storage, forward error correction (FEC) and erasure coding including messages of how they cut RAID. Catch is that some object storage solutions are overlayed on top of lower level file systems that do things such as RAID 6, granted they are out of sight, out of mind.

    RAID comparison
    General RAID parity and erasure code/FEC comparisons

    Then there are advanced parity protection schemes which include FEC and erasure codes that while they are not your traditional RAID levels, they have characteristic including chunking or sharding data, spreading it out over multiple devices with multiple parity (or derivatives of parity) protection.

    Bottom line is that for some environments, different RAID levels may be more applicable and alive than for others.

    Via BizTech – How to Turn Storage Networks into Better Performers

    • Maintain Situational Awareness
    • Design for Performance and Availability
    • Determine Networked Server and Storage Patterns
    • Make Use of Applicable Technologies and Techniques

    If RAID is alive, what to do with it?

    If you are new to RAID, learn more about the past, present and future keeping mind context. Keeping context in mind means that there are different RAID levels and implementations for various environments. Not all RAID 0, 1, 1/0, 10, 2, 3, 4, 5, 6 or other variations (past, present and emerging) are the same for consumer vs. SOHO vs. SMB vs. SME vs. Enterprise, nor are the usage cases. Some need performance for reads, others for writes, some for high-capacity with low performance using hardware or software. RAID Rules of thumb are ok and useful, however keep them in context to what you are doing as well as using.

    What to do next?

    Take some time to learn, ask questions including what to use when, where, why and how as well as if an approach or recommendation are applicable to your needs. Check out the following links to read some extra perspectives about RAID and keep in mind, what might apply to enterprise may not be relevant for consumer or SMB and vise versa.

    Some advise needed on SSD’s and Raid (Via Spiceworks)
    RAID 5 URE Rebuild Means The Sky Is Falling (Via BenchmarkReview)
    Double drive failures in a RAID-10 configuration (Via SearchStorage)
    Industry Trends and Perspectives: RAID Rebuild Rates (Via StorageIOblog)
    RAID, IOPS and IO observations (Via StorageIOBlog)
    RAID Relevance Revisited (Via StorageIOBlog)
    HDDs Are Still Spinning (Rust Never Sleeps) (Via InfoStor)
    When and Where to Use NAND Flash SSD for Virtual Servers (Via TheVirtualizationPractice)
    What’s the best way to learn about RAID storage? (Via Spiceworks)
    Design considerations for the host local FVP architecture (Via Frank Denneman)
    Some basic RAID fundamentals and definitions (Via SearchStorage)
    Can RAID extend nand flash SSD life? (Via StorageIOBlog)
    I/O Performance Issues and Impacts on Time-Sensitive Applications (Via CMG)
    The original RAID white paper (PDF) that while over 20 years old, it provides a basis, foundation and some history by Katz, Gibson, Patterson et al
    Storage Interview Series (Via Infortrend)
    Different RAID methods (Via RAID Recovery Guide)
    A good RAID tutorial (Via TheGeekStuff)
    Basics of RAID explained (Via ZDNet)
    RAID and IOPs (Via VMware Communities)

    Where To Learn More

    View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

    Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

    Software Defined Data Infrastructure Essentials Book SDDC

    What This All Means

    What is my favorite or preferred RAID level?

    That depends, for some things its RAID 1, for others RAID 10 yet for others RAID 4, 5, 6 or DP and yet other situations could be a fit for RAID 0 or erasure codes and FEC. Instead of being focused on just one or two RAID levels as the solution for different problems, I prefer to look at the environment (consumer, SOHO, small or large SMB, SME, enterprise), type of usage (primary or secondary or data protection), performance characteristics, reads, writes, type and number of drives among other factors. What might be a fit for one environment would not be a fit for others, thus my preferred RAID level along with where implemented is the one that meets the given situation. However also keep in mind is tying RAID into part of an overall data protection strategy, remember, RAID is not a replacement for backup.

    What this all means

    Like other technologies that have been declared dead for years or decades, aka the Zombie technologies (e.g. dead yet still alive) RAID continues to be used while the technologies evolves. There are specific products, implementations or even RAID levels that have faded away, or are declining in some environments, yet alive in others. RAID and its variations are still alive, however how it is used or deployed in conjunction with other technologies also is evolving.

    Ok, nuff said, for now.

    Gs

    Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

    Data Storage Tape Update V2014, Its Still Alive

    Data Storage Tape Update V2014, It’s Still Alive

    server storage I/O trends

    A year or so ago I did a piece tape summit resources. Despite being declared dead for decades, and will probably stay being declared dead for years to come, magnetic tape is in fact still alive being used by some organizations, granted its role is changing while the technology still evolves.

    Here is the memo I received today from the PR folks of the Tape Storage Council (e.g. tape vendors marketing consortium) and for simplicity (mine), I’m posting it here for you to read in its entirety vs. possibly in pieces elsewhere. Note that this is basically a tape status and collection of marketing and press release talking points, however you can get an idea of the current messaging, who is using tape and technology updates.

    Tape Data Storage in 2014 and looking towards 2015

    True to the nature of magnetic tape as a data storage medium, this is not a low latency small post, rather a large high-capacity bulk post or perhaps all you need to know about tape for now, or until next year. Otoh, if you are a tape fan, you can certainly take the memo from the tape folks, as well as visit their site for more info.

    From the tape storage council industry trade group:

    Today the Tape Storage Council issued its annual memo to highlight the current trends, usages and technology innovations occurring within the tape storage industry. The Tape Storage Council includes representatives of BDT, Crossroads Systems, FUJIFILM, HP, IBM, Imation, Iron Mountain, Oracle, Overland Storage, Qualstar, Quantum, REB Storage Systems, Recall, Spectra Logic, Tandberg Data and XpresspaX.  

    Data Growth and Technology Innovations Fuel Tape’s Future
    Tape Addresses New Markets as Capacity, Performance, and Functionality Reach New Levels

    Abstract
    For the past decade, the tape industry has been re-architecting itself and the renaissance is well underway. Several new and important technologies for both LTO (Linear Tape Open) and enterprise tape products have yielded unprecedented cartridge capacity increases, much longer media life, improved bit error rates, and vastly superior economics compared to any previous tape or disk technology. This progress has enabled tape to effectively address many new data intensive market opportunities in addition to its traditional role as a backup device such as archive, Big Data, compliance, entertainment and surveillance. Clearly disk technology has been advancing, but the progress in tape has been even greater over the past 10 years. Today’s modern tape technology is nothing like the tape of the past.

    The Growth in Tape  
    Demand for tape is being fueled by unrelenting data growth, significant technological advancements, tape’s highly favorable economics, the growing requirements to maintain access to data “forever” emanating from regulatory, compliance or governance requirements, and the big data demand for large amounts of data to be analyzed and monetized in the future. The Digital Universe study suggests that the world’s information is doubling every two years and much of this data is most cost-effectively stored on tape.

    Enterprise tape has reached an unprecedented 10 TB native capacity with data rates reaching 360 MB/sec. Enterprise tape libraries can scale beyond one exabyte. Enterprise tape manufacturers IBM and Oracle StorageTek have signaled future cartridge capacities far beyond 10 TBs with no limitations in sight.  Open systems users can now store more than 300 Blu-ray quality movies with the LTO-6 2.5 TB cartridge. In the future, an LTO-10 cartridge will hold over 14,400 Blu-ray movies. Nearly 250 million LTO tape cartridges have been shipped since the format’s inception. This equals over 100,000 PB of data protected and retained using LTO Technology. The innovative active archive solution combining tape with low-cost NAS storage and LTFS is gaining momentum for open systems users.

    Recent Announcements and Milestones
    Tape storage is addressing many new applications in today’s modern data centers while offering welcome relief from constant IT budget pressures. Tape is also extending its reach to the cloud as a cost-effective deep archive service. In addition, numerous analyst studies confirm the TCO for tape is much lower than disk when it comes to backup and data archiving applications. See TCO Studies section below.

    • On Sept. 16, 2013 Oracle Corp announced the StorageTek T10000D enterprise tape drive. Features of the T10000D include an 8.5 TB native capacity and data rate of 252 MB/s native. The T10000D is backward read compatible with all three previous generations of T10000 tape drives.
    • On Jan. 16, 2014 Fujifilm Recording Media USA, Inc. reported it has manufactured over 100 million LTO Ultrium data cartridges since its release of the first generation of LTO in 2000. This equates to over 53 thousand petabytes (53 exabytes) of storage and more than 41 million miles of tape, enough to wrap around the globe 1,653 times.
    • April 30, 2014, Sony Corporation independently developed a soft magnetic under layer with a smooth interface using sputter deposition, created a nano-grained magnetic layer with fine magnetic particles and uniform crystalline orientation. This layer enabled Sony to successfully demonstrate the world’s highest areal recording density for tape storage media of 148 GB/in2. This areal density would make it possible to record more than 185 TB of data per data cartridge.
    • On May 19, 2014 Fujifilm in conjunction with IBM successfully demonstrated a record areal data density of 85.9 Gb/in2 on linear magnetic particulate tape using Fujifilm’s proprietary NANOCUBIC™ and Barium Ferrite (BaFe) particle technologies. This breakthrough in recording density equates to a standard LTO cartridge capable of storing up to 154 terabytes of uncompressed data, making it 62 times greater than today’s current LTO-6 cartridge capacity and projects a long and promising future for tape growth.
    • On Sept. 9, 2014 IBM announced LTFS LE version 2.1.4 4 extending LTFS (Linear Tape File System) tape library support.
    • On Sept. 10, 2014 the LTO Program Technology Provider Companies (TPCs), HP, IBM and Quantum, announced an extended roadmap which now includes LTO generations 9 and 10. The new generation guidelines call for compressed capacities of 62.5 TB for LTO-9 and 120 TB for generation LTO-10 and include compressed transfer rates of up to 1,770 MB/second for LTO-9 and a 2,750 MB/second for LTO-10. Each new generation will include read-and-write backwards compatibility with the prior generation as well as read compatibility with cartridges from two generations prior to protect investments and ease tape conversion and implementation.
    • On Oct. 6, 2014 IBM announced the TS1150 enterprise drive. Features of the TS1150 include a native data rate of up to 360 MB/sec versus the 250 MB/sec native data rate of the predecessor TS1140 and a native cartridge capacity of 10 TB compared to 4 TB on the TS1140. LTFS support was included.
    • On Nov. 6, 2014, HP announced a new release of StoreOpen Automation that delivers a solution for using LTFS in automation environments with Windows OS, available as a free download. This version complements their already existing support for Mac and Linux versions to help simplify integration of tape libraries to archiving solutions.

    Significant Technology Innovations Fuel Tape’s Future
    Development and manufacturing investment in tape library, drive, media and management software has effectively addressed the constant demand for improved reliability, higher capacity, power efficiency, ease of use and the lowest cost per GB of any storage solution. Below is a summary of tape’s value proposition followed by key metrics for each:

    • Tape drive reliability has surpassed disk drive reliability
    • Tape cartridge capacity (native) growth is on an unprecedented trajectory
    • Tape has a faster device data rate than disk
    • Tape has a much longer media life than any other digital storage medium
    • Tape’s functionality and ease of use is now greatly enhanced with LTFS
    • Tape requires significantly less energy consumption than any other digital storage technology
    • Tape storage has  a much lower acquisition cost and TCO than disk

    Reliability. Tape reliability levels have surpassed HDDs. Reliability levels for tape exceeds that of the most reliable disk drives by one to three orders of magnitude. The BER (Bit Error Rate – bits read per hard error) for enterprise tape is rated at 1×1019 and 1×1017 for LTO tape. This compares to 1×1016 for the most reliable enterprise Fibre Channel disk drive.

    Capacity and Data Rate. LTO-6 cartridges provide 2.5 TB capacity and more than double the compressed capacity of the preceding LTO-5 drive with a 14% data rate performance boost to 160 MB/sec. Enterprise tape has reached 8.5 TB native capacity and 252 MB/sec on the Oracle StorageTek T10000D and 10 TB native capacity and 360 MB/sec on the IBM TS1150. Tape cartridge capacities are expected to grow at unprecedented rates for the foreseeable future.

    Media Life. Manufacturers specifications indicate that enterprise and LTO tape media has a life span of 30 years or more while the average tape drive will be deployed 7 to 10 years before replacement. By comparison, the average disk drive is operational 3 to 5 years before replacement.

    LTFS Changes Rules for Tape Access. Compared to previous proprietary solutions, LTFS is an open tape format that stores files in application-independent, self-describing fashion, enabling the simple interchange of content across multiple platforms and workflows. LTFS is also being deployed in several innovative “Tape as NAS” active archive solutions that combine the cost benefits of tape with the ease of use and fast access times of NAS. The SNIA LTFS Technical Working Group has been formed to broaden cross–industry collaboration and continued technical development of the LTFS specification.

    TCOStudies. Tape’s widening cost advantage compared to other storage mediums makes it the most cost-effective technology for long-term data retention. The favorable economics (TCO, low energy consumption, reduced raised floor) and massive scalability have made tape the preferred medium for managing vast volumes of data. Several tape TCO studies are publicly available and the results consistently confirm a significant TCO advantage for tape compared to disk solutions.

    According to the Brad Johns Consulting Group, a TCO study for an LTFS-based ‘Tape as NAS’ solution totaled $1.1M compared with $7.0M for a disk-based unified storage solution.  This equates to a savings of over $5.9M over a 10-year period, which is more than 84 percent less than the equivalent amount for a storage system built on a 4 TB hard disk drive unified storage system.  From a slightly different perspective, this is a TCO savings of over $2,900/TB of data. Source: Johns, B. “A New Approach to Lowering the Cost of Storing File Archive Information,”.

    Another comprehensive TCO study by ESG (Enterprise Strategies Group) comparing an LTO-5 tape library system with a low-cost SATA disk system for backup using de-duplication (best case for disk) shows that disk deduplication has a 2-4x higher TCO than the tape system for backup over a 5 year period. The study revealed that disk has a TCO of 15x higher than tape for long-term data archiving.

    Select Case Studies Highlight Tape and Active Archive Solutions
    CyArk Is a non-profit foundation focused on the digital preservation of cultural heritage sites including places such as Mt. Rushmore, and Pompeii. CyArk predicted that their data archive would grow by 30 percent each year for the foreseeable future reaching one to two petabytes in five years. They needed a storage solution that was secure, scalable, and more cost-effective to provide the longevity required for these important historical assets. To meet this challenge CyArk implemented an active archive solution featuring LTO and LTFS technologies.

    Dream Works Animation a global Computer Graphic (CG) animation studio has implemented a reliable, cost-effective and scalable active archive solution to safeguard a 2 PB portfolio of finished movies and graphics, supporting a long-term asset preservation strategy. The studio’s comprehensive, tiered and converged active archive architecture, which spans software, disk and tape, saves the company time, money and reduces risk.

    LA Kings of the NHL rely extensively on digital video assets for marketing activities with team partners and for its broadcast affiliation with Fox Sports. Today, the Kings save about 200 GB of video per game for an 82 game regular season and are on pace to generate about 32-35 TB of new data per season. The King’s chose to implement Fujifilm’s Dternity NAS active archive appliance, an open LTFS based architecture. The Kings wanted an open source archiving solution which could outlast its original hardware while maintaining data integrity. Today with Dternity and LTFS, the Kings don’t have to decide what data to keep because they are able to cost-effectively save everything they might need in the future. 

    McDonald’s primary challenge was to create a digital video workflow that streamlines the management and distribution of their global video assets for their video production and post-production environment. McDonald’s implemented the Spectra T200 tape library with LTO-6 providing 250 TB of McDonald’s video production storage. Nightly, incremental backup jobs store their media assets into separate disk and LTO- 6 storage pools for easy backup, tracking and fast retrieval. This system design allows McDonald’s to effectively separate and manage their assets through the use of customized automation and data service policies.

    NCSA employs an Active Archive solution providing 100 percent of the nearline storage for the NCSA Blue Waters supercomputer, which is one of the world’s largest active file repositories stored on high capacity, highly reliable enterprise tape media. Using an active archive system along with enterprise tape and RAIT (Redundant Arrays of Inexpensive Tape) eliminates the need to duplicate tape data, which has led to dramatic cost savings.

    Queensland Brain Institute (QBI) is a leading center for neuroscience research.  QBI’s research focuses on the cellular and molecular mechanisms that regulate brain function to help develop new treatments for neurological and mental disorders.  QBI’s storage system has to scale extensively to store, protect, and access tens of terabytes of data daily to support cutting-edge research.  QBI choose an Oracle solution consisting of Oracle’s StorageTek SL3000 modular tape libraries with StorageTek T10000 enterprise tape drives.   The Oracle solution improved QBI’s ability to grow, attract world-leading scientists and meet stringent funding conditions.

    Looking Ahead to 2015 and Beyond
    The role tape serves in today’s modern data centers is expanding as IT executives and cloud service providers address new applications for tape that leverage its significant operational and cost advantages. This recognition is driving investment in new tape technologies and innovations with extended roadmaps, and it is expanding tape’s profile from its historical role in data backup to one that includes long-term archiving requiring cost-effective access to enormous quantities of stored data. Given the current and future trajectory of tape technology, data intensive markets such as big data, broadcast and entertainment, archive, scientific research, oil and gas exploration, surveillance, cloud, and HPC are expected to become significant beneficiaries of tape’s continued progress. Clearly the tremendous innovation, compelling value proposition and development activities demonstrate tape technology is not sitting still; expect this promising trend to continue in 2015 and beyond. 

    Visit the Tape Storage Council at tapestorage.org

    What this means and summary

    Like it not tape is still alive being used along with the technology evolving with new enhancements as outlined above.

    Good to see the tape folks doing some marketing to get their story told and heard for those who are still interested.

    Does that mean I still use tape?

    Nope, I stopped using tape for local backups and archives well over a decade ago using disk to disk and disk to cloud.

    Does that mean I believe that tape is dead?

    Nope, I still believe that for some organizations and some usage scenarios it makes good sense, however like with most data storage related technologies, it’s not a one size or type of technology fits everything scenario value proposition.

    On a related note for cloud and object storage, visit www.objectstoragecenter.com

    Ok, nuff said, for now…

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    StorageIO Out and About Update – VMworld 2014

    StorageIO Out and About Update – VMworld 2014

    Here is a quick video montage or mash-up if you prefer that Cory Peden (aka the Server and StorageIO Intern @Studentof_IT) put together using some video that recorded while at VMworld 2014 in San Francisco. In this YouTube video we take a quick tour around the expo hall to see who as well as what we run into while out and about.

    VMworld 2014 StorageIO Update
    Click on above image to view video

    For those of you who were at VMworld 2014 the video (click above image) will give you a quick Dejavu memory of the sites and sounds while for those who were not there, see what you missed to plan for next year. Watch for appearances from Gina Minks (@Gminks) aka Gina Rosenthal (of BackupU)and Michael (not Dell) of Dell Data Protection, Luigi Danakos (@Nerdblurt) of HP Data Protection who lost his voice (tweet Luigi if you can help him find his voice). With Luigi we were able to get in a quick game of buzzword bingo before catching up with Marc Farley (@Gofarley) and John Howarth of Quaddra Software. Mark and John talk about their new solution from Quaddra which will enable searching and discovering data across different storage systems and technologies.  

    Other visits include a quick look at an EVO:Rail from Dell, along with Docker for Smarties overview with Nathan LeClaire (@upthecyberpunks) of Docker (click here to watch the extended interview with Nathan).

    Docker for smarties

    Check out the conversation with Max Kolomyeytsev of StarWind Software (@starwindsan) before we get interrupted by a sales person. During our walk about, we also bump into Mark Peters (@englishmdp) of ESG facing off video camera to video camera.

    Watch for other things including rack cabinets that look like compute servers yet that have a large video screen so they can be software defined for different demo purposes.

    virtual software defined server

    Watch for more Server and StorageIO Industry Trend Perspective podcasts, videos as well as out and about updates soon, meanwhile check out others here.

    Ok, nuff said (for now)

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Data Protection Diaries: March 31 World Backup Day is Restore Data Test Time

    Storage I/O trends

    World Backup Day Generating Awareness About Data Protection

    This World Backup Day piece is part of my ongoing Data Protection Diaries series of posts (www.dataprotecitondiaries.com) about trends, strategies, tools and best practices spanning applications, archiving, backup/restore, business continuance (BC), business resiliency (BR), cloud, data footprint reduction (DFR), security, servers, storage and virtualization among other related topic themes.

    data protection threat risk scenarios
    Different threat risks and reasons to protect your digital assets (data)

    March 31 is World Backup Day which means you should make sure that your data and digital assets (photos, videos, music or audio, scanned items) along with other digital documents are protected. Keep in mind that various reasons for protecting, preserving and serving your data regardless of if you are a consumer with needs to protect your home and personal information, or a large business, institution or government agency.

    Why World Backup Day and Data Protection Focus

    By being protected this means making sure that there are copies of your documents, data, files, software tools, settings, configurations and other digital assets. These copies can be in different locations (home, office, on-site, off-site, in the cloud) as well as for various points in time or recovery point objective (RPO) such as monthly, weekly, daily, hourly and so forth.

    Having different copies for various times (e.g. your protection interval) gives you the ability to go back to a specific time to recover or restore lost, stolen, damaged, infected, erased, or accidentally over-written data. Having multiple copies is also a safeguard incase either the data, files, objects or items being backed up or protected are bad, or the copy is damaged, lost or stolen.

    Restore Test Time

    While the focus of world backup data is to make sure that you are backing up or protecting your data and digital assets, it is also about making sure what you think is being protected is actually occurring. It is also a time to make sure what you think is occurring or know is being done can actually be used when needed (restore, recover, rebuild, reload, rollback among other things that start with R). This means testing that you can find the files, folders, volumes, objects or data items that were protected, use those copies or backups to restore to a different place (you don’t want to create a disaster by over-writing your good data).

    In addition to making sure that the data can be restored to a different place, go one more step to verify that the data can actually be used which means has it be decrypted or unlocked, have the security or other rights and access settings along with meta data been applied. While that might seem obvious it is often the obvious that will bite you and cause problems, hence take some time to test that all is working, not to mention get some practice doing restores.

    Data Protection and Backup 3 2 1 Rule and Guide

    Recently I did a piece based on my own experiences with data protection including Backup as well as Restore over at Spiceworks called My copies were corrupted: The 3-2-1 rule. For those not familiar, or as a reminder 3 2 1 means have more than three copies or better yet, versions stored on at least two different devices, systems, drives, media or mediums in at least one different location from the primary or main copy.

    Following is an excerpt from the My copies were corrupted: The 3-2-1 rule piece:

    Not long ago I had a situation where something happened to an XML file that I needed. I discovered it was corrupted, and I needed to do a quick restore.

    “No worries,” I thought, “I’ll simply copy the most recent version that I had saved to my file server.” No such luck. That file had been just copied and was damaged.

    “OK, no worries,” I thought. “That’s why I have a periodic backup copy.” It turns out that had worked flawlessly. Except there was a catch — it had backed up the damaged file. This meant that any and all other copies of the file were also damaged as far back as to when the problem occurred.

    Read the full piece here.

    Backup and Data Protection Walking the Talk

    Yes I eat my own dog food meaning that I practice what I talk about (e.g. walking the talk) leveraging not just a  3 2 1 approach, actually more of a 4 3 2 1 hybrid which means different protection internals, various retention’s and frequencies, not all data gets treated the same, using local disk, removable disk to go off-site as well as cloud. I also test candidly more often by accident using the local, removable and cloud copies when I accidentally delete something, or save the wrong version.

    Some of my data and applications are protected throughout the day, others on set schedules that vary from hours to days to weeks to months or more. Yes, some of my data such as large videos or other items that are static do not change, so why backup them up or protect every day, week or month? I also align the type of protection, frequency, retention to meet different threat risks, as well as encrypt data. Part of actually testing and using the restores or recoveries is also determining what certificates or settings are missing, as well as where opportunities exist or needed to enhance data protection.

    Closing comments (for now)

    Take some time to learn more about data protection including how you can improve or modernize while rethinking what to protect, when, where, why how and with what.

    In addition to having copies from different points in time and extra copies in various locations, also make sure that they are secured or encrypted AND make sure to protect your encryption keys. After all, try to find a digital locksmith to unlock your data who is not working for a government agency when you need to get access to your data ;)…

    Learn more about data protection including Backup/Restore at www.storageioblog.com/data-protection-diaries-main/ where there are a collection of related posts and presentations including:

    Also check out the collection of technology and vendor / product neutral data protection and backup/restore content at BackupU (disclosure: sponsored by Dell Data Protection Software) that includes various webinars and Google+ hangout sessions that I have been involved with.

    Watch for more data protection conversations about related trends, themes, technologies, techniques perspectives in my ongoing data protection diaries discussions as well as read more about Backup and other related items at www.storageioblog.com/data-protection-diaries-main/.

    Ok, nuff said

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Spring 2014 StorageIO Events and Activities Update

    Storage I/O trends

    Cloud Virtual Server Storage I/O and Networking events

    Speaking of Old School, New School, Current and Future School, here are some upcoming events including live in-person as well as virtual or online StorageIO activities. The following calendar also includes a series of one-day workshop sessions that are part of a week of seminars to be held in Nijkerk Netherlands being Organized by Brouwer Storage Consultancy (learn more here).

    The current calendar which continues to be updated includes a mix of webinars (playback are available), and live events covering data infrastructure topics from cloud, virtual, physical and software defined across servers, storage I/O networking, SSD, performance, object storage and data protection among other related themes.

    June 19, 2014
    Server and StorageIO BrightTalk Channel
    Evolving from Disaster Recovery and Business Continuity (BC) to Business Resiliency (BR)Webinar
    9AM PT
    June 12, 2014
    Server and StorageIO BrightTalk Channel
    The Many Facets of Virtual Storage and Software Defined Storage Virtualization9AM PTWebinar
    June 11, 2014
    Server and StorageIO BrightTalk Channel
    The Changing Face and Landscape of Enterprise Storage9AM PTWebinar
    May 16, 2014 What you need to know about virtualization (Demystifying Virtualization)Nijkerk Holland
    Netherlands
    May 15, 2014 Data Infrastructure Industry Trends: What’s New and TrendingNijkerk Holland
    Netherlands
    May 14, 2014 To be announcedNijkerk Holland
    Netherlands
    May 13, 2014 Data Movement and Migration: Storage Decision Making ConsiderationsNijkerk Holland
    Netherlands
    May 12, 2014 Rethinking Business Resiliency: From Disaster Recovery to Business ContinuanceNijkerk Holland
    Netherlands
    May 5-7, 2014EMC WorldLas Vegas
    April 22-23, 2014SNIA DSI EventTBASanta Clara CA
    April 16, 2014
    Server and StorageIO BrightTalk Channel
    Open Source and Cloud Storage – Enabling business, or a technology enabler?9AM PT
    Webinar
    April 9, 2014
    Server and StorageIO BrightTalk Channel
    Storage Decision Making for Fast, Big and Very Big Data Environments9AM PT
    Webinar
    April 8, 2014NABNational Association Broadcasters (e.g. Very Big Fast data Event)Las Vegas NV
    March 27, 2014
    Keynote: The 2017 Datacenter – PREPARING FOR THE 2017 DATACENTER SESSIONSEdina MN
    8:00AM CT
    Register Here
    March 19, 2014
    Server and StorageIO BrightTalk Channel
    Business Resiliency (BR), Business Continuity (BC) and Disaster Recovery (DR) Management9AM PT
    Webinar
    March 19, 2014
    Server and StorageIO BrightTalk Channel
    Data Center Monitoring – Metrics that Matter for Effective Management7AM PT
    Webinar
    March 12, 2014
    Server and StorageIO BrightTalk Channel
    Hybrid Clouds – Bridging the Gap between public and private environments11AM PT
    Webinar

    View other recent and past activities along with new additions at the StorageIO.com/events page. Also check out recent commentary in the news here as well as tips and articles here.

    Ok, nuff said (for now)

    Cheers Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved