Driving ROI with Cloud Storage Consolidation Seminars

Driving ROI with Cloud Storage Consolidation Seminars

Driving ROI with Cloud Storage Consolidation Seminars

Driving ROI with Cloud Storage Consolidation Seminars

Join me in a series of in-person seminars driving ROI with cloud storage consolidation for unstructured file data.

driving roi with cloud storage consolidation seminars
Various Data Infrastructure options from on-prem to edge to cloud and beyond

These initial seminars are being held at Amazon Web Services (AWS) locations April 30 in New York City, May 1 in Chicago and May 2 in Houston Amazon. At each of these three cities, I will be joined by experts from NetApp, Talon and AWS as we look at issues, trends and what can be done today (including hands on demos) driving ROI with cloud storage consolidation for unstructured file data.

What The Seminars Are About

These seminars look at how remove cost and complexity while boosting productivity for distributed sites with unstructured data and NAS file servers. The seminars look at making informed decisions balancing technical considerations with a business return on investment (ROI) model, along with return on innovation (the other ROI) from boosting productivity. It’s not about simply cutting costs that can create chaos or compromise elsewhere, it’s about removing complexity and cost while boosting productivity with smart cloud storage consolidation for unstructured file data.

distributed file server cloud storage consolidation

Distributed File Server Cloud Storage Consolidation ROI Economic Comparison

During these seminars I will discuss various industry and customer trends, challenges as well as solutions, particular for environments with distributed file servers for unstructured file data. As part of my discussion, we will look at both technical, as well as ROI business based model for distributed file server cloud storage consolidation based on the Server StorageIO white paper report titled Cloud File Data Storage Consolidation and Economic Comparison Model (Free PDF download here).

Where When and How to Register

New York City Tuesday April 30, 2019 9:00AM
Amazon Web Services
7 West 34th St.
6th Floor
Learn more and register here.

Chicago Illinois  Wednesday May 1, 2019 9:00AM
Amazon Web Services
222 West Adams Street
Suite 1400
Learn more and register here

Houston Texas Thursday May 2, 2019 9:00AM
Amazon Web Services
825 Town and Country Lane
Suite 1000
Learn more and register here

Where to learn more

Learn more about world backup day, recovery and data protection along with other related topics via the following links:

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means

Making informed decisions for data infrastructure resources including cloud storage consolidation and distributed file servers involves technical, application workload as well as business economic analysis. Which of the three (technical, application workload, financial) is more important for enabling a business benefit will depend on your perspective, as well as area of focus. However, all the above need to be considered in the balance as part of making an informed data infrastructure resource decision. That is where a discussion about a business financial ROI model (pro forma if you prefer) comes into play as part of cloud storage consolidation, including for distributed file server of unstructured file data.

I look forward to meeting with attendees and hope to see you at the events April 30th in New York City, May 1 in Chicago, and Houston May 2nd as we discuss driving ROI with cloud storage consolidation at these seminars.

Ok, nuff said, for now.

Cheers GS

Greg Schulz – Multi-year Microsoft MVP Cloud and Data Center Management, ten-time VMware vExpert. Author of Data Infrastructure Insights (CRC Press), Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Also visit www.picturesoverstillwater.com to view various UAS/UAV e.g. drone based aerial content created by Greg Schulz. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. Visit our companion site https://picturesoverstillwater.com to view drone based aerial photography and video related topics. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Data Infrastructure Industry Trends WekaIO Matrix Software Defined Storage SDS

WekaIO Matrix Scale Out Software Defined Storage SDS

server storage I/O trends

Updated 2/11/2018

WekaIO Matrix is a scale out software defined solution (SDS).

WekaIO Matrix software defined scale out storage SDS

This Server StorageIO Industry Trends Perspective report looks at common issues, trends, and how to address different application server storage I/O challenges. In this report, we look at WekaIO Matrix, an elastic, flexible, highly scalable easy to use (and manage) software defined (e.g. software based) storage solution. WekaIO Matrix enables flexible elastic scaling with stability and without compromise.

Matrix is a new scale out software defined storage solution that:

  • Installs on bare metal, virtual or cloud servers
  • Has POSIX, NFS, SMB, and HDFS storage access
  • Adaptable performance for little and big data
  • Tiering of flash SSD and cloud object storage
  • Distributed resilience without compromise
  • Removes complexity of traditional storage
  • Deploys on bare metal, virtual and cloud environments

Where To Learn More

View additional SDS and related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

Read more about WekaIO Matrix in this (free, no registration required) Server StorageIO Industry Trends Perspective (ITP) Report compliments of WekaIO.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Which Enterprise HDD for Content Applications Different File Size Impact

Which HDD for Content Applications Different File Size Impact

Different File Size Impact server storage I/O trends

Updated 1/23/2018

Which enterprise HDD to use with a content server platform different file size impact.

Insight for effective server storage I/O decision making
Server StorageIO Lab Review

Which enterprise HDD to use for content servers

This is the fifth in a multi-part series (read part four here) based on a white paper hands-on lab report I did compliments of Servers Direct and Seagate that you can read in PDF form here. The focus is looking at the Servers Direct (www.serversdirect.com) converged Content Solution platforms with Seagate Enterprise Hard Disk Drive (HDD’s). In this post the focus looks at large and small file I/O processing.

File Performance Activity

Tip, Content solutions use files in various ways. Use the following to gain perspective how various HDD’s handle workloads similar to your specific needs.

Two separate file processing workloads were run (12), one with a relative small number of large files, and another with a large number of small files. For the large file processing (table-3), 5 GByte sized files were created and then accessed via 128 Kbyte (128KB) sized I/O over a 10 hour period with 90% read using 64 threads (workers). Large file workload simulates what might be seen with higher definition video, image or other content streaming.

(Note 12) File processing workloads were run using Vdbench 5.04 and file anchors with sample script configuration below. Instead of vdbench you could also use other tools such as sysbench or fio among others.

VdbenchFSBigTest.txt
# Sample script for big files testing
fsd=fsd1,anchor=H:,depth=1,width=5,files=20,size=5G
fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=128k,fileselect=random,fileio=random,threads=64
rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

vdbench -f VdbenchFSBigTest.txt -m 16 -o Results_FSbig_H_060615

VdbenchFSSmallTest.txt
# Sample script for big files testing
fsd=fsd1,anchor=H:,depth=1,width=64,files=25600,size=16k
fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=1k,fileselect=random,fileio=random,threads=64
rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

vdbench -f VdbenchFSSmallTest.txt -m 16 -o Results_FSsmall_H_060615

The 10% writes are intended to reflect some update activity for new content or other changes to content. Note that 128KB per second translates to roughly 1 Gbps streaming content such as higher definition video. However 4K video (not optimized) would require a higher speed as well as resulting in larger file sizes. Table-3 shows the performance during the large file access period showing average read /write rates and response time, bandwidth (MBps), average open and close rates with response time.

Avg. File Read Rate

Avg. Read Resp. Time
Sec.

Avg. File Write Rate

Avg. Write Resp. Time
Sec.

Avg.
CPU %
Total

Avg. CPU % System

Avg. MBps
Read

Avg. MBps
Write

ENT 15K R1

580.7

107.9

64.5

19.7

52.2

35.5

72.6

8.1

ENT 10K R1

455.4

135.5

50.6

44.6

34.0

22.7

56.9

6.3

ENT CAP R1

285.5

221.9

31.8

19.0

43.9

28.3

37.7

4.0

ENT 10K R10

690.9

87.21

76.8

48.6

35.0

21.8

86.4

9.6

Table-3 Performance summary for large file access operations (90% read)

Table-3 shows that for two-drive RAID 1, the Enterprise 15K are the fastest performance, however using a RAID 10 with four 10K HDD’s with enhanced cache features provide a good price, performance and space capacity option. Software RAID was used in this workload test.

Figure-4 shows the relative performance of various HDD options handling large files, keep in mind that for the response line lower is better, while for the activity rate higher is better.

large file processing
Figure-4 Large file processing 90% read, 10% write rate and response time

In figure-4 you can see the performance in terms of response time (reads larger dashed line, writes smaller dotted line) along with number of file read operations per second (reads solid blue column bar, writes green column bar). Reminder that lower response time, and higher activity rates are better. Performance declines moving from left to right, from 15K to 10K Enterprise Performance with enhanced cache feature to Enterprise Capacity (7.2K), all of which were hardware RAID 1. Also shown is a hardware RAID 10 (four x 10K HDD’s).

Results in figure-4 above and table-4 below show how various drives can be configured to balance their performance, capacity and costs to meet different needs. Table-4 below shows an analysis looking at average file reads per second (RPS) performance vs. HDD costs, usable capacity and protection level.

Table-4 is an example of looking at multiple metrics to make informed decisions as to which HDD would be best suited to your specific needs. For example RAID 10 using four 10K drives provides good performance and protection along with large usable space, however that also comes at a budget cost (e.g. price).

Avg.
File Reads Per Sec. (RPS)

Single Drive Cost per RPS

Multi-Drive Cost per RPS

Single Drive Cost / Per GB Capacity

Cost / Per GB Usable (Protected) Cap.

Drive Cost (Multiple Drives)

Protection Overhead (Space Capacity for RAID)

Cost per usable GB per RPS

Avg. File Read Resp. (Sec.)

ENT 15K R1

580.7

$1.02

$2.05

$ 0.99

$0.99

$1,190

100%

$2.1

107.9

ENT 10K R1

455.5

1.92

3.84

0.49

0.49

1,750

100%

3.8

135.5

ENT CAP R1

285.5

1.40

2.80

0.20

0.20

798

100%

2.8

271.9

ENT 10K R10

690.9

1.27

5.07

0.49

0.97

3,500

100%

5.1

87.2

Table-4 Performance, capacity and cost analysis for big file processing

Small File Size Processing

To simulate a general file sharing environment, or content streaming with many smaller objects, 1,638,464 16KB sized files were created on each device being tested (table-5). These files were spread across 64 directories (25,600 files each) and accessed via 64 threads (workers) doing 90% reads with a 1KB I/O size over a ten hour time frame. Like the large file test, and database activity, all workloads were run at the same time (e.g. test devices were concurrently busy).

Avg. File Read Rate

Avg. Read Resp. Time
Sec.

Avg. File Write Rate

Avg. Write Resp. Time
Sec.

Avg.
CPU %
Total

Avg. CPU % System

Avg. MBps
Read

Avg. MBps
Write

ENT 15K R1

3,415.7

1.5

379.4

132.2

24.9

19.5

3.3

0.4

ENT 10K R1

2,203.4

2.9

244.7

172.8

24.7

19.3

2.2

0.2

ENT CAP R1

1,063.1

12.7

118.1

303.3

24.6

19.2

1.1

0.1

ENT 10K R10

4,590.5

0.7

509.9

101.7

27.7

22.1

4.5

0.5

Table-5 Performance summary for small sized (16KB) file access operations (90% read)

Figure-5 shows the relative performance of various HDD options handling large files, keep in mind that for the response line lower is better, while for the activity rate higher is better.

small file processing
Figure-5 Small file processing 90% read, 10% write rate and response time

In figure-5 you can see the performance in terms of response time (reads larger dashed line, writes smaller dotted line) along with number of file read operations per second (reads solid blue column bar, writes green column bar). Reminder that lower response time, and higher activity rates are better. Performance declines moving from left to right, from 15K to 10K Enterprise Performance with enhanced cache feature to Enterprise Capacity (7.2K RPM), all of which were hardware RAID 1. Also shown is a hardware RAID 10 (four x 10K RPM HDD’s) that has higher performance and capacity along with costs (table-5).

Results in figure-5 above and table-5 below show how various drives can be configured to balance their performance, capacity and costs to meet different needs. Table-6 below shows an analysis looking at average file reads per second (RPS) performance vs. HDD costs, usable capacity and protection level.

Table-6 is an example of looking at multiple metrics to make informed decisions as to which HDD would be best suited to your specific needs. For example RAID 10 using four 10K drives provides good performance and protection along with large usable space, however that also comes at a budget cost (e.g. price).

Avg.
File Reads Per Sec. (RPS)

Single Drive Cost per RPS

Multi-Drive Cost per RPS

Single Drive Cost / Per GB Capacity

Cost / Per GB Usable (Protected) Cap.

Drive Cost (Multiple Drives)

Protection Overhead (Space Capacity for RAID)

Cost per usable GB per RPS

Avg. File Read Resp. (Sec.)

ENT 15K R1

3,415.7

$0.17

$0.35

$0.99

$0.99

$1,190

100%

$0.35

1.51

ENT 10K R1

2,203.4

0.40

0.79

0.49

0.49

1,750

100%

0.79

2.90

ENT CAP R1

1,063.1

0.38

0.75

0.20

0.20

798

100%

0.75

12.70

ENT 10K R10

4,590.5

0.19

0.76

0.49

0.97

3,500

100%

0.76

0.70

Table-6 Performance, capacity and cost analysis for small file processing

Looking at the small file processing analysis in table-5 shows that the 15K HDD’s on an apples to apples basis (e.g. same RAID level and number of drives) provide the best performance. However when also factoring in space capacity, performance, different RAID level or other protection schemes along with cost, there are other considerations. On the other hand the Enterprise Capacity 2TB HDD’s have a low cost per capacity, however do not have the performance of other options, assuming your applications need more performance.

Thus the right HDD for one application may not be the best one for a different scenario as well as multiple metrics as shown in table-5 need to be included in an informed storage decision making process.

Where To Learn More

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

File processing are common content applications tasks, some being small, others large or mixed as well as reads and writes. Even if your content environment is using object storage, chances are unless it is a new applications or a gateway exists, you may be using NAS or file based access. Thus the importance of if your applications are doing file based processing, either run your own applications or use tools that can simulate as close as possible to what your environment is doing.

Continue reading part six in this multi-part series here where the focus is around general I/O including 8KB and 128KB sized IOPs along with associated metrics.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Big Files Lots of Little File Processing Benchmarking with Vdbench

Big Files Lots of Little File Processing Benchmarking with Vdbench


server storage data infrastructure i/o File Processing Benchmarking with Vdbench

Updated 2/10/2018

Need to test a server, storage I/O networking, hardware, software, services, cloud, virtual, physical or other environment that is either doing some form of file processing, or, that you simply want to have some extra workload running in the background for what ever reason? An option is File Processing Benchmarking with Vdbench.

I/O performance

Getting Started


Here’s a quick and relatively easy way to do it with Vdbench (Free from Oracle). Granted there are other tools, both for free and for fee that can similar things, however we will leave those for another day and post. Here’s the con to this approach, there is no Uui Gui like what you have available with some other tools Here’s the pro to this approach, its free, flexible and limited by your creative, amount of storage space, server memory and I/O capacity.

If you need a background on Vdbench and benchmarking, check out the series of related posts here (e.g. www.storageio.com/performance).

Get and Install the Vdbench Bits and Bytes


If you do not already have Vdbench installed, get a copy from the Oracle or Source Forge site (now points to Oracle here).

Vdbench is free, you simply sign-up and accept the free license, select the version down load (it is a single, common distribution for all OS) the bits as well as documentation.

Installation particular on Windows is really easy, basically follow the instructions in the documentation by copying the contents of the download folder to a specified directory, set up any environment variables, and make sure that you have Java installed.

Here is a hint and tip for Windows Servers, if you get an error message about counters, open a command prompt with Administrator rights, and type the command:

$ lodctr /r


The above command will reset your I/O counters. Note however that command will also overwrite counters if enabled so only use it if you have to.

Likewise *nix install is also easy, copy the files, make sure to copy the applicable *nix shell script (they are in the download folder), and verify Java is installed and working.

You can do a vdbench -t (windows) or ./vdbench -t (*nix) to verify that it is working.

Vdbench File Processing

There are many options with Vdbench as it has a very robust command and scripting language including ability to set up for loops among other things. We are only going to touch the surface here using its file processing capabilities. Likewise, Vdbench can run from a single server accessing multiple storage systems or file systems, as well as running from multiple servers to a single file system. For simplicity, we will stick with the basics in the following examples to exercise a local file system. The limits on the number of files and file size are limited by server memory and storage space.

You can specify number and depth of directories to put files into for processing. One of the parameters is the anchor point for the file processing, in the following examples =S:\SIOTEMP\FS1 is used as the anchor point. Other parameters include the I/O size, percent reads, number of threads, run time and sample interval as well as output folder name for the result files. Note that unlike some tools, Vdbench does not create a single file of results, rather a folder with several files including summary, totals, parameters, histograms, CSV among others.


Simple Vdbench File Processing Commands

For flexibility and ease of use I put the following three Vdbench commands into a simple text file that is then called with parameters on the command line.
fsd=fsd1,anchor=!fanchor,depth=!dirdep,width=!dirwid,files=!numfiles,size=!filesize

fwd=fwd1,fsd=fsd1,rdpct=!filrdpct,xfersize=!fxfersize,fileselect=random,fileio=random,threads=!thrds

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=!etime,interval=!itime

Simple Vdbench script

# SIO_vdbench_filesystest.txt
#
# Example Vdbench script for file processing
#
# fanchor = file system place where directories and files will be created
# dirwid = how wide should the directories be (e.g. how many directories wide)
# numfiles = how many files per directory
# filesize = size in in k, m, g e.g. 16k = 16KBytes
# fxfersize = file I/O transfer size in kbytes
# thrds = how many threads or workers
# etime = how long to run in minutes (m) or hours (h)
# itime = interval sample time e.g. 30 seconds
# dirdep = how deep the directory tree
# filrdpct = percent of reads e.g. 90 = 90 percent reads
# -p processnumber = optional specify a process number, only needed if running multiple vdbenchs at same time, number should be unique
# -o output file that describes what being done and some config info
#
# Sample command line shown for Windows, for *nix add ./
#
# The real Vdbench script with command line parameters indicated by !=
#

fsd=fsd1,anchor=!fanchor,depth=!dirdep,width=!dirwid,files=!numfiles,size=!filesize

fwd=fwd1,fsd=fsd1,rdpct=!filrdpct,xfersize=!fxfersize,fileselect=random,fileio=random,threads=!thrds

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=!etime,interval=!itime

Big Files Processing Script


With the above script file defined, for Big Files I specify a command line such as the following.
$ vdbench -f SIO_vdbench_filesystest.txt fanchor=S:\SIOTemp\FS1 dirwid=1 numfiles=60 filesize=5G fxfersize=128k thrds=64 etime=10h itime=30 numdir=1 dirdep=1 filrdpct=90 -p 5576 -o SIOWS2012R220_NOFUZE_5Gx60_BigFiles_64TH_STX1200_020116

Big Files Processing Example Results


The following is one of the result files from the folder of results created via the above command for Big File processing showing totals.


Run totals

21:09:36.001 Starting RD=format_for_rd1

Feb 01, 2016 .Interval. .ReqstdOps.. ...cpu%... read ....read.... ...write.... ..mb/sec... mb/sec .xfer.. ...mkdir... ...rmdir... ..create... ...open.... ...close... ..delete...
rate resp total sys pct rate resp rate resp read write total size rate resp rate resp rate resp rate resp rate resp rate resp
21:23:34.101 avg_2-28 2848.2 2.70 8.8 8.32 0.0 0.0 0.00 2848.2 2.70 0.00 356.0 356.02 131071 0.0 0.00 0.0 0.00 0.1 109176 0.1 0.55 0.1 2006 0.0 0.00

21:23:35.009 Starting RD=rd1; elapsed=36000; fwdrate=max. For loops: None

07:23:35.000 avg_2-1200 4939.5 1.62 18.5 17.3 90.0 4445.8 1.79 493.7 0.07 555.7 61.72 617.44 131071 0.0 0.00 0.0 0.00 0.0 0.00 0.1 0.03 0.1 2.95 0.0 0.00


Lots of Little Files Processing Script


For lots of little files, the following is used.


$ vdbench -f SIO_vdbench_filesystest.txt fanchor=S:\SIOTEMP\FS1 dirwid=64 numfiles=25600 filesize=16k fxfersize=1k thrds=64 etime=10h itime=30 dirdep=1 filrdpct=90 -p 5576 -o SIOWS2012R220_NOFUZE_SmallFiles_64TH_STX1200_020116

Lots of Little Files Processing Example Results


The following is one of the result files from the folder of results created via the above command for Big File processing showing totals.
Run totals

09:17:38.001 Starting RD=format_for_rd1

Feb 02, 2016 .Interval. .ReqstdOps.. ...cpu%... read ....read.... ...write.... ..mb/sec... mb/sec .xfer.. ...mkdir... ...rmdir... ..create... ...open.... ...close... ..delete...
rate resp total sys pct rate resp rate resp read write total size rate resp rate resp rate resp rate resp rate resp rate resp
09:19:48.016 avg_2-5 10138 0.14 75.7 64.6 0.0 0.0 0.00 10138 0.14 0.00 158.4 158.42 16384 0.0 0.00 0.0 0.00 10138 0.65 10138 0.43 10138 0.05 0.0 0.00

09:19:49.000 Starting RD=rd1; elapsed=36000; fwdrate=max. For loops: None

19:19:49.001 avg_2-1200 113049 0.41 67.0 55.0 90.0 101747 0.19 11302 2.42 99.36 11.04 110.40 1023 0.0 0.00 0.0 0.00 0.0 0.00 7065 0.85 7065 1.60 0.0 0.00


Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

The above examples can easily be modified to do different things particular if you read the Vdbench documentation on how to setup multi-host, multi-storage system, multiple job streams to do different types of processing. This means you can benchmark a storage systems, server or converged and hyper-converged platform, or simply put a workload on it as part of other testing. There are even options for handling data footprint reduction such as compression and dedupe.

Ok, nuff said, for now.

Gs

Greg Schulz - Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

DataDynamics StorageX 7.0 file and data management migration software

Storage I/O trends

DataDynamics StorageX 7.0 file and data management migration software

Some of you may recall back in 2006 (here and here) when Brocade bought a file management storage startup called NuView whose product was StorageX, and then in 2009 issued end of life (EOL) notice letters that the solution was being discontinued.

Fast forward to 2013 and there is a new storage startup (DatraDynamics) with an existing product that was just updated and re-released called StorageX 7.0.

Software Defined File Management – SDFM?

Granted from an industry buzz focused adoption perspective you may not have heard of DataDynamics or perhaps even StorageX. However many other customers around the world from different industry sectors have as well as are using the solution.

The current industry buzz is around software defined data centers (SDDC) which has lead to software defined networking (SDN), software defined storage (SDS), and other software defined marketing (SDM) terms, not to mention Valueware. So for those who like software defined marketing or software defined buzzwords, you can think of StorageX as software defined file management (SDFM), however don’t ask or blame them about using it as I just thought of it for them ;).

This is an example of industry adoption traction (what is being talked about) vs. industry deployment and customer adoption (what is actually in use on a revenue basis) in that DataDynamics is not a well-known company yet, however they have what many of the high-flying startups with industry adoption don’t have which is an installed base with revenue customers that also now have a new version 7.0 product to deploy.

StorageX 7.0 enabling intelligent file and data migration management

Thus, a common theme is adding management including automated data movement and migration to carry out structure around unstructured NAS file data. More than a data mover or storage migration tool, Data Dynamics StorageX is a software platform for adding storage management structure around unstructured local and distributed NAS file data. This includes heterogeneous vendor support across different storage system, protocols and tools including Windows CIFS and Unix/Linux NFS.

Storage I/O image

A few months back prior to its release, I had an opportunity to test drive StorageX 7.0 and have included some of my comments in this industry trends perspective technology solution brief (PDF). This solution brief titled Data Dynamics StorageX 7.0 Intelligent Policy Based File Data Migration is a free download with no registration required (as are others found here), however per our disclosure policy to give transparency, DataDynamics has been a StorageIO client.

If you have a new for gaining insight and management control around your file unstructured data to support migrations for upgrades, technology refresh, archiving or tiering across different vendors including EMC and NetApp, check out DataDynamics StorageX 7.0, take it for a test drive like I did and tell them StorageIO sent you.

Ok, nuff said,

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

How can direct attached storage (DAS) make a comeback if it never left?

Server and StorageIO industry trend and perspective DAS

Have you seen or heard the theme that Direct Attached Storage (DAS), either dedicated or shared, internal or external is making a comeback?

Wait, if something did not go away, how can it make a comeback?

IMHO it is as simple as for the past decade or so, DAS has been overshadowed by shared networked storage including switched SAS, iSCSI, Fibre Channel (FC) and FC over Ethernet (FCoE) based block storage area networks (SAN) and file based (NFS and Windows SMB/CIFS) network attached storage (NAS) using IP and Ethernet networks. This has been particularly true by most of the independent storage vendors who have become focused on networked storage (SAN or NAS) solutions.

However some of the server vendors have also jumped into the deep end of the storage pool with their enthusiasm for networked storage, even though they still sell a lot of DAS including internal dedicated, along with external dedicated and shared storage.

Server and StorageIO industry trend and perspective DAS

The trend for DAS storage has evolved with the interfaces and storage mediums including from parallel SCSI and IDE to SATA and more recently 3Gbs and 6Gbs SAS (with 12Gbs in first lab trials). Similarly the storage mediums include a mix of fast 10K and 15K hard disk drives (HDD) along with high-capacity HDDs and ultra-high performance solid state devices (SSD) moving from 3.5 to 2.5 inch form factors.

While there has been a lot of industry and vendor marketing efforts around networked storage (e.g. SAN and NAS), DAS based storage was over shadowed so it should not be a surprise that those focused on SAN and NAS are surprised to hear DAS is alive and well. Not only is DAS alive and well, it’s also becoming an important scaling and convergence topic for adding extra storage to appliances as well as servers including those for scale out, big data, cloud and high density not to mention high performance and high productivity computing.

Server and StorageIO industry trend and perspective DAS

Consequently its becoming ok to talk about DAS again. Granted you might get some peer pressure from your trend setting or trend following friends to get back on the networked storage bandwagon. Keep this in mind, take a look at some of the cool trend setting big data and little data (database) appliances, backup, dedupe and archive appliances, cloud and scale out NAS and object storage systems among others and will likely find DAS on the back-end. On a smaller scale, or in high-density rack deployments in large cloud or similar environments you may also find DAS including switched shared SAS.

Does that mean SANs are dead?
No, not IMHO despite what some vendors marketers and their followers will claim which is ironic given how some of them were leading the DAS is dead campaign in favor of iSCSI or FC or NAS a few years ago. However simply comparing DAS to SAN or NAS in a competing way is like comparing apples to oranges, instead, look at how and where they can complement and enable each other. In other words, different tools for various tasks, various storage and interfaces for different needs.

Thus IMHO DAS never left or went anywhere per say, it just was not fashionable or cool to talk about until now as it is cool and trend to discuss it again.

Ok, nuff said for now.

Cheers Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

SMB, SOHO and low end NAS gaining enterprise features

Here is a link to an interview that I did providing industry trends, perspectives and commentary on how Network Attached Storage (NAS) aka file and data sharing for the Small Medium Business (SMB), Small Office Home Office (SOHO) and consumer or low end offerings are gaining features and functionality traditionally associated with larger enterprise, however without the large price. In addition, here is a link to some tips for small business NAS storage and to another perspective on how choosing an SMB NAS is getting easier (and here for comments on unified storage).

Click on the image below to listen to a pod cast that I did with comments and perspectives involving SMB, SOHO, ROBO and low end NAS.

Listen to comments by Greg Schulz of StorageIO on SMB, SOHO, ROBO and lowend NAS

If your favorite or preferred product or vendor was not mentioned in the above links, dont worry, as with many media interviews there is a limited amount of time or narrow scope so those mentioned were among others in the space.

Speaking of others, there are many others in the broad and diverse SMB, SOHO, ROBO and consumer NAS and unified storage space. For example there are QNAP, SMC, Huawei, Buffalo, Synology and Starwind among many others. There is a lot of diversity in this NAS space. You’ve got Buffalo Technology, Cisco, Dlink, Dell, Data Robotic Drobo, EMC Iomega, Hewlett-Packard (HP) Co. via Microsoft, Intel, Overland Storage Snap Server, Seagate Black Armour, Western Digital Corp., and many others. Some of these vendors are household names that you would expect to see in the upper SMB, mid sized environments, and even into the enterprise.

For those who have other favorites or want to add another vendor to those already mentioned above, feel free to respond with a polite comment below. Oh and for disclosure, I bought my SMB or low end NAS from Amazon.com and it is an Iomega IX4.

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

Dell Storage Forum 2011 revisited

About a month ago I was invited by Dell to make a quick trip down to Orlando to attend the Dell Storage Forum 2011 (e.g. twitter #dellsf11). Given that on Tuesday June 7th Minneapolis was having a heat wave with 100 degree (F) temperatures, it was actually cooler in Orlando.

Make no mistake however, there were plenty of technologies that were cool and being kept cool at the Hilton adjacent to Disney as Dell continues to expand their footprint into the hot data storage market. The event brought together three aspects of the Dell storage story which were the mergers of the recently acquired Compellent user group with the Dell Equallogic user group along with the rest of the Dell storage and data management lineup. While the limelight was focused on Compellent and Equalogic, the Dell disk Dudes (and Dudettes e.g. Gina Rosenthal aka twitter @gminks and Sheryl Koenigsberg aka twitter @storagediva ) have been involved with storage for many years in addition to the recent acquisitions.

During the event I was invited to tag along with Roger Lund (twitter @rogerlund) an IT customer of Dells and Ed Saipetch (twitter @edsai) an Dell partner to go talk with the Dell NAS dudes (aka Unified, clustered, grid, rain, big data, bulk, scale out NAS) team formerly known as Exanet. The team is mix of Dell, former Exanet and new members who have been relatively quietly enhancing their technology in addition to creating packaged solution bundles with other Dell products such as the FS7500 (coupled with EqualLogic). For those not familiar with Exanet, have a read here or hear and for those not familiar with scale out NAS (aka bulk, grid, clustered, big data, etc) have a read here.

There are lots of interesting things in the works or possible and the team that we spoke with are full of energy, ideas, support from management not to mention having some interesting technology tools to work with ranging from Ocarina (data footprint reduction aka DFR), Kace, Scalent, Powervault MD series, servers and micro servers, not to mention EqualLogic and Compellent among others including those from various partners.

NAS was not the only thing cool at the event, there was the Dell object storage solution (aka DX) based on Caringo CAS (Content Addressable Storage) OEM software technology that has been the Rx (prescription) for healthcare, medical and other archives. Keep in mind that Dell also earlier this year acquired Insight one that just happens to be involved with healthcare and medical data or information management.

Speaking of archives and objects there was also some activity this past week with Dell and Rainstor making an announcement of their joint solutions in addition. Speaking of making sure that data on Dell storage remains available, accessible and protected, preserved and served, there were also backup/restore as well as many other pieces of technology, services and solutions. There was also a good presence by Dell partners at the event including Brocade, Commvault, Quantum and Symantec among others.

Here is a link to a video from when I was a guest with hosts Cali Lewis and John McArthur on the Wikibon/Silicon Angle The Cube show while at the Dell Event. During the discussion we had some fun as well as discussed not to be scared of clouds and virtualization, however look before you leap, doing your homework to be prepared along with other themes in my new book Cloud and Virtual Data Storage Networking (CRC Press).

Speaking of Dell, I had a nice conversation with Michael Dell during the storage beers tweet up. Did we talk about SMB or SOHO NAS, SSD, tape, HHDD, Brocade, block vs. file vs. object, data footprint reduction, big backup vs. big data, clouds, 3PAR, Equallogic vs. Compellent, HP vs. EMC?

Nope, we talked about the Dallas Mavericks (who went on to win the NBA title for 2011), social media and other items. If you have never meet Michael Dell, he is one of the most relaxed, confident and approachable CEOs of any big or large company I have meet.

In addition to visiting with Michael Dell, I also had the pleasure of meeting many other great people from Dell, their partners and others face to face including many twitter tweeps. All in all it was a great day and a half trip down to the Dell event, look forward to seeing and hearing more from Dell in the future.

Oh, and for disclosure purposes, Dell covered my RT coach class airfare while I picked up my own hotel, airport transfers, parking and incidentals.

Thanks again to Gina Rosenthal for making it all happen!

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2011 StorageIO and UnlimitedIO All Rights Reserved

What do NAS NASA NASCAR have in common?

What do NAS NASA NASCAR have in common?

server storage I/O data infrastructure trends

Updated 2/10/2018

The other day it dawned on me what do NAS, NASA NASCAR have in common?

Several things in addition to all starting with the letters NAS it turns out.

For example, they all deal with round objects, NAS or Network Attached storage involved with circular spinning disk drives, NASA or National Aeronautical Space Administration besides involved with aircraft that have tires that go round and round, or airplanes circling waiting for landing.

In the case of NASA they are also involved with sending craft or devices to circle other planets or moons and land or crash into them. Sometimes NAS along with other storage systems have disk drives that crash, similar to how NASCAR events see accidents.
NAS

Ceder Lake 3M NASCAR at dirt track - Photo (C) 2008 Karen Schulz all rights reserved

Ceder Lake dirt track 3M NASCAR night (Photo (C) 2008 Karen Schulz)

NASCAR is also involved with vehicles that dont or at least should not fly, however they do go round and round on a track, often paved however sometimes mud or dirt tracks plus high tech exists with computers and various data models, not to mention the NASCAR air force.

In addition to being involved with round objects and activities, all three are also involved in computing, generating, processing, storing and retrieving for analysis of data, not to mention high performance requirements.

NAS based storage can also be relied upon for serving the needs of NASA and NASCAR data and informational needs.

And FWIW, just for fun, look at what you get when you spell NAS, NASA or NASCAR backwards:

RACSAN
ASAN
SAN

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

Not much actually other than to stimulate some thought, discussion as well as perhaps have some fun with technology during the holiday season.

Im sure if I put some more thought to it, more similarities would or will come to mind.

However, for now, thats it for a quick thought, what similarities do you see or know about with NAS, NASA and NASCAR?

Ok, nuf fun for now, time to work on some other posts, content and projects.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Clarifying Clustered Storage Confusion

Clustered storage can be iSCSI, Fibre Channel block based or NAS (NFS or CIFS or proprietary file system) file system based. Clustered storage can also be found in virtual tape library (VTL) including dedupe solutions along with other storage solutions such as those for archiving, cloud, medical or other specialized grids among others.

Recently in the IT and data storage specific industry, there has been a flurry of merger and acquisition (M&A) (Here and here), new product enhancement or announcement activity around clustered storage. For example, HP buying clustered file system vendor IBRIX complimenting their previous acquisition of another clustered file system vendor (PolyServe) a few years ago, or, of iSCSI block clustered storage software vendor LeftHand earlier this year. Another recent acquisition is that of LSI buying clustered NAS vendor ONstor, not to mention Dell buying iSCSI block clustered storage vendor EqualLogic about a year and half ago, not to mention other vendor acquisitions or announcements involving storage and clustering.

Where the confusion enters into play is the term cluster which means many things to different people, and even more so when clustered storage is combined with NAS or file based storage. For example, clustered NAS may infer a clustered file system when in reality a solution may only be multiple NAS filers, NAS heads, controllers or storage processors configured for availability or failover.

What this means is that a NFS or CIFS file system may only be active on one node at a time, however in the event of a failover, the file system shifts from one NAS hardware device (e.g. NAS head or filer) to another. On the other hand, a clustered file system enables a NFS or CIFS or other file system to be active on multiple nodes (e.g. NAS heads, controllers, etc.) concurrently. The concurrent access may be for small random reads and writes for example supporting a popular website or file serving application, or, it may be for parallel reads or writes to a large sequential file.

Clustered storage is no longer exclusive to the confines of high-performance sequential and parallel scientific computing or ultra large environments. Small files and I/O (read or write), including meta-data information, are also being supported by a new generation of multipurpose, flexible, clustered storage solutions that can be tailored to support different applications workloads.

There are many different types of clustered and bulk storage systems. Clustered storage solutions may be block (iSCSI or Fibre Channel), NAS or file serving, virtual tape library (VTL), or archiving and object-or content-addressable storage. Clustered storage in general is similar to using clustered servers, providing scale beyond the limits of a single traditional system—scale for performance, scale for availability, and scale for capacity and to enable growth in a modular fashion, adding performance and intelligence capabilities along with capacity.

For smaller environments, clustered storage enables modular pay-as-you-grow capabilities to address specific performance or capacity needs. For larger environments, clustered storage enables growth beyond the limits of a single storage system to meet performance, capacity, or availability needs.

Applications that lend themselves to clustered and bulk storage solutions include:

  • Unstructured data files, including spreadsheets, PDFs, slide decks, and other documents
  • Email systems, including Microsoft Exchange Personal (.PST) files stored on file servers
  • Users’ home directories and online file storage for documents and multimedia
  • Web-based managed service providers for online data storage, backup, and restore
  • Rich media data delivery, hosting, and social networking Internet sites
  • Media and entertainment creation, including animation rendering and post processing
  • High-performance databases such as Oracle with NFS direct I/O
  • Financial services and telecommunications, transportation, logistics, and manufacturing
  • Project-oriented development, simulation, and energy exploration
  • Low-cost, high-performance caching for transient and look-up or reference data
  • Real-time performance including fraud detection and electronic surveillance
  • Life sciences, chemical research, and computer-aided design

Clustered storage solutions go beyond meeting the basic requirements of supporting large sequential parallel or concurrent file access. Clustered storage systems can also support random access of small files for highly concurrent online and other applications. Scalable and flexible clustered file servers that leverage commonly deployed servers, networking, and storage technologies are well suited for new and emerging applications, including bulk storage of online unstructured data, cloud services, and multimedia, where extreme scaling of performance (IOPS or bandwidth), low latency, storage capacity, and flexibility at a low cost are needed.

The bandwidth-intensive and parallel-access performance characteristics associated with clustered storage are generally known; what is not so commonly known is the breakthrough to support small and random IOPS associated with database, email, general-purpose file serving, home directories, and meta-data look-up (Figure 1). Note that a clustered storage system, and in particular, a clustered NAS may or may not include a clustered file system.

Clustered Storage Model: Source The Green and Virtual Data Center (CRC)
Figure 1 – Generic clustered storage model (Courtesy “The Green and Virtual Data Center  (CRC)”

More nodes, ports, memory, and disks do not guarantee more performance for applications. Performance depends on how those resources are deployed and how the storage management software enables those resources to avoid bottlenecks. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion when processing diverse application workloads. Other things to consider include support for industry-standard interfaces, protocols, and technologies.

Scalable and flexible clustered file server and storage systems provide the potential to leverage the inherent processing capabilities of constantly improving underlying hardware platforms. For example, software-based clustered storage systems that do not rely on proprietary hardware can be deployed on industry-standard high-density servers and blade centers and utilizes third-party internal or external storage.

Clustered storage is no longer exclusive to niche applications or scientific and high-performance computing environments. Organizations of all sizes can benefit from ultra scalable, flexible, clustered NAS storage that supports application performance needs from small random I/O to meta-data lookup and large-stream sequential I/O that scales with stability to grow with business and application needs.

Additional considerations for clustered NAS storage solutions include the following.

  • Can memory, processors, and I/O devices be varied to meet application needs?
  • Is there support for large file systems supporting many small files as well as large files?
  • What is the performance for small random IOPS and bandwidth for large sequential I/O?
  • How is performance enabled across different application in the same cluster instance?
  • Are I/O requests, including meta-data look-up, funneled through a single node?
  • How does a solution scale as the number of nodes and storage devices is increased?
  • How disruptive and time-consuming is adding new or replacing existing storage?
  • Is proprietary hardware needed, or can industry-standard servers and storage be used?
  • What data management features, including load balancing and data protection, exists?
  • What storage interface can be used: SAS, SATA, iSCSI, or Fibre Channel?
  • What types of storage devices are supported: SSD, SAS, Fibre Channel, or SATA disks?

As with most storage systems, it is not the total number of hard disk drives (HDDs), the quantity and speed of tiered-access I/O connectivity, the types and speeds of the processors, or even the amount of cache memory that determines performance. The performance differentiator is how a manufacturer combines the various components to create a solution that delivers a given level of performance with lower power consumption.

To avoid performance surprises, be leery of performance claims based solely on speed and quantity of HDDs or the speed and number of ports, processors and memory. How the resources are deployed and how the storage management software enables those resources to avoid bottlenecks are more important. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion.

Learn more about clustered storage (block, file, VTL/dedupe, archive), clustered NAS, clustered file system, grids and cloud storage among other topics in the following links:

"The Many faces of NAS – Which is appropriate for you?"

Article: Clarifying Storage Cluster Confusion
Presentation: Clustered Storage: “From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0”
Video Interview: How to Scale Data Storage Systems with Clustering
Guidelines for controlling clustering
The benefits of clustered storage

Along with other material on the StorageIO Tips and Tools or portfolio archive or events pages.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Summer Weddings: EMC+Datadomain and HP+IBRIX

Storage I/O trends

Are you friend or family of the bride or groom?

Here’s comes the bride! (Audio)

That’s a question me and Mrs. Schulz were asked recently when we attended a wedding.

Summer months particularly June and August are known as wedding months (Hmmm, more merger & acquisition activity to come?). Summer is a nice time of the year for marriages at least in the U.S. and how ironic that we have already seen two well publicized IT data storage industry unions in the past couple of weeks, not to mention other smaller less publicized ones.

In one case, the California based bride (Datadomain-DDUP) had two courtiers (Massachusetts based EMC and California based NetApp, plus rumors of others). Fortunately one of those had a prenuptial that earned them a cool $57 million for their efforts (NetApp-NTAP) when EMC won the bride. Read more including some of my comments and perspectives among others about EMC, NTAP and DDUP here and here.

Yesterday, on a mid-July Friday, when things are normally quiet, in true wedding industry forum, news was released (here and here) that California based HP announced that it had bought Massachusetts based data and storage management software vendor IBRIX.

That’s a lot of activity involving California and Massachusetts in the past couple of weeks, not to mention the tornado sightings in the vicinity of EMCs Hopington Massachusetts headquarters coincidently around the same time the marriage to DDUP was formerly announced! What’s’ next, Aerosmith is out on tour, perhaps the Del Fuegos or Boston will perform at one of these wedding parties?

Within the data storage industry, publicly traded Datadomain (DDUP) is fairly well known to many for their role in helping to popularize the data footprint impact reduction technique refereed to as de-duplication (e.g. normalization, commonality factoring, intelligent compression, etc.). Adding to the awareness of DDUP was the recent highly public courtship with EMC eventually out-bidding NTAP with a dowry of about $2.1B USD. That type of press coverage and monetary amounts might normally be expected for the likes of a Madonna, Brittney Spears, Michael Jackson-RIP, Paris Hilton, Elizabeth Taylor or other celebrity unions covered by paparazzi with a similar number of attorneys involved.

On the other hand, IBRIX while known to some, is a lessor known entity compared to DDUP having taken a lower profile than even some of their close competitors. However for those who have been following and covering the clustered storage market (see here, here, here, here, here, here, here, here and here ), IBRIX is a well known entity.

IBRIX also has had ties to EMC having been involved in a pre-mari age affair with an reseller arrangement along with being "rumored" ;) to have been involved with ATMOS cloud or policy based storage solution formerly known as "Hulk". IBRIX has also quietly been involved with others like Dell as well as HP in similar to EMC reseller arrangements. Where IBRIX has been positioned is to address high performance, scale out parallel or concurrent clustered file system needs, both big and small I/O, sequential and random data storage and access. For example, in the media/entertainment and other industries along with enabling large Internet providers a bulk (low cost, high capacity) scale-out NAS (NFS & CIFS) option.

One of the reasons that IBRIX has been involved with the likes of EMC, Dell and HP among others is that unlike other vendors such as BlueArc, the once high-flying Isilon, NetApp, Onstor or Panasas, not to mention EMC Cellera NAS , is that those solutions are all bundled with proprietary hardware while IBRIX is software based. Where IBRIX Fusion fits is to enable NAS storage solutions using industry standard hardware (servers and storage) that are capable of being configured for both high performance compute (HPC) along with for low-cost general purpose bulk storage to support Web 2.0, social networking, home directories or on-line archives.

Consequently, and HP or Dell who just happen to sell servers, have had the ability of meeting large scale out and scale up NAS file serving applications by re-selling IBRIX installed on their servers or blade servers with either their own entry to mid-range lower cost, high performance and high capacity storage along with that of 3rd party vendors.

Ironically one of IBRIX’s competitors in the software NAS solution market was and remains PolyServe, software that HP acquired a couple of years ago to create their own scale out NAS solution (e.g. EFS). Other software based solutions include among others Lustre (Sun), CXFS (SGI), EMC ATMOS (I’m sure some will argue this is not scale out or NAS, will leave it at that for now) ;) not to mention those from IBM, Microsoft, Quantum (also re-sold by HP) or Symantec.

What does HP get with IBRIX?

Simple, the ability to own the IP (intellectual proprietary) that one of their competitors had been "rumored" to have been working with at one point, IP that their competitors had been reselling like themselves.

Thus HP gets more software IP that can and has been sold along with their hardware such as the Proliant servers and blade servers giving their customers choice, similar to what HP and other vendors do with their open servers. For example, HP had the ExDS9000 extreme storage system built on a blade server with high density, low cost, high capacity HP storage (e.g. HP Modular Disk System 600, HP MSA or even EVA).

This makes for a nice solution for bulk on-line and near-line storage applications where the emphasis is not as much on performance, rather massive scalability for storing on-line documents, archives, videos, images and other unstructured content which is where there is a lot of growth activity. The challenge is that the ExDS9x00 has only been available with the HP PolyServe software which works good for some environments, yet, for others, the clustered file system scale out capabilities of IBRIX were deployed.

With the addition of IBRIX, HP now should be able to provide their customers and prospects the choice of software to meet specific needs while maintaining an HP footprint, that is both hardware, software and services. HP has several different storage software stacks that they now own (e.g. Lefthand for clustered iSCSI, PolyServe for NFS/CIFS NAS, IBRIX for Clustered File system scale out NAS) not to mention those that it OEMS including among others Bycast (Medical Archive System) that is also OEM’d by IBM as their Medical Grid combined with IBM SOFS, Quantum StorNext and Microsoft Windows Storage Server and Sepaton (VTL and Dedupe) to name a few.

Do I think this was a good move by HP?

Yes as it gives them control over IP that they had been reselling as had some of their competitors who left IBRIX to HP to grab up. HP now has the IP which they can package with their hardware similar to how they have been doing, and giving customers choices to align the right hardware and software technology to the task at hand.

Whether it be Bycast for medical archiving, PolyServe or IBRIX for scale out NAS, Lefthand for clustered iSCSI, Sepaton for VTL and dedupe, Microsoft, Quantum StorNext for shared block storage serving or any of the other software packages HP offers with their industry standard servers, the customer has options.

For IBRIX customers and prospects, this move will give them a boost in a confidence that their decisions and investments are safe.

Ironically, vendors like Symantec with their Scaleable File Serving (SFS) clustered NAS solution that is also software based and runs on anyone’s open servers including those from HP gets a potential shot in the arm with HP validating the model and approach for bulk-storage and clustered NAS (Oh Mr. Salem, Mr. Dell is holding on Line 1, Mr. Chambers is on line 2 and Mr. Ellison on line 3 ;) )

Who’s going to be at the alter next? IMHO, I would keep an eye on (and this all just pure speculation) Bycast, Symantec, EMLX (Broadcom was a wake up call), Quantum, Sepaton, STEC, StorMagic, or ACS, maybe even 3PAR among other possibilities (think outside of the lines). I would not rule out a major game changer such as someone buying NetApp or the likes of an HP buying an EMC or Oracle buying a CSC, maybe even a CSCO buying someone like NTAP, how about Oracle buying NTAP and putting some attorneys out of work, not to mention, who will MSFT hook up with? Anything is possible as we have seen and traditional M&A wisdom is out the window.

Have fun at the next wedding you attend, go easy on the cake and wedding punch, especially if you will be doing any dancing (please, no You tube videos of the chicken dance) and be careful throwing rice or other items.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

SPC and Storage Benchmarking Games

Storage I/O trends

There is a post over in one of the LinkedIn Discussion forums about storage performance council (SPC) benchmarks being miss-leading that I just did a short response post to. Here’s the full post as LinkedIn has a short post response limit.

While the SPC is far from perfect, it is at least for block, arguably better than doing nothing.

For the most part, SPC has become a de facto standard for at least block storage benchmarks independent of using IOmeter or other tools or vendor specific simulations, similar how MSFT ESRP is for exchange, TPC for database, SPEC for NFS and so forth. In fact, SPC even recently rather quietly rolled out a new set of what could be considered the basis for Green storage benchmarks. I would argue that SPC results in themselves are not misleading, particularly if you take the time to look at both the executive and full disclosures and look beyond the summary.

Some vendors have taken advantage of the SPC results playing games with discounting on prices (something that’s allowed under SPC rules) to show and make apples to oranges comparisons on cost per IOP or other ploys. This proactive is nothing new to the IT industry or other industries for that matter, hence benchmark games.

Where the misleading SPC issue can come into play is for those who simply look at what a vendor is claiming and not looking at the rest of the story, or taking the time to look at the results and making apples to apples, instead of believing the apples to oranges comparison. After all, the results are there for a reason. That reason is for those really interested to dig in and sift through the material, granted not everyone wants to do that.

For example, some vendors can show a highly discounted list price to get a better IOP per cost on an apple to oranges basis, however, when processes are normalized, the results can be quite different. However here’s the real gem for those who dig into the SPC results, including looking at the configurations and that is that latency under workload is also reported.

The reason that latency is a gem is that generally speaking, latency does not lie.

What this means is that if vendor A doubles the amount of cache, doubles the number of controllers, doubles the number of disk drives, plays games with actual storage utilization (ASU), utilizes fast interfaces from 10 GbE  iSCSI to 8Gb FC or FCoE or SAS to get a better cost per IOP number with discounting, look at the latency numbers. There have been some recent examples of this where vendor A has a better cost per IOP while achieving a higher number of IOPS at a lower cost compared to vendor B, which is what is typically reported in a press release or news story. (See a blog entry that also points to a CMG presentation discussion around this topic here.

Then go and look at the two results, vendor B may be at list price while vendor A is severely discounted which is not a bad thing, as that is then the starting list price as to which customers should start negotiations. However to be fair, normalize the pricing for fun, look at how much more equipment vendor A may need while having to discount to get the price to offset the increased amount of hardware, then look at latency.

In some of the recent record reported results, the latency results are actually better for a vendor B than for a vendor A and why does latency matter? Beyond showing what a controller can actually do in terms of levering  the number of disks, cache, interface ports and so forth, the big kicker is for those talking about SSD (RAM or FLASH) in that SSD generally is about latency. To fully effectively utilize SSD which is a low latency device, you would want a controller that can do a decent job at handling IOPS; however you also need a controller that can do a decent job of handling IOPS with low latency under heavy workload conditions.

Thus the SPC again while far from perfect, at least for a thumb nail sketch and comparison is not necessarily misleading, more often than not it’s how the results are utilized that is misleading. Now in the quest for the SPC administrators to try and gain more members and broader industry participation and thus secure their own future, is the SPC organization or administration opening itself up to being used more and more as a marketing tool in ways that potentially compromise all the credibility (I know, some will dispute the validity of SPC, however that’s reserved for a different discussion ;) )?

There is a bit of Déjà here for those involved with RAID and storage who recall how the RAID Advisory Board (RAB) in its quest to gain broader industry adoption and support succumbed to marketing pressures and use or what some would describe as miss-use and is now a member of the “Where are they now” club!

Don’t get me wrong here; I like the SPC tests/results/format, there is a lot of good information in the SPC. The various vendor folks who work very hard behind the scenes to make the SPC actually work and continue to evolve it also all deserve a great big kudos, an “atta boy” or “atta girl” for the fine work that have been doing, work that I hope does not become lost in the quest to gain market adoption for the SPC.

Ok, so then this should all then beg the question of what is the best benchmark. Simple, the one that most closely resembles your actual applications, workload, conditions, configuration and environment.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Hot Storage Topics Converge on Chicago Next Week

Storage I/O trends

Next week in Chicago (May 12th) at Storage Strategies, the event for channel professional held the evening before StorageDecisions I will be talking about Hot Storage Topics for 2008 including addressing data protection for virtual environments, power cooling floor space environmental (PCFE) aka green items and the “Green Gap”, data footprint reduction for both on-line active and changing data using real-time data compression, archiving for in-active or dormant data and de-dupe for backup data. Also on the list of hot topics will be clustered NAS and clustered storage for Web 2.0 along with other timely and relevant items.

At the StorageDecisions event, I will be talking about ?Green and Environmental Friendly Storage? Tuesday morning May 13th in the presentation ?Practical Ways to Achieve Energy Efficiency – Power, Cooling, Floor-Space and Environmental (PCFE) Issues and Trends? looking at different issues including the ?Green Gap? or disconnect between messaging and common IT data center issues along with various options to boost efficiency for both active and in-active data and storage resources.

Also while at StorageDecisions next week, on Wednesday the 14th I will be talking about clustered storage including clustered NAS in the session ?Clustered Storage – ?From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0?. Given some recent vendor technology announcements and statements of direction, Web 2.0 and unstructured data are gaining popularity as are the confusing options or different types of clustered storage solutions including ?Cluster Wanna Bee?s?. If you are in Chicago next week, stop in and check out the event and if you can attend any of my sessions, stop by and say hello.

Cheers
GS

StorageIO Spring Keynote and Speaking tour V2.008

Several new keynote and speaking engagements involving myself have been added to the StorageIO events page including among others:

April 8th, 2008 – SNW Orlando FL
Beyond Green-Wash:
IT Data Center Power, Cooling, Floor Space and Environmental (PCFE) Topics and Trends V2.008

This talk will move past what are the issues and reasons for going green and get right to the point of what you can do today leveraging various technologies, techniques and best practices to address PCFE and green environmental issues including EHS, low power and economic sustainment in an environmental friendly manner as well as what to include in a long term green strategy for your data center.

Chicago, May 13th-15th – StorageDecisions
Clustered Storage:
From SMB, to Scientific, to Social Networking and Web 2.0

The growth of structured and unstructured data continues at an explosive rate in most environments resulting in a constantly expanding data footprint requiring data and storage management resources. Similarly, the relative ease of use of NFS and Windows CIFS file sharing based storage, also known as Network Attached Storage (NAS), has led to a proliferation of NAS and Windows file servers which are not all that different from how the ease of use of personal computers (PCs) resulted in desktop and server sprawl. With the focus of many IT organizations today to do more with less, or, do more with what you have, clustered storage and clustered file serving have become a popular option to support modular, scalable and flexible growth. Clustered storage including clustered file serving, grid and web 2.0 based storage solutions are no longer confined to the specific high performance scientific applications they are commonly associated. Clustered storage serving is commonly being deployed to support a wide diversity of applications including commercial, entertainment or media, Web 2.0 and social networking along with grid, cloud and traditional scientific needs.

This session takes a look at among other topics:
? Look at what different clustered storage vendors are claiming and how their solutions differ
? Fact vs. Fiction, Myths and Realties of clustered storage
o Grid vs. Clusters, Cluster vs. Grid, what?s the differences
o Clustered storage is only for ultra large environments like Google
o Clustered file serving is only for high performance (HPC) environments
o SMBs and bulk storage applications can not benefit from clustered storage
? What are the caveats to be aware of when deploying clustered storage?
? What are some emerging trends and solutions to keep an eye on for clustered storage
? What are some questions that some vendors do not want you to ask about their solutions!

Green and Environmental Friendly Storage:
Practical Ways to Achieve Energy Efficiency

Green is in-and every storage vendor out there has a green story to tell. Despite the vendor and industry hyperbole about the environmental benefits of their products, there are still no standard metrics by which to measure and compare power consumption or energy efficiency claims. The challenge is sorting out and closing the gap between vendor green messaging and IT data center issues including power, cooling, floor space and other environmental topics including RoHS and e-waste disposal. This session looks at several practical techniques and technologies that you can leverage today to achieve an energy efficiency data center to sustain business growth in an economical and ecological friendly manner.

Topics that will be covered include among others:
? How truthful are vendor claims and what is ?Green wash?
? Facts and Fiction, Myths and Realities:
o Storage is cheaper to buy than to power
o Power avoidance vs. energy efficiency
o Are Solid State Devices (SSD) the silver bullet?
o Dedupe vs. Archive vs. Compression vs. Consolidation
? What?s real and achievable today, what are your options?
? Measuring and determining energy efficiency with emerging metrics
? How to do more with what you have and avoid forklift upgrades
? Who is the ?Greenest of them all? and where to learn more

I will also be keynoting at several TechTarget seminar series events around the U.S. including
StorageIO events page located here.

Cheers
GS