Cloud Archives

April 9, 2015March 7, 2022

Cloud Conversations: AWS EFS Elastic File System (Cloud NAS) First Preview Look

Storage I/O trends

Cloud Conversations: AWS EFS Elastic File System (Cloud NAS) First Preview Look

Amazon Web Services (AWS) recently announced (preview) new Elastic File System (EFS) providing Network File System (NFS) NAS (Network Attached Storage) capabilities for AWS Elastic Cloud Compute (EC2) instances. EFS AWS compliments other AWS storage offerings including Simple Storage Service (S3) along with Elastic Block Storage (EBS), Glacier and Relational Data Services (RDS) among others.

Ok, that’s a lot of buzzwords and acronyms so lets break this down a bit.

AWS EFS and Cloud Storage, Beyond Buzzword Bingo

EC2 – Instances exist in various Availability Zones (AZ’s) in different AWS Regions. Compute instance with various operating systems including Windows and Ubuntu among others that also can be pre-configured with applications such as SQL Server or web services among others. EC2 instances vary from low-cost to high-performance compute, memory, GPU, storage or general purposed optimized. For example, some EC2 instances rely solely on EBS, S3, RDS or other AWS storage offerings while others include on-board Solid State Disk (SSD) like DAS SSD found on traditional servers. EC2 instances on EBS volumes can be snapshot to S3 storage which in turn can be replicated to another region.
EBS – Scalable block accessible storage for EC2 instances that can be configured for performance or bulk storage, as well as for persistent images for EC2 instances (if you choose to configure your instance to be persistent)
EFS – New file (aka NAS) accessible storage service accessible from EC2 instances in various AZ’s in a given AWS region
Glacier – Cloud based near-line (or by some comparisons off-line) cold-storage archives.
RDS – Relational Database Services for SQL and other data repositories
S3 – Provides durable, scalable low-cost bulk (aka object) storage accessible from inside AWS as well as via externally. S3 can be used by EC2 instances for bulk durable storage as well as being used as a target for EBS snapshots.
Learn more about EC2, EBS, S3, Glacier, Regions, AZ’s and other AWS topics in this primer here

What is EFS

Implements NFS V4 (SNIA NFS V4 primer) providing network attached storage (NAS) meaning data sharing. AWS is indicating initial pricing for EFS at $0.30 per GByte per month. EFS is designed for storage and data sharing from multiple EC2 instances in different AZ’s in the same AWS region with scalability into the PBs.

What EFS is not

Currently it seems that EFS has an end-point inside AWS accessible via an EC2 instance like EBS. This appears to be like EBS where the storage service is accessible only to AWS EC2 instances unlike S3 which can be accessible from the out-side world as well as via EC2 instances.

Note however, that depending on how you configure your EC2 instance with different software, as well as configure a Virtual Private Cloud (VPC) and other settings, it is possible to have an application, software tool or operating system running on EC2 accessible from the outside world. For example, NAS software such as those from SoftNAS and NetApp among many others can be installed on an EC2 instance and with proper configuration, as well as being accessible to other EC2 instances, they can also be accessed from outside of AWS (with proper settings and security).

AWS EFS at this time is NFS version 4 based however does not support Windows SMB/CIFS, HDFS or other NAS access protocols. In addition AWS EFS is accessible from multiple AZ’s within a region. To share NAS data across regions some other software would be required.

EFS is not yet as of this writing released and AWS is currently accepting requests to join the EFS preview here.

Where to learn more

Here are some links to learn more about AWS S3 and related topics

Cross-Region Replication for Amazon S3
Cloud conversations: If focused on cost you might miss other cloud storage benefits
Data Protection Diaries
Cloud Conversations: AWS overview and primer
Eight Ways to Avoid Cloud Storage Pricing Surprises
Cloud and Object Storage Center
Are more than five nines of availability really possible?
How do primary storage clouds and cloud for backup differ?
What’s most important to know about my cloud privacy policy?

What this all means and wrap-up

AWS continues to extend its cloud platform include both compute and storage offerings. EFS compliments EBS along with S3, Glacier and RDS. For many environments NFS support will be welcome while for others CIFS/SMB would be appreciated and others are starting to find that value in HDFS accessible NAS.

Overall I like this announcement and look forward to moving beyond the preview.

Ok, nuff said, for now..

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

March 25, 2015March 7, 2022

Cloud Conversations: AWS S3 Cross Region Replication storage enhancements

Storage I/O trends

Cloud Conversations: AWS S3 Cross Region Replication storage enhancements

Amazon Web Services (AWS) recently among other enhancements announced new Simple Storage Service (S3) cross-region replication of objects from a bucket (e.g. container) in one region to a bucket in another region. AWS also recently enhanced Elastic Block Storage (EBS) increasing maximum performance and size of Provisioned IOPS (SSD) and General Purpose (SSD) volumes. EBS enhancements included ability to store up to 16 TBytes of data in a single volume and do 20,000 input/output operations per second (IOPS). Read more about EBS and other recent AWS server, storage I/O and application enhancements here.

The Problem, Issue, Challenge, Opportunity and Need

The challenge is being able to move data (e.g. objects) stored in AWS buckets in one region to another in a safe, secure, timely, automated, cost-effective way.

Even though AWS has a global name-space, buckets and their objects (e.g. files, data, videos, images, bit and byte streams) are stored in a specific region designated by the customer or user (AWS S3, EBS, EC2, Glacier, Regions and Availability Zone primer can be found here).

Understanding the challenge and designing a strategy

The following diagram shows the challenge and how to copy or replicate objects in an S3 bucket in one region to a destination bucket in a different region. While objects can be copied or replicated without S3 cross-region replication, that involves essentially reading your objects pulling that data out via the internet and then writing to another place. The catch is that this can add extra costs, take time, consume network bandwidth and need extra tools (Cloudberry, Cyberduck, S3fuse, S3motion, S3browser, S3 tools (not AWS) and a long list of others).
aws cross region replication

What is AWS S3 Cross-region replication

Highlights of AWS S3 Cross-region replication include:

AWS S3 Cross region replication is as its name implies, replication of S3 objects from a bucket in one region to a destination bucket in another region.
S3 replication of new objects added to an existing or new bucket (note new objects get replicated)
Policy based replication tied into S3 versioning and life-cycle rules
Quick and easy to set up for use in a matter of minutes via S3 dashboard or other interfaces
Keeps region to region data replication and movement within AWS networks (potential cost advantage)

To activate, you simply enable versioning on a bucket, enable cross-region replication, indicate source bucket (or prefix of objects in bucket), specify destination region and target bucket name (or create one), then create or select an IAM (Identify Access Management) role and objects should be replicated.

Some AWS S3 cross-region replication things to keep in mind (e.g. considerations):
As with other forms of mirroring and replication if you add something on one side it gets replicated to other side
As with other forms of mirroring and replication if you deleted something from the other side it can be deleted on both (be careful and do some testing)
Keep costs in perspective as you still need to pay for your S3 storage at both locations as well as applicable internal data transfer and GET fees
Click here to see current AWS S3 fees for various regions

S3 Cross-region replication and alternative approaches

There are several regions around the world and up until today AWS customers could copy, sync or replicate S3 bucket contents between AWS regions manually (or via automation) using various tools such as Cloudberry, Cyberduck, S3browser and S3motion to name just a few as well as via various gateways and other technologies. Some of those tools and technologies are open-source or free, some are freemium and some are premium for a few that also vary by interface (some with GUI, others with CLI or APIs) including ability to mount an S3 bucket as a local network drive and use tools to sync or copy.

However a catch with the above mentioned tools (among others) and approaches is that to replicate your data (e.g. objects in a bucket) can involve other AWS S3 fees. For example reading data (e.g. a GET which has a fee) from one AWS region and then copying out to the internet has fees. Likewise when copying data into another AWS S3 region (e.g. a PUT which are free) there is also the cost of storage at the destination.

Storage I/O trends

AWS S3 cross-region hands on experience (first look)

For my first hands on (first look) experience with AWS cross-region replication today I enabled a bucket in the US Standard region (e.g. Northern Virginia) and created a new target destination bucket in the EU Ireland. Setup and configuration was very quick, literally just a few minutes with most of the time spent reading the text on the new AWS S3 dashboard properties configuration displays.

I selected an existing test bucket to replicate and noticed that nothing had replicated over to the other bucket until I realized that new objects would be replicated. Once some new objects were added to the source bucket within a matter of moments (e.g. few minutes) they appeared across the pond in my EU Ireland bucket. When I deleted those replicated objects from my EU Ireland bucket and switched back to my view of the source bucket in the US, those new objects were already deleted from the source. Yes, just like regular mirroring or replication, pay attention to how you have things configured (e.g. synchronized vs. contribute vs. echo of changes etc.).

While I was not able to do a solid quantifiable performance test, simply based on some quick copies and my network speed moving via S3 cross-region replication was faster than using something like s3motion with my server in the middle.

It also appears from some initial testing today that a benefit of AWS S3 cross-region replication (besides being bundled and part of AWS) is that some fees to pull data out of AWS and transfer out via the internet can be avoided.

Where to learn more

Here are some links to learn more about AWS S3 and related topics

Cross-Region Replication for Amazon S3
Cloud conversations: If focused on cost you might miss other cloud storage benefits
Data Protection Diaries
Cloud Conversations: AWS overview and primer
Eight Ways to Avoid Cloud Storage Pricing Surprises
Cloud and Object Storage Center
Are more than five nines of availability really possible?
How do primary storage clouds and cloud for backup differ?
What’s most important to know about my cloud privacy policy?

What this all means and wrap-up

For those who are looking for a way to streamline replicating data (e.g. objects) from an AWS bucket in one region with a bucket in a different region you now have a new option. There are potential cost savings if that is your goal along with performance benefits in addition to using what ever might be working in your environment. Replicating objects provides a way of expanding your business continuance (BC), business resiliency (BR) and disaster recovery (DR) involving S3 across regions as well as a means for content cache or distribution among other possible uses.

Overall, I like this ability for moving S3 objects within AWS, however I will continue to use other tools such as S3motion and s3sfs for moving data in and out of AWS as well as among other public cloud serves and local resources.

Ok, nuff said, for now..

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

March 20, 2015October 18, 2024

Data Protection Diaries: Are your restores ready for World Backup Day 2015?

This is part of an ongoing data protection diaries series of post about, well, cloud and data protection and what I’m doing pertaining to World Backup Day 2015 along with related topics.

In case you forgot or did not know, World Backup Day is March 31 2015 (@worldbackupday) so now is a good time to be ready. The only challenge that I have with the World Backup Day (view their site here) that has gone on for a few years know is that it is a good way to call out the importance of backing up or protecting data. However its time to also put more emphasis and focus on being able to make sure those backups or protection copies actually work.

By this I mean doing more than making sure that your data can be read from tape, disk, SSD or cloud service actually going a step further and verifying that restored data can actually be used (read, written, etc).

The Problem, Issue, Challenge, Opportunity and Need

The problem, issue and challenges are simple, are your applications, systems and data protected as well as can you use those protection copies (e.g. backups, snapshots, replicas or archives) when as well as were needed?

storage I/O data protection

The opportunity is simple, avoiding downtime or impact to your business or organization by being proactive.

Understanding the challenge and designing a strategy

The following is my preparation checklist for World Backup Data 2015 (e.g. March 31 2015) which includes what I need or want to protect, as well as some other things to be done including testing, verification, address (remediate or fix) known issues while identifying other areas for future enhancements. Thus perhaps like yours, data protection for my environment which includes physical, virtual along with cloud spanning servers to mobile devices is constantly evolving.

collect TPM metrics from SQL Server with hammerdb
My data protection preparation, checklist and to do list

Finding a solution

While I already have a strategy, plan and solution that encompasses different tools, technologies and techniques, they are also evolving. Part of the evolving is to improve while also exploring options to use new and old things in new ways as well as eat my down dog food or walk the talk vs. talk the talk. The following figure provides a representation of my environment that spans physical, virtual and clouds (more than one) and how different applications along with systems are protected against various threats or risks. Key is that not all applications and data are the same thus enabling them to be protected in different ways as well as over various intervals. Needless to say there is more to how, when, where and with what different applications and systems are protected in my environment than show, perhaps more on that in the future.

server storageio and unlimitedio data protection
Some of what my data protection involves for Server StorageIO

Taking action

What I’m doing is going through my checklist to verify and confirm the various items on the checklist as well as find areas for improvement which is actually an ongoing process.

Do I find things that need to be corrected?

Yup, in fact found something that while it was not a problem, identified a way to improve on a process that will once fully implemented enabler more flexibility both if a restoration is needed, as well as for general everyday use not to mention remove some complexity and cost.

Speaking of lessons learned, check this out that ties into why you want 4 3 2 1 based data protection strategies.

Storage I/O trends

Where to learn more

Here are some extra links to have a look at:

Data Protection Diaries
Cloud conversations: If focused on cost you might miss other cloud storage benefits
5 Tips for Factoring Software into Disaster Recovery Plans
Remote office backup, archiving and disaster recovery for networking pros
Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)
Given outages, are you concerned with the security of the cloud?
Data Archiving: Life Beyond Compliance
My copies were corrupted: The 3-2-1 rule
Take a 4-3-2-1 approach to backing up data
Cloud and Virtual Data Storage Networks – Chapter 8 (CRC/Taylor and Francis)

What this all means and wrap-up

Be prepared, be proactive when it comes to data protection and business resiliency vs. simply relying reacting and recovering hoping that all will be ok (or works).

Take a few minutes (or longer) and test your data protection including backup to make sure that you can:

a) Verify that in fact they are working protecting applications and data in the way expected

b) Restore data to an alternate place (verify functionality as well as prevent a problem)

c) Actually use the data meaning it is decrypted, inflated (un-compressed, un-de duped) and security certificates along with ownership properties properly applied

d) Look at different versions or generations of protection copies if you need to go back further in time

e) Identify area of improvement or find and isolate problem issues in advance vs. finding out after the fact

Time to get back to work checking and verifying things as well as attending to some other items.

Ok, nuff said, for now…

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

March 7, 2015April 27, 2025

How to test your HDD SSD or all flash array (AFA) storage fundamentals

How to test your HDD SSD AFA Hybrid or cloud storage

Updated 2/14/2018

Over at BizTech Magazine I have a new article 4 Ways to Performance Test Your New HDD or SSD that provides a quick guide to verifying or learning what the speed characteristic of your new storage device are capable of.

An out-take from the article used by BizTech as a "tease" is:

These four steps will help you evaluate new storage drives. And … psst … we included the metrics that matter.

Building off the basics, server storage I/O benchmark fundamentals

The four basic steps in the article are:

Plan what and how you are going to test (what’s applicable for you)
Decide on a benchmarking tool (learn about various tools here)
Test the test (find bugs, errors before a long running test)
Focus on metrics that matter (what’s important for your environment)

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

StorageIO Podcast: Kevin Closson discusses SLOB Server CPU I/O Database Performance benchmarks
@KevinClosson: SLOB Use Cases By Industry Vendors. Learn SLOB, Speak The Experts’ Language
Can we get a side of context with them IOPS and other storage metrics?
WHEN AND WHERE TO USE NAND FLASH SSD FOR VIRTUAL SERVERS
Revisiting RAID storage remains relevant and resources
NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

To some the above (read the full article here) may seem like common sense tips and things everybody should know otoh there are many people who are new to servers storage I/O networking hardware software cloud virtual along with various applications, not to mention different tools.

Thus the above is a refresher for some (e.g. Dejavu) while for others it might be new and revolutionary or simply helpful. Interested in HDD’s, SSD’s as well as other server storage I/O performance along with benchmarking tools, techniques and trends check out the collection of links here (Server and Storage I/O Benchmarking and Performance Resources).

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2026 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

February 1, 2015January 23, 2019

Server Storage I/O Benchmark Tools: Microsoft Diskspd (Part I)

server storage I/O trends

This is part-one of a two-part post pertaining Microsoft Diskspd.that is also part of a broader series focused on server storage I/O benchmarking, performance, capacity planning, tools and related technologies. You can view part-two of this post here, along with companion links here.

Background

Many people use Iometer for creating synthetic (artificial) workloads to support benchmarking for testing, validation and other activities. While Iometer with its GUI is relatively easy to use and available across many operating system (OS) environments, the tool also has its limits. One of the bigger limits for Iometer is that it has become dated with little to no new development for a long time, while other tools including some new ones continue to evolve in functionality, along with extensibility. Some of these tools have optional GUI for easy of use or configuration, while others simple have extensive scripting and command parameter capabilities. Many tools are supported across different OS including physical, virtual and cloud, while others such as Microsoft Diskspd are OS specific.

Instead of focusing on Iometer and other tools as well as benchmarking techniques (we cover those elsewhere), lets focus on Microsoft Diskspd.

What is Microsoft Diskspd?

Microsoft Diskspd is a synthetic workload generation (e.g. benchmark) tool that runs on various Windows systems as an alternative to Iometer, vdbench, iozone, iorate, fio, sqlio among other tools. Diskspd is a command line tool which means it can easily be scripted to do reads and writes of various I/O size including random as well as sequential activity. Server and storage I/O can be buffered file system as well non-buffered across different types of storage and interfaces. Various performance and CPU usage information is provided to gauge the impact on a system when doing a given number of IOP’s, amount of bandwidth along with response time latency.

What can Diskspd do?

Microsoft Diskspd creates synthetic benchmark workload activity with ability to define various options to simulate different application characteristics. This includes specifying read and writes, random, sequential, IO size along with number of threads to simulate concurrent activity. Diskspd can be used for testing or validating server and storage I/O systems along with associated software, tools and components. In addition to being able to specify different workloads, Diskspd can also be told which processors to use (e.g. CPU affinity), buffering or non-buffered IO among other things.

What type of storage does Diskspd work with?

Physical and virtual storage including hard disk drive (HDD), solid state devices (SSD), solid state hybrid drives (SSHD) in various systems or solutions. Storage can be physical as well as partitions or file systems. As with any workload tool when doing writes, exercise caution to prevent accidental deletion or destruction of your data.

What information does Diskspd produce?

Diskspd provides output in text as well as XML formats. See an example of Diskspd output further down in this post.

Where to get Diskspd?

You can download your free copy of Diskspd from the Microsoft site here.

The download and installation are quick and easy, just remember to select the proper version for your Windows system and type of processor.

Another tip is to remember to set path environment variables point to where you put the Diskspd image.

Also stating what should be obvious, don’t forget that if you are going to be doing any benchmark or workload generation activity on a system where the potential for a data to be over-written or deleted, make sure you have a good backup and tested restore before you begin, if something goes wrong.

New to server storage I/O benchmarking or tools?

If you are not familiar with server storage I/O performance benchmarking or using various workload generation tools (e.g. benchmark tools), Drew Robb (@robbdrew) has a Data Storage Benchmarking Guide article over at Enterprise Storage Forum that provides a good framework and summary quick guide to server storage I/O benchmarking.

Via Drew:

Data storage benchmarking can be quite esoteric in that vast complexity awaits anyone attempting to get to the heart of a particular benchmark.

Case in point: The Storage Networking Industry Association (SNIA) has developed the Emerald benchmark to measure power consumption. This invaluable benchmark has a vast amount of supporting literature. That so much could be written about one benchmark test tells you just how technical a subject this is. And in SNIA’s defense, it is creating a Quick Reference Guide for Emerald (coming soon).

But rather than getting into the nitty-gritty nuances of the tests, the purpose of this article is to provide a high-level overview of a few basic storage benchmarks, what value they might have and where you can find out more.

Read more here including some of my comments, tips and recommendations.

In addition to Drew’s benchmarking quick reference guide, along with the server storage I/O benchmarking tools, technologies and techniques resource page (Server and Storage I/O Benchmarking 101 for Smarties.

How do you use Diskspd?

Tip: When you run Microsoft Diskspd it will create a file or data set on the device or volume being tested that it will do its I/O to, make sure that you have enough disk space for what will be tested (e.g. if you are going to test 1TB you need to have more than 1TB of disk space free for use). Another tip is to speed up the initializing (e.g. when Diskspd creates the file that I/Os will be done to) run as administrator.

Tip: In case you forgot, a couple of other useful Microsoft tools (besides Perfmon) for working with and displaying server storage I/O devices including disks (HDD and SSDs) are the commands "wmic diskdrive list [brief]" and "diskpart". With diskpart exercise caution as it can get you in trouble just as fast as it can get you out of trouble.

You can view the Diskspd commands after installing the tool and from a Windows command prompt type:

C:\Users\Username> Diskspd

The above command will display Diskspd help and information about the commands as follows.

Usage: diskspd [options] target1 [ target2 [ target3 …] ]

version 2.0.12 (2014/09/17)
Available targets:

       file_path

       #
       :
Available options:

-?	display usage information
-a#[,#[…]]	advanced CPU affinity – affinitize threads to CPUs provided after -a in a round-robin manner within current KGroup (CPU count starts with 0); the same CPU can be listed more than once and the number of CPUs can be different than the number of files or threads (cannot be used with -n)
-ag	group affinity – affinitize threads in a round-robin manner across KGroups
-b[K\|M\|G]	block size in bytes/KB/MB/GB [default=64K]
-B[K\|M\|G\|b]	base file offset in bytes/KB/MB/GB/blocks [default=0] (offset from the beginning of the file)
-c[K\|M\|G\|b]	create files of the given size. Size can be stated in bytes/KB/MB/GB/blocks
-C	cool down time – duration of the test after measurements finished [default=0s].
-D	Print IOPS standard deviations. The deviations are calculated for samples of duration . is given in milliseconds and the default value is 1000.
-d	duration (in seconds) to run test [default=10s]
-f[K\|M\|G\|b]	file size – this parameter can be used to use only the part of the file/disk/partition for example to test only the first sectors of disk
-fr	open file with the FILE_FLAG_RANDOM_ACCESS hint
-fs	open file with the FILE_FLAG_SEQUENTIAL_SCAN hint
-F	total number of threads (cannot be used with -t)
-g	throughput per thread is throttled to given bytes per millisecond note that this can not be specified when using completion routines
-h	disable both software and hardware caching
-i	number of IOs (burst size) before thinking. must be specified with -j
-j	time to think in ms before issuing a burst of IOs (burst size). must be specified with -i
-I	Set IO priority to . Available values are: 1-very low, 2-low, 3-normal (default)
-l	Use large pages for IO buffers
-L	measure latency statistics
-n	disable affinity (cannot be used with -a)
-o	number of overlapped I/O requests per file per thread (1=synchronous I/O, unless more than 1 thread is specified with -F) [default=2]
-p	start async (overlapped) I/O operations with the same offset (makes sense only with -o2 or grater)
-P	enable printing a progress dot after each completed I/O operations (counted separately by each thread) [default count=65536]
-r[K\|M\|G\|b]	random I/O aligned to bytes (doesn’t make sense with -s). can be stated in bytes/KB/MB/GB/blocks [default access=sequential, default alignment=block size]
-R	output format. Default is text.
-s[K\|M\|G\|b]	stride size (offset between starting positions of subsequent I/O operations)
-S	disable OS caching
-t	number of threads per file (cannot be used with -F)
-T[K\|M\|G\|b]	stride between I/O operations performed on the same file by different threads [default=0] (starting offset = base file offset + (thread number * ) it makes sense only with -t or -F
-v	verbose mode
-w	percentage of write requests (-w and -w0 are equivalent). absence of this switch indicates 100% reads IMPORTANT: Your data will be destroyed without a warning
-W	warm up time – duration of the test before measurements start [default=5s].
-x	use completion routines instead of I/O Completion Ports
-X	use an XML file for configuring the workload. Cannot be used with other parameters.
-z	set random seed [default=0 if parameter not provided, GetTickCount() if value not provided]

	Write buffers command options. By default, the write buffers are filled with a repeating pattern (0, 1, 2, …, 255, 0, 1, …)
-Z	zero buffers used for write tests
-Z[K\|M\|G\|b]	use a global buffer filled with random data as a source for write operations.
-Z[K\|M\|G\|b],	use a global buffer filled with data from as a source for write operations. If is smaller than , its content will be repeated multiple times in the buffer. By default, the write buffers are filled with a repeating pattern (0, 1, 2, …, 255, 0, 1, …)

	Synchronization command options
-ys	signals event before starting the actual run (no warmup) (creates a notification event if does not exist)
-yf	signals event after the actual run finishes (no cooldown) (creates a notification event if does not exist)
-yr	waits on event before starting the run (including warmup) (creates a notification event if does not exist)
-yp	allows to stop the run when event is set; it also binds CTRL+C to this event (creates a notification event if does not exist)
-ye	sets event and quits

Event Tracing command options
-ep	use paged memory for NT Kernel Logger (by default it uses non-paged memory)
-eq	use perf timer
-es	use system timer (default)
-ec	use cycle count
-ePROCESS	process start & end
-eTHREAD	thread start & end
-eIMAGE_LOAD	image load
-eDISK_IO	physical disk IO
-eMEMORY_PAGE_FAULTS	all page faults
-eMEMORY_HARD_FAULTS	hard faults only
-eNETWORK	TCP/IP, UDP/IP send & receive
-eREGISTRY	registry calls

Examples:

Create 8192KB file and run read test on it for 1 second:

diskspd -c8192K -d1 testfile.dat

Set block size to 4KB, create 2 threads per file, 32 overlapped (outstanding)
I/O operations per thread, disable all caching mechanisms and run block-aligned random
access read test lasting 10 seconds:

diskspd -b4K -t2 -r -o32 -d10 -h testfile.dat

Create two 1GB files, set block size to 4KB, create 2 threads per file, affinitize threads
to CPUs 0 and 1 (each file will have threads affinitized to both CPUs) and run read test
lasting 10 seconds:

diskspd -c1G -b4K -t2 -d10 -a0,1 testfile1.dat testfile2.dat

Where to learn more

The following are related links to read more about servver (cloud, virtual and physical) storage I/O benchmarking tools, technologies and techniques.
resource page

Server and Storage I/O Benchmarking 101 for Smarties.

Microsoft Diskspd download and Microsoft Diskspd overview (via Technet)

I/O, I/O how well do you know about good or bad server and storage I/Os?

Server and Storage I/O Benchmark Tools: Microsoft Diskspd (Part I and Part II)

Wrap up and summary, for now…

This wraps up part-one of this two-part post taking a look at Microsoft Diskspd benchmark and workload generation tool. In part-two (here) of this post series we take a closer look including a test drive using Microsoft Diskspd.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

twitter @storageio

February 1, 2015November 3, 2024

Microsoft Diskspd (Part II): Server Storage I/O Benchmark Tools

server storage I/O trends

This is part-two of a two-part post pertaining Microsoft Diskspd.that is also part of a broader series focused on server storage I/O benchmarking, performance, capacity planning, tools and related technologies. You can view part-one of this post here, along with companion links here.

Microsoft Diskspd StorageIO lab test drive

Server and StorageIO lab

Talking about tools and technologies is one thing, installing as well as trying them is the next step for gaining experience so how about some quick hands-on time with Microsoft Diskspd (download your copy here).

The following commands all specify an I/O size of 8Kbytes doing I/O to a 45GByte file called diskspd.dat located on the F: drive. Note that a 45GByte file is on the small size for general performance testing, however it was used for simplicity in this example. Ideally a larger target storage area (file, partition, device) would be used, otoh, if your application uses a small storage device or volume, then tune accordingly.

In this test, the F: drive is an iSCSI RAID protected volume, however you could use other storage interfaces supported by Windows including other block DAS or SAN (e.g. SATA, SAS, USB, iSCSI, FC, FCoE, etc) as well as NAS. Also common to the following commands is using 16 threads and 32 outstanding I/Os to simulate concurrent activity of many users, or application processing threads.

Another common parameter used in the following was -r for random, 7200 seconds (e.g. two hour) test duration time, display latency ( -L ) disable hardware and software cache ( -h), forcing cpu affinity (-a0,1,2,3). Since the test ran on a server with four cores I wanted to see if I could use those for helping to keep the threads and storage busy. What varies in the commands below is the percentage of reads vs. writes, as well as the results output file. Some of the workload below also had the -S option specified to disable OS I/O buffering (to view how buffering helps when enabled or disabled). Depending on the goal, or type of test, validation, or workload being run, I would choose to set some of these parameters differently.

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -w0 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noh_write000.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -w50 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noh_write050.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -w100 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noh_write100.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -S -w0 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noSh_test_write000.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -S -w50 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noSh_write050.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -S -w100 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noSh_write100.txt

The following is the output from the above workload command.
Microsoft Diskspd sample output

Note that as with any benchmark, workload test or simulation your results will vary. In the above the server, storage and I/O system were not tuned as the focus was on working with the tool, determining its capabilities. Thus do not focus on the performance results per say, rather what you can do with Diskspd as a tool to try different things. Btw, fwiw, in the above example in addition to using an iSCSI target, the Windows 2012 R2 server was a guest on a VMware ESXi 5.5 system.

Where to learn more

The following are related links to read more about server (cloud, virtual and physical) storage I/O benchmarking tools, technologies and techniques.

Drew Robb’s benchmarking quick reference guide
Server storage I/O benchmarking tools, technologies and techniques resource page
Server and Storage I/O Benchmarking 101 for Smarties.
Microsoft Diskspd download and Microsoft Diskspd overview (via Technet)
I/O, I/O how well do you know about good or bad server and storage I/Os?
Server and Storage I/O Benchmark Tools: Microsoft Diskspd (Part I and Part II)

Comments and wrap-up

What I like about Diskspd (Pros)

Reporting including CPU usage (you can’t do server and storage I/O without CPU) along with IOP’s (activity), bandwidth (throughout or amount of data being moved), per thread and total results along with optional reporting. While a GUI would be nice particular for beginners, I’m used to setting up scripts for different workloads so having an extensive options for setting up different workloads is welcome. Being associated with a specific OS (e.g. Windows) the CPU affinity and buffer management controls will be handy for some projects.

Diskspd has the flexibility to use different storage interfaces and types of storage including files or partitions should be taken for granted, however with some tools don’t take things for granted. I like the flexibility to easily specify various IO sizes including large 1MByte, 10MByte, 20MByte, 100MByte and 500MByte to simulate application workloads that do large sequential (or random) activity. I tried some IO sizes (e.g. specified by -b parameter larger than 500MB however, I received various errors including "Could not allocate a buffer bytes for target" which means that Diskspd can do IO sizes smaller than that. While not able to do IO sizes larger than 500MB, this is actually impressive. Several other tools I have used or with have IO size limits down around 10MByte which makes it difficult for creating workloads that do large IOP’s (note this is the IOP size, not the number of IOP’s).

Oh, something else that should be obvious however will state it, Diskspd is free unlike some industry de-facto standard tools or workload generators that need a fee to get and use.

Where Diskspd could be improved (Cons)

For some users a GUI or configuration wizard would make the tool easier to get started with, on the other hand (oth), I tend to use the command capabilities of tools. Would also be nice to specify ranges as part of a single command such as stepping through an IO size range (e.g. 4K, 8K, 16K, 1MB, 10MB) as well as read write percentages along with varying random sequential mixes. Granted this can easily be done by having a series of commands, however I have become spoiled by using other tools such as vdbench.

Summary

Overall I like Diskspd and have added it to my Server Storage I/O workload and benchmark tool-box

Keep in mind that the best benchmark or workload generation technology tool will be your own application(s) configured to run as close as possible to production activity levels.

However when that is not possible, the an alternative is to use tools that have the flexibility to be configured as close as possible to your application(s) workload characteristics. This means that the focus should not be as much on the tool, as opposed to how flexible is a tool to work for you, granted the tool needs to be robust.

Having said that, Microsoft Diskspd is a good and extensible tool for benchmarking, simulation, validation and comparisons, however it will only be as good as the parameters and configuration you set it up to use.

Check out Microsoft Diskspd and add it to your benchmark and server storage I/O tool-box like I have done.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

February 1, 2015April 27, 2025

I/O, I/O how well do you know good bad ugly server storage I/O iops?

How well do you know good bad ugly I/O iops?

Updated 2/10/2018

There are many different types of server storage I/O iops associated with various environments, applications and workloads. Some I/Os activity are iops, others are transactions per second (TPS), files or messages per time (hour, minute, second), gets, puts or other operations. The best IO is one you do not have to do.

What about all the cloud, virtual, software defined and legacy based application that still need to do I/O?

If no IO operation is the best IO, then the second best IO is the one that can be done as close to the application and processor as possible with the best locality of reference.

Also keep in mind that aggregation (e.g. consolidation) can cause aggravation (server storage I/O performance bottlenecks).

Example of aggregation (consolidation) causing aggravation (server storage i/o blender bottlenecks)

And the third best?

It’s the one that can be done in less time or at least cost or effect to the requesting application, which means moving further down the memory and storage stack.

Leveraging flash SSD and cache technologies to find and fix server storage I/O bottlenecks

On the other hand, any IOP regardless of if for block, file or object storage that involves some context is better than those without, particular involving metrics that matter (here, here and here [webinar] )

Server Storage I/O optimization and effectiveness

The problem with IO’s is that they are a basic operations to get data into and out of a computer or processor, so there’s no way to avoid all of them, unless you have a very large budget. Even if you have a large budget that can afford an all flash SSD solution, you may still meet bottlenecks or other barriers.

IO’s require CPU or processor time and memory to set up and then process the results as well as IO and networking resources to move data too their destination or retrieve them from where they are stored. While IO’s cannot be eliminated, their impact can be greatly improved or optimized by, among other techniques, doing fewer of them via caching and by grouping reads or writes (pre-fetch, write-behind).

Think of it this way: Instead of going on multiple errands, sometimes you can group multiple destinations together making for a shorter, more efficient trip. However, that optimization may also mean your drive will take longer. So, sometimes it makes sense to go on a couple of quick, short, low-latency trips instead of one larger one that takes half a day even as it accomplishes many tasks. Of course, how far you have to go on those trips (i.e., their locality) makes a difference about how many you can do in a given amount of time.

Locality of reference (or proximity)

What is locality of reference?

This refers to how close (i.e., its place) data exists to where it is needed (being referenced) for use. For example, the best locality of reference in a computer would be registers in the processor core, ready to be acted on immediately. This would be followed by levels 1, 2, and 3 (L1, L2, and L3) onboard caches, followed by main memory, or DRAM. After that comes solid-state memory typically NAND flash either on PCIe cards or accessible on a direct attached storage (DAS), SAN, or NAS device.

Even though a PCIe NAND flash card is close to the processor, there still remains the overhead of traversing the PCIe bus and associated drivers. To help offset that impact, PCIe cards use DRAM as cache or buffers for data along with meta or control information to further optimize and improve locality of reference. In other words, this information is used to help with cache hits, cache use, and cache effectiveness vs. simply boosting cache use.

SSD to the rescue?

What can you do the cut the impact of IO’s?

There are many steps one can take, starting with establishing baseline performance and availability metrics.

The metrics that matter include IOP’s, latency, bandwidth, and availability. Then, leverage metrics to gain insight into your application’s performance.

Understand that IO’s are a fact of applications doing work (storing, retrieving, managing data) no matter whether systems are virtual, physical, or running up in the cloud. But it’s important to understand just what a bad IO is, along with its impact on performance. Try to identify those that are bad, and then find and fix the problem, either with software, application, or database changes. Perhaps you need to throw more software caching tools, hypervisors, or hardware at the problem. Hardware may include faster processors with more DRAM and faster internal busses.

Leveraging local PCIe flash SSD cards for caching or as targets is another option.

You may want to use storage systems or appliances that rely on intelligent caching and storage optimization capabilities to help with performance, availability, and capacity.

Where to gain insight into your server storage I/O environment

There are many tools that you can be used to gain insight into your server storage I/O environment across cloud, virtual, software defined and legacy as well as from different layers (e.g. applications, database, file systems, operating systems, hypervisors, server, storage, I/O networking). Many applications along with databases have either built-in or optional tools from their provider, third-party, or via other sources that can give information about work activity being done. Likewise there are tools to dig down deeper into the various data information infrastructure to see what is happening at the various layers as shown in the following figures.

Gaining application and operating system level performance insight via different tools

Insight and awareness via operating system tools on Windows and Linux

In the above example, Spotlight on Windows (SoW) which you can download for free from Dell here along with Ubuntu utilities are shown, You could also use other tools to look at server storage I/O performance including Windows Perfmon among others.

Hypervisor performance using VMware ESXi / vsphere built-in tools

Using Visual ESXtop to dig deeper into virtual server storage I/O performance

Gaining insight into virtual server storage I/O cache performance

Wrap up and summary

There are many approaches to address (e.g. find and fix) vs. simply move or mask data center and server storage I/O bottlenecks. Having insight and awareness into how your environment along with applications is important to know to focus resources. Also keep in mind that a bit of flash SSD or DRAM cache in the applicable place can go along way while a lot of cache will also cost you cash. Even if you cant eliminate I/Os, look for ways to decrease their impact on your applications and systems.

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Can we get a side of context with them IOPS and other storage metrics?
WHEN AND WHERE TO USE NAND FLASH SSD FOR VIRTUAL SERVERS
Revisiting RAID storage remains relevant and resources
NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

>Keep in mind: SSD including flash and DRAM among others are in your future, the question is where, when, with what, how much and whose technology or packaging.

Ok, nuff said, for now.

January 31, 2015March 7, 2022

Green and Virtual IT Data Center Primer

Green and Virtual Data Center Primer

Moving beyond Green Hype and Green washing

Green IT is about enabling efficient, effective and productive information services delivery. There is a growing green gap between green hype messaging or green washing and IT pain point issues including limits on availability or rising costs of power, cooling, floor-space as well as e-waste and environmental health and safety (PCFE). To close the gap will involve addressing green messaging and rhetoric closer to where IT organizations pain points are and where budget dollars exists that can address PCFE and other green related issues as a by-product. The green gap will also be narrowed as awareness of broader green related topics coincide with IT data center pain points, in other words, alignment of messaging with IT issues that have or will have budget dollars allocated towards them to sustain business and economic growth via IT resource usage efficiency. Read more here.

There are many aspects to "Green" Information Technology including servers, storage, networks and associated management tools and techniques. The reasons and focus of "Green IT" including "Green Data Storage ", "Green Computing" and related focus areas are varied to discuss diverse needs, issues and requirements including among others:

Power, Cooling, Floor-space, Environmental (PCFE) related issues or constraints
Reduction of carbon dioxide (CO2) emissions and other green house gases (GHGs)
Business growth and economic sustain in an environmental friendly manner
Proper disposal or recycling of environmental harmful retired technology components
Reduction or better efficiency of electrical power consumption used for IT equipment
Cost avoidance or savings from lower energy fees and cooling costs
Support data center and application consolidation to cut cost and management
Enable growth and enhancements to application service level objectives
Maximize the usage of available power and cooling resources available in your region
Compliance with local or federal government mandates and regulations
Economic sustain and ability to support business growth and service improvements
General environmental awareness and stewardship to save and protect the earth

While much of the IT industry focuses on CO2 emissions footprints, data management software and electrical power consumption, cooling and ventilation of IT data centers is an area of focus associated with "Green IT" as well as a means to discuss more effective use of electrical energy that can yield rapid results for many environments. Large tier-1 vendors including HP and IBM among others who have an IT and data center wide focus have services designed to do quick assessments as well as detailed analysis and re-organization of IT data center physical facilities to improve air flow and power consumption for more effective cooling of IT technologies including servers, storage, networks and other equipment.

Similar to your own residence, basic steps to improve your cooling effectiveness can lead to use of less energy to cut your budget impact, or, enable you to do more with what you already have with your cooling capacity to support growth, acquisitions and or consolidation initiatives. Vendors are also looking at means and alternatives for cooling IT equipment ranging from computer assisted computational fluid dynamics (CFD) software analysis of data center cooling and ventilation to refrigerated cooling racks some leveraging water or inert liquid cooling.

Various metrics exists and others are evolving for measuring, estimating, reporting, analyzing and discussing IT Data Center infrastructure resource topics including servers, storage, networks, facilities and associated software management tools from a power, cooling and green environmental standpoint. The importance of metrics is to focus on the larger impact of a piece of IT equipment that includes its cost and energy consumption that factors in cooling and other hosting or site environmental costs. Naturally energy costs and CO2 (carbon offsets) will vary by geography and region along with type of electrical power being used (Coal, Natural Gas, Nuclear, Wind, Thermo, Solar, etc) and other factors that should be kept in perspective as part of the big picture.

Consequently your view and needs or interests around "Green" IT may be from an electrical power conservation perspective to maximize your power consumption or to adapt to a given power footprint or ceiling. Your focus around "Green" Data Centers and Green Storage may be from a carbon savings standpoint or proper disposition of old and retired IT equipment or from a data center cooling standpoint. Another area of focus may be that you are looking to cut your data footprint to align with your power, cooling and green footprint while enhancing application and data service delivery to your customers.

Where to learn more

The following are useful links to related efficient, effective, productive, flexible, scalable and resilient IT data center along with server storage I/O networking hardware and software that supports cloud and virtual green data centers.

Various IT industry vendor and service provider links
Green and Virtual Data Center: Productive Economical Efficient Effective Flexible
Green and Virtual Data Center links
Are large storage arrays dead at the hands of SSD?
Closing the Green Gap
Energy efficient technology sales depend on the pitch

What this all means

The result of a green and virtual data center is that of a flexible, agile, resilient, scalable information factory that is also economical, productive, efficient, productive as well as sustainable.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

January 31, 2015December 29, 2025

Green and Virtual Data Center Links

Updated 10/25/2017

Green and Virtual IT Data Center Links

Moving beyond Green Hype and Green washing

Green hype and green washing may be on the endangered species list and going away, however, green IT for servers, storage, networks, facilities as well as related software and management techniques that address energy efficiency including power and cooling along with e-waste, environmental health and safety related issues are topics that wont be going away anytime soon.

There is a growing green gap between green hype messaging or green washing and IT pain point issues including limits on availability or rising costs of power, cooling, floor-space as well as e-waste and environmental health and safety (PCFE).

To close the gap will involve addressing green messaging and rhetoric closer to where IT organizations pain points are and where budget dollars exists that can address PCFE and other green related issues as a by-product. The green gap will also be narrowed as awareness of broader green related topics coincide with IT data center pain points, in other words, alignment of messaging with IT issues that have or will have budget dollars allocated towards them to sustain business and economic growth via IT resource usage efficiency. Read more here.

Enabling Effective Produtive Efficient Economical Flexible Scalable Resilient Information Infrastrctures

Various IT industry vendors and other links

Via StorageIOblog – Happy Earth Day 2016 Eliminating Digital and Data e-Waste

Green and Virtual Data Center Primer
Green and Virtual Data Center: Productive Economical Efficient Effective Flexible
Are large storage arrays dead at the hands of SSD?
Closing the Green Gap
Energy efficient technology sales depend on the pitch
EPA E nergy Star for Data Center Storage Update
EPA Energy Star for data center storage draft 3 specification
Green IT Confusion Continues, Opportunities Missed!
Green IT deferral blamed on economic recession might be result of green gap
How much SSD do you need vs. want?
How to reduce your Data Footprint impact (Podcast)
Industry trend: People plus data are aging and living longer
In the data center or information factory, not everything is the same
More storage and IO metrics that matter
Optimizing storage capacity and performance to reduce your data footprint
Performance metrics: Evaluating your data storage efficiency
PUE, Are you Managing Power, Energy or Productivity?
Saving Money with Green Data Storage Technology
Saving Money with Green IT: Time To Invest In Information Factories
Shifting from energy avoidance to energy efficiency
SNIA Green Storage Knowledge Center
Speaking of speeding up business with SSD storage
SSD and Green IT moving beyond green washing
Storage Efficiency and Optimization: The Other Green
Supporting IT growth demand during economic uncertain times
The Green and Virtual Data Center Book (CRC Press, Intel Recommended Reading)
The new Green IT: Efficient, Effective, Smart and Productive
The other Green Storage: Efficiency and Optimization
What is the best kind of IO? The one you do not have to do

Click here to learn about "The Green and Virtual Data Center" book (CRC Press) for enabling efficient , productive IT data centers. This book covers cloud, virtualization, servers, storage, networks, software, facilities and associated management topics, technologies and techniques including metrics that matter. This book by industry veteran IT advisor and author Greg Schulz is the definitive guide for enabling economic efficiency and productive next generation data center strategies. Read more here and order your copyhere. Also check out Cloud and Virtual Data Storage Networking (CRC Press) a new book by Greg Schulz.

White papers, analyst reports and perspectives

Business benefits of data footprint reduction (archiving, compression, de-dupe)
Data center I/O and performance issues – Server I/O and storage capacity gap
Analysis of EPA Report to Congress (Law 109-431)
The Many Faces of MAID Storage Technology
Achieving Energy Efficiency with FLASH based SSD
MAID 2.0: Energy Savings without Performance Compromises

Articles, Tips, Blogs, Webcasts and Podcasts

AP – SNIA Green Emerald Program and measurements
AP – Southern California heat wave strains electrical system
Ars Technica – EPA: Power usage in data centers could double by 2011
Ars Technica – Meet the climate savers: Major tech firms launch war on energy-inefficient PCs – Article
Askageek.com – Buying an environmental friendly laptop – November 2008
Baseline – Examining Energy Consumption in the Data Center
Baseline – Burts Bees: What IT Means When You Go Green
Bizcovering – Green architecture for the masses
Broadstuff – Are Green 2.0 and Enterprise 2.0 Incompatible?
Business Week – CEO Guide to Technology
Business Week – Computers’ elusive eco factor
Business Week – Clean Energy – Its Getting Affordable
Byte & Switch – Keeping it Green This Summer – Don’t be "Green washed"
Byte & Switch – IBM Sees Green in Energy Certificates
Byte & Switch – Users Search for power solutions
Byte & Switch – DoE issues Green Storage Warning
CBR – The Green Light for Green IT
CBR – Big boxes make greener data centers
CFO – Power Scourge
Channel Insider – A 12 Step Program to Dispose of IT Equipment
China.org.cn – China publishes Energy paper
CIO – Green Storage Means Money Saved on Power
CIO – Data center designers share secrets for going green
CIO – Best Place to Build a Data Center in North America
CIO Insight – Clever Marketing or the Real Thing?
Cleantechnica – Cooling Data Centers Could Prevent Massive Electrical Waste – June 2008
Climatebiz – Carbon Calculators Yield Spectrum of Results: Study
CNET News – Linux coders tackle power efficiency
CNET News – Research: Old data centers can be nearly as ‘green’ as new ones
CNET News – Congress, Greenpeace move on e-wast
CNN Money – A Green Collar Recession
CNN Money – IBM creates alliance with industry leaders supporting new data center standards
Communication News – Utility bills key to greener IT
Computerweekly – Business case for green storage
Computerweekly – Optimising data centre operations
Computerweekly – Green still good for IT, if it saves money
Computerweekly – Meeting the Demands for storage
Computerworld – Wells Fargo Free Data Center Cooling System
Computerworld – Seven ways to get green and save money
Computerworld – Build your data center here: The most energy-efficient locations
Computerworld – EPA: U.S. needs more power plants to support data centers
Computerworld – GreenIT: A marketing ploy or new technology?
Computerworld – Gartner Criticizes Green Grid
Computerworld – IT Skills no longer sufficient for data center execs.
Computerworld – Meet MAID 2.0 and Intelligent Power Management
Computerworld – Feds to offer energy ratings on servers and storage
Computerworld – Greenpeace still hunting for truly green electronics
Computerworld – How to benchmark data center energy costs
ComputerworldUK – Datacenters at risk from poor governance
ComputerworldUK – Top IT Leaders Back Green Survey
ComputerworldMH – Lean and Green
CTR – Strategies for enhancing energy efficiency
CTR – Economies of Scale – Green Data Warehouse Appliances
Datacenterknowledge – Microsoft to build Illinois datacenter
Data Center Strategies – Storage The Next Hot Topic
Earthtimes – Fujitsu installs hydrogen fuel cell power
eChannelline – IBM Goes Green(er)
Ecoearth.info – California Moves To Speed Solar, Wind Power Grid Connections
Ecogeek – Solar power company figures they can power 90% of America
Economist – Cool IT
Electronic Design – How many watts in that Gigabyte
eMazzanti – Desktop virtualization movement creeping into customer sites
ens-Newswire – Western Governors Ask Obama for National Green Energy Plan
Environmental Leader – Best Place to Build an Energy Efficient Data Center
Environmental Leader – New Guide Helps Advertisers Avoid Greenwash Complaints
Enterprise Storage Forum – Power Struggles Take Center Stage at SNW
Enterprise Storage Forum – Pace Yourself for Storage Power & Cooling Needs
Enterprise Storage Forum – Storage Power and Cooling Issues Heat Up – StorageIO Article
Enterprise Storage Forum – Score Savings With A Storage Power Play
Enterprise Storage Forum – I/O, I/O, Its off to Virtual Work I Go
Enterprise Storage Forum – Not Just a Flash in the Pan – Various SSD options
Enterprise Storage Forum – Closing the Green Gap – Article August 2008
EPA Report to Congress and Public Law 109-431 – Reports & links
eWeek – Saving Green by being Green
eWeek – ‘No Cooling Necessary’ Data Centers Coming?
eWeek – How the ‘Down’ Macroeconomy Will Impact the Data Storage Sector
ExpressComputer – In defense of Green IT
ExpressComputer – What data center crisis
Forbes – How to Build a Quick Charging Battery
GCN – Sun launches eco data center
GreenerComputing – New Code of Conduct to Establish Best Practices in Green Data Centers
GreenerComputing – Silicon valley’s green detente
GreenerComputing – Majority of companies plan to green their data centers
GreenerComputing – Citigroup to spend $232M on Green Data Center
GreenerComputing – Chicago and Quincy, WA Top Green Data Center Locations
GreenerComputing – Using airside economizers to chill data center cooling bills
GreenerComputing – Making the most of asset disposal
GreenerComputing – Greenpeace vendor rankings
GreenerComputing – Four Steps to Improving Data Center Efficiency without Capital Expenditures
GreenerComputing – Enabling a Green and Virtual Data Center
Green-PC – Strategic Steps Down the Green Path
Greeniewatch – BBC news chiefs attack plans for climate change campaign
Greeniewatch – Warmest year predictions and data that has not yet been measured
GoverenmentExecutive – Public Private Sectors Differ on "Green" Efforts
HPC Wire – How hot is your code
Industry Standard – Why green data centers mean partner opportunities
InformationWeek – It could be 15 years before we know what is really green
InformationWeek – Beyond Server Consolidaiton
InformationWeek – Green IT Beyond Virtualization: The Case For Consolidation
InfoWorld – Sun celebrates green datacenter innovations
InfoWorld – Tech’s own datacenters are their green showrooms
InfoWorld – 2007: The Year in Green
InfoWorld – Green Grid Announces Tech Forum in Feb 2008
InfoWorld – SPEC seeds future green-server benchmarks
InfoWorld – Climate Savers green catalog proves un-ripe
InfoWorld – Forester: Eco-minded activity up among IT pros
InfoWorld – Green ventures in Silicon Valley, Mass reaped most VC cash in ’07
InfoWorld – Congress misses chance to see green-energy growth
InfoWorld – Unisys pushes green envelope with datacenter expansion
InfoWorld – No easy green strategy for storage
Internet News – Storage Technologies for a Slowing Economy
Internet News – Economy will Force IT to Transform
ITManagement – Green Computing, Green Revenue
itnews – Data centre chiefs dismiss green hype
itnews – Australian Green IT regulations could arrive this year
IT Pro – SNIA Green storage metrics released
ITtoolbox – MAID discussion
Linux Power – Saving power with Linux on Intel platforms
MSNBC – Microsoft to build data center in Ireland
National Post – Green technology at the L.A. Auto Show
Network World – Turning the datacenter green
Network World – Color Interop Green
Network World – Green not helpful word for setting environmental policies
NewScientistEnvironment – Computer servers as bad for climate as SUVs
Newser – Texas commission approves nation’s largest wind power project
New Yorker – Big Foot: In measuring carbon emissions, it’s easy to confuse morality and science
NY Times – What the Green Bubble Will Leave Behind
PRNewswire – Al Gore and Cisco CEO John Chambers to debate climate change
Processor – More than just monitoring
Processor – The new data center: What’s hot in Data Center physical infrastructure:
Processor – Liquid Cooling in the Data Center
Processor – Curbing IT Power Usage
Processor – Services To The Rescue – Services Available For Today’s Data Centers
Processor – Green Initiatives: Hire A Consultant?
Processor – Energy-Saving Initiatives
Processor – The EPA’s Low Carbon Campaig
Processor – Data Center Power Planning
SAN Jose Mercury – Making Data Centers Green
SDA-Asia – Green IT still a priority despite Credit Crunch
SearchCIO – EPA report gives data centers little guidance
SearchCIO – Green IT Strategies Could Lead to hefty ROIs
SearchCIO – Green IT In the Data Center: Plenty of Talk, not much Walk
SearchCIO – Green IT Overpitched by Vendors, CIOs beware
SearchDataCenter – Study ranks cheapest places to build a data center
SearchDataCenter – Green technology still ranks low for data center planners
SearchDataCenter – Green Data Center: Energy Effiecnty Computing in the 21st Century
SearchDataCenter – Green Data Center Advice: Is LEED Feasible
SearchDataCenter – Green Data Centers Tackle LEED Certification
SearchDataCenter – PG&E invests in data center effieicny
SearchDataCenter – A solar powered datacenter
SearchSMBStorage – Improve your storage energy efficiency
SearchSMBStorage – SMB capacity planning: Focusing on energy conservation
SearchSMBStorage – Data footprint reduction for SMBs
SearchSMBStorage – MAID & other energy-saving storage technologies for SMBs
SearchStorage – How to increase your storage energy efficiency
SearchStorage – Is storage now top energy hog in the data center
SearchStorage – Storage eZine: Turning Storage Green
SearchStorage – The Green Storage Gap
SearchStorageChannel – Green Data Storage Projects
Silicon.com – The greening of IT: Cooling costs
SNIA – SNIA Green Storage Overview
SNIA – Green Storage
SNW – Beyond Green-wash
SNW Spring 2008 Beyond Green-wash
State.org – Why Texas Has Its Own Power Grid
StorageDecisions – Different Shades of Green
Storage Magazine – Storage still lacks energy metrics
StorageIOblog – Posts pertaining to Green, power, cooling, floor-space, EHS (PCFE)
Storage Search – Various postings, news and topics pertaining to Green IT
Technology Times – Revealed: the environmental impact of Google searches
TechTarget – Data center power efficiency
TechTarget – Tip for determining power consumption
Techworld – Inside a green data center
Techworld – Box reduction – Low hanging green datacenter fruit
Techworld – Datacentere used to heat swimming pool
Theinquirer – Spansion and Virident flash server farms
Theinquirer – Storage firms worry about energy efficiency How green is the valley
TheRegister – Data Centre Efficiency, the good, the bad and the way to hot
TheRegister – Server makers snub whalesong for serious windmill abuse
TheRegister – Green data center threat level: Not Green
The Standard – Growing cynicism around going Green
ThoughtPut – Energy Central
Thoughtput – Power, Cooling, Green Storage and related industry trends
Wallstreet Journal – Utilities Amp Up Push To Slash Energy Use
Wallstreet Journal – The IT in Green Investing
Wallstreet Journal – Tech’s Energy Consumption on the Rise
Washingtonpost – Texas approves major new wind power project
WhatPC – Green IT: It doesnt have to cost the earth
WHIRnews – SingTel building green data center
Wind-watch.org – Loss of wind causes Texas power grid emergency
WyomingNews – Overcoming Greens Stereotype
Yahoo – Washington Senate Unviel Green Job Plan
ZDnet – Will supercomputer speeds hit a plateau?
Are data centers causing climate change

News and Press Releases

Business Wire – The Green and Virtual Data Center
Enterprise Storage Forum – Intel and HGST (Hitachi) partner on FLASH SSD
PCworld – Intel and HP describe Green Strategy
DoE – To Invest Approximately $1.3 Billion to Commercialize CCS Technology
Yahoo – Shell Opens Los Angeles’ First Combined Hydrogen and Gasoline Station
DuPont – DuPont Projects Save Enough Energy to Power 25,000 Homes
Gartner – Users Are Becoming Increasingly Confused About the Issues and Solutions Surrounding Green IT

Websites and Tools

Various power, cooling, emmisions and device configuration tools and calculators
Solar Action Alliance web site
SNIA Emerald program
Carbon Disclosure Project
The Chicago Climate Exchange
Climate Savers
Data Center Decisions
Electronic Industries Alliance (EIA)
EMC – Digital Life Calculator
Energy Star
Energy Star Data Center Initiatives
Greenpeace – Technology ranking website also here
GlobalActionPlan
KyotoPlanet
LBNL High Tech Data centers
Millicomputing
RoHS & WEE News
Storage Performance Council (SPC)
SNIA Green Technical Working Group
SPEC
Transaction Processing Council (TPC)
The Green Grid
The Raised Floor
Terra Pass Carbon Offset Credits – Website with CO2 calculators
Energy Information Administration – EIA (US and International Electrical Information)
U.S. Department of Energy and related information
U.S. DOE Energy Efficient Industrial Programs
U.S. EPA server and storage energy topics
Zerofootprint – Various "Green" and environmental related links and calculators

Vendor Centric and Marketing Website Links and tools

Vendors and organizations have different types of calculators some with focus on power, cooling, floor space, carbon offsets or emissions,

ROI, TCO and other IT data center infrastructure resource management. Following is an evolving list and by no means definitive even for a particular vendors as

different manufactures may have multiple different calculators for different product lines or areas of focus.

Brocade – Green website
Cisco – Green and Environmental websites here, here and here
Dell – Green website
EMC – EMC Energy, Power and Cooling Related Website
HDS – How to be green – HDS Positioning White Paper
HP – HP Green Website
IBM – Green Data Center – IBM Positioning White Paper
IBM – Green Data Center for Education – IBM Positioning White Paper
Intel – What is an Efficient Data Center and how do I measure it?
LSI – Green site and white paper
NetApp – Press Release and related information
Sun – Various articles and links
Symantec – Global 2000 Struggle to Adopt "Green" Data Centers – Announcement of Survey results
ACTON
Adinfa
APC
Australian Conservation Foundation
Avocent
BBC
Brocade
Carbon Credit Calculator UK
Carbon Footprint Site
Carbon Planet
Carbonify
CarbonZero
Cassatt
CO2 Stats Site
Copan
Dell
DirectGov UK Acton
Diesel Service & Supply Power Calculator & Converter
Eaton Powerware
Ecobusinesslinks
Ecoscale
EMC Power Calculator
EMC Web Power Calculator
EMC Digital Life Calculator
EPA Power Profiler
EPA Related Tools
EPEAT
Google UK Green Footprint
Green Grid Calculator
HP and more here
HVAC Calculator
IBM
Logicalis
Kohler Power (Business and Residential)
Micron
MSN Carbon Footprint Calculator
National Wildlife Foundation
NEF UK
NetApp
Rackwise
Platespin
Safecom
Sterling Planet
Sun and more here and here and here
Tandberg
TechRepublic
TerraPass Carbon Offset Credits
Thomas Kreen AG
Toronto Hydro Calculator
80 Plus Calculator
VMware
42u Green Grid PUE DCiE calculator
42u energy calculator

Green and Virtual Tools

What’s your power, cooling, floor space, energy, environmental or green story?

What’s your power, cooling, floor space, energy, environmental or green story? Do you have questions or want to learn more about

energy issues pertaining to IT data center and data infrastructure topics? Do you have a solution or technology or a success story that you would like to share

with us pertaining to data storage and server I/O energy optimization strategies? Do you need assistance in developing, validating or reviewing your strategy

or story? Contact us at: info@storageio.com or 612-810-9890 to learn more about green data storage and server I/O or to

schedule a briefing to tell us about your energy efficiency and effectiveness story pertaining to IT data centers and data infrastructures.

Disclaimer and note: URL’s submitted for inclusion on this site will be reviewed for consideration and to be

in generally accepted good taste in regards to the theme of this site. Best effort has been made to validate and verify the URLs that appear on this page and

website however they are subject to change. The author and/or maintainer’s) of this page and web site make no endorsement to and assume no responsibility for the

URLs and their content that are listed on this page.

Green and Virtual Metrics

Chapter 5 "Measurement, Metrics, and Management of IT Resources" in the book "The Green and Virtual Data Center" (CRC Press) takes a look at the importance of being able to measure and monitor to enable effective management and utilization of IT resources across servers, storage, I/O networks, software, hardware and facilities.

There are many different points of interest for collecting metrics in an IT data center for servers, storage, networking and facilities along with various points of interest or perspectives. Data center personal have varied interest from a facilities to a resource (server, storage, networking) usage and effectiveness perspective for normal use as well as planning purposes or comparison when evaluating new technology. Vendors have different uses for metrics during R&D, Q/A testing and marketing or sales campaigns as well as on-going service and support. Industry trade groups including 80 Plus, SNIA and the green grid along with government groups including the EPA Energy Star are working to define and establish applicable metrics pertinent for Green and Virtual data centers.

Acronym	Description	Comment
DCiE	Data center Efficiency = (IT equipment / Total facility power) * 100	Shows a ratio of how well a data center is consuming power
DCPE	Data center Performance Efficiency = Effective IT workload / total facility power	Shows how effective data center is consuming power to produce a given level of service or work such as energy per transaction or energy per business function performed
PUE	Power usage effectiveness = Total facility power / IT equipment power	Inverse of DCE
Kilowatts (kw)	Watts / 1,000	One thousand watts
Annual kWh	kWh x 24 x 365	kWh used in on year
Megawatts (mw)	kW / 1,000	One thousand kW
BTU/hour	watts x 3.413	Heat generated in an hour from using energy in British Thermal Units. 12,000 BTU/hour can equate to 1 Ton of cooling.
kWh	1,000 watt hours	The number of watts used in one hour
Watts	Amps x Volts (e.g. 12 amps * 12 volts = 144 watts)	Unit of electrical energy power
Watts	BTU/hour x 0.293	Convert BTU/hr to watts
Volts	Watts / Amps (e.g. 144 watts / 12 amps = 12 volts)	The amount of force on electrons
Amps	Watts / Volts (e.g. 144 watts / 12 volts = 12 amps)	The flow rate of electricity
Volt-Amperes (VA)	Volts x Amps	Sometimes power expressed in Volt-Ampres
kVA	Volts x Amp / 1000	Number of kilovolt-ampres
kW	kVA x power-factor	Power factor is the efficiency of a piece of equipments use of power
kVA	kW / power-factor	Killovolt-Ampres
U	1U = 1.75”	EIA metric describing height of equipment in racks.
Activity / Watt	Amount of work accomplished per unit of energy consumed. This could be IOPS, Transactions or Bandwidth per watt.	Indicator how much work and how efficient energy is being used to accomplish useful work. This metric applies to active workloads or actively used and frequently accessed storage and data. Examples would be IOPS per watt, Bandwidth per watt, Transactions per watt, Users or streams per watt. Activity per watt should also be used in conjunction with another metric such as how much capacity is supported per watt and total watts consumed for a representative picture.
IOPS / Watt	Number of I/O operations (or transactions) / energy (watts)	Indicator of how effectively energy is being used to perform a given amount of work. The work could be I/Os, transactions, throughput or other indicator of application activity. For example SPC-1 / Watt, SPEC / Watt, TPC / Watt, transaction / watt, IOP / Watt.
Bandwidth / Watt	GBPS or TBPS or PBPS / Watt Amount of data transferred or moved per second and energy used. Often confused with Capacity per watt	This indicates how much data is moved or accessed per second or time interval per unit of energy consumed. This is often confused with capacity per watt given that both bandwidth and capacity reference GByte, TByte, PByte.
Capacity / Watt	GB or TB or PB (storage capacity space / watt	Indicator of how much capacity (space) or bandwidth supported in a given configuration or footprint per watt of energy. For inactive data or off-line and archive data, capacity per watt can be an effective measurement gauge however for active workloads and applications activity per watt also needs to be looked at to get a representative indicator of how energy is being used
Mhz / Watt	Processor performance / energy (watts)	Indicator of how effectively energy is being used by a CPU or processor.
Carbon Credit	Carbon offset credit	Offset credits that can be bought and sold to offset your CO2 emissions
CO2 Emission	Average 1.341 lbs per kWh of electricity generated	The amount of average carbon dioxide (CO2) emissions from generating an average kWh of electricity

Various power, cooling, floor space and green storage or IT related metrics

Metrics include Data center Efficiency (DCiE) via the greengrid which is the indicator ratio of a IT data center energy efficiency defined as IT equipment (servers, disk and tape storage, networking switches, routers, printers, etc) / Total facility power x 100 (for percentage). For example, if the sum of all IT equipment energy usage resulted in 1,500 kilowatt hours (kWh) per month yet the total facility power including UPS, energy switching, power conversation and filtering, cooling and associated infrastructure costs as well as IT equipment resulting in 3,500 kWh, the DCiE would be (1,500 / 3,500) x 100 = 43%. DCiE can be used as a ratio for example to show in the above scenario that IT equipment accounts for about 43% of energy consumed by the data center with in this scenario 57% of electrical energy being consumed by cooling, conversion and conditioning or lighting.

Power usage effectiveness (PUE) is the indicator ratio of total energy being consumed by the data center to energy being used to operate IT equipment. PUE is defined as total facility power / IT equipment energy consumption. Using the above scenario PUE = 2.333 (3,500 / 1,500) which means that a server requiring 100 watts of power would actually require (2.333 * 100) 233.3 watts of energy that includes both direct power and cooling costs. Similarly a storage system that required 1,500 kWh of energy to power would require (1,500*2.333) 3,499.5 kWh of electrical power including cooling.

Another metric that has the potential to have meaning is Data center Performance Efficiency (DCPE) that takes into consideration how much useful and effective work is performed by the IT equipment and data center per energy consumed. DCPE is defined as useful work / total facility power with an example being some number of transactions processed using servers, networks and storage divided by energy for the data center to power and cool the equipment. An relatively easy and straightforward implementation of DCPE is an IOPs per watt measurement that looks at how many IOPs can be performed (regardless of size or type such as reads or writes) per unit of energy in this case watts.

DCPE = Useful work / Total facility power, for example IOPS per watt of energy used

DCiE = IT equipment energy / Total facility power = 1 / PUE

PUE = Total facility energy / IT equipment energy

IOPS per Watt = Number of IOPs (or bandwidth) / energy used by the storage system

The importance of these numbers and metrics is to focus on the larger impact of a piece of IT equipment that includes its cost and energy consumption that factors in cooling and other hosting or site environmental costs. Naturally energy costs and CO2 (carbon offsets) will vary by geography and region along with type of electrical power being used (Coal, Natural Gas, Nuclear, Wind, Thermo, Solar, etc) and other factors that should be kept in perspective as part of the big picture. Learn more in Chapter 5 "Measurement, Metrics, and Management of IT Resources" in the book "The Green and Virtual Data Center" (CRC) and in the book Cloud and Virtual Data Storage Networking (CRC).

Disclaimer and notes

Disclaimer and note: URL’s submitted for inclusion on this site will be reviewed for consideration and to be in generally accepted good taste in regards to the theme of this site. Best effort has been made to validate and verify the URLs that appear on this page and web site however they are subject to change. The author and/or maintainer’s) of this page and web site make no endorsement to and assume no responsibility for the URLs and their content that are listed on this page.

What this all means

The result of a green and virtual data center is that of a flexible, agile, resilient, scalable information factory that is also economical, productive, efficient, productive as well as sustainable.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

December 23, 2014April 27, 2025

Revisiting RAID data protection remains relevant resource links

Revisiting RAID data protection remains relevant and resources

Updated 2/10/2018

RAID data protection remains relevant including erasure codes (EC), local reconstruction codes (LRC) among other technologies. If RAID were really not relevant anymore (e.g. actually dead), why do some people spend so much time trying to convince others that it is dead or to use a different RAID level or enhanced RAID or beyond raid with related advanced approaches?

When you hear RAID, what comes to mind?

A legacy monolithic storage system that supports narrow 4, 5 or 6 drive wide stripe sets or a modern system support dozens of drives in a RAID group with different options?

RAID means many things, likewise there are different implementations (hardware, software, systems, adapters, operating systems) with various functionality, some better than others.

For example, which of the items in the following figure come to mind, or perhaps are new to your RAID vocabulary?

There are Many Variations of RAID Storage some for the enterprise, some for SMB, SOHO or consumer. Some have better performance than others, some have poor performance for example causing extra writes that lead to the perception that all parity based RAID do extra writes (some actually do write gathering and optimization).

Some hardware and software implementations using WBC (write back cache) mirrored or battery backed-BBU along with being able to group writes together in memory (cache) to do full stripe writes. The result can be fewer back-end writes compared to other systems. Hence, not all RAID implementations in either hardware or software are the same. Likewise, just because a RAID definition shows a particular theoretical implementation approach does not mean all vendors have implemented it in that way.

RAID is not a replacement for backup rather part of an overall approach to providing data availability and accessibility.

What’s the best RAID level? The one that meets YOUR needs

There are different RAID levels and implementations (hardware, software, controller, storage system, operating system, adapter among others) for various environments (enterprise, SME, SMB, SOHO, consumer) supporting primary, secondary, tertiary (backup/data protection, archiving).

General RAID comparisons

Thus one size or approach does fit all solutions, likewise RAID rules of thumbs or guides need context. Context means that a RAID rule or guide for consumer or SOHO or SMB might be different for enterprise and vise versa, not to mention on the type of storage system, number of drives, drive type and capacity among other factors.

General basic RAID comparisons

Thus the best RAID level is the one that meets your specific needs in your environment. What is best for one environment and application may be different from what is applicable to your needs.

Key points and RAID considerations include:

· Not all RAID implementations are the same, some are very much alive and evolving while others are in need of a rest or rewrite. So it is not the technology or techniques that are often the problem, rather how it is implemented and then deployed.

· It may not be RAID that is dead, rather the solution that uses it, hence if you think a particular storage system, appliance, product or software is old and dead along with its RAID implementation, then just say that product or vendors solution is dead.

· RAID can be implemented in hardware controllers, adapters or storage systems and appliances as well as via software and those have different features, capabilities or constraints.

· Long or slow drive rebuilds are a reality with larger disk drives and parity-based approaches; however, you have options on how to balance performance, availability, capacity, and economics.

· RAID can be single, dual or multiple parity or mirroring-based.

· Erasure and other coding schemes leverage parity schemes and guess what umbrella parity schemes fall under.

· RAID may not be cool, sexy or a fun topic and technology to talk about, however many trendy tools, solutions and services actually use some form or variation of RAID as part of their basic building blocks. This is an example of using new and old things in new ways to help each other do more without increasing complexity.

· Even if you are not a fan of RAID and think it is old and dead, at least take a few minutes to learn more about what it is that you do not like to update your dead FUD.

Wait, Isn’t RAID dead?

There is some dead marketing that paints a broad picture that RAID is dead to prop up something new, which in some cases may be a derivative variation of parity RAID.

Data dispersal and durability

RAID continues to evolve with rapid rebuilds for some systems

Otoh, there are some specific products, technologies, implementations that may be end of life or actually dead. Likewise what might be dead, dying or simply not in vogue are specific RAID implementations or packaging. Certainly there is a lot of buzz around object storage, cloud storage, forward error correction (FEC) and erasure coding including messages of how they cut RAID. Catch is that some object storage solutions are overlayed on top of lower level file systems that do things such as RAID 6, granted they are out of sight, out of mind.

General RAID parity and erasure code/FEC comparisons

Then there are advanced parity protection schemes which include FEC and erasure codes that while they are not your traditional RAID levels, they have characteristic including chunking or sharding data, spreading it out over multiple devices with multiple parity (or derivatives of parity) protection.

Bottom line is that for some environments, different RAID levels may be more applicable and alive than for others.

Via BizTech – How to Turn Storage Networks into Better Performers

Maintain Situational Awareness
Design for Performance and Availability
Determine Networked Server and Storage Patterns
Make Use of Applicable Technologies and Techniques

If RAID is alive, what to do with it?

If you are new to RAID, learn more about the past, present and future keeping mind context. Keeping context in mind means that there are different RAID levels and implementations for various environments. Not all RAID 0, 1, 1/0, 10, 2, 3, 4, 5, 6 or other variations (past, present and emerging) are the same for consumer vs. SOHO vs. SMB vs. SME vs. Enterprise, nor are the usage cases. Some need performance for reads, others for writes, some for high-capacity with low performance using hardware or software. RAID Rules of thumb are ok and useful, however keep them in context to what you are doing as well as using.

What to do next?

Take some time to learn, ask questions including what to use when, where, why and how as well as if an approach or recommendation are applicable to your needs. Check out the following links to read some extra perspectives about RAID and keep in mind, what might apply to enterprise may not be relevant for consumer or SMB and vise versa.

Some advise needed on SSD’s and Raid (Via Spiceworks)
RAID 5 URE Rebuild Means The Sky Is Falling (Via BenchmarkReview)
Double drive failures in a RAID-10 configuration (Via SearchStorage)
Industry Trends and Perspectives: RAID Rebuild Rates (Via StorageIOblog)
RAID, IOPS and IO observations (Via StorageIOBlog)
RAID Relevance Revisited (Via StorageIOBlog)
HDDs Are Still Spinning (Rust Never Sleeps) (Via InfoStor)
When and Where to Use NAND Flash SSD for Virtual Servers (Via TheVirtualizationPractice)
What’s the best way to learn about RAID storage? (Via Spiceworks)
Design considerations for the host local FVP architecture (Via Frank Denneman)
Some basic RAID fundamentals and definitions (Via SearchStorage)
Can RAID extend nand flash SSD life? (Via StorageIOBlog)
I/O Performance Issues and Impacts on Time-Sensitive Applications (Via CMG)
The original RAID white paper (PDF) that while over 20 years old, it provides a basis, foundation and some history by Katz, Gibson, Patterson et al
Storage Interview Series (Via Infortrend)
Different RAID methods (Via RAID Recovery Guide)
A good RAID tutorial (Via TheGeekStuff)
Basics of RAID explained (Via ZDNet)
RAID and IOPs (Via VMware Communities)

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips
Can we get a side of context with them IOPS and other storage metrics?
WHEN AND WHERE TO USE NAND FLASH SSD FOR VIRTUAL SERVERS
Revisiting RAID storage remains relevant and resources
NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

What is my favorite or preferred RAID level?

That depends, for some things its RAID 1, for others RAID 10 yet for others RAID 4, 5, 6 or DP and yet other situations could be a fit for RAID 0 or erasure codes and FEC. Instead of being focused on just one or two RAID levels as the solution for different problems, I prefer to look at the environment (consumer, SOHO, small or large SMB, SME, enterprise), type of usage (primary or secondary or data protection), performance characteristics, reads, writes, type and number of drives among other factors. What might be a fit for one environment would not be a fit for others, thus my preferred RAID level along with where implemented is the one that meets the given situation. However also keep in mind is tying RAID into part of an overall data protection strategy, remember, RAID is not a replacement for backup.

What this all means

Like other technologies that have been declared dead for years or decades, aka the Zombie technologies (e.g. dead yet still alive) RAID continues to be used while the technologies evolves. There are specific products, implementations or even RAID levels that have faded away, or are declining in some environments, yet alive in others. RAID and its variations are still alive, however how it is used or deployed in conjunction with other technologies also is evolving.

Ok, nuff said, for now.

November 30, 2014November 26, 2023

Cloud Conversations: Revisiting re:Invent 2014 and other AWS updates

server storage I/O trends

This is part one of a two-part series about Amazon Web Services (AWS) re:Invent 2014 and other recent cloud updates, read part two here.

Revisiting re:Invent 2014 and other AWS updates

A few weeks ago I attended Amazon Web Service (AWS) re:Invent 2014 in Las Vegas for a few days. For those of you who have not yet attended this event, I recommend adding it to your agenda. If you have interest in compute servers, networking, storage, development tools or management of cloud (public, private, hybrid), virtualization and related topic themes, you should check out AWS re:invent.

AWS made several announcements at re:invent including many around development tools, compute and data storage services. One of those to keep an eye on is cloud based Aurora relational database service that complement existing RDS tools. Aurora is positioned as an alternative to traditional SQL based transactional databases commonly found in enterprise environments (e.g. SQL Server among others).

Some recent AWS announcements prior to re:Invent include

AWS Adds EU (Frankfurt) Region
Amazon Linux AMI Updates
AWS Systems Manager for Microsoft System Center Virtual Machine Manager
T2, the New Low-Cost, General Purpose Instance Type for Amazon EC2
Windows Server 2012 R2 AMI Updates
Zocalo Enterprise File Sync & Share updates (read more Zocalo here )
AWS Management Portal for vCenter Setup Enhancements

AWS vCenter Portal

Using the AWS Management Portal for vCenter adds a plug-in within your VMware vCenter to manage your AWS infrastructure. The vCenter for AWS plug-in includes support for AWS EC2 and Virtual Machine (VM) import to migrate your VMware VMs to AWS EC2, create VPC (Virtual Private Clouds) along with subnet’s. There is no cost for the plug-in, you simply pay for the underlying AWS resources consumed (e.g. EC2, EBS, S3). Learn more about AWS Management Portal for vCenter here, and download the OVA plug-in for vCenter here.

AWS re:invent content

AWS Andy Jassy (Image via AWS)

November 12, 2014 (Day 1) Keynote (highlight video, full keynote). This is the session where AWS SVP Andy Jassy made several announcements including Aurora relational database that complements existing RDS (Relational Data Services). In addition to Andy, the key-note sessions also included various special guests ranging from AWS customers, partners and internal people in support of the various initiatives and announcements.

Amazon.com CTO Werner Vogels (Image via AWS)

November 13, 2014 (Day 2) Keynote (highlight video, full keynote). In this session, Amazon.com CTO Werner Vogels appears making announcements about the new Container and Lambda services.

AWS re:Invent announcements

Announcements and enhancements made by AWS during re:Invent include:

Key Management Service (KMS)
Amazon RDS for Aurora
Amazon EC2 Container Service
AWS Lambda
Amazon EBS Enhancements
Application development, deployed and life-cycle management tools
AWS Service Catalog
AWS CodeDeploy
AWS CodeCommit
AWS CodePipeline

Key Management Service (KMS)

Hardware security module (HSM) based key managed service for creating and control of encryption keys to protect security of digital assets and their keys. Integration with AWS EBS and others services including S3 and Redshift along with CloudTrail logs for regulatory, compliance and management. Learn more about AWS KMS here

AWS Database

For those who are not familiar, AWS has a suite of database related services including SQL and no SQL based, simple to transactional to Petabyte (PB) scale data warehouses for big data and analytics. AWS offers the Relational Database Service (RDS) which is a suite of different database types, instances and services. RDS instance and types include SimpleDB, MySQL, Postgress, Oracle, SQL Server and the new AWS Aurora offering (read more below). Other little data database and big data repository related offerings include DynamoDB (a non-SQL database), ElasticCache (in memory cache repository) and Redshift (large-scale data warehouse and big data repository).

In addition to database services offered by AWS, you can also combine various AWS resources including EC2 compute, EBS and other storage offerings to create your own solution. For example there are various Amazon Machine Images (AMI’s) or pre-built operating systems and database tools available with EC2 as well as via the AWS Marketplace , such as MongoDB and Couchbase among others. For those not familiar with MongoDB, Couchbase, Cassandra, Riak along with other non SQL or alternative databases and key value repositories, check out Seven Databases in Seven Weeks in my book review of it here.

Seven Databases in Seven Weeks and NoSQL movement available from Amazon.com

Amazon RDS for Aurora

Aurora is a new relational database offering part of the AWS RDS suite of services. Positioned as an alternative to commercial high-end database, Aurora is a cost-effective database engine compatible with MySQL. AWS is claiming 5x better performance than standard MySQL with Aurora while being resilient and durable. Learn more about Aurora which will be available in early 2015 and its current preview here.

Amazon EC2 C4 instances

AWS will be adding a new C4 instance as a next generation of EC2 compute instance based on Intel Xeon E5-2666 v3 (Haswell) processors. The Intel Xeon E5-2666 v3 processors run at a clock speed of 2.9 GHz providing the highest level of EC2 performance. AWS is targeting traditional High Performance Computing (HPC) along with other compute intensive workloads including analytics, gaming, and transcoding among others. Learn more AWS EC2 instances here, and view this Server and StorageIO EC2, EBS and associated AWS primer here.

Amazon EC2 Container Service

Containers such as those via Docker have become popular to support developers rapidly build as well as deploy scalable applications. AWS has added a new feature called EC2 Container Service that supports Docker using simple API’s. In addition to supporting Docker, EC2 Container Service is a high performance scalable container management service for distributed applications deployed on a cluster of EC2 instances. Similar to other EC2 services, EC2 Container Service leverages security groups, EBS volumes and Identity Access Management (IAM) roles along with scheduling placement of containers to meet your needs. Note that AWS is not alone in adding container and docker support with Microsoft Azure also having recently made some announcements, learn more about Azure and Docker here. Learn more about EC2 container service here and more about Docker here.

Continue reading about re:Invent 2014 and other recent AWS enhancements here in part two of this two-part series.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

November 30, 2014April 27, 2025

Part II: Revisiting re:Invent 2014, Lambda and other AWS updates

server storage I/O trends

Part II: Revisiting re:Invent 2014 and other AWS updates

This is part two of a two-part series about Amazon Web Services (AWS) re:Invent 2014 and other recent cloud updates, read part one here.

AWS re:Invent announcements

Announcements and enhancements made by AWS during re:Invent include:

Key Management Service (KMS)
Amazon RDS for Aurora
Amazon EC2 Container Service
AWS Lambda
Amazon EBS Enhancements
Application development, deployed and life-cycle management tools
AWS Service Catalog
AWS CodeDeploy
AWS CodeCommit
AWS CodePipeline

AWS Lambda

In addition to announcing new higher performance Elastic Cloud Compute (EC2) compute instances along with container service, another new service is AWS Lambda. Lambda is a service that automatically and quickly runs your applications code in response to events, activities, or other triggers. In addition to running your code, Lambda service is billed in 100 millisecond increments along with corresponding memory use vs. standard EC2 per hour billing. What this means is that instead of paying for an hour of time for your code to run, you can choose to use the Lambda service with more fine-grained consumption billing.

Lambda service can be used to have your code functions staged ready to execute. AWS Lambda can run your code in response to S3 bucket content (e.g. objects) changes, messages arriving via Kinesis streams or table updates in databases. Some examples include responding to event such as a web-site click, response to data upload (photo, image, audio, file or other object), index, stream or analyze data, receive output from a connected device (think Internet of Things IoT or Internet of Device IoD), trigger from an in-app event among others. The basic idea with Lambda is to be able to pay for only the amount of time needed to do a particular function without having to have an AWS EC2 instance dedicated to your application. Initially Lambda supports Node.js (JavaScript) based code that runs in its own isolated environment.

AWS cloud example
Various application code deployment models

Lambda service is a pay for what you consume, charges are based on the number of requests for your code function (e.g. application), amount of memory and execution time. There is a free tier for Lambda that includes 1 million requests and 400,000 GByte seconds of time per month. A GByte second is the amount of memory (e.g. DRAM vs. storage) consumed during a second. An example is your application is run 100,000 times and runs for 1 second consuming 128MB of memory = 128,000,000MB = 128,000GB seconds. View various pricing models here on the AWS Lambda site that show examples for different memory sizes, times a function runs and run time.

How much memory you select for your application code determines how it can run in the AWS free tier, which is available to both existing and new customers. Lambda fees are based on the total across all of your functions starting with the code when it runs. Note that you could have from one to thousands or more different functions running in Lambda service. As of this time, AWS is showing Lambda pricing as free for the first 1 million requests, and beyond that, $0.20 per 1 million request ($0.0000002 per request) per duration. Duration is from when you code runs until it ends or otherwise terminates rounded up to the nearest 100ms. The Lambda price also depends on the amount of memory you allocated for your code. Once past the 400,000 GByte second per month free tier the fee is $0.00001667 for every GB second used.

Why use AWS Lambda vs. an EC2 instance

Why would you use AWS Lambda vs. provisioning an Container, EC2 instance or running your application code function on a traditional or virtual machine?

If you need control and can leverage an entire physical server with its operating system (O.S.), application and support tools for your piece of code (e.g. JavaScript), that could be an option. If you simply need to have an isolated image instance (O.S., applications and tools) for your code on a shared virtual on-premises environment then that can be an option. Likewise if you have the need to move your application to an isolated cloud machine (CM) that hosts an O.S. along with your application paying for those resources such as on an hourly basis, that could be your option. Simply need a lighter-weight container to drop your application into that’s where Docker and containers comes into play to off-load some of the traditional application dependencies overhead.

However, if all you want to do is to add some code logic to support processing activity for example when an object, file or image is uploaded to AWS S3 without having to standup an EC2 instance along with associated server, O.S. and complete application activity, that’s where AWS Lambda comes into play. Simply create your code (initially JavaScript) and specify how much memory it needs, define what events or activities will trigger or invoke the event, and you have a solution.

View AWS Lambda pricing along with free tier information here.

Amazon EBS Enhancements

AWS is increasing the performance and size of General Purpose SSD and Provisioned IOP’s SSD volumes. This means that you can create volumes up to 16TB and 10,000 IOP’s for AWS EBS general-purpose SSD volumes. For EBS Provisioned IOP’s SSD volumes you can create up to 16TB for 20,000 IOP’s. General-purpose SSD volumes deliver a maximum throughput (bandwidth) of 160 MBps and Provisioned IOP SSD volumes have been specified by AWS at 320MBps when attached to EBS optimized instances. Learn more about EBS capabilities here. Verify your IO size and verify AWS sizing information to avoid surprises as all IO sizes are not considered to be the same. Learn more about Provisioned IOP’s, optimized instances, EBS and EC2 fundamentals in this StorageIO AWS primer here.

Application development, deployed and life-cycle management tools

In addition to compute and storage resource enhancements, AWS has also announced several tools to support application development, configuration along with deployment (life-cycle management). These include tools that AWS uses themselves as part of building and maintaining the AWS platform services.

AWS Config (Preview e.g. early access prior to full release)

Management, reporting and monitoring capabilities including Data center infrastructure management (DCIM) for monitoring your AWS resources, configuration (including history), governance, change management and notifications. AWS Config enables similar capabilities to support DCIM, Change Management Database (CMDB), trouble shooting and diagnostics, auditing, resource and configuration analysis among other activities. Learn more about AWS Config here.

AWS Service Catalog

AWS announced a new service catalog that will be available in early 2015. This new service capability will enable administrators to create and manage catalogs of approved resources for users to use via their personalized portal. Learn more about AWS service catalog here.

AWS CodeDeploy

To support code rapid deployment automation for EC2 instances, AWS has released CodeDeploy. CodeDeploy masks complexity associated with deployment when adding new features to your applications while reducing human error-prone operations. As part of the announcement, AWS mentioned that they are using CodeDeploy as part of their own applications development, maintenance, and change-management and deployment operations. While suited for at scale deployments across many instances, CodeDeploy works with as small as a single EC2 instance. Learn more about AWS CodeDeploy here.

AWS CodeCommit

For application code management, AWS will be making available in early 2015 a new service called CodeCommit. CodeCommit is a highly scalable secure source control service that host private Git repositories. Supporting standard functionalities of Git, including collaboration, you can store things from source code to binaries while working with your existing tools. Learn more about AWS CodeCommit here.

AWS CodePipeline

To support application delivery and release automation along with associated management tools, AWS is making available CodePipeline. CodePipeline is a tool (service) that supports build, checking workflow’s, code staging, testing and release to production including support for 3rd party tool integration. CodePipeline will be available in early 2015, learn more here.

Additional reading and related items

Learn more about the above and other AWS services by actually truing hands on using their free tier (AWS Free Tier). View AWS re:Invent produced breakout session videos here, audio podcasts here, and session slides here (all sessions may not yet be uploaded by AWS re:Invent)

What this all means

AWS continues to invest as well as re-invest into its environment both adding new feature functionality, as well as expanding the extensibility of those features. This means that AWS like other vendors or service providers adds new check-box features, however they also like some increase the depth extensibility of those capabilities. Besides adding new features and increasing the extensibility of existing capabilities, AWS is addressing both the data and information infrastructure including compute (server), storage and database, networking along with associated management tools while also adding extra developer tools. Developer tools include life-cycle management supporting code creation, testing, tracking, testing, change management among other management activities.

Another observation is that while AWS continues to promote the public cloud such as those services they offer as the present and future, they are also talking hybrid cloud. Granted you have to listen carefully as you may not simply hear hybrid cloud used like some toss it around, however listen for and look into AWS Virtual Private Cloud (VPC), along with what you can do using various technologies via the AWS marketplace. AWS is also speaking the language of enterprise and traditional IT from an applications and development to data and information infrastructure perspective while also walking the cloud talk. What this means is that AWS realizes that they need to help existing environments evolve and make the transition to the cloud which means speaking their language vs. converting them to cloud conversations to then be able to migrate them to the cloud. These steps should make AWS practical for many enterprise environments looking to make the transition to public and hybrid cloud at their pace, some faster than others. More on these and some related themes in future posts.

The AWS re:Invent event continues to grow year over year, I heard a figure of over 12,000 people however it was not clear if that included exhibiting vendors, AWS people, attendees, analyst, bloggers and media among others. However a simple validation is that the keynotes were in the larger rooms used by events such as EMCworld and VMworld when they hosted in Las Vegas as was the expo space vs. what I saw last year while at re:Invent. Unlike some large events such as VMworld where at best there is a waiting queue or line to get into sessions or hands on lab (HOL), while becoming more crowded, AWS re:Invent is still easy to get in and spend some time using the HOL which is of course powered by AWS meaning you can resume what you started while at re:Invent later. Overall a good event and nice series of enhancements by AWS, looking forward to next years AWS re:Invent.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

November 24, 2014December 29, 2025

November 2014 Server StorageIO Update Newsletter

November 2014

Hello and welcome to this November Server and StorageIO update newsletter. Enjoy this edition of the Server and StorageIO update newsletter and watch for new tips, articles, StorageIO lab report reviews, blog posts, videos and podcasts along with in the news commentary appearing soon.

Cheers gs

Industry Trends and Perspectives

Storage trends

A few weeks ago I attended AWS re:invent 2014 in Las Vegas for a few days. For those of you who have not yet attended this event, I recommend adding it to your agenda. If you have interest in compute servers, networking, storage, development tools or management of cloud (public, private, hybrid), virtualization and related topic themes, you should check out AWS re:invent. For those who need a AWS primer or refresher visit here.

Commentary In The News

Following are some StorageIO industry trends perspectives comments that have appeared in various venues. Cloud conversations continue to be popular including concerns about privacy, security and availability.

Over at Processor: Comments on Datacenters, Decide Whether To Build Or Not To Build, and controlling storage costs via insight and action. EdTechMagazine: has some comments on IaaS and Is Lean IT Here to Stay, while at CyberTrend perspectives on Better Servers for Better Business.

Across the pond over at the UK based Computerweekly comments on AWS launching Aurora cloud-based relational database engine, and hybrid cloud storage. Some comments on Overland Storage RAINcloud can be found at SearchStorage, while SearchDatabackup has some comments on Symantec break-up makeing sense for storage.

For those of you who speak Dutch, here is an interview (via it-infra.nl) I did when Holland earlier this year about storage and your business.

View other industry trends comments here

Tips and Articles

View recent as well as past tips and articles here

StorageIOblog posts

Recent StorageIOblog posts include:

View other recent as well as past blog posts here

In This Issue

Industry Trends Perspectives

Commentary in the news

Tips and Articles

StorageIOblog posts

Events & Activities

November 11-13, 2014
AWS re:Invent Las Vegas

View other recent and upcoming events here

Webinars

December 11, 2014 – BrightTalk
Server & Storage I/O Performance

December 10, 2014 – BrightTalk
Server & Storage I/O Decision Making

December 9, 2014 – BrightTalk
Virtual Server and Storage Decision Making

December 3, 2014 – BrightTalk
Data Protection Modernization

November 13 9AM PT – BrightTalk
Software Defined Storage

November 11 10AM PT
Google+ Hangout Dell BackupU

November 11 9AM PT – BrightTalk
Software Defined Data Centers

Videos and Podcasts

Video: Click to view VMworld 2014 update

StorageIO podcasts are also available via and at StorageIO.tv

From StorageIO Labs

Research, Reviews and Reports

Lenovo ThinkServer TD340
Earlier this year I did a review of the Lenovo ThinkServer TS140 in the StorageIO Labs (see the review here), in fact I ended up buying a TS140 after the review, and a few months back picked up yet another one. This StorageIOlab review looks at the Lenovo ThinkServer TD340 Tower Server which besides having a larger model number than the TS140, it also has a lot more capabilities (server compute, memory, I/O slots and internal hot-swap storage bays. Read more about the TD340 here.

Resources and Links

Check out these useful links and pages:
storageio.com/links
objectstoragecenter.com
storageioblog.com/data-protection-diaries-main/
storageio.com/ssd
storageio.com/ssd

Ok, nuff said (for now)

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

October 28, 2014March 7, 2022

What does server storage I/O scaling mean to you?

Scaling means different things to various people depending on the context or what it is referring to.

For example, scaling can me having or doing more of something, or less as well as referring to how more, or less of something is implemented.

Scaling occurs in a couple of different dimensions and ways:

Application workload attributes – Performance, Availability, Capacity, Economics (PACE)
Stability without compromise or increased complexity
Dimension and direction – Scaling-up (vertical), scaling-out (horizontal), scaling-down

Scaling PACE – Performance Availability Capacity Economics

Often I hear people talk about scaling only in the context of space capacity. However there are aspects including performance, availability as well as scaling-up or scaling-out. Scaling from application workloads perspectives include four main group themes which are performance, availability, capacity and economics (as well as energy).

Performance – Transactions, IOP’s, bandwidth, response time, errors, quality of service
Availability – Accessibility, durability, reliability, HA, BC, DR, Backup/Restore, BR, data protection, security
Capacity – Space to store information or place for workload to run on a server, connectivity ports for networks
Economics – Capital and operating expenses, buy, rent, lease, subscription

Scaling with Stability

The latter of the above items should be thought of more in terms of a by-product, result or goal for implementing scaling. Scaling should not result in a compromise of some other attribute such as increasing performance and loss of capacity or increased complexity. Scaling with stability also means that as you scale in some direction, or across some attribute (e.g. PACE), there should not be a corresponding increase in complexity of management, or loss of performance and availability. To use a popular buzz-term scaling with stability means performance, availability, capacity, economics should scale linear with their capabilities or perhaps cost less.

Scaling directions: Scaling-up, scaling-down, scaling-out

server and storage i/o scale options

Some examples of scaling in different directions include:

Scaling-up (vertical scaling with bigger or faster)
Scaling-down (vertical scaling with less)
Scaling-out (horizontal scaling with more of what being scaled)
Scaling-up and out (combines vertical and horizontal)

Of course you can combine the above in various combinations such as the example of scaling up and out, as well as apply different names and nomenclature to see your needs or preferences. The following are a closer look at the above with some simple examples.

server and storage i/o scale up
Example of scaling up (vertically)

server and storage i/o scale down
Example of scaling-down (e.g. for smaller scenarios)

server and storage i/o scale out
Example of scaling-out (horizontally)

server and storage i/o scale out
Example of scaling-out and up(horizontally and vertical)

Summary and what this means

There are many aspects to scaling, as well as side-effects or impacts as a result of scaling.

Scaling can refer to different workload attributes as well as how to support those applications.

Regardless of what you view scaling as meaning, keep in mind the context of where and when it is used and that others might have another scale view of scale.

Ok, nuff said (for now)…

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

Cloud Conversations: AWS EFS Elastic File System (Cloud NAS) First Preview Look

AWS EFS and Cloud Storage, Beyond Buzzword Bingo

What is EFS

What EFS is not

Where to learn more

What this all means and wrap-up

Share this:

Cloud Conversations: AWS S3 Cross Region Replication storage enhancements

The Problem, Issue, Challenge, Opportunity and Need

Understanding the challenge and designing a strategy

What is AWS S3 Cross-region replication

S3 Cross-region replication and alternative approaches

AWS S3 cross-region hands on experience (first look)

Where to learn more

What this all means and wrap-up

Share this:

Data Protection Diaries: Are your restores ready for World Backup Day 2015?

The Problem, Issue, Challenge, Opportunity and Need

Understanding the challenge and designing a strategy

Finding a solution

Taking action

Where to learn more

What this all means and wrap-up

Share this:

How to test your HDD SSD AFA Hybrid or cloud storage

Building off the basics, server storage I/O benchmark fundamentals

Where To Learn More

What This All Means

Share this:

Server Storage I/O Benchmark Tools: Microsoft Diskspd (Part I)

Background

What is Microsoft Diskspd?

What can Diskspd do?

What type of storage does Diskspd work with?

What information does Diskspd produce?

Where to get Diskspd?

New to server storage I/O benchmarking or tools?

How do you use Diskspd?

Where to learn more

Wrap up and summary, for now…

Share this:

Microsoft Diskspd (Part II): Server Storage I/O Benchmark Tools

Microsoft Diskspd StorageIO lab test drive

Where to learn more

Comments and wrap-up

What I like about Diskspd (Pros)

Where Diskspd could be improved (Cons)

Summary

Share this:

How well do you know good bad ugly I/O iops?

Server Storage I/O optimization and effectiveness

Locality of reference (or proximity)

SSD to the rescue?

Where to gain insight into your server storage I/O environment

Wrap up and summary

Where To Learn More

What This All Means

Share this:

Green and Virtual Data Center Primer

Moving beyond Green Hype and Green washing

Where to learn more

What this all means

Share this:

Green and Virtual IT Data Center Links

Moving beyond Green Hype and Green washing

Enabling Effective Produtive Efficient Economical Flexible Scalable Resilient Information Infrastrctures

White papers, analyst reports and perspectives

Vendor Centric and Marketing Website Links and tools

Green and Virtual Tools

What’s your power, cooling, floor space, energy, environmental or green story?

Green and Virtual Metrics

Disclaimer and notes

What this all means

Share this:

Revisiting RAID data protection remains relevant and resources

What’s the best RAID level? The one that meets YOUR needs

Wait, Isn’t RAID dead?

If RAID is alive, what to do with it?

What to do next?

Where To Learn More