block Archives

March 17, 2024March 17, 2024

Microsoft Azure Elastic SAN from Cloud to On-Prem

What is Azure Elastic SAN

Azure Elastic SAN (AES) is a new (now GA) Azure Cloud native storage service that provides scalable, resilient, easy management with rapid provisioning, high performance, and cost-effective storage. AES (figure 1) supports many workloads and computing resources. Workloads that benefit from AES include tier 1 and tier 2, such as Mission Critical, Database, and VDI, among others traditionally relying upon consolidated Storage Area Network (SAN) shared storage.

Compute resources that can use AES, including bare metal (BM) physical machines (PM), virtual machines (VM), and containers, among others, using iSCSI for access. AES is accessible by computing resources and services within the Azure Cloud in various regions (check Azure Website for specific region availability) and from on-prem core and edge locations using iSCSI. The AES management experience and value proposition are similar to traditional hardware or software-defined shared SAN storage combined with Azure cloud-based management capabilities.

Figure 1 General Concept and Use of Azure Elastic SAN (AES)

While Microsoft Azure describes AES as a cloud-native storage solution, that does not mean that AES is only for containers and other cloud-native apps or DevOPS. Rather, AES has been built for and is native to the cloud (e.g., software-defined) that can be accessed by various compute and other resources (e.g., VMs, Containers, AKS, etc) using iSCSI.

How Azure Elastic SAN differs from other Azure Storage

AES differs from traditional Azure block storage (e.g., Azure Disks) in that the storage is independent of the host compute server (e.g., BM, PM, VM, containers). With AES, similar to a conventional software-defined or hardware-based shared SAN solution, storage is disaggregated from host servers for sharing and management using iSCSI for connectivity. By comparison, AES differs from traditional Azure VM-based storage typically associated with a given virtual machine in a DAS (Direct Attached Storage) type configuration. Likewise, similar to conventional on-prem environments, there is a mix of DAS and SAN, including some host servers that leverage both.

AES supports Azure VM, Azure Kubernetes Service (AKS), cloud-native, edge, and on-prem computing (BM, VM, etc.) via iSCSI. Support for Azure VMware Solution (AVS) is in preview; check the Microsoft Azure website for updates and new feature functionality enhancements.

Does this mean everything is moving to AES? Similar to traditional SANs, there are roles and needs for various storage options, including DAS, shared block, file, and object, among storage offerings. Likewise, Microsoft and Azure have expanded their storage offerings to include AES, DAS (azure disks, including Ultra, premium, and standard, among other options), append, block, and page blobs (objects), and files, including Azure file sync, tables, and Data Box, among other storage services.

Azure Elastic Storage Feature Highlights

AES feature highlights include, among others:

- Management via Azure Portal and associated tools
- Azure cloud-based shared scalable bock storage
- Scalable capacity, low latency, and high performance (IOPs and throughput)
- Space capacity-optimized without the need for data reduction
- Accessible from within Azure cloud and from on-prem using iSCSI
- Supports Azure compute (VMs, Containers/AKS, Azure VMware Solution)
- On-prem access via iSCSI from PM/BM, VM, and containers
- Variable number of volumes and volume size per volume group
- Flexible easy to use Azure cloud-based management
- Encryption and network private endpoint security
- Local (LRS) and Zone (ZRS) with replication resiliency
- Volume snapshots and cluster support

Who is Azure Elastic SAN for

AES is for those who need cost-effective, shared, resilient, high capacity, high performance (IOPS, Bandwidth), and low latency block storage within Azure and from on-prem access. Others who can benefit from AES include those who need shared block storage for clustering app workloads, server and storage consolidation, and hybrid and migration. Another consideration is for those familiar with traditional hardware and software-defined SANs to facilitate hybrid and migration strategies.

How Azure Elastic SAN works

Azure Elastic SAN is a software-defined (cloud native if you prefer) block storage offering that presents a virtual SAN accessible within Azure Cloud and to on-prem core and edge locations currently via iSCSI. Using iSCSI, Azure VMs, Clusters, Containers, Azure VMware Solution among other compute and services, and on-prem BM/PM, VM, and containers, among others, can access AES storage volumes.

From the Azure Portal or associated tools (Azure CLI or PowerShell), create an AES SAN, giving it a 3 to 24-character name and specify storage capacity (base units with performance and any additional space capacity). Next, create a Volume Group, assigning it to a specific subscription and resource group (new or existing), then specify which Azure Region to use, type of redundancy (LRS or GRS), and Zone to use. LRS provides local redundancy, while ZRS provides enhanced zone resiliency, with highspeed synchronous resiliency without setting up multiple SAN systems and their associated replication configurations along with networking considerations (e.g., Azure takes care of that for you within their service).

The next step is to create volumes by specifying the volume name, volume group to use, volume size in GB, maximum IOPs, and bandwidth. Once you have made your AES volume group and volumes, you can create private endpoints, change security and access controls, and access the volumes from Azure or on-prem resources using iSCSI. Note that AES currently needs to be LRS (not ZRS) for clustered shared storage and that Key management includes using your keys with Azure key vault.

Using Azure Elastic SAN

Using AES is straightforward, and there are good easy to follow guides from Microsoft Azure, including the following:

- Azure Elastic SAN Introduction
- Azure Elastic SAN Configuration

The following images show what AES looks like from the Azure Portal, as well as from an Azure Windows Server VM and an onprem physical machine (e.g., Windows 10 laptop).

Figure 2 AES Azure Portal Big Picture

Figure 3 AES Volume Groups Portal View

Figure 4 AES Volumes Portal View

Figure 5 AES Volume Snapshot Views

Figure 6 AES Connected Volume Portal View

Figure 7 AES Volume iSCSI view from on-prem Windows Laptop

Figure 8 AES iSCSI Volume attached to Azure VM

Azure Elastic SAN Cost Pricing

The cost of AES is elastic, depending on whether you scale capacity with performance (e.g., base unit) or add more space capacity. If you need more performance, add base unit capacity, increasing IOPS, bandwidth, and space. In other words, base capacity includes storage space and performance, which you can grow in various increments. Remember that AES storage resources get shared across volumes within a volume group.

Azure Elastic SAN is billed hourly based on a monthly per-capacity base unit rate, with a minimum of 1TB provisioned capacity with minimum performance (e.g., 5,000 IOPs, 200MBps bandwidth). The base unit rate varies by region and type of redundancy, aka resiliency. For example, at the time of this writing, looking at US East, the Local Redundant Storage (LRS) base unit rate is 1TB with 5,000 IOPs and 200MBps bandwidth, costing $81.92 per unit per month.

The above example breaks down to a rate of $0.08 per GB per month, or $0.000110 per GB per hour (assumes 730 hours per month). An example of simply adding storage capacity without increasing base unit (e.g., performance) for US East is $61.44 per month. That works out to $0.06 per GB per month (no additional provisioned IOPs or Bandwidth) or $0.000083 per GB per hour.

Note that there are extra fees for Zone Redundant Storage (ZRS). Learn more about Azure Elastic SAN pricing here, as well as via a cost calculator here.

Azure Elastic SAN Performance

Performance for Azure Elastic SAN includes IOPs, Bandwidth, and Latency. AES IOPs get increased in increments of 5,000 per base TB. Thus, an AES with a base of 10TB would have 50,000 IOPs distributed (shared) across all of its volumes (e.g., volumes are not restricted). For example, if the base TB is increased from 10TB to 20TB, then the IOPs would increase from 50,000 to 100,000 IOPs.

On the other hand, if the base capacity (10TB) is not increased, only the storage capacity would increase from 10TB to 20TB, and the AES would have more capacity but still only have the 50,000 IOPs. AES bandwidth throughput increased by 200MBps per TB. For example, a 5TB AES would have 5 x 200MBps (1,000 MBps) throughput bandwidth shared across the volume groups volumes.

Note that while the performance gets shared across volumes, individual volume performance is determined by its capacity with a maximum of 80,000 IOPs and up to 1,024 MBps. Thus, to reach 80,000 IOPS and 1,024 MBps, an AES volume would have to be at least 107GB in space capacity. Also, note that the aggregate performance of all volumes cannot exceed the total of the AES. If you need more performance, then create another AES.

Will all VMs or compute resources see performance improvements with AES? Traditional Azure Disks associated with VMs have per-disk performance resource limits, including IOPs and Bandwidth. Likewise, VMs have storage limits based on their instance type and size, including the number of disks (HDD or SSD), performance (IOPS and bandwidth), and the number of CPUs and memory.

What this means is that an AES volume could have more performance than what a given VM is limited to. Refer to your VM instance sizing and configuration to determine its IOP and bandwidth limits; if needed, explore changing the size of your VM instance to leverage the performance of Azure Elastic SAN storage.

Additional Resources Where to learn more

The following links are additional resources to learn about Microsoft Azure Elastic SAN and related data infrastructures and tradecraft topics.

Azure AKS Storage Concepts
Azure Elastic SAN (AES) Documentation and Deployment Guides
Azure Elastic SAN Microsoft Blog
Azure Elastic SAN Overview
Azure Elastic SAN Performance topics
Azure Elastic SAN Pricing calculator
Azure Products by Region (see where AES is currently available)
Azure Storage Offerings
Azure Virtual Machine (VM) sizes
Azure Virtual Machine (VM) types
Azure Elastic SAN General Pricing
Azure Storage redundancy
Azure Service Level Agreements (SLA)
StorageIOBlog.com Data Box Family
StorageIOBlog.com Data Box Review
StorageIOBlog.com Data Box Test Drive
StorageIOblog.com Microsoft Hyper-V Alive Enhanced with Win Server 2025
StorageIOblog.com If NVMe is the answer, what are the questions?
StorageIOblog.com NVMe Primer (or refresh)

Additional learning experiences along with common questions (and answers), are found in my Software Defined Data Infrastructure Essentials book.

What this all means

Azure Elastic SAN (AES) is a new and now generally available shared block storage offering that is accessible using iSCSI from within Azure Cloud and on-prem environments. Even with iSCSI, AES is relatively easy to set up and use for shared storage, mainly if you are used to or currently working with hardware or software-defined SAN storage solutions.

With NVMe over TCP fabrics gaining industry and customer traction, I’m hoping for Microsoft to adding that in the future. Currently, AES supports LRS and ZRS for redundancy, and an excellent future enhancement would be to add Geo Redundant Storage (GRS) capabilities for those who need it.

I like the option of elastic shared storage regarding performance, availability, capacity, and economic costs (PACE). Suppose you understand the value proposition of evolving from dedicated DAS to shared SAN (independent of the underlying fabric network); or are currently using some form of on-prem shared block storage. In that case, you will find AES familiar and easy to use. Granted, AES is not a solution for everything as there are roles for other block storage, including DAS such as Azure disks and VMs within Azure, along with on-prem DAS, as well as file, object, and blobs, tables, among others.

Wrap up

The notion that all cloud storage must be objects or blobs is tied those who only need, provide, or prefer those solutions. The reality is that everything is not the same. Thus, there is a need for various storage mediums, devices, tiers, access, and types of services. Microsoft and Azure have done an excellent job of providing. I like what Microsoft Azure is doing with Azure Elastic SAN.

Ok, nuff said, for now.

Cheers Gs

Greg Schulz – Nine time Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of UnlimitedIO LLC.

March 28, 2023August 21, 2023

March 31st is world backup day; when is world recovery day

If March 31st is world backup day, when is world recovery day?

For several years, if not decades, March 31st has been world backup day, a reminder to protect and backup your apps and data. Data protection, including backup, recovery, business continuance (BC), disaster recovery (DR), and business resilience (BR), should be a 365-day-a-year focus. If you have regular data protection, including backup, that is great; when was the last time you tested restore?

Some related content

Upcoming and past events including webinars, tips and commentary
World Backup Day Reminder Don’t Be an April Fool Test Your Data Recovery
Data Infrastructure Overview, Its What’s Inside of a Data Center
Application Data Value Characteristics Everything Is Not The Same
Data Protection Diaries Topics Tools Techniques Technologies Tips

Reminder to Protect your data and apps and settings

Thus, this is also a reminder to protect your data and apps and their settings regularly. What’s even better is evolving from none once a year to more frequent data protection, including backup of your critical and noncritical apps and data. Notice I keep mentioning apps and not just the usual focus of or on data. Program apps are considered broadly data; after all, apps and your settings and metadata are just data when stored and protected.

There is also often a focus on just the data, which can lead to problems when it comes time to recover an app program, settings, or metadata. Also, a reminder that data protection, including backup, is not just for large enterprises; it applies to organizations and entities of all sizes, including small and medium businesses (SMBs), non-profits, and homes (e.g., your photos, worksheets, and other documents).

What About Recovery

If March 31st is world backup day, when is world recovery day? So far, I have been talking about backup as part of data protection or ensuring your apps, data, and settings are protected; what about recovery?

Sometimes with data protection, discussions can drift into what’s more critical, backup or recovery, which is a bit like a chicken and egg situation. In other words, what’s more important, the chicken or the egg? Similar to data protection, what’s more critical, backup or recovery?

Recovery is only as good as your backup (or snapshot, point-in-time copy, checkpoint, or consistency point), and your backup or protection copy is only as good as its recoverability. Recoverability means that not only is there something to restore from a point in time (e.g., recovery point objective or RPO) in a given amount of time (recovery time objective or RTO).

Recoverability also means that you can pull the data (e.g., bits, bytes, blocks, blobs, objects, files, tables) from the protection medium, media, or service and use it. Recovery means that the data is valid and consistent, has integrity, or is otherwise not bad, missing, damaged, or corrupted (e.g., usable).

What About Recovery Day?

For several years I have mentioned and will continue to do so that if March 31^st is world backup day, then April 1^st should be a world recovery day. So why April 1^st for world recovery day? Simple, you don’t want to look like a fool the day after world backup day if you can’t restore and use data backed up the day before.

If you are not comfortable with April 1st for world recovery day? Then make your world recovery day (or test) a day or so later. The important message is to ensure your apps, data, and settings are protected (e.g., copied, backed up, snapshot, checkpoint, etc.), trust yet verify, and test your restorations.

Why do I mentation apps, data, and settings?

The important message here is that it is good if you are already protecting your data, your spreadsheets, worksheets, databases, files, photos, and the application programs that use them. However, also ensure that you are protecting application settings, configurations, metadata, encryption keys, the backup or protection mechanisms, and their data.

For example, when I accidentally delete a data file or configuration settings, I can restore those without recovering everything. Suppose, for instance, I accidentally or intentionally uninstall an application program. In that case, I can reinstall (assuming I have a copy of the program), then restore my settings and pick up where I resumed.

Who does this apply to?

From organizations of size and type to individuals. If you have or generate or save data, if it is worth having (or you have to keep it), then it should be protected. What how often to protect data (time interval) will be based on what your recovery point objective (RPO) is. Likewise how fast you need to recover with your recovery time objective (RTO).

Remember that it is not if you will need to restore, recover, reload, refresh, or repair your apps, data, and settings instead when. It might be because of accidental or planned deletion, accident, hardware, software, cloud service situation, ransomware, or malware, among other things that can and do happen.

What to do?

If March 31st is world backup day, when is world recovery day? Ensure you have regular copies of your apps, data, and configuration settings, including encryption keys. Implement a variation of the old school three two one (e.g., 3 2 1) data protection, e.g., backup scheme (e.g., three or more copies, stored on two or more devices, systems, media or mediums, and at least one of them offsite preferably offline including at cloud).

A variation of the new school 4 3 2 1 data protection scheme has:
Have four or more versions of your protected data.
Three or more copies (feel free to swap the number of copies and versions).
Stored on two or more different systems (devices, media, or locations).
At least one copy offsite (preferably with one offline), including cloud.

The big difference between the old school 3 2 1 and the new school 4 3 2 1 is the emphasis and distinction of having multiple copies and various versions (e.g., points in time). For example, storing three copies on two systems with one offsite is good unless all copies are damaged. Having different versions (e.g., point in time) and multiple copies of those versions stored in different places including at least one offline (e.g., air-gapped), is essential.

Trust yet verify, test your backups and recovery

Test to verify your data protection is working and that data (apps, data, settings) can be restored. When testing restores, be careful not to overwrite your good data and cause a disaster. Also, ensure your data is encrypted in multiple locations and layers and that you protect your encryption keys. Finally, make sure your backup, protection software, catalog, and settings are encrypted, secured, and protected.

If you have questions, not sure, learn more here in my book Software Defined Data Infrastructure Essentials (CRC Press), Data Infrastructure Management Insight and Strategies (CRC Press), as well as check out these listed below, or reach out to me or others. If you are an individual consumer and just looking to protect some photos, valuable documents, and heirlooms, get in touch with professionals who specialize in these types of things.

What do I do?

Implement 4 3 2 1 type data protection with different granularities and frequencies. For example, my data protection includes regular point-in-time copies, including backups and snapshots, checkpoints, consistency points of systems, volumes, shares, apps, files, data, and settings at different intervals. Having different types of apps and data, some of which are more static vs. others that are changing, protection is also varied to avoid treating everything the same, reduce cost, and increase coverage.

I protect my Apps, data, and settings with multiple versions and copies locally on different systems, devices, mediums, and offsite, including offline and at cloud services. So why do I store data offsite vs. having it all in the cloud? Simple, speed of recovery, and flexibility.

If it’s a few files, perhaps a few GBs of data, it is usually faster for me if I don’t have a good copy locally to get it from Microsoft Azure. Otoh, if I need to restore TBs of data (something terrible happens), then it can be faster to bring an offline, offsite copy back, correct that, then only pull the more recent data I need from the cloud.

What are some of the tools and technologies that I use?

Locally I have multiple Microsoft Windows Servers (Server 2022) with various storage (HDDs and SSDs), including removable devices. In addition to on-prem, I have data stored offsite on removable media and cloud copies. For my cloud copies, I have a mix of files and blobs stored at Microsoft Azure.

A challenge moving from AWS to Azure was Retrospect did not support objects (Azure blobs). I realized, no worries, Retrospect supports storing data on local storage (SSD or HDD) on regular filesystems as files. The solution was set up an Azure file share for Retrospect, and everything has worked fantastic.

Are there things I need and want to improve? Yes, it’s an ongoing process and journey.

What should you do next?

Make sure you have a data backup; if not, march 31^st is a good reminder. Trust yet verify your backups are working and you can recover and not be an April 1^st fool.

Where to learn more

Learn more about world backup day, recovery and data protection along with other related topics via the following links:

Upcoming and past events including webinars, tips and commentary
Next Generation Hybrid Data Infrastructures Are In Your Future
Cloud File Data Storage Consolidation and Economic Comparison Model
New Book Data Infrastructure Management Insight Strategies
World Backup Day Reminder Don’t Be an April Fool Test Your Data Recovery
Virtual, Cloud and IT Availability, it’s a shared responsibility
Don’t Stop Learning Expand Your Skills Experiences Everyday
Data Infrastructure Overview, Its What’s Inside of a Data Center
Application Data Value Characteristics Everything Is Not The Same
Data Protection Diaries Topics Tools Techniques Technologies Tips
Data Infrastructure Server Storage I/O related Tradecraft Overview

Additional learning experiences can be found in Software Defined Data Infrastructure Essentials book. Also check out Data Infrastructure Management Insight and Strategies.

What this all means

If March 31st is world backup day, when is world recovery day? Every day should be a backup day (e.g., some protection, backup, copy, snapshot, checkpoint, consistency point). Likewise, every day should be able to be a recovery day. World backup day and recovery apply to organizations of all sizes and individuals. Remember that If March 31st is world backup day, when is world recovery day?

Ok, nuff said.

Cheers gs

Greg Schulz – Multi-year Microsoft MVP Cloud and Data Center Management, ten-time VMware vExpert. Author of Data Infrastructure Insights (CRC Press), Software Defined Data Infrastructure Essentials (CRC). Cloud and Virtual Data Storage Networking (CRC), The Green and Virtual Data Center (CRC), Resilient Storage Networks (Elsevier). Visit twitter @storageio as well as www.picturesoverstillwater.com to view various UAS/UAV e.g. drone based aerial content created by Greg Schulz. Courteous comments are welcome for consideration. First published on https://storageioblog.com. Any reproduction without attribution or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. Visit our companion site https://picturesoverstillwater.com to view drone based aerial photography and video related topics. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO and UnlimitedIO LLC.

August 12, 2017November 26, 2023

Data Infrastructure Industry Trends WekaIO Matrix Software Defined Storage SDS

WekaIO Matrix Scale Out Software Defined Storage SDS

server storage I/O trends

Updated 2/11/2018

WekaIO Matrix is a scale out software defined solution (SDS).

This Server StorageIO Industry Trends Perspective report looks at common issues, trends, and how to address different application server storage I/O challenges. In this report, we look at WekaIO Matrix, an elastic, flexible, highly scalable easy to use (and manage) software defined (e.g. software based) storage solution. WekaIO Matrix enables flexible elastic scaling with stability and without compromise.

Matrix is a new scale out software defined storage solution that:

Installs on bare metal, virtual or cloud servers
Has POSIX, NFS, SMB, and HDFS storage access
Adaptable performance for little and big data
Tiering of flash SSD and cloud object storage
Distributed resilience without compromise
Removes complexity of traditional storage
Deploys on bare metal, virtual and cloud environments

Where To Learn More

View additional SDS and related topics via the following links.

WekaIO Matrix Report compliments of WekaIO
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Intel and Micron unveil new 3D XPoint Non Volatile Memory (NVM) for servers and storage
Data Infrastructure Overview, Its Whats Inside of Data Centers
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Read more about WekaIO Matrix in this (free, no registration required) Server StorageIO Industry Trends Perspective (ITP) Report compliments of WekaIO.

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

July 22, 2017November 26, 2023

Who Will Be At Top Of Storage World Next Decade?

server storage I/O data infrastructure trends

Data Storage regardless of if hardware, legacy, new, emerging, cloud service or various software defined storage (SDS) approaches are all fundamental resource components of data infrastructures along with compute server, I/O networking as well as management tools, techniques, processes and procedures.

Fundamental Data Infrastructure resources

Data infrastructures include legacy along with software defined data infrastructures (SDDI), along with software defined data centers (SDDC), cloud and other environments to support expanding workloads more efficiently as well as effectively (e.g. boosting productivity).

Data Infrastructure and other IT Layers (stacks and altitude levels)

Various data infrastructures resource components spanning server, storage, I/O networks, tools along with hardware, software, services get defined as well as composed into solutions or services which may in turn be further aggregated into more extensive higher altitude offerings (e.g. further up the stack).

Various IT and Data Infrastructure Stack Layers (Altitude Levels)

Focus on Data Storage Present and Future Predictions

Drew Robb (@Robbdrew) has a good piece over at Enterprise Storage Forum looking at the past, present and future of who will rule the data storage world that includes several perspective predictions comments from myself as well as others. Some of the perspectives and predictions by others are more generic and technology trend and buzzword bingo focus which should not be a surprise. For example including the usual performance, Cloud and Object Storage, DPDK, RDMA/RoCE, Software-Defined, NVM/Flash/SSD, CI/HCI, NVMe among others.

Here are some excerpts from Drews piece along with my perspective and prediction comments of who may rule the data storage roost in a decade:

Amazon Web Services (AWS) – AWS includes cloud and object storage in the form of S3. However, there is more to storage than object and S3 with AWS also having Elastic File Services (EFS), Elastic Block Storage (EBS), database, message queue and on-instance storage, among others. for traditional, emerging and storage for the Internet of Things (IoT).

It is difficult to think of AWS not being a major player in a decade unless they totally screw up their execution in the future. Granted, some of their competitors might be working overtime putting pins and needles into Voodoo Dolls (perhaps bought via Amazon.com) while wishing for the demise of Amazon Web Services, just saying.

Voodoo Dolls and image via Amazon.com

Of course, Amazon and AWS could follow the likes of Sears (e.g. some may remember their catalog) and ignore the future ending up on the where are they now list. While talking about Amazon and AWS, one will have to wonder where Wall Mart will end up in a decade with or without a cloud of their own?

Microsoft – With Windows, Hyper-V and Azure (including Azure Stack), if there is any company in the industry outside of AWS or VMware that has quietly expanded its reach and positioning into storage, it is Microsoft, said Schulz.

Microsoft IMHO has many offerings and capabilities across different dimensions as well as playing fields. There is the installed base of Windows Servers (and desktops) that have the ability to leverage Software Defined Storage including Storage Spaces Direct (S2D), ReFS, cache and tiering among other features. In some ways I’m surprised by the number of people in the industry who are not aware of Microsoft’s capabilities from S2D and the ability to configure CI as well as HCI (Hyper Converged Infrastructure) deployments, or of Hyper-V abilities, Azure Stack to Azure among others. On the other hand, I run into Microsoft people who are not aware of the full portfolio offerings or are just focused on Azure. Needless to say, there is a lot in the Microsoft storage related portfolio as well as bigger broader data infrastructure offerings.

NetApp – Schulz thinks NetApp has the staying power to stay among the leading lights of data storage. Assuming it remains as a freestanding company and does not get acquired, he said, NetApp has the potential of expanding its portfolio with some new acquisitions. “NetApp can continue their transformation from a company with a strong focus on selling one or two products to learning how to sell the complete portfolio with diversity,” said Schulz.

NetApp has been around and survived up to now including via various acquisitions, some of which have had mixed results vs. others. However assuming NetApp can continue to reinvent themselves, focusing on selling the entire solution portfolio vs. focus on specific products, along with good execution and some more acquisitions, they have the potential for being a top player through the next decade.

Dell EMC – Dell EMC is another stalwart Schulz thinks will manage to stay on top. “Given their size and focus, Dell EMC should continue to grow, assuming execution goes well,” he said.

There are some who I hear are or have predicted the demise of Dell EMC, granted some of those predicted the demise of Dell and or EMC years ago as well. Top companies can and have faded away over time, and while it is possible Dell EMC could be added to the where are they now list in the future, my bet is that at least while Michael Dell is still involved, they will be a top player through the next decade, unless they mess up on execution.

Cloud and software defined storage data infrastructure
Various Data Infrastructures and Resources involving Data Storage

Huawei – Huawei is one of the emerging giants from China that are steadily gobbling up market share. It is now a top provider in many categories of storage, and its rapid ascendancy is unlikely to stop anytime soon. “Keep an eye on Huawei, particularly outside of the U.S. where they are starting to hit their stride,” said Schulz.

In the US, you have to look or pay attention to see or hear what Huawei is doing involving data storage, however that is different in other parts of the world. For example, I see and hear more about them in Europe than in the US. Will Huawei do more in the US in the future? Good question, keep an eye on them.

VMware – A decade ago, Storage Networking World (SNW) was by far the biggest event in data storage. Everyone who was anyone attended this twice yearly event. And then suddenly, it lost its luster. A new forum known as VMworld had emerged and took precedence. That was just one of the indicators of the disruption caused by VMware. And Schulz expects the company to continue to be a major force in storage. “VMware will remain a dominant player, expanding its role with software-defined storage,” said Schulz.

VMware has a dominant role in data storage not just because of the relationship with Dell EMC, or because of VSAN which continues to gain in popularity, or the soon to be released VMware on AWS solution options among others. Sure all of those matters, however, keep in mind that VMware solutions also tie into and work with other legacies as well as software-defined storage solution, services as well as tools spanning block, file, object for virtual machines as well as containers.

"Someday soon, people are going to wake up like they did with VMware and AWS," said Schulz. "That’s when they will be asking ‘When did Microsoft get into storage like this in such a big way.’"

What the above means is that some environments may not be paying attention to what AWS, Microsoft, VMware among others are doing, perhaps discounting them as the old or existing while focusing on new, emerging what ever is trendy in the news this week. On the other hand, some environments may see the solution offerings from those mentioned as not relevant to their specific needs, or capable of scaling to their requirements.

Keep in mind that it was not that long ago, just a few years that VMware entered the market with what by today’s standard (e.g. VSAN and others) was a relatively small virtual storage appliance offering, not to mention many people discounted and ignored VMware as a practical storage solution provider. Things and technology change, not to mention there are different needs and solution requirements for various environments. While a solution may not be applicable today, give it some time, keep an eye on them to avoid being surprised asking the question, how and when did a particular vendor get into storage in such a big way.

Is Future Data Storage World All Cloud?

Perhaps someday everything involving data storage will be in or part of the cloud.

Does this mean everything is going to the cloud, or at least in the next ten years? IMHO the simple answer is no, even though I see more workloads, applications, and data residing in the cloud, there will also be an increase in hybrid deployments.

Note that those hybrids will span local and on-premises or on-site if you prefer, as well as across different clouds or service providers. Granted some environments are or will become all in on clouds, while others are or will become a hybrid or some variation. Also when it comes to clouds, do not be scared, be prepared. Also keep an eye on what is going on with containers, orchestration, management among other related areas involving persistent storage, a good example is Dell EMCcode RexRay among others.

Various data storage focus areas along with data infrastructures.

What About Other Vendors, Solutions or Services?

In addition to those mentioned above, there are plenty of other existing, new and emerging vendors, solutions, and services to keep an eye on, look into, test and conduct a proof of concept (PoC) trial as part of being an informed data infrastructure and data storage shopper (or seller).

Keep in mind that component suppliers some of whom like Cisco also provides turnkey solutions that are also part of other vendors offerings (e.g. Dell EMC VxBlock, NetApp FlexPod among others), Broadcom (which includes Avago/LSI, Brocade Fibre Channel, among others), Intel (servers, I/O adapters, memory and SSDs), Mellanox, Micron, Samsung, Seagate and many others.

E8, Excelero, Elastifile (software defined storage), Enmotus (micro-tiering, read Server StorageIOlab report here), Everspin (persistent and storage class memories including NVDIMM), Hedvig (software defined storage), NooBaa, Nutanix, Pivot3, Rozo (software defined storage), WekaIO (scale out elastic software defined storage, read Server StorageIO report here).

Some other software defined management tools, services, solutions and components I’m keeping an eye on, exploring, digging deeper into (or plan to) include Blue Medora, Datadog, Dell EMCcode and RexRay docker container storage volume management, Google, HPE, IBM Bluemix Cloud aka IBM Softlayer, Kubernetes, Mangstor, OpenStack, Oracle, Retrospect, Rubrix, Quest, Starwind, Solarwinds, Storpool, Turbonomic, Virtuozzo (software defined storage) among many others

What about those not mentioned? Good question, some of those I have mentioned in earlier Server StorageIO Update newsletters, as well as many others mentioned in my new book "Software Defined Data Infrastructure Essentials" (CRC Press). Then there are those that once I hear something interesting from on a regular basis will get more frequent mentions as well. Of course, there is also a list to be done someday that is basically where are they now, e.g. those that have disappeared, or never lived up to their full hype and marketing (or technology) promises, let’s leave that for another day.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Where To Learn More

Learn more about related technology, trends, tools, techniques, and tips with the following links.

Data Infrastructure Primer and Overview (Its What’s Inside The Data Center)
Data Plane Development Kit (DPDK) information via Intel
Get in the NVMe Game, Using the Intel 750 series
Intel and Micron Unveil new 3D XPoint NVM, SCM, Persistent Memory
Software Defined Data Infrastructure Essentials (CRC) Book via Amazon.com
www.thessdplace.com, www.thenvmeplace.com, and www.objectstoragecenter.com

Data Infrastructures and workloads
Data Infrastructures Resources (Servers, Storage, I/O Networks) enabling various services

What This All Means

It is safe to say that each new year will bring new trends, techniques, technologies, tools, features, functionality as well as solutions involving data storage as well as data infrastructures. This means a usual safe bet is to say that the current year is the most exciting and has the most new things than in the past when it comes to data infrastructures along with resources such as data storage. Keep in mind that there are many aspects to data infrastructures as well as storage all of which are evolving. Who Will Be At Top Of Storage World Next Decade? What say you?

Ok, nuff said (for now…).

Cheers
Gs

Greg Schulz – Multi-year Microsoft MVP Cloud and Data Center Management, VMware vExpert (and vSAN). Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Watch for the spring 2017 release of his new book "Software-Defined Data Infrastructure Essentials" (CRC Press).

Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

June 14, 2017March 25, 2022

AWS S3 Storage Gateway Revisited (Part I)

server storage I/O trends

AWS S3 Storage Gateway Revisited (Part I)

This Amazon Web Service (AWS) Storage Gateway Revisited posts is a follow-up to the AWS Storage Gateway test drive and review I did a few years ago (thus why it’s called revisited). As part of a two-part series, the first post looks at what AWS Storage Gateway is, how it has improved since my last review of AWS Storage Gateway along with deployment options. The second post in the series looks at a sample test drive deployment and use.

If you need an AWS primer and overview of various services such as Elastic Cloud Compute (EC2), Elastic Block Storage (EBS), Elastic File Service (EFS), Simple Storage Service (S3), Availability Zones (AZ), Regions and other items check this multi-part series (Cloud conversations: AWS EBS, Glacier and S3 overview (Part I) ).

As a quick refresher, S3 is the AWS bulk, high-capacity unstructured and object storage service along with its companion deep cold (e.g. inactive) Glacier. There are various S3 storage service classes including standard, reduced redundancy storage (RRS) along with infrequent access (IA) that have different availability durability, performance, service level and cost attributes.

Note that S3 IA is not Glacier as your data always remains on-line accessible while Glacier data can be off-line. AWS S3 can be accessed via its API, as well as via HTTP rest calls, AWS tools along with those from third-party’s. Third party tools include NAS file access such as S3FS for Linux that I use for my Ubuntu systems to mount S3 buckets and use similar to other mount points. Other tools include Cloudberry, S3 Motion, S3 Browser as well as plug-ins available in most data protection (backup, snapshot, archive) software tools and storage systems today.

AWS S3 Storage Gateway and What’s New

The Storage Gateway is the AWS tool that you can use for accessing S3 buckets and objects via your block volume, NAS file or tape based applications. The Storage Gateway is intended to give S3 bucket and object access to on-premises applications and data infrastructures functions including data protection (backup/restore, business continuance (BC), business resiliency (BR), disaster recovery (DR) and archiving), along with storage tiering to cloud.

Some of the things that have evolved with the S3 Storage Gateway include:

Easier, streamlined download, installation, deployment
Enhanced Virtual Tape Library (VTL) and Virtual Tape support
File serving and sharing (not to be confused with Elastic File Services (EFS))
Ability to define your own bucket and associated parameters
Bucket options including Infrequent Access (IA) or standard
Options for AWS EC2 hosted, or on-premises VMware as well as Hyper-V gateways (file only supports VMware and EC2)

AWS Storage Gateway Three Functions

AWS Storage Gateway can be deployed for three basic functions:

AWS Storage Gateway File Architecture via AWS.com

File Gateway (NFS NAS) – Files, folders, objects and other items are stored in AWS S3 with a local cache for low latency access to most recently used data. With this option, you can create folders and subdirectory similar to a regular file system or NAS device as well as configure various security, permissions, access control policies. Data is stored in S3 buckets that you specify policies such as standard or Infrequent Access (IA) among other options. AWS hosted via EC2 as well as VMware Virtual Machine (VM) for on-premises file gateway.
Also, note that AWS cautions on multiple concurrent writers to S3 buckets with Storage Gateway so check the AWS FAQs which may have changed by the time you read this. Current file share limits (subject to change) include 1 file gateway share per S3 bucket (e.g. a one to one mapping between file share and a bucket). There can be 10 file shares per gateway (e.g. multiple shares each with its own bucket per gateway) and a maximum file size of 5TB (same as maximum S3 object size). Note that you might hear about object storage systems supporting unlimited size objects which some may do, however generally there are some constraints either on their API front-end, or what is currently tested. View current AWS Storage Gateway resource and specification limits here.

AWS Storage Gateway Non-Cached Volume Architecture via AWS.com

AWS Storage Gateway Cached Volume Architecture via AWS.com

Volume Gateway (Block iSCSI) – Leverages S3 with a point in time backup as an AWS EBS snapshot. Two options exist including Cached volumes with low-latency access to most recently used data (e.g. data is stored in AWS, with a local cache copy on disk or SSD). The other option is Stored Volumes (e.g. non-cached) where primary copy is local and periodic snapshot backups are sent to AWS. AWS provides EC2 hosted, as well as VMs for VMware and various Hyper-V Windows Server based VMs.
Current Storage Gateway volume limits (subject to change) include maximum size of a cached volume 32TB, maximum size of a stored volume 16TB. Note that snapshots of cached volumes larger than 16TB can only be restored to a storage gateway volume, they can not be restored as an EBS volume (via EC2). There are a maximum of 32 volumes for a gateway with total size of all volumes for a gateway (cached) of 1,024TB (e.g. 1PB). The total size of all volumes for a gateway (stored volume) is 512TB. View current AWS Storage Gateway resource and specification limits here.

AWS Storage Gateway VTL Architecture via AWS.com

Virtual Tape Library Gateway (VTL) – Supports saving your data for backup/BC/DR/archiving into S3 and Glacier storage tiers. Being a Virtual Tape Library (e.g. VTL) you can specify emulation of tapes for compatibility with your existing backup, archiving and data protection software, management tools and processes.
Storage Gateway limits for tape include minimum size of a virtual tape 100GB, maximum size of a virtual tape 2.5TB, maximum number of virtual tapes for a VTL is 1,500 and total size of all tapes in a VTL is 1PB. Note that the maximum number of virtual tapes in an archive is unlimited and total size of all tapes in an archive is also unlimited. View current AWS Storage Gateway resource and specification limits here.

Where To Learn More

AWS S3 Storage Gateway Revisited (Part I)
Part II Revisiting AWS S3 Storage Gateway (Test Drive Deployment)
AWS Storage Gateway site
AWS Storage Gateway resource limits and specifications and Pricing
AWS Storage Gateway Concepts, Getting Started, Managing Volumes, Troubleshooting and Local Console
Cross-Region Replication for Amazon S3
AWS (Amazon) storage gateway, first, second and third impressions
Cloud conversations: If focused on cost you might miss other cloud storage benefits
Data Protection Diaries
Cloud Conversations: AWS overview and primer
Eight Ways to Avoid Cloud Storage Pricing Surprises
Cloud and Object Storage Center
Are more than five nines of availability really possible?
How do primary storage clouds and cloud for backup differ?
What’s most important to know about my cloud privacy policy?
Cloud Conversations: AWS S3 Cross Region Replication storage enhancements
S3motion Buckets Containers Objects AWS S3 Cloud and EMCcode
AWS EFS Elastic File System (Cloud NAS) First Preview Look

What This All Means

As to which gateway function and mode (cached or non-cached for Volumes) depends on what it is that you are trying to do. Likewise choosing between EC2 (cloud hosted) or on-premises Hyper-V and VMware VMs depends on what your data infrastructure support requirements are. Overall I like the progress that AWS has put into evolving the Storage Gateway, granted it might not be applicable for all usage cases. Continue reading more and view images from the AWS Storage Gateway Revisited test drive in part two located here.

Ok, nuff said (for now…).

Cheers
Gs

Greg Schulz – Multi-year Microsoft MVP Cloud and Data Center Management, VMware vExpert (and vSAN). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio.

January 3, 2017March 7, 2022

Cloud and Object storage are in your future, what are some questions?

server storage I/O trends

IMHO there is no doubt that cloud and object storage are in your future, what are some questions?

Granted, what type of cloud and object storage or service along with for work or entertainment are some questions.

Likewise, what are your cloud and object storage concerns (assuming you already have heard the benefits)?

Some other questions include when, where for different applications workload needs, as well as how and with what among others.

Keep in mind that there are many aspects to cloud storage and they are not all object, likewise, there are many facets to object storage.

Recently I did a piece over at InfoStor titled Cloud Storage Concerns, Considerations and Trends that looks at the above among other items including:

Is cloud storage cheaper than traditional storage?
How do you access cloud object storage from legacy block and file applications?
How do you implement on-site cloud storage?
Is enterprise file sync and share (EFSS) safe and secure?
Does cloud storage need to be backed up and protected?
What geographic location requirements or regulations apply to you?

When it comes to cloud computing and, in particular, cloud storage, context matters. Conversations are necessary to discuss concerns, as well as discuss various considerations, options and alternatives. People often ask me questions about the best cloud storage to use, concerns about privacy, security, performance and cost.

Some of the most common cloud conversations topics involve context :

Public, private or hybrid cloud; turnkey subscription service or do it yourself (DIY)?
Storage, compute server, networking, applications or development tools?
Storage application such as file sync and share like Dropbox?
Storage resources such as table, queues, objects, file or block?
Storage for applications in the cloud, on-site or hybrid?

Continue reading Cloud Storage Concerns, Considerations and Trends over at InfoStor.

Where To Learn More

Additional related content can be found at:

objectstoragecenter.com
storageio.com/news
storageio.com/tips
InfoStor – Cloud Storage Concerns, Considerations and Trends

What This All Means

As I mentioned above, cloud and object storage are in your future, granted your future may not rely on just cloud or object storage. Take a few minutes to check out some of the conversation topics, tips and trends in my piece over at InfoStor Cloud Storage Concerns, Considerations and Trends along with more material at www.objectstoragecenter.com.

Btw, what are your questions, comments, concerns, claims or caveats as part of cloud and object storage conversations?

Ok, nuff said, for now…

Cheers
Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, vSAN and VMware vExpert. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio.

February 1, 2015November 26, 2023

Server Storage I/O Benchmark Performance Resource Tools

Server Storage I/O Benchmarking Performance Resource Tools

server storage I/O trends

Updated 1/23/2018

Server storage I/O benchmark performance resource tools, various articles and tips. These include tools for legacy, virtual, cloud and software defined environments.

benchmark performance resource tools server storage I/O performance

The best server and storage I/O (input/output operation) is the one that you do not have to do, the second best is the one with the least impact.

This is where the idea of locality of reference (e.g. how close is the data to where your application is running) comes into play which is implemented via tiered memory, storage and caching shown in the figure above.

Cloud virtual software defined storage I/O

Server storage I/O performance applies to cloud, virtual, software defined and legacy environments

What this has to do with server storage I/O (and networking) performance benchmarking is keeping the idea of locality of reference, context and the application workload in perspective regardless of if cloud, virtual, software defined or legacy physical environments.

StorageIOblog: I/O, I/O how well do you know about good or bad server and storage I/Os?
StorageIOblog: Server and Storage I/O benchmarking 101 for smarties
StorageIOblog: Which Enterprise HDDs to use for a Content Server Platform (7 part series with using benchmark tools)
StorageIO.com: Enmotus FuzeDrive MicroTiering lab test using various tools
StorageIOblog: Some server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
StorageIOblog: Get in the NVMe SSD game (if you are not already)
Doridmen.com: Transcend SSD360S Review with tips on using ATTO and Crystal benchmark tools
ComputerWeekly: Storage performance metrics: How suppliers spin performance specifications

Via StorageIO Podcast: Kevin Closson discusses SLOB Server CPU I/O Database Performance benchmarks
Via @KevinClosson: SLOB Use Cases By Industry Vendors. Learn SLOB, Speak The Experts’ Language
Via BeyondTheBlocks (Reduxio): 8 Useful Tools for Storage I/O Benchmarking
Via CCSIObench: Cold-cache Sequential I/O Benchmark
Doridmen.com: Transcend SSD360S Review with tips on using ATTO and Crystal benchmark tools
CISJournal: Benchmarking the Performance of Microsoft Hyper-V server, VMware ESXi and Xen Hypervisors (PDF)
Microsoft TechNet:Windows Server 2016 Hyper-V large-scale VM performance for in-memory transaction processing
InfoStor: What’s The Best Storage Benchmark?
StorageIOblog: How to test your HDD, SSD or all flash array (AFA) storage fundamentals
Via ATTO: Atto V3.05 free storage test tool available
Via StorageIOblog: Big Files and Lots of Little File Processing and Benchmarking with Vdbench

Via StorageIO.com: Which Enterprise Hard Disk Drives (HDDs) to use with a Content Server Platform (White Paper)
Via VMware Blogs: A Free Storage Performance Testing Tool For Hyperconverged
Microsoft Technet: Test Storage Spaces Performance Using Synthetic Workloads in Windows Server
Microsoft Technet: Microsoft Windows Server Storage Spaces – Designing for Performance
BizTech: 4 Ways to Performance-Test Your New HDD or SSD
EnterpriseStorageForum: Data Storage Benchmarking Guide
StorageSearch.com: How fast can your SSD run backwards?
OpenStack: How to calculate IOPS for Cinder Storage ?
StorageAcceleration: Tips for Measuring Your Storage Acceleration

server storage I/O STI and SUT

Spiceworks: Determining HDD SSD SSHD IOP Performance
Spiceworks: Calculating IOPS from Perfmon data
Spiceworks: profiling IOPs

vdbench server storage I/O benchmark
Vdbench example via StorageIOblog.com

StorageIOblog: What does server storage I/O scaling mean to you?
StorageIOblog: What is the best kind of IO? The one you do not have to do
Testmyworkload.com: Collect and report various OS workloads
Whoishostingthis: Various SQL resources
StorageAcceleration: What, When, Why & How to Accelerate Storage
Filesystems.org: Various tools and links
StorageIOblog: Can we get a side of context with them IOPS and other storage metrics?

flash ssd and hdd

BrightTalk Webinar: Data Center Monitoring – Metrics that Matter for Effective Management
StorageIOblog: Enterprise SSHD and Flash SSD Part of an Enterprise Tiered Storage Strategy
StorageIOblog: Has SSD put Hard Disk Drives (HDD’s) On Endangered Species List?

server storage I/O bottlenecks and I/O blender

Microsoft TechNet: Measuring Disk Latency with Windows Performance Monitor (Perfmon)
Via Scalegrid.io: How to benchmark MongoDB with YCSB? (Perfmon)
Microsoft MSDN: List of Perfmon counters for sql server
Microsoft TechNet: Taking Your Server’s Pulse
StorageIOblog: Part II: How many IOPS can a HDD, HHDD or SSD do with VMware?
CMG: I/O Performance Issues and Impacts on Time-Sensitive Applications

flash ssd and hdd

Virtualization Practice: IO IO it is off to Storage and IO metrics we go
InfoStor: Is HP Short Stroking for Performance and Capacity Gains?
StorageIOblog: Is Computer Data Storage Complex? It Depends
StorageIOblog: More storage and IO metrics that matter
StorageIOblog: Moving Beyond the Benchmark Brouhaha
Yellow-Bricks: VSAN VDI Benchmarking and Beta refresh!

server storage I/O benchmark example

YellowBricks: VSAN performance: many SAS low capacity VS some SATA high capacity?
YellowBricsk: VSAN VDI Benchmarking and Beta refresh!
StorageIOblog: Seagate 1200 12Gbs Enterprise SAS SSD StorgeIO lab review
StorageIOblog: Part II: Seagate 1200 12Gbs Enterprise SAS SSD StorgeIO lab review
StorageIOblog: Server Storage I/O Network Benchmark Winter Olympic Games

flash ssd and hdd

VMware VDImark aka View Planner (also here, here and here) as well as VMmark here
StorageIOblog: SPC and Storage Benchmarking Games
StorageIOblog: Speaking of speeding up business with SSD storage
StorageIOblog: SSD and Storage System Performance

Hadoop server storage I/O performance
Various Server Storage I/O tools in a hadoop environment

Michael-noll.com: Benchmarking and Stress Testing an Hadoop Cluster With TeraSort, TestDFSIO
Virtualization Practice: SSD options for Virtual (and Physical) Environments Part I: Spinning up to speed on SSD
StorageIOblog: Storage and IO metrics that matter
InfoStor: Storage Metrics and Measurements That Matter: Getting Started
SilvertonConsulting: Storage throughput vs. IO response time and why it matters
Splunk: The percentage of Read / Write utilization to get to 800 IOPS?

flash ssd and hdd
Various server storage I/O benchmarking tools

Spiceworks: What is the best IO IOPs testing tool out there
StorageIOblog: How many IOPS can a HDD, HHDD or SSD do?
StorageIOblog: Some Windows Server Storage I/O related commands
Openmaniak: Iperf overview and Iperf.fr: Iperf overview
StorageIOblog: Server and Storage I/O Benchmark Tools: Microsoft Diskspd (Part I and Part II)
Quest: SQL Server Perfmon Poster (PDF)
Server and Storage I/O Networking Performance Management (webinar)
Data Center Monitoring – Metrics that Matter for Effective Management (webinar)
Flash back to reality – Flash SSD Myths and Realities (Industry trends & benchmarking tips), (MSP CMG presentation)
DBAstackexchange: How can I determine how many IOPs I need for my AWS RDS database?
ITToolbox: Benchmarking the Performance of SANs

server storage IO labs

StorageIOblog: Dell Inspiron 660 i660, Virtual Server Diamond in the rough (Server review)
StorageIOblog: Part II: Lenovo TS140 Server and Storage I/O Review (Server review)
StorageIOblog: DIY converged server software defined storage on a budget using Lenovo TS140
StorageIOblog: Server storage I/O Intel NUC nick knack notes First impressions (Server review)
StorageIOblog & ITKE: Storage performance needs availability, availability needs performance
StorageIOblog: Why SSD based arrays and storage appliances can be a good idea (Part I)
StorageIOblog: Revisiting RAID storage remains relevant and resources

Interested in cloud and object storage visit our objectstoragecenter.com page, for flash SSD checkout storageio.com/ssd page, along with data protection, RAID, various industry links and more here.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Watch for additional links to be added above in addition to those that appear via comments.

Ok, nuff said, for now.

August 24, 2014March 8, 2022

VMware VVOLs storage I/O fundementals (Part 1)

VMware VVOL’s storage I/O fundamentals (Part I)

Note that this is a three part series with the first piece here (e.g. Are VMware VVOL’s in your virtual server and storage I/O future?), the second piece here (e.g.VMware VVOL’s and storage I/O fundamentals Part 1) and the third piece here (e.g. VMware VVOL’s and storage I/O fundamentals Part 2).

Some of you may already be participating in the VMware beta of VVOL involving one of the initial storage vendors also in the beta program.

Ok, now let’s go a bit deeper, however if you want some good music to listen to while reading this, check out @BruceRave GoDeepMusic.Net and shows here.

Taking a step back, digging deeper into Storage I/O and VVOL’s fundamentals

Instead of a VM host accessing its virtual disk (aka VMDK) which is stored in a VMFS formatted data store (part of ESXi hypervisor) built on top of a SCSI LUN (e.g. SAS, SATA, iSCSI, Fibre Channel aka FC, FCoE aka FC over Ethernet, IBA/SRP, etc) or an NFS file system presented by a storage system (or appliance), VVOL’s push more functionality and visibility down into the storage system. VVOL’s shift more intelligence and work from the hypervisor down into the storage system. Instead of a storage system simply presenting a SCSI LUN or NFS mount point and having limited (coarse) to no visibility into how the underlying storage bits, bytes as well as blocks are being used, storage systems gain more awareness.

Keep in mind that even files and objects still get ultimately mapped to pages and blocks aka sectors even on nand flash-based SSD’s. However also keep an eye on some new technology such as the Seagate Kinetic drive that instead of responding to SCSI block based commands, leverage object API’s and associated software on servers. Read more about these emerging trends here and here at objectstoragecenter.com.

With a normal SCSI LUN the underlying storage system has no knowledge of how the upper level operating system, hypervisor, file system or application such as a database (doing raw IO) is allocating the pages or blocks of memory aka storage. It is up to the upper level storage and data management tools to map from objects and files to the corresponding extents, pages and logical block address (LBA) understood by the storage system. In the case of a NAS solution, there is a layer of abstractions placed over the underlying block storage handling file management and the associated file to LBA mapping activity.

Storage I/O basics
Storage I/O and IOP basics and addressing: LBA’s and LBN’s

Getting back to VVOL, instead of simply presenting a LUN which is essentially a linear range of LBA’s (think of a big table or array) that the hypervisor then manages data placement and access, the storage system now gains insight into what LBA’s correspond to various entities such as a VMDK or VMX, log, clone, swap or other VMware objects. With this more insight, storage systems can now do native and more granular functions such as clone, replication, snapshot among others as opposed to simply working on a coarse LUN basis. The similar concepts extend over to NAS NFS based access. Granted, there are more to VVOL’s including ability to get the underlying storage system more closely integrated with the virtual machine, hypervisor and associated management including supported service manage and classes or categories of service across performance, availability, capacity, economics.

What about VVOL, VAAI and VASA?

VVOL’s are building from earlier VMware initiatives including VAAI and VASA. With VAAI, VMware hypervisor’s can off-load common functions to storage systems that support features such as copy, clone, zero copy among others like how a computer can off-load graphics processing to a graphics card if present.

VASA however provides a means for visibility, insight and awareness between the hypervisor and its associated management (e.g. vCenter etc) as well as the storage system. This includes storage systems being able to communicate and publish to VMware its capabilities for storage space capacity, availability, performance and configuration among other things.

With VVOL’s VASA gets leveraged for unidirectional (e.g. two-way) communication where VMware hypervisor and management tools can tell the storage system of things, configuration, activities to do among others. Hence why VASA is important to have in your VMware CASA.

What’s this object storage stuff?

VVOL’s are a form of object storage access in that they differ from traditional block (LUN’s) and files (NAS volumes/mount points). However, keep in mind that not all object storage are the same as there are object storage access and architectures.

Object Storage basics, generalities and block file relationships

Avoid making the mistake of when you hear object storage that means ANSI T10 (the folks that manage the SCSI command specifications) Object Storage Device (OSD) or something else. There are many different types of underlying object storage architectures some with block and file as well as object access front ends. Likewise there are many different types of object access that sit on top of object architectures as well as traditional storage system.

Object storage I/O
An example of how some object storage gets accessed (not VMware specific)

Also keep in mind that there are many different types of object access mechanism including HTTP Rest based, S3 (e.g. a common industry defacto standard based on Amazon Simple Storage Service), SNIA CDMI, SOAP, Torrent, XAM, JSON, XML, DICOM, IL7 just to name a few, not to mention various programmatic bindings or application specific implementations and API’s. Read more about object storage architectures, access and related topics, themes and trends at www.objecstoragecenter.com

Lets take a break here and when you are ready, click here to read the third piece in this series VMware VVOL’s and storage I/O fundamentals Part 2.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

March 20, 2013June 7, 2022

Welcome to the Cloud Bulk Object Storage Resources Center

Updated 8/31/19

Cloud Bulk Big Data Software Defined Object Storage Resources

server storage I/O trends Object Storage resources

Welcome to the Cloud, Big Data, Software Defined, Bulk and Object Storage Resources Center Page objectstoragecenter.com.

This object storage resources, along with software defined, cloud, bulk, and scale-out storage page is part of the server StorageIOblog microsite collection of resources. Software-defined, Bulk, Cloud and Object Storage exist to support expanding and diverse application data demands.

Other related resources include:

Software Defined, Cloud, Bulk and Object Storage Fundamentals

Software Defined Data Infrastructure Essentials book (CRC Press)

Cloud, Software Defined, Scale-Out, Object Storage News Trends

Via Software Defined Data Infrastructure Essentials (CRC Press 2017)

Bulk, Cloud, Object Storage Solutions and Services

There are various types of cloud, bulk, and object storage including public services such as Amazon Web Services (AWS) Simple Storage Service (S3), Backblaze, Google, Microsoft Azure, IBM Softlayer, Rackspace among many others. There are also solutions for hybrid and private deployment from Cisco, Cloudian, CTERA, Cray, DDN, Dell EMC, Elastifile, Fujitsu, Vantera/HDS, HPE, Hedvig, Huawei, IBM, NetApp, Noobaa, OpenIO, OpenStack, Quantum, Rackspace, Rozo, Scality, Spectra, Storpool, StorageCraft, Suse, Swift, Virtuozzo, WekaIO, WD, among many others.

Bulk Cloud Object storage SDDC SDDI
Via Software Defined Data Infrastructure Essentials (CRC Press 2017)

Cloud products and services among others, along with associated data infrastructures including object storage, file systems, repositories and access methods are at the center of bulk, big data, big bandwidth and little data initiatives on a public, private, hybrid and community basis. After all, not everything is the same in cloud, virtual and traditional data centers or information factories from active data to in-active deep digital archiving.

Object Context Matters

Before discussing Object Storage lets take a step back and look at some context that can clarify some confusion around the term object. The word object has many different meanings and context, both inside of the IT world as well as outside. Context matters with the term object such as a verb being a thing that can be seen or touched as well as a person or thing of action or feeling directed towards.

Besides a person, place or physical thing, an object can be a software-defined data structure that describes something. For example, a database record describing somebody’s contact or banking information, or a file descriptor with name, index ID, date and time stamps, permissions and access control lists along with other attributes or metadata. Another example is an object or blob stored in a cloud or object storage system repository, as well as an item in a hypervisor, operating system, container image or other application.

Besides being a verb, an object can also be a noun such as disapproval or disagreement with something or someone. From an IT context perspective, an object can also refer to a programming method (e.g. object-oriented programming [oop], or Java [among other environments] objects and classes) and systems development in addition to describing entities with data structures.

In other words, a data structure describes an object that can be a simple variable, constant, complex descriptor of something being processed by a program, as well as a function or unit of work. There are also objects unique or with context to specific environments besides Java or databases, operating systems, hypervisors, file systems, cloud and other things.

The Need For Bulk, Cloud and Object Storage

There is no such thing as an information recession with more data being generated, moved, processed, stored, preserved and served, granted there are economic realities. Likewise as a society our dependence on information being available for work or entertainment, from medical healthcare to social media and all points in between continues to increase (check out the Human Face of Big Data).

In addition, people and data are living longer, as well as getting larger (hence little data, big data and very big data). Cloud products and services along with associated object storage, file systems, repositories and access methods are at the center of big data, big bandwidth and little data initiatives on a public, private, hybrid and community basis. After all, not everything is the same in cloud, virtual and traditional data centers or information factories from active data to in-active deep digital archiving.

Click here to view (and hear) more content including cloud and object storage fundamentals

Click here to view software defined, bulk, cloud and object storage trend news

cloud object storage

Where to learn more

The following resources provide additional information about big data, bulk, software defined, cloud and object storage.

Via InfoStor: Object Storage Is In Your Future
Via FujiFilm IT Summit: Software Defined Data Infrastructures (SDDI) and Hybrid Clouds
Via MultiChannel: After ditching cloud business, Verizon inks Virtual Network Services deal with Amazon
Via MultiChannel: Verizon Digital Media Services now offers integrated Microsoft Azure Storage
Via StorageIOblog: AWS EFS Elastic File System (Cloud NAS) First Preview Look
Via InfoStor: Cloud Storage Concerns, Considerations and Trends
Via InfoStor: Object Storage Is In Your Future
Via Server StorageIO: April 2015 Newsletter Focus on Cloud and Object storage
Via StorageIOblog: AWS S3 Cross Region Replication storage enhancements
Cloud conversations: AWS EBS, Glacier and S3 overview
AWS (Amazon) storage gateway, first, second and third impressions
Cloud and Virtual Data Storage Networking (CRC Book)

View more news, trends and related cloud object storage activity here.

Videos and podcasts at storageio.tv also available via Applie iTunes.

Human Face of Big Data (Book review)

Seven Databases in Seven Weeks (Book review)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Object and cloud storage are in your future, the questions are when, where, with what and how among others.

Watch for more content and links to be added here soon to this object storage center page including posts, presentations, pod casts, polls, perspectives along with services and product solutions profiles.

Ok, nuff said, for now.

January 9, 2013March 7, 2022

Many faces of storage hypervisor, virtual storage or storage virtualization

StorageIO industry trends cloud, virtualization and big data

Storage hypervisors were a 2012 popular buzzword bingo topic with plenty of industry adoption and some customer deployment. Separating the hype around storage hypervisors reveals conversations around backup, restore, BC, DR and archiving.

backup, restore, BC, DR and archiving
Cloud and virtualization components

Storage virtualization along with virtual storage and storage hypervisors have a theme of abstracting underlying physical hardware resources like server virtualization. The abstraction can be for consolidation and aggregation, or for enabling agility, flexibility, emulation and other functionality.

backup, restore, BC, DR and archiving

Storage virtualization can be implemented in different locations, in many ways with various functionality and focus. For example the abstraction can occur on a server, in an virtual or physical appliance (e.g. tin wrapped software), in a network switch or router, as well as in a storage system. The focus can be for aggregation, or data protection (HA, BC, DR, backup, replication, snapshot) on a homogeneous (all one vendor) or mixed vendor basis (heterogeneous).

backup, restore, BC, DR and archiving

Here is a link to a guest post that I recently did over at The Virtualization Practice looking at storage hypervisors, virtual storage and storage virtualization. As is the case with virtual storage, storage virtualization, storage for virtual environments, depending on your views, spheres of influence, preferences among other factors what you call a storage hypervisor will probably vary.

Additional related material:

Are you using or considering implementation of a storage hypervisor?

Cloud, virtualization, storage and networking in an election year

EMC VPLEX: Virtual Storage Redefined or Respun?

Server and Storage Virtualization – Life beyond Consolidation

Should Everything Be Virtualized?

How many degrees separate you and your information?

Cloud and Virtual Data Storage Networking (CRC)

The Green and Virtual Data Center (CRC)

Resilient Storage Networks (Elsevier)

backup, restore, BC, DR and archiving

Btw, as a special offer for viewers, I have some copies of Resilient Storage Networking: Designing Flexible Scalable Data Infrastructures (Elsevier) available for $19.95, shipping and handling included. Send me an email or tweet (@storageio) to learn more and get your copy (Major credit cards and Pay pal accepted).

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

July 11, 2011May 24, 2020

SMB, SOHO and low end NAS gaining enterprise features

Here is a link to an interview that I did providing industry trends, perspectives and commentary on how Network Attached Storage (NAS) aka file and data sharing for the Small Medium Business (SMB), Small Office Home Office (SOHO) and consumer or low end offerings are gaining features and functionality traditionally associated with larger enterprise, however without the large price. In addition, here is a link to some tips for small business NAS storage and to another perspective on how choosing an SMB NAS is getting easier (and here for comments on unified storage).

Click on the image below to listen to a pod cast that I did with comments and perspectives involving SMB, SOHO, ROBO and low end NAS.

If your favorite or preferred product or vendor was not mentioned in the above links, dont worry, as with many media interviews there is a limited amount of time or narrow scope so those mentioned were among others in the space.

Speaking of others, there are many others in the broad and diverse SMB, SOHO, ROBO and consumer NAS and unified storage space. For example there are QNAP, SMC, Huawei, Buffalo, Synology and Starwind among many others. There is a lot of diversity in this NAS space. You’ve got Buffalo Technology, Cisco, Dlink, Dell, Data Robotic Drobo, EMC Iomega, Hewlett-Packard (HP) Co. via Microsoft, Intel, Overland Storage Snap Server, Seagate Black Armour, Western Digital Corp., and many others. Some of these vendors are household names that you would expect to see in the upper SMB, mid sized environments, and even into the enterprise.

For those who have other favorites or want to add another vendor to those already mentioned above, feel free to respond with a polite comment below. Oh and for disclosure, I bought my SMB or low end NAS from Amazon.com and it is an Iomega IX4.

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

July 11, 2011March 7, 2022

Unified storage systems showdown: NetApp FAS vs. EMC VNX

Unified storage systems that support concurrent block, file and in some cases object based access have become popular in terms of industry adoption as well as customer deployments with solutions from many vendors across different price bands, or market (customer) sectors. Two companies that are leaders in this space are also squared off against each other (here and here) to compete for existing, each others, as well as new customers in adjacent or different markets. Those companies are EMC and NetApp that I have described as two similar companies on parallel tracks offset by time.

Recently I was asked to provide some commentary about unified storage systems in general, as well as EMC and NetApp that you can read here, or view additional commentary on related themes here, here and here. EMC has a historical block based storage DNA that has evolved to file and object based while NetApp originated in the file space having moved into block based storage along with object based access. EMC converged various product technologies including those developed organically (e.g. internally) as well as via acquisition as part of their unified approach. NetApp who has had a unified produce has more recently added a new line of block products with their acquisition of Engenio from LSI. Obviously there are many other vendors with unified storage solutions that are either native (e.g. the functionality is built into the actual technology) or by parterning with others to combine their block or file based solutions as a unified offering.

What is unified storage, what does it enable, and why is it popular now?
Over the past couple of years, multifunction systems that can do both block- and file-based storage have become more popular. These systems simplify the acquisition process by removing the need to choose while enabling flexibility to use something else later. NAS solutions have evolved to support both NFS and CIFS and other TCP-based protocols, including HTTP and FTP, concurrently. NAS or file sharing–based storage continues to gain popularity because of its ease of use and built-in data management capabilities. However, some applications, including Microsoft Exchange or databases, either require block-based storage using SAS, iSCSI, or Fibre Channel, or have manufacture configuration guidelines for block-based storage.

Multi protocol storage products enable the following:

Acquisition and installation without need for a specialist

Use by professionals with varied skills

Reprovisioning for different applications requirements

Expansion and upgrades to boost future capacity needs

Figure 1 shows variations of how storage systems, gateways, or appliances can provide multiple functionality support with various interfaces and protocols. The exact protocols, interfaces, and functionality supported by a given system, software stack, gateway, or appliance will vary by specific vendor implementation. Most solutions provide some combination of block and file storage, with increasing support for various object-based access as well. Some solutions provide multiple block protocols concurrently, while others support block, file, and object over Ethernet interfaces. In addition to various front-end or server and application-facing support, solutions also commonly utilize multiple back-end interfaces, protocols, and tiered storage media.

Figure 1: Multi protocol and function unified storage examples

For low-end SMB, ROBO, workgroup, SOHO, and consumers, the benefit of multi protocol and unified storage solutions is similar to that of a multifunction printer, copier, fax, and scanner—that is, many features and functionality in a common footprint that is easy to acquire, install, and use in an affordable manner.

For larger environments, the value proposition of multi protocol and multi functionality is the flexibility and ability to adapt to different usage scenarios that enable a storage system to take on more personalities. What this means is that by being able to support multiple interfaces and protocols along with different types of media and functionality, a storage system becomes multifunctional. A multifunction storage system may be configured for on-line primary storage with good availability and performance and for lower-cost, high-capacity storage in addition to being used as backup target. In other scenarios, a multifunction device may be configured to perform a single function with the idea of later redeploying it to use a different personality or mode of functionality.

An easy way to determine whether you need multi protocol storage is to look at your environment and requirements. If all you need is FC, FCoE, SAS, iSCSI, or NAS, and a multi protocol device is going to cost you more, it may not be a good fit.

If you think you may ever need multi protocol capability, and there’s no extra charge for it, go ahead. If you’re not being penalized in performance, extra management software fees, functionality or availability, and you have the capability, why wouldnt you implement a unified storage system?

Look for products that have the ability to scale to meet your current and future storage capacity, performance, and availability needs or that can coexist under common management with additional storage systems.

Vendors of unified storage in addition to EMC and NetApp include BlueArc, Fujitsu, Dell, Drobo, HDS (with BlueArc), HP, IBM, Huawei, Oracle, Overland, Quantum, Symantec and Synology among others.

So what does this all mean? Simple, if you are not already using unified storage in some shape or form, either at work or perhaps even at home, most likely it will be in your future. Thus the question of not if, rather when, where, with what and how.

Ok, nuff said for now.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

twitter @storageio

December 7, 2009November 26, 2023

EMC Storage and Management Software Getting FAST

EMC has announced the availability of the first phase of FAST (Fully Automated Storage Tiering) functionality for their Symmetrix VMAX, CLARiiON and Celerra storage systems.

FAST was first previewed earlier this year (see here and here).

Key themes of FAST are to leverage policies for enabling automation to support large scale environments, doing more with what you have along with enabling virtual data centers for traditional, private and public clouds as well as enhancing IT economics.

This means enabling performance and capacity planning analysis along with facilitating load balancing or other infrastructure optimization activities to boost productivity, efficiency and resource usage effectiveness not to mention enabling Green IT.

Is FAST revolutionary? That will depend on who you talk or listen to.

Some vendors will jump up and down similar to donkey in shrek wanting to be picked or noticed claiming to have been the first to implement LUN or file movement inside of storage systems, or, as operating system or file system or volume manager built in. Others will claim to have done it via third party information lifecycle management (ILM) software including hierarchal storage management (HSM) tools among others. Ok, fair enough, than let their games begin (or continue) and I will leave it up to the variou vendors and their followings to debate whos got what or not.

BTW, anyone remember system manage storage on IBM mainframes or array based movement in HP AutoRAID among others?

Vendors have also in the past provided built in or third party add on tools for providing insight and awareness ranging from capacity or space usage and allocation storage resource management (SRM) tools, performance advisory activity monitors or charge back among others. For example, hot files analysis and reporting tool have been popular in the past, often operating system specific for identifying candidate files for placement on SSD or other fast storage. Granted the tools provided insight and awareness, there was still the time and error prone task of decision making and subsequently data movement, not to mention associated down time.

What is new here with FAST is the integrated approach, tools that are operating system independent, functionality in the array, available for different product family and price bands as well as that are optimized for improving user and IT productivity in medium to high-end enterprise scale environments.

One of the knocks on previous technology is either the performance impact to an application when data was moved, or, impact to other applications when data is being moved in the background. Another issue has been avoiding excessive thrashing due to data being moved at the expense of taking performance cycles from production applications. This would also be similar to having too many snapshots or raid rebuild that are not optimized running in the background on a storage system lacking sufficient performance capability. Another knock has been that historically, either 3rd party host or appliance based software was needed, or, solutions were designed and targeted for workgroup, departmental or small environments.

What is FAST and how is it implemented
FAST is technology for moving data within storage systems (and external for Celerra) for load balancing, capacity and performance optimization to meet quality of service (QoS) performance, availability, capacity along with energy and economic initiatives (figure1) across different tiers or types of storage devices. For example, moving data from slower SATA disks where a performance bottleneck exists to faster Fibre Channel or SSD devices. Similarly, cold or infrequently data on faster more expensive storage devices can be marked as candidates for migration to lower cost SATA devices based on customer policies.

EMC FAST
Figure 1 FAST big picture Source EMC

The premise is that policies are defined based on activity along with capacity to determine when data becomes a candidate for movement. All movement is performed in the background concurrently while applications are accessing data without disruptions. This means that there are no stub files or application pause or timeouts that occur or erratic I/O activity while data is being migrated. Another aspect of FAST data movement which is performed in the actual storage systems by their respective controllers is the ability for EMC management tools to identify hot or active LUNs or volumes (files in the case of Celerra) as candidates for moving (figure 2).

EMC FAST
Figure 2 FAST what it does Source EMC

However, users specify if they want data moved on its own or under supervision enabling a deterministic environment where the storage system and associated management tools makes recommendations and suggestions for administrators to approve before migration occurs. This capacity can be a safeguard as well as a learn mode enabling organizations to become comfortable with the technology along with its recommendations while applying knowledge of current business dynamics (figure 3).

EMC FAST
Figure 3 The Value proposition of FAST Source EMC

FAST is implemented as technology resident or embedded in the EMC VMAX (aka Symmetrix), CLARiiON and Cellera along with external management software tools. In the case of the block (figure 4) storage systems including DMX/VMAX and CLARiiON family of products that support FAST, data movement is on a LUN or volume basis and within a single storage system. For NAS or file based Cellera storage systems, FAST is implanted using FMA technology enabling either in the box or externally to other storage systems on a file basis.

EMC FAST
Figure 4 Example of FAST activity Source EMC

What this means is that data at the LUN or volume level can be moved across different tiers of storage or disk drives within a CLARiiON instance, or, within a VMAX instance (e.g. amongst the nodes). For example, Virtual LUNs are a building block that is leveraged for data movement and migration combined with external management tools including Navisphere for the CLARiiON and Symmetrix management console along with Ionix all of which has been enhanced.

Note however that initially data is not moved externally between different CLARiiONs or VMAX systems. For external data movement, other existing EMC tools would be deployed. In the case of Celerra, files can be moved within a specific CLARiiON as well as externally across other storage systems. External storage systems that files can be moved across using EMC FMA technology includes other Celleras, Centera and ATMOS solutions based upon defined policies.

What do I like most and why?

Integration of management tools providing insight with ability for user to setup polices as well as approve or intercede with data movement and placement as their specific philosophies dictate. This is key, for those who want to, let the system manage it self with your supervision of course. For those who prefer to take their time, then take simple steps by using the solution for initially providing insight into hot or cold spots and then helping to make decisions on what changes to make. Use the solution and adapt it to your specific environment and philosophy approach, what a concept, a tool that works for you, vs you working for it.

What dont I like and why?

There is and will remain some confusion about intra and inter box or system data movement and migration, operations that can be done by other EMC technology today for those who need it. For example I have had questions asking if FAST is nothing more than EMC Invista or some other data mover appliance sitting in front of Symmetrix or CLARiiONs and the answer is NO. Thus EMC will need to articulate that FAST is both an umbrella term as well as a product feature set combining the storage system along with associated management tools unique to each of the different storage systems. In addition, there will be confusion at least with GA of lack of support for Symmetrix DMX vs supported VMAX. Of course with EMC pricing is always a question so lets see how this plays out in the market with customer acceptance.

What about the others?

Certainly some will jump up and down claiming ratification of their visions welcoming EMC to the game while forgetting that there were others before them. However, it can also be said that EMC like others who have had LUN and volume movement or cloning capabilities for large scale solutions are taking the next step. Thus I would expect other vendors to continue movement in the same direction with their own unique spin and approach. For others who have in the past made automated tiering their marketing differentiation, I would suggest they come up with some new spins and stories as those functions are about to become table stakes or common feature functionality on a go forward basis.

When and where to use?

In theory, anyone with a Symmetrix/VMAX, CLARiiON or Celerra that supports the new functionality should be a candidate for the capabilities, that is, at least the insight, analysis, monitoring and situation awareness capabilities Note that does not mean actually enabling the automated movement initially.

While the concept is to enable automated system managed storage (Hmmm, Mainframe DejaVu anyone), for those who want to walk before they run, enabling the insight and awareness capabilities can provide valuable information about how resources are being used. The next step would then to look at the recommendations of the tools, and if you concur with the recommendations, then take remedial action by telling the system when the movement can occur at your desired time.

For those ready to run, then let it rip and take off as FAST as you want. In either situation, look at FAST for providing insight and situational awareness of hot and cold storage, where opportunities exist for optimizing and gaining efficiency in how resources are used, all important aspects for enabling a Green and Virtual Data Center not to mention as well as supporting public and private clouds.

FYI, FTC Disclosure and FWIW

I have done content related projects for EMC in the past (see here), they are not currently a client nor have they sponsored, underwritten, influenced, renumerated, utilize third party off shore swiss, cayman or south american unnumbered bank accounts, or provided any other reimbursement for this post, however I did personally sign and hand to Joe Tucci a copy of my book The Green and Virtual Data Center (CRC) ;).

Bottom line

Do I like what EMC is doing with FAST and this approach? Yes.

Do I think there is room for improvement and additional enhancements? Absolutely!

Whats my recommendation? Have a look, do your homework, due diligence and see if its applicable to your environment while asking others vendors what they will be doing (under NDA if needed).

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

July 23, 2009March 7, 2022

Clarifying Clustered Storage Confusion

Clustered storage can be iSCSI, Fibre Channel block based or NAS (NFS or CIFS or proprietary file system) file system based. Clustered storage can also be found in virtual tape library (VTL) including dedupe solutions along with other storage solutions such as those for archiving, cloud, medical or other specialized grids among others.

Recently in the IT and data storage specific industry, there has been a flurry of merger and acquisition (M&A) (Here and here), new product enhancement or announcement activity around clustered storage. For example, HP buying clustered file system vendor IBRIX complimenting their previous acquisition of another clustered file system vendor (PolyServe) a few years ago, or, of iSCSI block clustered storage software vendor LeftHand earlier this year. Another recent acquisition is that of LSI buying clustered NAS vendor ONstor, not to mention Dell buying iSCSI block clustered storage vendor EqualLogic about a year and half ago, not to mention other vendor acquisitions or announcements involving storage and clustering.

Where the confusion enters into play is the term cluster which means many things to different people, and even more so when clustered storage is combined with NAS or file based storage. For example, clustered NAS may infer a clustered file system when in reality a solution may only be multiple NAS filers, NAS heads, controllers or storage processors configured for availability or failover.

What this means is that a NFS or CIFS file system may only be active on one node at a time, however in the event of a failover, the file system shifts from one NAS hardware device (e.g. NAS head or filer) to another. On the other hand, a clustered file system enables a NFS or CIFS or other file system to be active on multiple nodes (e.g. NAS heads, controllers, etc.) concurrently. The concurrent access may be for small random reads and writes for example supporting a popular website or file serving application, or, it may be for parallel reads or writes to a large sequential file.

Clustered storage is no longer exclusive to the confines of high-performance sequential and parallel scientific computing or ultra large environments. Small files and I/O (read or write), including meta-data information, are also being supported by a new generation of multipurpose, flexible, clustered storage solutions that can be tailored to support different applications workloads.

There are many different types of clustered and bulk storage systems. Clustered storage solutions may be block (iSCSI or Fibre Channel), NAS or file serving, virtual tape library (VTL), or archiving and object-or content-addressable storage. Clustered storage in general is similar to using clustered servers, providing scale beyond the limits of a single traditional system—scale for performance, scale for availability, and scale for capacity and to enable growth in a modular fashion, adding performance and intelligence capabilities along with capacity.

For smaller environments, clustered storage enables modular pay-as-you-grow capabilities to address specific performance or capacity needs. For larger environments, clustered storage enables growth beyond the limits of a single storage system to meet performance, capacity, or availability needs.

Applications that lend themselves to clustered and bulk storage solutions include:

Unstructured data files, including spreadsheets, PDFs, slide decks, and other documents
Email systems, including Microsoft Exchange Personal (.PST) files stored on file servers
Users’ home directories and online file storage for documents and multimedia
Web-based managed service providers for online data storage, backup, and restore
Rich media data delivery, hosting, and social networking Internet sites
Media and entertainment creation, including animation rendering and post processing
High-performance databases such as Oracle with NFS direct I/O
Financial services and telecommunications, transportation, logistics, and manufacturing
Project-oriented development, simulation, and energy exploration
Low-cost, high-performance caching for transient and look-up or reference data
Real-time performance including fraud detection and electronic surveillance
Life sciences, chemical research, and computer-aided design

Clustered storage solutions go beyond meeting the basic requirements of supporting large sequential parallel or concurrent file access. Clustered storage systems can also support random access of small files for highly concurrent online and other applications. Scalable and flexible clustered file servers that leverage commonly deployed servers, networking, and storage technologies are well suited for new and emerging applications, including bulk storage of online unstructured data, cloud services, and multimedia, where extreme scaling of performance (IOPS or bandwidth), low latency, storage capacity, and flexibility at a low cost are needed.

The bandwidth-intensive and parallel-access performance characteristics associated with clustered storage are generally known; what is not so commonly known is the breakthrough to support small and random IOPS associated with database, email, general-purpose file serving, home directories, and meta-data look-up (Figure 1). Note that a clustered storage system, and in particular, a clustered NAS may or may not include a clustered file system.

Clustered Storage Model: Source The Green and Virtual Data Center (CRC)
Figure 1 – Generic clustered storage model (Courtesy “The Green and Virtual Data Center (CRC)”

More nodes, ports, memory, and disks do not guarantee more performance for applications. Performance depends on how those resources are deployed and how the storage management software enables those resources to avoid bottlenecks. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion when processing diverse application workloads. Other things to consider include support for industry-standard interfaces, protocols, and technologies.

Scalable and flexible clustered file server and storage systems provide the potential to leverage the inherent processing capabilities of constantly improving underlying hardware platforms. For example, software-based clustered storage systems that do not rely on proprietary hardware can be deployed on industry-standard high-density servers and blade centers and utilizes third-party internal or external storage.

Clustered storage is no longer exclusive to niche applications or scientific and high-performance computing environments. Organizations of all sizes can benefit from ultra scalable, flexible, clustered NAS storage that supports application performance needs from small random I/O to meta-data lookup and large-stream sequential I/O that scales with stability to grow with business and application needs.

Additional considerations for clustered NAS storage solutions include the following.

Can memory, processors, and I/O devices be varied to meet application needs?
Is there support for large file systems supporting many small files as well as large files?
What is the performance for small random IOPS and bandwidth for large sequential I/O?
How is performance enabled across different application in the same cluster instance?
Are I/O requests, including meta-data look-up, funneled through a single node?
How does a solution scale as the number of nodes and storage devices is increased?
How disruptive and time-consuming is adding new or replacing existing storage?
Is proprietary hardware needed, or can industry-standard servers and storage be used?
What data management features, including load balancing and data protection, exists?
What storage interface can be used: SAS, SATA, iSCSI, or Fibre Channel?
What types of storage devices are supported: SSD, SAS, Fibre Channel, or SATA disks?

As with most storage systems, it is not the total number of hard disk drives (HDDs), the quantity and speed of tiered-access I/O connectivity, the types and speeds of the processors, or even the amount of cache memory that determines performance. The performance differentiator is how a manufacturer combines the various components to create a solution that delivers a given level of performance with lower power consumption.

To avoid performance surprises, be leery of performance claims based solely on speed and quantity of HDDs or the speed and number of ports, processors and memory. How the resources are deployed and how the storage management software enables those resources to avoid bottlenecks are more important. For some clustered NAS and storage systems, more nodes are required to compensate for overhead or performance congestion.

Learn more about clustered storage (block, file, VTL/dedupe, archive), clustered NAS, clustered file system, grids and cloud storage among other topics in the following links:

"The Many faces of NAS – Which is appropriate for you?"

Article: Clarifying Storage Cluster Confusion
Presentation: Clustered Storage: “From SMB, to Scientific, to File Serving, to Commercial, Social Networking and Web 2.0”
Video Interview: How to Scale Data Storage Systems with Clustering
Guidelines for controlling clustering
The benefits of clustered storage

Along with other material on the StorageIO Tips and Tools or portfolio archive or events pages.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

What is Azure Elastic SAN

How Azure Elastic SAN differs from other Azure Storage

Azure Elastic Storage Feature Highlights

Who is Azure Elastic SAN for

How Azure Elastic SAN works

Using Azure Elastic SAN

Azure Elastic SAN Cost Pricing

Azure Elastic SAN Performance

Additional Resources Where to learn more

What this all means

Wrap up

Share this:

March 31st is world backup day; when is world recovery day

If March 31st is world backup day, when is world recovery day?

Some related content

Reminder to Protect your data and apps and settings

What About Recovery

What About Recovery Day?

Why do I mentation apps, data, and settings?

Who does this apply to?

What to do?

Trust yet verify, test your backups and recovery

What do I do?

What are some of the tools and technologies that I use?

What should you do next?

Where to learn more

What this all means

Share this:

WekaIO Matrix Scale Out Software Defined Storage SDS

Where To Learn More

What This All Means

Share this:

Who Will Be At Top Of Storage World Next Decade?

Focus on Data Storage Present and Future Predictions

Is Future Data Storage World All Cloud?

What About Other Vendors, Solutions or Services?

Where To Learn More

What This All Means

Share this:

AWS S3 Storage Gateway Revisited (Part I)

AWS S3 Storage Gateway and What’s New

AWS Storage Gateway Three Functions

Where To Learn More

What This All Means

Share this:

Cloud and Object storage are in your future, what are some questions?

Where To Learn More

What This All Means

Share this:

Server Storage I/O Benchmarking Performance Resource Tools

What This All Means

Share this:

VMware VVOL’s storage I/O fundamentals (Part I)

Taking a step back, digging deeper into Storage I/O and VVOL’s fundamentals

What about VVOL, VAAI and VASA?

What’s this object storage stuff?

Share this:

Cloud Bulk Big Data Software Defined Object Storage Resources

Bulk, Cloud, Object Storage Solutions and Services

Object Context Matters

The Need For Bulk, Cloud and Object Storage

Where to learn more

What This All Means

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: