object storage Archives

July 2, 2018November 26, 2023

Dell Technologies Announces Class V VMware Tracking Stock exchange for stock or cash

Dell Technologies Announces Class V VMware Tracking Stock exchange for stock or cash.

Image via Dell Technologies

Summary of Dell transaction announcement includes:

VMware declares an $11 Billion USD cash dividend pro rata to all VMware stock holders.
Given ownership percentage of VMware, Dell Technologies will receive approximately $9 Billion USD cash dividend.
Dell plans to list its Class C common stock shares on the New York Stock Exchange (NYSE).
Dell plans to use the VMware dividend proceeds to fund cash consideration to be paid to Class V (tracking stock) shareholders.
For each Class V share (e.g. VMware tracking stock) shareholders can choose to receive:
1.3665 shares of Dell Technologies Class C common stock, or
$109 in cash per DVMT (Class V share) a 29% premium per share

Image via Dell Technologies

Additional interest points of this transaction include:

Transaction expected to close Q4 CY2018, subject to Class V shareholder approval.
VMware maintains its independence as a separate publicly traded company.
Dell Technologies maintains its 81% ownership of VMware common stock
Dell Technologies Class V (DVMT) shareholders will own 20.8% to 31.0% of Dell Class C (depending on cash election amounts).
Streamline Dell capital and ownership structure.
Establishes a public security (stock) in global end to end data infrastructure provider (e.g. Dell Technologies Stock on NYSE).
Enables financial flexibility for future strategic initiatives

Image via Dell Technologies

Michael Dell and Silver Lake Continued Ownership

As part of this transaction, both Michael Dell and Silver Lake partners announce commitment to Dell Technologies. Michael Dell will continue to serve as Chairman and CEO, along with a committed stockholder beneficially owning between about 47% to 54% of Dell Technologies on a fully diluted basis. Silver Lake equity partners, an investor in Dell will continue its long-term partnership with Michael Dell beneficially owning between about 16%-18% of Dell Technologies on a fully diluted basis.

Where to learn more

Learn more about Dell Technologies, VMware, Data Infrastructures and related topics via the following links:

Dell Investors Events Page
Announcement of VMware Dividend, Dell Class V, and Class C stock news
Dell announcement conference call presentation ( PDF)
VMware VMworld
Via SearchStorage: Dell EMC storage IPO, VMware merger plans still unclear
Via SearchStorage: Dell EMC storage strategy talk buzzes Dell Tech World
Via SearchStorage: How a Dell and VMware merger could affect EMC storage
May 2018 Server StorageIO Data Infrastructure Update Newsletter
Dell Technology World 2018 Announcement Summary
VMware vSphere vSAN vCenter version 6.7 SDDC Update Summary
Application Data Value Characteristics Everything Is Not The Same (Part I)
Data Infrastructure Primer Overview (Its Whats Inside The Data Center)
If NVMe is the answer, what are the questions?
NVMe Primer (or refresh), The NVMe Place, and The SSD Place
Server Storage I/O Benchmark Performance Resource Tools
Data Infrastructure server storage I/O network Recommended Reading

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means

This announcement enables Dell to streamline its financial structure, while providing VMware shareholder with a dividend value. In addition, this Dell Technologies announcement puts to rest industry discussions of what will Michael Dell along with Dell Technologies and VMware do in the future. Speaking of the future, this transaction could also page the wave for future investment or acquisitions by Dell and/or VMware. Now the question is if you are a DVMT tracking stock shareholder, do you take the $109 USD cash, or, new Class C Dell Technologies stock? Now lets see how Dell Technologies Announces Class V VMware Tracking Stock exchange for stock or cash plays out during the rest of summer and into the fall.

Ok, nuff said, for now.

Cheers Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2026 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

June 30, 2018April 27, 2025

June 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 6 (June 2018)

Hello and welcome to the June 2018 Server StorageIO Data Infrastructure Update Newsletter.

In cased you missed it, the May 2018 Server StorageIO Data Infrastructure Update Newsletter can be viewed here (HTML and PDF).

In this issue buzzwords topics include AI, All Flash, HPC, Lustre, Multi Cloud, NVMe, NVMeoF, SAS, and SSD among others:

Data Infrastructure Industry Activity
News Commentary and Tips
Server StorageIOblog posts
Recommended Reading
Various Events and Webinars
Industry Resources and Links

Enjoy this edition of the Server StorageIO Data Infrastructure update newsletter.

Cheers GS

Data Infrastructure and IT Industry Activity Trends

June data infrastructure, server, storage, I/O network, hardware, software, cloud, converged, and container as well as data protection industry activity includes among others:

Check out what’s new at Amazon Web Services (AWS) here, as well as Microsoft Azure here, Google Cloud Compute here, IBM Softlayer here, and OVH here. CTERA announced new cloud storage gateways (HC Series) for enterprise environments that include all flash SSD options, capacity up to 96TB (raw), Petabyte scale tiering to public and private cloud, 10 Gbe Ethernet connectivity, virtual machine deployment, along with high availability configuration.

Cray announced enhancements to its Lustre (parallel file system) based ClusterStor storage system for high performance compute (HPC) along with it previously acquired from Seagate (Who had acquired it as part of the Xyratex acquisition). New enhancements for ClusterStor include all flash SSD solution that will integrate and work with our existing hard disk drive (HDD) based systems.

In related Lustre based activity, DataDirect Network (DDN) has acquired from Intel, their Lustre File system capability. Intel acquired its Lustre capabilities via its purchase of Whamcloud back in 2012, and in 2017 announced that it was getting out of the Lustre business (here and here). DDN also announced new storage solutions for enabling HPC environment workloads along with Artificial Intelligence (AI) centric applications.

HPE which held its Discover event announced a $4 Billion USD investment over four years pertaining to development of edge technologies and services including software defined WAN (SD-WAN) and security among others.

Microsoft held its first virtual Windows Server Summit in June that outlined current and future plans for the operating system along with its hybrid cloud future.

Penguin computing has announced the Accelion solution for accessing geographically dispersed data enabling faster file transfer or other data movement functions.

SwiftStack has added multi cloud features (enhanced search, universal access, policy management, automation, data migration) and making them available via 1space open source project. 1space enables a single object namespace across different object storage locations including integration with OpenStack Swift.

Vexata announced a new version of its Vexata operating system (VX-OS) for its storage solution including NVMe over Fabric (NVMe-oF) support.

Speaking of NVMe and fabrics, the Fibre Channel Industry Association (FCIA) announced that the International Committee on Information Technology Standards (INCITS) has published T11 technical committee developed Fibre Channel over NVMe (FC-NVMe) standard.

Various NVMe front-end including NVMeoF along with NVMe back-end devices (U.2, M.2, AiC)

Keep in mind that there are many different facets to NVMe including direct attached (M.2, U.2/8639, PCIe AiC) along with fabrics. Likewise, there are various fabric options for the NVMe protocol including over Fibre Channel (FC-NVMe), along with other NVMe over Fabrics including RDMA over Converged Ethernet (RoCE) as well as IP based among others. NVMe can be used as a front-end on storage systems supporting server attachment (e.g. competes with Fibre Channel, iSCSI, SAS among others).

Another variation of NVMe is as a back-end for attachment of drives or other NVMe based devices in storage systems, as well as servers. There is also end to end NVMe (e.g. both front-end and back-end) options. Keep context in mind when you hear or talk about NVMe and in particular, NVMe over fabrics, learn more about NVMe at https://storageioblog.com/nvme-place-volatile-memory-express/.

Toshiba announced new RM5 series of high capacity SAS SSDs for replacement of SATA devices in servers. The RM5 series being added to the Toshiba portfolio combine capacity and economics traditional associated with SATA SSDs along with performance as well as connectivity of SAS.

Check out other industry news, comments, trends perspectives here.

Server StorageIO Commentary in the news, tips and articles

Recent Server StorageIO industry trends perspectives commentary in the news.

Via SearchStorage: Comments The storage administrator skills you need to keep up today
Via SearchStorage: Comments Managing storage for IoT data at the enterprise edge
Via SearchCloudComputing: Comments Hybrid cloud deployment demands a change in security mindset

View more Server, Storage and I/O trends and perspectives comments here.

Server StorageIOblog Data Infrastructure Posts

Recent and popular Server StorageIOblog posts include:

Announcing Windows Server Summit Virtual Online Event
May 2018 Server StorageIO Data Infrastructure Update Newsletter
Solving Application Server Storage I/O Performance Bottlenecks Webinar
Have you heard about the new CLOUD Act data regulation?
Data Protection Recovery Life Post World Backup Day Pre GDPR
Microsoft Windows Server 2019 Insiders Preview
Which Enterprise HDD for Content Server Platform
Server Storage I/O Benchmark Performance Resource Tools
Introducing Windows Subsystem for Linux WSL Overview
Data Infrastructure Primer Overview (Its Whats Inside The Data Center)
If NVMe is the answer, what are the questions?

View other recent as well as past StorageIOblog posts here

Server StorageIO Recommended Reading (Watching and Listening) List

In addition to my own books including Software Defined Data Infrastructure Essentials (CRC Press 2017) available at Amazon.com (check out special sale price), the following are Server StorageIO data infrastructure recommended reading, watching and listening list items. The Server StorageIO data infrastructure recommended reading list includes various IT, Data Infrastructure and related topics including Intel Recommended Reading List (IRRL) for developers is a good resource to check out. Speaking of my books, Didier Van Hoye (@WorkingHardInIt) has a good review over on his site you can view here, also check out the rest of his great content while there.

Watch for more items to be added to the recommended reading list book shelf soon.

Events and Activities

Recent and upcoming event activities.

July 25, 2018 – Webinar – Data Protect & Storage

June 27, 2018 – Webinar – App Server Performance

June 26, 2018 – Webinar – Cloud App Optimize

May 29, 2018 – Webinar – Microsoft Windows as a Service

April 24, 2018 – Webinar – AWS and on-site, on-premises hybrid data protection

See more webinars and activities on the Server StorageIO Events page here.

Data Infrastructure Server StorageIO Industry Resources and Links

Various useful links and resources:

Data Infrastructure Recommend Reading and watching list
Microsoft TechNet – Various Microsoft related from Azure to Docker to Windows
storageio.com/links – Various industry links (over 1,000 with more to be added soon)
objectstoragecenter.com – Cloud and object storage topics, tips and news items
OpenStack.org – Various OpenStack related items
storageio.com/downloads – Various presentations and other download material
storageio.com/protect – Various data protection items and topics
thenvmeplace.com – Focus on NVMe trends and technologies
thessdplace.com – NVM and Solid State Disk topics, tips and techniques
storageio.com/converge – Various CI, HCI and related SDS topics
storageio.com/performance – Various server, storage and I/O benchmark and tools
VMware Technical Network – Various VMware related items

What this all means and wrap-up

Data Infrastructures are what exists inside physical data centers as well as spanning cloud, converged, hyper-converged, virtual, serverless and other software defined as well as legacy environments. NVMe continues to gain in industry adoption as well as customer deployment. Cloud adoption also continues along with multi-cloud deployments. Enjoy this edition of the Server StorageIO Data Infrastructure update newsletter and watch for more NVMe,cloud, data protection among other topics in future posts, articles, events, and newsletters.

Ok, nuff said, for now.

June 9, 2018April 27, 2025

Solving Application Server Storage I/O Performance Bottlenecks Webinar

The best I/O is the one you do not have to do, the second best is the one with least server I/O and storage overhead along with application performance bottleneck impact.

Fast applications need fast servers, storage, I/O networking hardware, and software. Merely throwing more hardware as a cache at application performance bottlenecks can help. However, throwing more hardware at performance problems can also cost much cash. On the other hand, a little fast memory and storage in the right place, with robust software performance acceleration can have significant application productivity benefits. Fast hardware also needs fast software to help boost application and user productivity.

As application workloads activity increases, by implementing server software, performance acceleration along with additional fast memory and storage including flash, Storage Class Memories (SCM) among other SSD along with NVMe accessed devices, even more, work can be done boosting productivity while reducing cost.

Join me on June 27 at 1 PM Pacific Time (PT) when I host a free webinar (registration required) sponsored by DataCore and produced by Redmond Magazine/1105 Media as we discuss Solving Application Server Storage I/O Performance Bottlenecks including what you can do today.

I will be joined by guest presenters Augie Gonzalez, Director Technical Product Marketing and Tim Warden, Director Engineering Product Management both from DataCore. During the interactive webinar discussions, we invite you to participate with your questions, as we look at issues, challenges, various approaches, and what you can do today to boost different application performance and productivity.

This webinar is for those whose applications have the need for speed including database, VDI, SharePoint, Exchange, AI, ML and other I/O intensive workloads. Topics that we will be discussing in addition to your questions include:

Boosting application performance without breaking the bank
Improving application productivity and reducing user wait time
Gaining insight and awareness into bottlenecks and what to do
Unlocking value in your existing hardware and software licenses
What you can do today, literally right after or even during this webinar

Where to learn more

Learn more about Windows Server Summit and related topics via the following links:

Microsoft Windows Summit save the date, Windows Server Summit updates and event here
Application Data Value Characteristics Everything Is Not The Same (Part I)
Data Infrastructure Primer Overview (Its Whats Inside The Data Center)
If NVMe is the answer, what are the questions?
NVMe Primer (or refresh), The NVMe Place, and The SSD Place
Server Storage I/O Benchmark Performance Resource Tools
Data Infrastructure server storage I/O network Recommended Reading

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means

The best I/O is the one you do not have to do, the second best is the one that has the least impact on your applications, while boosting user productivity. There are many differ net approaches to addressing various server storage I/O performance bottlenecks across applications. Join me on June 27, 2018 at 1 PM PT for the free webinar Solving Application Server Storage I/O Performance Bottlenecks Webinar and learn what you can do today to boost your uses productivity.

Ok, nuff said, for now.

Cheers Gs

June 7, 2018April 27, 2025

Announcing Windows Server Summit Virtual Online Event

Microsoft will be hosting a free (no registration required) half day virtual (e.g. online) Windows Server Summit Virtual Online Event June 26, 2018 starting at 9AM PT. As part of its continued focus on supporting hybrid strategy spanning on-premises Windows Server to Azure (among others including AWS) cloud based, Microsoft is preparing for the launch later this year of Windows Server 2019.

There is no registration required, you can just show up without concern of getting email or other spam, however you can also click here to save the date, as well as here to get updates on the event.

Windows Server 2019 is now in insider preview (get it here) and is the next Long Term Service Channel (LTSC) release following Windows Server 2016. In the past, Microsoft would have called Windows Server 2019 something such as Windows Server 2016 R2, however that has changed with the new Semiannual Channel (SAC) and LTSC release cycles.

Keynote kick off presentations will be from Erin Chapple, Director of Program Management, Cloud + AI (which includes Windows Kernel, Hypervisors, Containers and Storage), Arpan Shah, General Manager of Azure Infrastructure marketing (Windows Server, Azure IaaS, Azure Stack, Azure Management and Security), and, Jeff Woosley Principal PM, Windows Server. In addition to the kick off presentations with current state and status of Windows Servers available for on-premises bare metal, virtual, container as well as cloud, there will be demos, Q&A, roadmap’s and much more. Topics will include new and recent functionalities such as Windows Server 2019, Windows Admin Center (formerly known as Honolulu), IoT, roadmap’s and much more.

Images Via Microsoft Windows Server Summit Page

Windows Server Summit Break Out Tracks

During the Windows Server Summit, there will be four technology focused tracks including:

Hybrid – From on-premisess to Azure, how Windows Server supports different workloads in various configurations, along with associated management tools (including Windows Admin Center aka Honolulu)
Security – New and recent security enhancements for Windows Server along with Hyper-V and other related topics.
Application Platform – Containers and Linux support along with associated management tools for on-premisess and Azure.
Hyper-converged infrastructure (HCI) – Leveraging software defined storage (SDS) with Storage Spaces Direct (S2D) in Windows Server 2016, along with Hyper-V and other technologies, learn how Microsoft supports HCI and beyond.

Where to learn more

Learn more about Windows Server Summit and related topics via the following links:

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means

Windows Server remains relevant today for traditional, on site, on-premises, as well as on-premisess along with cloud, container among other deployments. Remember to click here to save the date, click here to sign up for Windows Server Summit updates and learn more about the Windows Server Summit Virtual Online event here, see there, or at least virtually.

Ok, nuff said, for now.

Cheers Gs

May 24, 2018April 27, 2025

May 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 5 (May 2018)

Hello and welcome to the May 2018 Server StorageIO Data Infrastructure Update Newsletter.

In cased you missed it, the April 2018 Server StorageIO Data Infrastructure Update Newsletter can be viewed here (HTML and PDF).

May has been a busy month with a lot of data infrastructure related activity from software-defined virtual, cloud, container, converged, serverless to legacy, hardware, software, services, server, storage, I/O and networking along with data protection topics among others.

In this issue buzzwords topics include GDPR, NVMe, NVMeoF, Composable, Serverless, Data Protection, SCM, Gen-Z, MaaS:

Data Infrastructure Industry Activity
News Commentary and Tips
Server StorageIOblog posts
Recommended Reading
Various Events and Webinars
Industry Resources and Links

Enjoy this edition of the Server StorageIO Data Infrastructure update newsletter.

Cheers GS

Data Infrastructure and IT Industry Activity Trends

May has been a busy month, some data infrastructure, server, storage, I/O network, hardware, software, cloud, converged, and container as well as data protection activity includes among others:

Depending on when you read this, the new global data protection regulations (GDPR) are either days away, or already in effect. For those who are not aware of GDPR other than seeing many inbox items in your email pertaining to it, here are some resources as a refresher or primer:

Webinar with Danny Alan of Veeam (free with registration): GDPR experiences walking the talk
Blog post – Data Protection life after world backup day and pre GDPR
Various GDPR related links, posts and resources

May Buzzword, Buzz Topic and Trends

Besides data protection and GDPR, other recent data infrastructure related news, trends, technologies and topics to keep an eye on (besides AI, ML, DL, AR/VR, IoT, Blockchain, Serverless) include Metal as a Service (MaaS) that might be familiar to some, for others, something new. Canonical has been busy for sometime now with MaaS including in Ubuntu and they are not alone with variations appearing with various managed service providers, hosting and cloud providers as well. NVMe has become a more common topic, technology, trend including for use in servers as well as over fabrics (e.g. NVMe over Fabrics) as a language for server, storage, I/O communication.

A new emerging companion to NVMe is Gen-Z which initially is a companion to PCIe. Longer term, Gen-Z could maybe possibly be a replacement, as well as for use accessing direct random access memory (DRAM) among other uses. Storage Class Memory (SCM) has been an industry conversation topic for several years now with new persistent memories (PMEM) that combine the best of traditional DRAM (Speed and write endurance) as well as persistent, higher capacity, lower cost of traditional NAND flash SSDs.

Another trend topic is that for some, ASIC, FPGA and GPU are new companions to standard commodity compute processors along with servers, yet for others it may be Dejavu as they have been being used for years (ok, decades) in some solutions. For now, two other buzzwords, buzz terms to add or refresh your data infrastructure vocabulary include distributed ledgers (aka blockchains), composable resources and ephemeral instance storage (storage on a cloud instance).

May NVMe Momentum Movement Activity

May saw a lot of NVMe related activity, from chips and components (adapters, devices) to systems spanning direct attached to NVMe over Fabric (NVMeoF). Here is a primer (or refresh) for NVMe along with various deployment options. NVMeoF includes RDMA over Converged Ethernet (RoCE) based, along with NVMe over Fibre Channel (FC-NVMe), as well as emerging NVMe over IP.

NVMe being used for front-end accessed via shared PCIe along with back-end devices

There are many different facets of NVMe including for use as a front-end on storage systems supporting server attachment (e.g. competes with Fibre Channel, iSCSI, SAS among others). Another variation of NVMe is as a back-end for attachment of drives or other NVMe based devices in storage systems, as well as servers.

Front-end using traditional block SAN access with back-end NVMe, SAS and SATA devices

Read more about the many different options and variations of NVMe including key questions to ask or understand, deployment topology along with other related topics at thenvmeplace.com.

Various NVMe front-end including NVMeoF along with NVMe back-end devices (U.2, M.2, AiC)

Software Defined Data Infrastructure Activity

Amazon Web Services (AWS) continues to add new features, functionality as well as extending those as along with existing capabilities into various regions. Some recent updates include new Elastic Cloud Compute (EC2) Microsoft Windows Servers versions 1709 and 1803 Amazon Machine Images (AMIs). Other AWS updates include spot instances support for Red Hat BYOL (Bring Your Own License), VPN enhancements, X1e instances available in Frankfurt, H1 instance price reduction, as well as LightSail now in Canada, Paris, and Seoul regions.

For those who are not familiar with LightSail, they are virtual private servers (VPS) which are different from traditional EC2 instances. LightSail can be a cost-effective way for those who need to move out of general population shared hosting, yet cannot justify a full EC2 instance while requiring more than a container.

The LightSail instance also is available with various software pre-installed such as for WordPress websites among others. For example, I have used LightSail as a backup and standby WordPress site for StorageIOblog using Updraft Plus Pro for data protection.

In other news, AWS C5d EC2 instances are available in various regions. C5d instances are available with 2, 4, 8, 16, 36 and 72 vCPUs along with up to 1800GB of NVMe based ephemeral storage for on-demand reserved or spot instances.

Note that instance-based storage is temporary meaning that it persists for the life of the instance. What this means is that if you stop and restart the instance, the data is not persistence. Instance-based storage is useful for data that can be protected or persisted to other storage including EBS (Elastic Block Storage). Usage includes batch, log and analytics processing, burst buffers, cache or workspace.

AWS also announced a new Simple Storage Service (S3) storage class a month or so ago called One Zone Availability Infrequent Access. This new storage class primarily provides a lower cost of storage with lower durability (e.g., data spread across one zone vs. multiple). Over the past couple of months, I have been migrating from S3 Infrequent Access (IA) as well as standard into One Zone Availability. Some of my active data remains in S3 Standard storage class, while cold archives are in Glacier.

A tip about migrating to One Zone Availability, as well as between other S3 storage classes is paid attention to your API calls and monthly budget. You might see an increase in S3 costs during the migration time, that then settles into the lower prices once data has been moved due to API calls (gets, puts, lists, dir). In other words, pay attention to how many API calls you are allowed per storage class per month, along with other fees beyond focusing only on cost per TByte. Read about other recent AWS news updates here.

Software-defined storage startup Cloudian announced their technology available for test drive on Google Cloud Platform as part of a continued industry trend. That trend is for storage vendors to make their storage software technology available on different cloud platforms such as AWS, Azure, Google, Softlayer among others.

Dell Technologies made several announcements as part of Dell Technologies World that are covered in a series of posts here. Announcements included PowerMax the successor to VMAX, XtremIO X2 updates, new servers, workstations among many other items, read more here.

Besides the data infrastructure, cloud service providers and systems vendors, component suppliers including Cavium announced NVMe over Fibre Channel updates (here and here), along with Marvel NVMe updates here. HPE announced new thin clients and software (t430 Thin Client, HP mt44 Mobile Thin Client, HP ThinPro software), as well as updates to 3PAR and other storage solutions.

IBM announced various storage enhancements (and here) as well as a Happy 30th anniversary to the IBM Power9 based i systems. In other news, Kaseya bought backup data protection vendor Unitrends.

Micron announced the first quad layer cell (QLC) nand flash solid state device (SSD) named 52100 has begun shipping to select customers (and vendors). QLC packs or stacks 4 bits per cell. The 5200 is optimized for read-intensive workloads with up to 33% higher densities compared to previous generation TLC (triple layer cell) NAND flash. Broader market availability is expected to occur later fall 2018, 5210 form factor is 2.5” as a standard SSD or HDD, with capacities from 1.92TB to 7.68TB.

In other news, Micron also announced a $10 Billion (USD) stock repurchase plan, along with an extension of Intel 3D NAND flash memory partnership involving 3D NAND flash, as well as 96 layer 3D NAND. Meanwhile, various vendors are increasingly talking about how their systems are or will be storage class memory (SCM) ready including for use such as Micron 3D XPoint also known as Intel Optane among others.

Microsoft has placed into public preview Azure Active Directory (AAD) Storage authentication for Azure Blobs and Queues. Azure Storage Explorer is now released as version 1.0. AAD storage authentication enables organizations to implement role-based access control of Azure storage resources. Speaking of Azure, Microsoft has published several architectures, reference and other content at the Azure Virtual Datacenter portal here.

If you have not done so, check out Azure File Sync which is currently in public preview. Having been involved and using it for over a year including during private preview, Azure File Sync is an exciting, useful technology for creating a hybrid distributed file sharing with cloud tiering solutions. Learn more Azure File Sync here and here. In other news, Microsoft has announced a preview as part of the April 2018 Windows 10 build for a Hyper-V Google Android emulator support.

NetApp has had Azure based NAS storage in preview for a while now, and also announced Cloud Volumes on Google Cloud Platform (GCP). In addition to Cloud Volumes on AWS, Azure, and GCP, NetApp also announced enhanced NVMe based storage systems among other updates.

Two companies that have similar names are Opendrives (video workflow acceleration) and Opendrive (cloud storage, backup, and data protection). Meanwhile, data infrastructure startup Pavilion has received new funding as well as begun talking about their NVMe including NVMe over Fabric (NVMeOF) hardware storage system. Long-time data infrastructure converged server storage startup Pivot3 announced additional cloud workload mobility.

Pure storage made a couple of announcements including FlashArray//X NVMe based shared accelerated storage system as well as NVIDIA (GPU powered) based AIRI Mini for AI/DL/ML.

Have you heard about Snowflake computing, aka, the cloud data warehouse solution? If not, check them out here. Another cloud-related data infrastructure vendor to look into is Upbound.io who have received additional funding for their multi-cloud management solutions.

Building off of recent VMware vSphere updates (here), and Dell Technology World here, the following is an excellent post about Instant Clone in vSphere 6.7, and VMware vSAN HCI assessment tool here.

Check out other industry news, comments, trends perspectives here.

Server StorageIO Commentary in the news, tips and articles

Recent Server StorageIO industry trends perspectives commentary in the news.

Via SearchStorage: Comments Managing storage for IoT data at the enterprise edge
Via SearchCloudComputing: Comments Hybrid cloud deployment demands a change in security mindset
Via SearchStorage: Comments Dell EMC storage IPO, VMware merger plans still unclear
Via SearchStorage: Comments Dell EMC midrange storage keeps its overlapping arrays
Via SearchStorage: Comments Dell EMC all-flash PowerMax replaces VMAX, injects NVMe
Via IronMountain InfoGoto: The growing Trend of Secondary Data Storage

View more Server, Storage and I/O trends and perspectives comments here.

Server StorageIOblog Data Infrastructure Posts

Recent and popular Server StorageIOblog posts include:

Dell Technology World 2018 Announcement Summary
Part II Dell Technology World 2018 Modern Data Center Announcement Details
Part III Dell Technology World 2018 Storage Announcement Details
Part IV Dell Technology World 2018 PowerEdge MX Gen-Z Composable Infrastructure
Part V Dell Technology World 2018 Server Converged Announcement Details
April 2018 Server StorageIO Data Infrastructure Update Newsletter
VMware vSphere vSAN vCenter version 6.7 SDDC Update Summary
PCIe Fundamentals Server Storage I/O Network Essentials
Have you heard about the new CLOUD Act data regulation?
Data Protection Recovery Life Post World Backup Day Pre GDPR
Microsoft Windows Server 2019 Insiders Preview
Application Data Value Characteristics Everything Is Not The Same
Data Infrastructure Resource Links cloud data protection tradecraft trends
IT transformation Serverless Life Beyond DevOps Podcast
Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips
Introducing Windows Subsystem for Linux WSL Overview
Data Infrastructure Primer Overview (Its Whats Inside The Data Center)
If NVMe is the answer, what are the questions?

View other recent as well as past StorageIOblog posts here

Server StorageIO Recommended Reading (Watching and Listening) List

Containers, serverless, kubernetes continue to gain in industry adoption, as well as customer deployments. Here is some information about Microsoft Azure Kubernetes Service (AKS). Note that AWS has Elastic Kubernetes Service (EKS), Google, VMware and Pivotal with Pivotal Kubernetes Service (PKS) among others.

Here is an interesting perspective by Ben Kepps about Serverless (e.g. life beyond Kubernetes and containers (e.g. life beyond virtualization which to some is or was life (e.g. life beyond bare metal))) as well as the all to often punditry, evangelism of something new causing something else to be dead.

SNIA has updated their Emerald aka Green energy effectiveness (focus on productivity) measurement specification (V3.01) including NAS NFS file activity (besides block). Learn more at snia.org/forums/green.

Watch for more items to be added to the recommended reading list book shelf soon.

Events and Activities

Recent and upcoming event activities.

June 27, 2018 – Webinar – TBA

May 29, 2018 – Webinar – Microsoft Windows as a Service

April 24, 2018 – Webinar – AWS and on-site, on-premises hybrid data protection

See more webinars and activities on the Server StorageIO Events page here.

Data Infrastructure Server StorageIO Industry Resources and Links

Various useful links and resources:

Connect and Converse With Us

Subscribe to Newsletter – Newsletter Archives – StorageIO.com – StorageIOblog.com

What this all means and wrap-up

Data Infrastructures are what exists inside physical data centers spanning cloud, converged, hyper-converged, virtual, serverless and other software defined as well as legacy environments. So far this spring there has been a lot of data infrastructure related activity, from new technology announcements, to events, trends among others. Enjoy this edition of the Server StorageIO Data Infrastructure update newsletter and watch for more NVMe, Gen-Z, cloud, data protection among other topics in future posts, articles, events, and newsletters.

Ok, nuff said, for now.

April 6, 2018April 27, 2025

Have you heard about the new CLOUD Act data regulation?

The new CLOUD Act data regulation became law as part of the recent $1.3 Trillion (USD) omnibus U.S. government budget spending bill passed by Congress on March 23, 2018 and signed by President of the U.S. (POTUS) Donald Trump in March.

CLOUD Act is the acronym for Clarifying Lawful Overseas Use of Data, not to be confused with initiatives such as U.S. federal governments CLOUD First among others which are focused on using cloud, securing and complying (e.g. FedRAMP among others). In other words, the new CLOUD Act data regulation pertains to how data stored by cloud or other service providers can be accessed by law environment officials (LEO).

Supreme Court of the U.S. (SCOTUS) Image via https://www.supremecourt.gov/

CLOUD Act background and Stored Communications Act

After the signing into law of CLOUD Act, the US Department of Justice (DOJ) has asked the Supreme Court of the U.S. (SCOTUS) to dismiss the pending case against Microsoft (e.g., Azure Cloud). The case or question in front of SCOTUS pertained to whether LEO can search as well as seize information or data that is stored overseas or in foreign counties.

As a refresher, or if you had not heard, SCOTUS was asked to resolve if a service provider who is responding to a warrant based on probable cause under the 1986 era Stored Communications Act, is required to provide data in its custody, control or possession, regardless of if stored inside, or, outside the US.

Microsoft Azure Regions via Microsoft.com

This particular case in front of SCOTUS centered on whether Microsoft (a U.S. Technology firm) had to comply with a court order to produce emails (as part of an LEO drug investigation) even if those were stored outside of the US. In this particular situation, the emails were alleged to have been stored in a Microsoft Azure Cloud Dublin Ireland data center.

For its part, Microsoft senior attorney Hasan Ali said via FCW “This bill is a significant step forward in the larger global debate on what our privacy laws should look like, even if it does not go to the highest threshold". Here are some additional perspectives via Microsoft Brad Smith on his blog along with a video.

What is CLOUD Act

Clarifying Lawful Overseas Use of Data is the new CLOUD Act data regulation approved by Congress (House and Senate) details can be read here and here respectively with additional perspectives here.

The new CLOUD Act law allows for POTUS to enter into executive agreements with foreign governments about data on criminal suspects. Granted what is or is not a crime in a given country will likely open Pandora’s box of issues. For example, in the case of Microsoft, if an agreement between the U.S. and Ireland were in place, and, Ireland agreed to release the data, it could then be accessed.

Now, for some who might be hyperventilating after reading the last sentence, keep this in mind that if you are overseas, it is up to your government to protect your privacy. The foreign government must have an agreement in place with the U.S. and that a crime has or had been committed, a crime that both parties concur with.

Also, keep in mind that is also appeal processes for providers including that the customer is not a U.S. person and does not reside in the U.S. and the disclosure would put the provider at risk of violating foreign law. Also, keep in mind that various provisions must be met before a cloud or service provider has to hand over your data regardless of what country you reside, or where the data resides.

Where to learn more

Learn more about CLOUD Act, cloud, data protection, world backup day, recovery, restoration, GDPR along with related data infrastructure topics for cloud, legacy and other software defined environments via the following links:

AWS Cloud Application Data Protection Webinar
U.S. House and Senate versions of CLOUD Act data regulations
CLOUD (Clarifying Lawful Overseas Use of Data) Act data regulation became law
$1.3 Trillion (USD) omnibus U.S. government budget spending bill passed by Congress
US DOJ has asked SCOTUS to dismiss pending case against Microsoft
1986 era regulations and Stored Communications Act
Microsoft Azure Cloud regions
Data Protection Recovery Life Post World Backup Day Pre GDPR
Additional perspectives via Microsoft Brad Smith on his blog along with a video.
March 2018 Server StorageIO Data Infrastructure Update Newsletter
Application Data Value Characteristics Everything Is Not the Same (five-part mini-series)
Application Data Availability 4 3 2 1 Data Protection (part of the mini-series)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, Replication, Security)
Veeam GDPR preparedness experiences Webinar walking the talk
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Data Infrastructure server storage I/O network Recommended Reading
Object Storage Center resources (www.objectstoragecenter.com)
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Is the new CLOUD Act data regulation unique to Microsoft Azure Cloud?

No, it also applies to Amazon Web Services (AWS), Google, IBM Softlayer Cloud, Facebook, LinkedIn, Twitter and the long list of other service providers.

What about GDPR?

Keep in mind that the new Global Data Protection Regulations (GDPR) go into effect May 25, 2018, that while based out of the European Union (EU), have global applicability across organizations of all size, scope, and type. Learn more about GDPR, Data Protection and its global impact here.

Thus, if you have not heard about the new CLOUD Act data regulation, now is the time to become aware of it.

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

April 2, 2018November 26, 2023

Data Protection Recovery Life Post World Backup Day Pre GDPR

Data Protection Recovery Life Post World Backup Day Pre GDPR trends

It’s time for Data Protection Recovery Life Post World Backup Day Pre GDPR Start Date.

The annual March 31 world backup day focus has come and gone once again.

However, that does not mean data protection including backup as well as recovery along with security gets a 364-day vacation until March 31, 2019 (or the days leading up to it).

Granted, for some environments, public relations, editors, influencers and other industry folks backup day will take some time off while others jump on the ramp up to GDPR which goes into effect May 25, 2018.

Expanding Focus Data Protection and GDPR

As I mentioned in this post here, world backup day should be expanded to include increased focus not just on backup, also recovery as well as other forms of data protection. Likewise, May 25 2018 is not the deadline or finish line or the destination for GDPR (e.g. Global Data Protection Regulations), rather, it is the starting point for an evolving journey, one that has global impact as well as applicability. Recently I participated in a fireside chat discussion with Danny Allan of Veeam who shared his GDPR expertise as well as experiences, lessons learned, tips of Veeam as they started their journey, check it out here.

Expanding Focus Data Protection Recovery and other Things that start with R

As part of expanding the focus on Data Protection Recovery Life Post World Backup Day Pre GDPR, that also means looking at, discussing things that start with R (like Recovery). Some examples besides recovery include restoration, reassess, review, rethink protection, recovery point, RPO, RTO, reconstruction, resiliency, ransomware, RAID, repair, remediation, restart, resume, rollback, and regulations among others.

Data Protection Tips, Reminders and Recommendations

There are no blue participation ribbons for failed recovery. However, there can be pink slips.
Only you can prevent on-premises or cloud data loss. However, it is also a shared responsibility with vendors and service providers
You can’t go forward in the future when there is a disaster or loss of data if you can’t go back in time for recovery
GDPR appliances to organizations around the world of all size and across all sectors including nonprofit
Keep new school 4 3 2 1 data protection in mind while evolving from old school 3 2 1 backup rules

4 3 2 1 backup data protection rule

A Fundamental premise of data infrastructures is to enable applications and their data, protect, preserve, secure and serve
Remember to protect your applications, as well as data including metadata, settings configurations
Test your restores including can you use the data along with security settings
Don’t cause a disaster in the course of testing your data protection, backups or recovery
Expand (or refresh) your data protection and data infrastructure education tradecraft skills experiences

Where to learn more

Learn more about data protection, world backup day, recovery, restoration, GDPR along with related data infrastructure topics for cloud, legacy and other software defined environments via the following links:

AWS Cloud Application Data Protection Webinar
March 2018 Server StorageIO Data Infrastructure Update Newsletter
Application Data Value Characteristics Everything Is Not The Same (five-part mini-series)
Application Data Availability 4 3 2 1 Data Protection (part of the mini-series)
World Backup Day: Best Practices for a Hybrid Approach
Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips
World Backup Day 2018 Data Protection Readiness Reminder
Data Protection Fundamental Topics Tools Techniques Technologies Tips
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)
Veeam GDPR preparedness experiences Webinar walking the talk
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Have you heard about the new CLOUD Act data regulation?
Data Infrastructure server storage I/O network Recommended Reading
Object Storage Center resources (www.objectstoragecenter.com)
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data protection including business continuance (BC), business resiliency (BR), disaster recovery (DR), availability, accessibility, backup, snapshots, encryption, security, privacy among others is a 7 x 24 x 365 day a year focus. The focus of data protection also needs to evolve from an after the fact cost overhead to proactive, business enabler Meanwhile, welcome to Data Protection Recovery Post World Backup Day Pre GDPR Start Date.

Ok, nuff said, for now.

March 20, 2018April 27, 2025

March 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 3 (March 2018)

Hello and welcome to the March 2018 Server StorageIO Data Infrastructure Update Newsletter.

If you are wondering where the January and February 2018 update newsletters are, they are rolled into this combined edition. In addition to the short email version (free signup here), you can access full versions (html here and PDF here) along with previous editions here.

In this issue:

Data Infrastructure Industry Activity
News Commentary and Tips
Server StorageIOblog posts
Recommended Reading
Various Events and Webinars
Industry Resources and Links

Enjoy this edition of the Server StorageIO Data Infrastructure update newsletter.

Cheers GS

Data Infrastructure and IT Industry Activity Trends

Data Infrastructure Data Protection and Backup BC BR DR HA Security

World Backup day is coming up on March 31 which is a good time to remember to verify and validate that your data protection is working as intended. On one hand I think it is a good idea to call out the importance of making sure your data is protected including backed up.

On the other hand data protection is not a once a year, rather a year around, 7 x 24 x 365 day focus. Also the focus needs to be on more than just backup, rather, all aspects of data protection from archiving to business continuance (BC), business resiliency (BR), disaster recovery (DR), always on, always accessible, along with security and recovery.

Data Infrastructure Data Protection Backup 4 3 2 1 rule
Data Infrastructure 4 3 2 1 Data Protection and Backup

Some data spring thoughts, perspectives and reminders. Data lakes may swell beyond their banks causing rivers of data to flood as they flow into larger reservoirs, great data lakes, gulfs of data, seas and oceans of data. Granted, some of that data will be inactive cold parked like glaciers while others semi-active floating around like icebergs. Hopefully your data is stored on durable storage solutions or services and does not melt.

Data Infrastructure Server Storage I/O flash SSD NVMe
Various NAND Flash SSD devices and SAS, SATA, NVMe, M.2 interfaces

Non-Volatile Memory (NVM) including various solid state device (SSD) mediums (e.g. nand flash, 3D XPoint, MRAM among others), packaging (drives, PCIe Add in cars [AiC] along with entire systems, appliances or arrays). Also part of the continue evolution of NVM, SSD and other persistent memories (PM) including storage class memories (SCM) are different access protocol interfaces.

Keep in mind that there is a difference between NVM (medium) and NVMe (access), NVM is the generic category of mediums or media and devices such as nand flash, nvram, 3D XPoint among others SCM (and PMs). In other words, NVM is what data devices use for storing data, NVMe is how devices and systems are accessed. NVMe and its variations is how NVM, SSD, PM, SCM media and devices get accessed locally, as well as over network fabrics (e.g. NVMe-oF an FC-NVMe).

NVMe continues to evolve including with networked fabric variations such as RDMA based NVMe over Fabric (NVMe-oF), along with Fibre Channel based (FC-NVMe). The Fibre Channel Industry Association trade group recently held its second multi-vendor plugfest in support of NVMe over Fibre Channel.

Read more about NVM, NVMe, SSD, SCM, flash and related technologies, tools, trends, tips via the following resources:

Has Object Storage failed to live up to its industry hype lacking traction? Or, is object storage (also known as blobs) progressing with customer adoption and deployment on normal realistic timelines? Recently I have seen some industry comments about object storage not catching on with customers or failing to live up to its hyped expectation. IMHO object storage is very much alive along with block, file, table (e.g. database SQL and NoSQL repositories), message/queue among others, as well as emerging blockchain aka data exchanges.

Various Industry and Customer Adoption Deployment Timeline (Via: StorageIOblog.com)

An issue with object storage is that it is still new, still evolving, many IT environments applications do not yet speak or access objects and blobs natively. Likewise as is often the case, industry adoption and deployment is usually early and short term around the hype, vs. the longer cycle of customer adoption and deployment. The downside for those who only focus on object storage (or blobs) is that they may be under pressure to do things short term instead of adjusting to customer cycles which take longer, however real adoption and deployment also last longer.

While the hype and industry buzz around object storage (and blobs) may have faded, customer adoption continues and is here to stay, along with block, file among others, learn more at www.objectstoragecenter.com. Also keep in mind that there is a difference between industry and customer adoption along with deployment.

Some recent Industry Activities, Trends, News and Announcements include:

In case you missed it, Amazon Web Services (e.g. AWS) announced EKS (Elastic Kubernetes Service) which as its name implies, is an easy to use and manage Kubernetes (containers, serverless data infrastructure) running on AWS. AWS joins others including Microsoft Azure Kubernetes Services (AKS), Googles Kubernetes Engine, EasyStack (ESContainer for openstack and Kubernetes),VMware Pivotal Container Service (PKS) among others. What this means is that in the container serverless data infrastructure ecosystem Kubernetes container management (orchestration platform) is gaining in both industry as well as customer adoption along with deployment.

Check out other industry news, comments, trends perspectives here.

Data Infrastructure Server StorageIO Comments Content

Server StorageIO Commentary in the news, tips and articles

Recent Server StorageIO industry trends perspectives commentary in the news.

Via BizTech: Why Hybrid (SSD and HDD) Storage Might Be Fit for SMB environments
Via Excelero: Server StorageIO white paper enabling database DBaaS productivity
Via Cloudian: YouTube video interview file services on object storage with HyperFile
Via CDW Solutions: Comments on Software Defined Access
Via SearchStorage: Comments on Cloudian HyperStore on demand cloud like pricing
Via EnterpriseStorageForum: Comments and tips on Software Defined Storage Best Practices
Via PRNewsWire: Comments on Excelero NVMe NVMesh Database and DBaaS solutions
Via SearchStorage: Comments on NooBaa multi-cloud storage management
Via CDW: Comments on New IT Strategies Improve Your Bottom Line
Via EnterpriseStorageForum: Comments on Software Defined Storage: Pros and Cons
Via DataCenterKnowledge: Comments on The Great Data Center Headache IoT
Via SearchStorage: Comments on Dell and VMware merger scenario options
Via PRNewswire: Comments on Chelsio Microsoft Validation of iWARP/RDMA
Via SearchStorage: Comments on Server Storage Industry trends and Dell EMC
Via ChannelProSMB: Comments on Hybrid HDD and SSD storage solutions
Via ChannelProNetwork: Comments on What the Future Holds for HDDs
Via HealthcareITnews: Comments on MOUNTAINS OF MOBILE DATA
Via SearchStorage: Comments on Cloudian HyperStore 7 targets multi-cloud complexities
Via GlobeNewsWire: Comments on Cloudian HyperStore 7
Via GizModo: Comments on Intel Optane 800P NVMe M.2 SSD
Via DataCenterKnowledge: Comments on getting data centers ready for IoT
Via DataCenterKnowledge: Comments on Beyond the Hype: AI in the Data Center
Via DataCenterKnowledge: Comments on Data Center and Cloud Disaster Recovery
Via SearchStoragae: Comments on Cloudian HyperFile marries NAS and object storage
Via SearchStoragae: Comments on Top 10 Tips on Solid State Storage Adoption Strategy
Via SearchStoragae: Comments on 8 Top Tips for Beating the Big Data Deluge

View more Server, Storage and I/O trends and perspectives comments here.

Data Infrastructure Server StorageIOblog posts

Server StorageIOblog Data Infrastructure Posts

Recent and popular Server StorageIOblog posts include:

Application Data Value Characteristics Everything Is Not The Same
Application Data Availability 4 3 2 1 Data Protection
AWS Cloud Application Data Protection Webinar
Microsoft Windows Server 2019 Insiders Preview
Application Data Characteristics Types Everything Is Not The Same
Application Data Volume Velocity Variety Everything Is Not The Same
Application Data Access Lifecycle Patterns Everything Is Not The Same
Veeam GDPR preparedness experiences Webinar walking the talk
VMware continues cloud construction with March announcements
Benefits of Moving Hyper-V Disaster Recovery to the Cloud Webinar
World Backup Day 2018 Data Protection Readiness Reminder
Use Intel Optane NVMe U.2 SFF 8639 SSD drive in PCIe slot
Data Infrastructure Resource Links cloud data protection tradecraft trends
How to Achieve Flexible Data Protection Availability with All Flash Storage Solutions
November 2017 Server StorageIO Data Infrastructure Update Newsletter
IT transformation Serverless Life Beyond DevOps Podcast
Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips
HPE Announces AMD Powered Gen 10 ProLiant DL385 For Software Defined Workloads
AWS Announces New S3 Cloud Storage Security Encryption Features
Introducing Windows Subsystem for Linux WSL Overview #blogtober
Hot Popular New Trending Data Infrastructure Vendors To Watch

View other recent as well as past StorageIOblog posts here

Server StorageIO Recommended Reading (Watching and Listening) List

In case you may have missed it, here is a good presentation from AWS re:invent 2017 by Brendan Gregg (@brendangregg) about how Netflix does EC2 and other AWS tuning along with plenty of great resource links. Keith Tenzer (@keithtenzer) provides a good perspective piece about containers in a large IT enterprise environment here including various options.

Speaking of IT data centers and data infrastructure environments, checkout the list of some of the worlds most extreme habitats for technology here. Mark Betz (@markbetz) has a series of Docker and Kubernetes networking fundamentals posts on his site here, as well as over at Medium including mention of Google Cloud (@googlecloud). The posts in Marks series are good refresher or intros to how Docker and Kubernetes handles basic networking between containers, pods, nodes, hosts in clusters. Check out part I here and part II here.

Blockchain elements
Image via https://stevetodd.typepad.com

Steve Todd (@Stevetodd) has some good perspectives about Trusted Data Exchanges e.g. life beyond blockchain and bitcoin here along with core element considerations (beyond the product pitch) here, along with associated data infrastructure and storage evolution vs. revolution here.

Watch for more items to be added to the recommended reading list book shelf soon.

Data Infrastructure Server StorageIO event activities

Events and Activities

Recent and upcoming event activities.

March 27, 2018 – Webinar – Veeams Road to GDPR Compliancy The 5 Lessons Learned

Feb 28, 2018 – Webinar – Benefits of Moving Hyper-V Disaster Recovery to the Cloud

Jan 30, 2018 – Webinar – Achieve Flexible Data Protection and Availability with All Flash Storage

Nov. 9, 2017 – Webinar – All You Need To Know about ROBO Data Protection Backup

See more webinars and activities on the Server StorageIO Events page here.

Data Infrastructure Server StorageIO Industry Resources and Links

Various useful links and resources:

Connect and Converse With Us

Subscribe to Newsletter – Newsletter Archives – StorageIO.com – StorageIOblog.com

What this all means and wrap-up

Data Infrastructures are what exists inside physical data centers spanning cloud, converged, hyper-converged, virtual, serverless and other software defined as well as legacy environments. The fundamental role of data infrastructures comprising server (compute), storage, I/O networking hardware, software, services defined by management tools, best practices and policies is to provide a platform for applications along with their data to deliver information services. With March 31 being world backup day, also focus on making sure that on April 1st you are not a fool trying to recover from a bad data protection copy. With the continued movement to flash SSD along with other forms of storage class memory (SCM) and persistent memories (PM), data moves at a faster rate meaning data protection is even more important to get you out of trouble as fast as you get into issues.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Value Characteristics Everything Is Not The Same (Part I)

Application Data Value Characteristics Everything Is Not The Same

This is part one of a five-part mini-series looking at Application Data Value Characteristics Everything Is Not The Same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we start things off by looking at general application server storage I/O characteristics that have an impact on data value as well as access.

Everything is not the same across different organizations including Information Technology (IT) data centers, data infrastructures along with the applications as well as data they support. For example, there is so-called big data that can be many small files, objects, blobs or data and bit streams representing telemetry, click stream analytics, logs among other information.

Keep in mind that applications impact how data is accessed, used, processed, moved and stored. What this means is that a focus on data value, access patterns, along with other related topics need to also consider application performance, availability, capacity, economic (PACE) attributes.

If everything is not the same, why is so much data along with many applications treated the same from a PACE perspective?

Data Infrastructure resources including servers, storage, networks might be cheap or inexpensive, however, there is a cost to managing them along with data.

Managing includes data protection (backup, restore, BC, DR, HA, security) along with other activities. Likewise, there is a cost to the software along with cloud services among others. By understanding how applications use and interact with data, smarter, more informed data management decisions can be made.

IT Applications and Data Infrastructure Layers

Keep in mind that everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them. Also keep in mind that programs (e.g. applications) = algorithms (code) + data structures (how data defined and organized, structured or unstructured).

There are traditional applications, along with those tied to Internet of Things (IoT), Artificial Intelligence (AI) and Machine Learning (ML), Big Data and other analytics including real-time click stream, media and entertainment, security and surveillance, log and telemetry processing among many others.

What this means is that there are many different application with various character attributes along with resource (server compute, I/O network and memory, storage requirements) along with service requirements.

Common Applications Characteristics

Different applications will have various attributes, in general, as well as how they are used, for example, database transaction activity vs. reporting or analytics, logs and journals vs. redo logs, indices, tables, indices, import/export, scratch and temp space. Performance, availability, capacity, and economics (PACE) describes the applications and data characters and needs shown in the following figure.

Application PACE attributes (via Software Defined Data Infrastructure Essentials)

All applications have PACE attributes, however:

PACE attributes vary by application and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application

Think of applications along with associated data PACE as its personality or how it behaves, what it does, how it does it, and when, along with value, benefit, or cost as well as quality-of-service (QoS) attributes.

Understanding applications in different environments, including data values and associated PACE attributes, is essential for making informed server, storage, I/O decisions and data infrastructure decisions. Data infrastructures decisions range from configuration to acquisitions or upgrades, when, where, why, and how to protect, and how to optimize performance including capacity planning, reporting, and troubleshooting, not to mention addressing budget concerns.

Primary PACE attributes for active and inactive applications and data are:

P – Performance and activity (how things get used)
A – Availability and durability (resiliency and data protection)
C – Capacity and space (what things use or occupy)
E – Economics and Energy (people, budgets, and other barriers)

Some applications need more performance (server computer, or storage and network I/O), while others need space capacity (storage, memory, network, or I/O connectivity). Likewise, some applications have different availability needs (data protection, durability, security, resiliency, backup, business continuity, disaster recovery) that determine the tools, technologies, and techniques to use.

Budgets are also nearly always a concern, which for some applications means enabling more performance per cost while others are focused on maximizing space capacity and protection level per cost. PACE attributes also define or influence policies for QoS (performance, availability, capacity), as well as thresholds, limits, quotas, retention, and disposition, among others.

Performance and Activity (How Resources Get Used)

Some applications or components that comprise a larger solution will have more performance demands than others. Likewise, the performance characteristics of applications along with their associated data will also vary. Performance applies to the server, storage, and I/O networking hardware along with associated software and applications.

For servers, performance is focused on how much CPU or processor time is used, along with memory and I/O operations. I/O operations to create, read, update, or delete (CRUD) data include activity rate (frequency or data velocity) of I/O operations (IOPS). Other considerations include the volume or amount of data being moved (bandwidth, throughput, transfer), response time or latency, along with queue depths.

Activity is the amount of work to do or being done in a given amount of time (seconds, minutes, hours, days, weeks), which can be transactions, rates, IOPs. Additional performance considerations include latency, bandwidth, throughput, response time, queues, reads or writes, gets or puts, updates, lists, directories, searches, pages views, files opened, videos viewed, or downloads.

Server, storage, and I/O network performance include:

Processor CPU usage time and queues (user and system overhead)
Memory usage effectiveness including page and swap
I/O activity including between servers and storage
Errors, retransmission, retries, and rebuilds

the following figure shows a generic performance example of data being accessed (mixed reads, writes, random, sequential, big, small, low and high-latency) on a local and a remote basis. The example shows how for a given time interval (see lower right), applications are accessing and working with data via different data streams in the larger image left center. Also shown are queues and I/O handling along with end-to-end (E2E) response time.

Server I/O performance fundamentals (via Software Defined Data Infrastructure Essentials)

Click here to view a larger version of the above figure.

Also shown on the left in the above figure is an example of E2E response time from the application through the various data infrastructure layers, as well as, lower center, the response time from the server to the memory or storage devices.

Various queues are shown in the middle of the above figure which are indicators of how much work is occurring, if the processing is keeping up with the work or causing backlogs. Context is needed for queues, as they exist in the server, I/O networking devices, and software drivers, as well as in storage among other locations.

Some basic server, storage, I/O metrics that matter include:

Queue depth of I/Os waiting to be processed and concurrency
CPU and memory usage to process I/Os
I/O size, or how much data can be moved in a given operation
I/O activity rate or IOPs = amount of data moved/I/O size per unit of time
Bandwidth = data moved per unit of time = I/O size × I/O rate
Latency usually increases with larger I/O sizes, decreases with smaller requests
I/O rates usually increase with smaller I/O sizes and vice versa
Bandwidth increases with larger I/O sizes and vice versa
Sequential stream access data may have better performance than some random access data
Not all data is conducive to being sequential stream, or random
Lower response time is better, higher activity rates and bandwidth are better

Queues with high latency and small I/O size or small I/O rates could indicate a performance bottleneck. Queues with low latency and high I/O rates with good bandwidth or data being moved could be a good thing. An important note is to look at several metrics, not just IOPs or activity, or bandwidth, queues, or response time. Also, keep in mind that metrics that matter for your environment may be different from those for somebody else.

Something to keep in perspective is that there can be a large amount of data with low performance, or a small amount of data with high-performance, not to mention many other variations. The important concept is that as space capacity scales, that does not mean performance also improves or vice versa, after all, everything is not the same.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Availability 4 3 2 1 Data Protection

4 3 2 1 data protection Application Data Availability Everything Is Not The Same

Application Data Availability 4 3 2 1 Data Protection

This is part two of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application performance, availability, capacity, economic (PACE) attributes that have an impact on data value as well as availability.

Availability (Accessibility, Durability, Consistency)

Just as there are many different aspects and focus areas for performance, there are also several facets to availability. Note that applications performance requires availability and availability relies on some level of performance.

Availability is a broad and encompassing area that includes data protection to protect, preserve, and serve (backup/restore, archive, BC, BR, DR, HA) data and applications. There are logical and physical aspects of availability including data protection as well as security including key management (manage your keys or authentication and certificates) and permissions, among other things.

Availability = accessibility (can you get to your application and data) + durability (is the data intact and consistent). This includes basic Reliability, Availability, Serviceability (RAS), as well as high availability, accessibility, and durability. “Durable” has multiple meanings, so context is important. Durable means how data infrastructure resources hold up to, survive, and tolerate wear and tear from use (i.e., endurance), for example, Flash SSD or mechanical devices such as Hard Disk Drives (HDDs). Another context for durable refers to data, meaning how many copies in various places.

Server, storage, and I/O network availability topics include:

Resiliency and self-healing to tolerate failure or disruption
Hardware, software, and services configured for resiliency
Accessibility to reach or be reached for handling work
Durability and consistency of data to be available for access
Protection of data, applications, and assets including security

Additional server I/O and data infrastructure along with storage topics include:

Backup/restore, replication, snapshots, sync, and copies
Basic Reliability, Availability, Serviceability, HA, fail over, BC, BR, and DR
Alternative paths, redundant components, and associated software
Applications that are fault-tolerant, resilient, and self-healing
Non disruptive upgrades, code (application or software) loads, and activation
Immediate data consistency and integrity vs. eventual consistency
Virus, malware, and other data corruption or loss prevention

From a data protection standpoint, the fundamental rule or guideline is 4 3 2 1, which means having at least four copies consisting of at least three versions (different points in time), at least two of which are on different systems or storage devices and at least one of those is off-site (on-line, off-line, cloud, or other). There are many variations of the 4 3 2 1 rule shown in the following figure along with approaches on how to manage technology to use. We will go into deeper this subject in later chapters. For now, remember the following.

4 3 2 1 data protection (via Software Defined Data Infrastructure Essentials)

4    At least four copies of data (or more), Enables durability in case a copy goes bad, deleted, corrupted, failed device, or site.
3    The number (or more) versions of the data to retain, Enables various recovery points in time to restore, resume, restart from.
2    Data located on two or more systems (devices or media/mediums), Enables protection against device, system, server, file system, or other fault/failure.

1 With at least one of those copies being off-premise and not live (isolated from active primary copy), Enables resiliency across sites, as well as space, time, distance gap for protection.

Capacity and Space (What Gets Consumed and Occupied)

In addition to being available and accessible in a timely manner (performance), data (and applications) occupy space. That space is memory in servers, as well as using available consumable processor CPU time along with I/O (performance) including over networks.

Data and applications also consume storage space where they are stored. In addition to basic data space, there is also space consumed for metadata as well as protection copies (and overhead), application settings, logs, and other items. Another aspect of capacity includes network IP ports and addresses, software licenses, server, storage, and network bandwidth or service time.

Server, storage, and I/O network capacity topics include:

Consumable time-expiring resources (processor time, I/O, network bandwidth)
Network IP and other addresses
Physical resources of servers, storage, and I/O networking devices
Software licenses based on consumption or number of users
Primary and protection copies of data and applications
Active and standby data infrastructure resources and sites
Data footprint reduction (DFR) tools and techniques for space optimization
Policies, quotas, thresholds, limits, and capacity QoS
Application and database optimization

DFR includes various techniques, technologies, and tools to reduce the impact or overhead of protecting, preserving, and serving more data for longer periods of time. There are many different approaches to implementing a DFR strategy, since there are various applications and data.

Common DFR techniques and technologies include archiving, backup modernization, copy data management (CDM), clean up, compress, and consolidate, data management, deletion and dedupe, storage tiering, RAID (including parity-based, erasure codes , local reconstruction codes [LRC] , and Reed-Solomon , Ceph Shingled Erasure Code (SHEC ), among others), along with protection configurations along with thin-provisioning, among others.

DFR can be implemented in various complementary locations from row-level compression in database or email to normalized databases, to file systems, operating systems, appliances, and storage systems using various techniques.

Also, keep in mind that not all data is the same; some is sparse, some is dense, some can be compressed or deduped while others cannot. Likewise, some data may not be compressible or dedupable. However, identical copies can be identified with links created to a common copy.

Economics (People, Budgets, Energy and other Constraints)

If one thing in life and technology that is constant is change, then the other constant is concern about economics or costs. There is a cost to enable and maintain a data infrastructure on premise or in the cloud, which exists to protect, preserve, and serve data and information applications.

However, there should also be a benefit to having the data infrastructure to house data and support applications that provide information to users of the services. A common economic focus is what something costs, either as up-front capital expenditure (CapEx) or as an operating expenditure (OpEx) expense, along with recurring fees.

In general, economic considerations include:

Budgets (CapEx and OpEx), both up front and in recurring fees
Whether you buy, lease, rent, subscribe, or use free and open sources
People time needed to integrate and support even free open-source software
Costs including hardware, software, services, power, cooling, facilities, tools
People time includes base salary, benefits, training and education

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. All applications have some element of performance, availability, capacity, economic (PACE) needs as well as resource demands. There is often a focus around data storage about storage efficiency and utilization which is where data footprint reduction (DFR) techniques, tools, trends and as well as technologies address capacity requirements. However with data storage there is also an expanding focus around storage effectiveness also known as productivity tied to performance, along with availability including 4 3 2 1 data protection. Continue reading the next post (Part III Application Data Characteristics Types Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Characteristics Types Everything Is Not The Same

This is part three of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on different types of data. There is more to data than simply being big data, fast data, big fast or unstructured, structured or semistructured, some of which has been touched on in this series, with more to follow. Note that there is also data in terms of the programs, applications, code, rules, policies as well as configuration settings, metadata along with other items stored.

Various Types of Data

Data types along with characteristics include big data, little data, fast data, and old as well as new data with a different value, life-cycle, volume and velocity. There are data in files and objects that are big representing images, figures, text, binary, structured or unstructured that are software defined by the applications that create, modify and use them.

There are many different types of data and applications to meet various business, organization, or functional needs. Keep in mind that applications are based on programs which consist of algorithms and data structures that define the data, how to use it, as well as how and when to store it. Those data structures define data that will get transformed into information by programs while also being stored in memory and on data stored in various formats.

Just as various applications have different algorithms, they also have different types of data. Even though everything is not the same in all environments, or even how the same applications get used across various organizations, there are some similarities. Even though there are different types of applications and data, there are also some similarities and general characteristics. Keep in mind that information is the result of programs (applications and their algorithms) that process data into something useful or of value.

Data typically has a basic life cycle of:

Creation and some activity, including being protected
Dormant, followed by either continued activity or going inactive
Disposition (delete or remove)

In general, data can be

Temporary, ephemeral or transient
Dynamic or changing (“hot data”)
Active static on-line, near-line, or off-line (“warm-data”)
In-active static on-line or off-line (“cold data”)

Data is organized

Structured
Semi-structured
Unstructured

General data characteristics include:

Value = From no value to unknown to some or high value
Volume = Amount of data, files, objects of a given size
Variety = Various types of data (small, big, fast, structured, unstructured)
Velocity = Data streams, flows, rates, load, process, access, active or static

The following figure shows how different data has various values over time. Data that has no value today or in the future can be deleted, while data with unknown value can be retained.

Different data with various values over time

Application Data Value across sddc
Data Value Known, Unknown and No Value

General characteristics include the value of the data which in turn determines its performance, availability, capacity, and economic considerations. Also, data can be ephemeral (temporary) or kept for longer periods of time on persistent, non-volatile storage (you do not lose the data when power is turned off). Examples of temporary scratch include work and scratch areas such as where data gets imported into, or exported out of, an application or database.

Data can also be little, big, or big and fast, terms which describe in part the size as well as volume along with the speed or velocity of being created, accessed, and processed. The importance of understanding characteristics of data and how their associated applications use them is to enable effective decision-making about performance, availability, capacity, and economics of data infrastructure resources.

Data Value

There is more to data storage than how much space capacity per cost.

All data has one of three basic values:

No value = ephemeral/temp/scratch = Why keep it?
Some value = current or emerging future value, which can be low or high = Keep
Unknown value = protect until value is unlocked, or no remaining value

In addition to the above basic three, data with some value can also be further subdivided into little value, some value, or high value. Of course, you can keep subdividing into as many more or different categories as needed, after all, everything is not always the same across environments.

Besides data having some value, that value can also change by increasing or decreasing in value over time or even going from unknown to a known value, known to unknown, or to no value. Data with no value can be discarded, if in doubt, make and keep a copy of that data somewhere safe until its value (or lack of value) is fully known and understood.

The importance of understanding the value of data is to enable effective decision-making on where and how to protect, preserve, and cost-effectively store the data. Note that cost-effective does not necessarily mean the cheapest or lowest-cost approach, rather it means the way that aligns with the value and importance of the data at a given point in time.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value at various times, and that value is also evolving. Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part IV Application Data Volume Velocity Variety Everything Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018October 18, 2024

Application Data Volume Velocity Variety Everything Is Not The Same

Application Data Volume Velocity Variety Everything Not The Same

This is part four of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on data volume velocity and variety, after all, everything is not the same, not to mention many different aspects of big data as well as little data.

Volume of Data

More data is growing at a faster rate every day, and that data is being retained for longer periods. Some data being retained has known value, while a growing amount of data has an unknown value. Data is generated or created from many sources, including mobile devices, social networks, web-connected systems or machines, and sensors including IoT and IoD. Besides where data is created from, there are also many consumers of data (applications) that range from legacy to mobile, cloud, IoT among others.

Unknown-value data may eventually have value in the future when somebody realizes that he can do something with it, or a technology tool or application becomes available to transform the data with unknown value into valuable information.

Some data gets retained in its native or raw form, while other data get processed by application program algorithms into summary data, or is curated and aggregated with other data to be transformed into new useful data. The figure below shows, from left to right and front to back, more data being created, and that data also getting larger over time. For example, on the left are two data items, objects, files, or blocks representing some information.

In the center of the following figure are more columns and rows of data, with each of those data items also becoming larger. Moving farther to the right, there are yet more data items stacked up higher, as well as across and farther back, with those items also being larger. The following figure can represent blocks of storage, files in a file system, rows, and columns in a database or key-value repository, or objects in a cloud or object storage system.

Application Data Value sddc
Increasing data velocity and volume, more data and data getting larger

In addition to more data being created, some of that data is relatively small in terms of the records or data structure entities being stored. However, there can be a large quantity of those smaller data items. In addition to the amount of data, as well as the size of the data, protection or overhead copies of data are also kept.

Another dimension is that data is also getting larger where the data structures describing a piece of data for an application have increased in size. For example, a still photograph was taken with a digital camera, cell phone, or another mobile handheld device, drone, or other IoT device, increases in size with each new generation of cameras as there are more megapixels.

Variety of Data

In addition to having value and volume, there are also different varieties of data, including ephemeral (temporary), persistent, primary, metadata, structured, semi-structured, unstructured, little, and big data. Keep in mind that programs, applications, tools, and utilities get stored as data, while they also use, create, access, and manage data.

There is also primary data and metadata, or data about data, as well as system data that is also sometimes referred to as metadata. Here is where context comes into play as part of tradecraft, as there can be metadata describing data being used by programs, as well as metadata about systems, applications, file systems, databases, and storage systems, among other things, including little and big data.

Context also matters regarding big data, as there are applications such as statistical analysis software and Hadoop, among others, for processing (analyzing) large amounts of data. The data being processed may not be big regarding the records or data entity items, but there may be a large volume. In addition to big data analytics, data, and applications, there is also data that is very big (as well as large volumes or collections of data sets).

For example, video and audio, among others, may also be referred to as big fast data, or large data. A challenge with larger data items is the complexity of moving over the distance promptly, as well as processing requiring new approaches, algorithms, data structures, and storage management techniques.

Likewise, the challenges with large volumes of smaller data are similar in that data needs to be moved, protected, preserved, and served cost-effectively for long periods of time. Both large and small data are stored (in memory or storage) in various types of data repositories.

In general, data in repositories is accessed locally, remotely, or via a cloud using:

Object and blobs stream, queue, and Application Programming Interface (API)
File-based using local or networked file systems
Block-based access of disk partitions, LUNs (logical unit numbers), or volumes

The following figure shows varieties of application data value including (left) photos or images, audio, videos, and various log, event, and telemetry data, as well as (right) sparse and dense data.

Application Data Value bits bytes blocks blobs bitstreams sddc
Varieties of data (bits, bytes, blocks, blobs, and bitstreams)

Velocity of Data

Data, in addition to having value (known, unknown, or none), volume (size and quantity), and variety (structured, unstructured, semi structured, primary, metadata, small, big), also has velocity. Velocity refers to how fast (or slowly) data is accessed, including being stored, retrieved, updated, scanned, or if it is active (updated, or fixed static) or dormant and inactive. In addition to data access and life cycle, velocity also refers to how data is used, such as random or sequential or some combination. Think of data velocity as how data, or streams of data, flow in various ways.

Velocity also describes how data is used and accessed, including:

Active (hot), static (warm and WORM), or dormant (cold)
Random or sequential, read or write-accessed
Real-time (online, synchronous) or time-delayed

Why this matters is that by understanding and knowing how applications use data, or how data is accessed via applications, you can make informed decisions. Also, having insight enables how to design, configure, and manage servers, storage, and I/O resources (hardware, software, services) to meet various needs. Understanding Application Data Value including the velocity of the data both for when it is created as well as when used is important for aligning the applicable performance techniques and technologies.

Where to learn more

Learn more about Application Data Value, application characteristics, performance, availability, capacity, economic (PACE) along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

- Part 1 – Application Data Value Characteristics Everything Is Not The Same
- Part 2 – 4 3 2 1 Data Protection Application Data Availability
- Part 3 – Application Data Characteristics Types Everything Is Not The Same
- Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
- Part 5 – Application Data Access life cycle Patterns Everything Not The Same
- Software Defined, Cloud, Object and Blob Storage
- Data Infrastructure server storage I/O network Recommended Reading
- World Backup Day 2018 Data Protection Readiness Reminder
- Data Infrastructure Server Storage I/O related Tradecraft Overview
- Data Infrastructure Overview, Its What’s Inside of Data Centers
- 4 3 2 1 and 3 2 1 data protection best practices
- Garbage data in, garbage information out, big data or big garbage?
- GDPR (General Data Protection Regulation) Resources Are You Ready?
- Which Enterprise HDD to use for a Content Server Platform
- The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
- The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
- Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value, size, as well as velocity as part of its characteristic including how used by various applications. Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part V Application Data Access life cycle Patterns Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Access Lifecycle Patterns Everything Is Not The Same

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Application Data Access Life cycle Patterns Everything Is Not The Same

This is part five of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we look at various application and data lifecycle patterns as well as wrap up this series.

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

When it comes to Application Data Value, a common question I hear is why not keep all data?

If the data has value, and you have a large enough budget, why not? On the other hand, most organizations have a budget and other constraints that determine how much and what data to retain.

Another common question I get asked (or told) it isn’t the objective to keep less data to cut costs?

If the data has no value, then get rid of it. On the other hand, if data has value or unknown value, then find ways to remove the cost of keeping more data for longer periods of time so its value can be realized.

In general, the data life cycle (called by some cradle to grave, birth or creation to disposition) is created, save and store, perhaps update and read with changing access patterns over time, along with value. During that time, the data (which includes applications and their settings) will be protected with copies or some other technique, and eventually disposed of.

Between the time when data is created and when it is disposed of, there are many variations of what gets done and needs to be done. Considering static data for a moment, some applications and their data, or data and their applications, create data which is for a short period, then goes dormant, then is active again briefly before going cold (see the left side of the following figure). This is a classic application, data, and information life-cycle model (ILM), and tiering or data movement and migration that still applies for some scenarios.

Application Data Value
Changing data access patterns for different applications

However, a newer scenario over the past several years that continues to increase is shown on the right side of the above figure. In this scenario, data is initially active for updates, then goes cold or WORM (Write Once/Read Many); however, it warms back up as a static reference, on the web, as big data, and for other uses where it is used to create new data and information.

Data, in addition to its other attributes already mentioned, can be active (hot), residing in a memory cache, buffers inside a server, or on a fast storage appliance or caching appliance. Hot data means that it is actively being used for reads or writes (this is what the term Heat map pertains to in the context of the server, storage data, and applications. The heat map shows where the hot or active data is along with its other characteristics.

Context is important here, as there are also IT facilities heat maps, which refer to physical facilities including what servers are consuming power and generating heat. Note that some current and emerging data center infrastructure management (DCIM) tools can correlate the physical facilities power, cooling, and heat to actual work being done from an applications perspective. This correlated or converged management view enables more granular analysis and effective decision-making on how to best utilize data infrastructure resources.

In addition to being hot or active, data can be warm (not as heavily accessed) or cold (rarely if ever accessed), as well as online, near-line, or off-line. As their names imply, warm data may occasionally be used, either updated and written, or static and just being read. Some data also gets protected as WORM data using hardware or software technologies. WORM (immutable) data, not to be confused with warm data, is fixed or immutable (cannot be changed).

When looking at data (or storage), it is important to see when the data was created as well as when it was modified. However, you should avoid the mistake of looking only at when it was created or modified: Instead, also look to see when it was the last read, as well as how often it is read. You might find that some data has not been updated for several years, but it is still accessed several times an hour or minute. Also, keep in mind that the metadata about the actual data may be being updated, even while the data itself is static.

Also, look at your applications characteristics as well as how data gets used, to see if it is conducive to caching or automated tiering based on activity, events, or time. For example, there is a large amount of data for an energy or oil exploration project that normally sits on slower lower-cost storage, but that now and then some analysis needs to run on.

Using data and storage management tools, given notice or based on activity, which large or big data could be promoted to faster storage, or applications migrated to be closer to the data to speed up processing. Another example is weekly, monthly, quarterly, or year-end processing of financial, accounting, payroll, inventory, or enterprise resource planning (ERP) schedules. Knowing how and when the applications use the data, which is also understanding the data, automated tools, and policies, can be used to tier or cache data to speed up processing and thereby boost productivity.

All applications have performance, availability, capacity, economic (PACE) attributes, however:

PACE attributes vary by Application Data Value and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application
PACE application and data characteristics along with value change over time

Read more about Application Data Value, PACE and application characteristics in Software Defined Data Infrastructure Essentials (CRC Press 2017).

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Lifecycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that Application Data Value everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them.

Also keep in mind that there is more data being created, the size of those data items, files, objects, entities, records are also increasing, as well as the speed at which they get created and accessed. The challenge is not just that there is more data, or data is bigger, or accessed faster, it’s all of those along with changing value as well as diverse applications to keep in perspective. With new Global Data Protection Regulations (GDPR) going into effect May 25, 2018, now is a good time to assess and gain insight into what data you have, its value, retention as well as disposition policies.

Remember, there are different data types, value, life-cycle, volume and velocity that change over time, and with Application Data Value Everything Is Not The Same, so why treat and manage everything the same?

Ok, nuff said, for now.

November 26, 2017October 18, 2024

Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips

Data Protection Fundamental Topics Tools Techniques Technologies Tips

Data Infrastructure and Data protection fundamental companion to Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Fundamental Server Storage I/O Tradecraft ( CRC Press 2017)

server storage I/O data infrastructure trends

By Greg Schulz – www.storageioblog.com November 26, 2017

This is Part I of a multi-part series on Data Protection fundamental tools topics techniques terms technologies trends tradecraft tips as a follow-up to my Data Protection Diaries series, as well as a companion to my new book Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Server Storage I/O Fundamental tradecraft (CRC Press 2017).

The focus of this series is around data protection fundamental topics including Data Infrastructure Services: Availability, RAS, RAID and Erasure Codes (including LRC) ( Chapter 9), Data Infrastructure Services: Availability, Recovery Point ( Chapter 10). Additional Data Protection related chapters include Storage Mediums and Component Devices ( Chapter 7), Management, Access, Tenancy, and Performance ( Chapter 8), as well as Capacity, Data Footprint Reduction ( Chapter 11), Storage Systems and Solutions Products and Cloud ( Chapter 12), Data Infrastructure and Software-Defined Management ( Chapter 13) among others.

Post in the series includes excerpts from Software Defined Data Infrastructure (SDDI) pertaining to data protection for legacy along with software defined data centers ( SDDC), data infrastructures in general along with related topics. In addition to excerpts, the posts also contain links to articles, tips, posts, videos, webinars, events and other companion material. Note that figure numbers in this series are those from the SDDI book and not in the order that they appear in the posts.

Posts in this data protection fundamental series include:

Part 1 – Data Infrastructure Data Protection Fundamentals
Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
Part 3 – Data Protection Access Availability RAID Erasure Codes ( EC) including LRC
Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
Part 5 – Point In Time Data Protection Granularity Points of Interest
Part 6 – Data Protection Security Logical Physical Software Defined
Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
Part 8 – Data Protection Diaries Walking Data Protection Talk
Part 9 – who’s Doing What ( Toolbox Technology Tools)
Part 10 – Data Protection Resources Where to Learn More

Figure 1.5 Data Infrastructures and other IT Infrastructure Layers

Data Infrastructures

Data Infrastructures exists to support business, cloud and information technology (IT) among other applications that transform data into information or services. The fundamental role of data infrastructures is to provide a platform environment for applications and data that is resilient, flexible, scalable, agile, efficient as well as cost-effective.

Put another way, data infrastructures exist to protect, preserve, process, move, secure and serve data as well as their applications for information services delivery. Technologies that make up data infrastructures include hardware, software, or managed services, servers, storage, I/O and networking along with people, processes, policies along with various tools spanning legacy, software-defined virtual, containers and cloud. Read more about data infrastructures (its what’s inside data centers) here.

Why SDDC SDDI Need Data Protection
Various Needs Demand Drivers For Data Protection Fundamentals

Why The Need For Data Protection

Data Protection encompasses many different things, from accessibility, durability, resiliency, reliability, and serviceability ( RAS) to security and data protection along with consistency. Availability includes basic, high availability ( HA), business continuance ( BC), business resiliency ( BR), disaster recovery ( DR), archiving, backup, logical and physical security, fault tolerance, isolation and containment spanning systems, applications, data, metadata, settings, and configurations.

From a data infrastructure perspective, availability of data services spans from local to remote, physical to logical and software-defined, virtual, container, and cloud, as well as mobile devices. Figure 9.2 shows various data infrastructure availability, accessibility, protection, and security points of interest. On the left side of Figure 9.2 are various data protection and security threat risks and scenarios that can impact availability, or result in a data loss event ( DLE), data loss access ( DLA), or disaster. The right side of Figure 9.2 shows various techniques, tools, technologies, and best practices to protect data infrastructures, applications, and data from threat risks.

SDDI SDDC Data Protection Fundamental Big Picture
Figure 9.2 Various threat vectors, issues, problems, and challenges that drive the need for data protection

A fundamental role of data infrastructures (and data centers) is to protect, preserve, secure and serve information when needed with consistency. This also means that the data infrastructure resources (servers, storage, I/O networks, hardware, software, external services) and the applications (and data) they combine and are defined to protect are also accessible, durable and secure.

Data Protection topics include:

Maintaining availability, accessibility to information services, applications and data
Data include software, actual data, metadata, settings, certificates and telemetry
Ensuring data is durable, consistent, secure and recoverable to past points in time
Everything is not the same across different environments, applications and data
Aligning techniques and technologies to meet various service level objectives ( SLO)

Data Protection Fundamental Tradecraft Skills Experience Knowledge

Tools, technologies, trends are part of Data Protection, so to are the techniques of knowing (e.g. tradecraft) what to use when, where, why and how to protect against various threats risks (challenges, issues, problems).

Part of what is covered in this series of posts as well as in the Software Defined Data Infrastructure (SDDI) Essentials book is tradecraft skills, tips, experiences, insight into what to use, as well as how to use old and new things in new ways.

This means looking outside the technology box towards what is that you need to protect and why, then knowing how to use different skills, experiences, techniques part of your tradecraft combined with data protection toolbox tools. Read more about tradecraft here.

Where To Learn More

Continue reading additional posts in this series of Data Infrastructure Data Protection fundamentals and companion to Software Defined Data Infrastructure Essentials (CRC Press 2017) book, as well as the following links covering technology, trends, tools, techniques, tradecraft and tips.

Part 1 – Data Infrastructure Data Protection Fundamentals
Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
Part 3 – Data Protection Fundamental Access Availability RAID Erasure Codes ( EC) including LRC
Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
Part 5 – Point In Time Data Protection Granularity Points of Interest
Part 6 – Data Protection Security Logical Physical Software Defined
Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
Part 8 – Data Protection Diaries Walking Data Protection Talk
Part 9 – who’s Doing What ( Toolbox Technology Tools)
Part 10 – Data Protection Resources Where to Learn More
Data Protection Diaries series
Data Infrastructure server storage I/O network Recommended Reading List Book Shelf
Software Defined Data Infrastructure Essentials (CRC 2017) Book
PCIe Server Storage I/O Network Fundamentals
If NVMe is the answer, what are the questions?
Fixing the Microsoft Windows 10 1709 post upgrade restart loop
Data Infrastructure server storage I/O network Recommended Reading
Introducing Windows Subsystem for Linux WSL Overview
IT transformation Serverless Life Beyond DevOps with New York Times CTO Nick Rockwell Podcast
HPE Announces AMD Powered Gen 10 ProLiant DL385 For Software Defined Workloads
NVM Non Volatile Memory Express NVMe Place
AWS Announces New S3 Cloud Storage Security Encryption Features

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Everything is not the same across environments, data centers, data infrastructures and applications.

Likewise everything is and does not have to be the same when it comes to Data Protection. Data protection fundamentals encompasses many different hardware, software, services including cloud technologies, tools, techniques, best practices, policies and tradecraft experience skills (e.g. knowing what to use when, where, why and how).

Since everything is not the same, various data protection approaches are needed to address various application performance availability capacity economic ( PACE) needs, as well as SLO and SLAs.

Get your copy of Software Defined Data Infrastructure Essentials here at Amazon.com, at CRC Press among other locations and learn more here. Meanwhile, continue reading with the next post in this series, Part 2 Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals.

Ok, nuff said, for now.

Dell Technologies Announces Class V VMware Tracking Stock exchange for stock or cash

Michael Dell and Silver Lake Continued Ownership

Where to learn more

What this all means

Share this:

June 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 6 (June 2018)

What this all means and wrap-up

Share this:

Solving Application Server Storage I/O Performance Bottlenecks Webinar

Where to learn more

What this all means

Share this:

Announcing Windows Server Summit Virtual Online Event

Windows Server Summit Break Out Tracks

Where to learn more

What this all means

Share this:

May 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 5 (May 2018)

May Buzzword, Buzz Topic and Trends

May NVMe Momentum Movement Activity

Software Defined Data Infrastructure Activity

What this all means and wrap-up

Share this:

Have you heard about the new CLOUD Act data regulation?

CLOUD Act background and Stored Communications Act

What is CLOUD Act

Where to learn more

What this all means and wrap-up

Share this:

Data Protection Recovery Life Post World Backup Day Pre GDPR

Expanding Focus Data Protection and GDPR

Expanding Focus Data Protection Recovery and other Things that start with R

Data Protection Tips, Reminders and Recommendations

Where to learn more

What this all means and wrap-up

Share this:

March 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 3 (March 2018)

What this all means and wrap-up

Share this:

Application Data Value Characteristics Everything Is Not The Same

Common Applications Characteristics

Performance and Activity (How Resources Get Used)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Availability 4 3 2 1 Data Protection

Availability (Accessibility, Durability, Consistency)

Capacity and Space (What Gets Consumed and Occupied)

Economics (People, Budgets, Energy and other Constraints)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Characteristics Types Everything Is Not The Same

Various Types of Data

Different data with various values over time

Data Value

Where to learn more

What this all means and wrap-up

Share this:

Application Data Volume Velocity Variety Everything Not The Same

Volume of Data

Variety of Data

Velocity of Data

Where to learn more

What this all means and wrap-up

Share this:

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

Where to learn more

What this all means and wrap-up

Share this:

Data Protection Fundamental Topics Tools Techniques Technologies Tips

Data Infrastructures

Why The Need For Data Protection

Data Protection Fundamental Tradecraft Skills Experience Knowledge

Where To Learn More

What This All Means