Security Archives

October 17, 2024March 12, 2025

RTO Context Matters

With RTO context matters similar to many things in and around Information Technology (IT) among other industries. Various three (or more) letter acronyms (TLAs) have different meanings based on their context. An example of a TLA is RTO which has different meanings. For instance, RTO can mean:

- Return To Office
- Recovery Time Objective
- Ready To Operate
- Return To Operations
- Among others…

From the data protection and cyber resilience context, RTO has traditionally been thought of as a Recovery Time Objective or the amount of time that something should be able to be restored, recovered, rebuilt, reset, or returned to service, aka being usable. Another way of looking at Recovery Time Objective is the goal or requirement that something is ready to operate, enabling an organization and its IT services apps, data, and information to return to operations.

Figure 1 Data Infrastructures and Recovery Time Objectives (RTO)

RTO Recovery Time Object Context

Where context is needed is not just simply what RTO is being discussed, e.g., recovery time objective; also, what is the scope of the recovery time objective? Is it all-inclusive for a specific component, layer, or focus point? A holistic RTO is when everything in the stack, vertical up and down all layers of hardware, software, services, and, if applicable, also horizontal across different systems, platforms, and locations, is usable. For example, when a user can access an app from various places, and everything is functioning, perhaps not at full or regular speed, it is functioning.

Figure 2 Various Threats and Data Infrastructure Layers

Recovery Time Objective Focus

On the other hand, component RTO refers to a specific focus area, point, or location in the stack (figure 2). For example, a lower-level server, network, storage device, physical or virtual machine, container, file system, database repository, or application is restored or returned to readiness and operation. The individual components may be restored to operating; however, what about the sum of all the parts that make up the holistic solution or service the user sees and expects to be in working condition?

Additional Resources Where to learn more

The following links are additional resources to learn more about Recovery Time Objectives (RTO) and related data infrastructures, tradecraft, and metrics that matter topics.

Various excerpts from Chapter 9 Software Defined Data Infrastructure book
Modernizing Data Protection (Blog Post)
Data Protection Diaries (Blog Post)
Azure Cloud Storage Options (Blog Post)
Availability and Accessibility (Article)

Additional learning experiences along with common questions (and answers), are found in my Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials (CRC Press) by Greg Schulz

What this all means

RTO context matters, not only for which RTO but also if it refers to a holistic compound aggregate scope or that of a component. While component RTOs are essential, so is the holistic focus of when things are usable.

Ok, nuff said.

Cheers Gs

Greg Schulz – Nine time Microsoft MVP Cloud and Data Center Management and Azure Storage, along with previous ten-time VMware vExpert. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2026 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of UnlimitedIO LLC.

April 6, 2018April 27, 2025

Have you heard about the new CLOUD Act data regulation?

The new CLOUD Act data regulation became law as part of the recent $1.3 Trillion (USD) omnibus U.S. government budget spending bill passed by Congress on March 23, 2018 and signed by President of the U.S. (POTUS) Donald Trump in March.

CLOUD Act is the acronym for Clarifying Lawful Overseas Use of Data, not to be confused with initiatives such as U.S. federal governments CLOUD First among others which are focused on using cloud, securing and complying (e.g. FedRAMP among others). In other words, the new CLOUD Act data regulation pertains to how data stored by cloud or other service providers can be accessed by law environment officials (LEO).

Supreme Court of the U.S. (SCOTUS) Image via https://www.supremecourt.gov/

CLOUD Act background and Stored Communications Act

After the signing into law of CLOUD Act, the US Department of Justice (DOJ) has asked the Supreme Court of the U.S. (SCOTUS) to dismiss the pending case against Microsoft (e.g., Azure Cloud). The case or question in front of SCOTUS pertained to whether LEO can search as well as seize information or data that is stored overseas or in foreign counties.

As a refresher, or if you had not heard, SCOTUS was asked to resolve if a service provider who is responding to a warrant based on probable cause under the 1986 era Stored Communications Act, is required to provide data in its custody, control or possession, regardless of if stored inside, or, outside the US.

Microsoft Azure Regions via Microsoft.com

This particular case in front of SCOTUS centered on whether Microsoft (a U.S. Technology firm) had to comply with a court order to produce emails (as part of an LEO drug investigation) even if those were stored outside of the US. In this particular situation, the emails were alleged to have been stored in a Microsoft Azure Cloud Dublin Ireland data center.

For its part, Microsoft senior attorney Hasan Ali said via FCW “This bill is a significant step forward in the larger global debate on what our privacy laws should look like, even if it does not go to the highest threshold". Here are some additional perspectives via Microsoft Brad Smith on his blog along with a video.

What is CLOUD Act

Clarifying Lawful Overseas Use of Data is the new CLOUD Act data regulation approved by Congress (House and Senate) details can be read here and here respectively with additional perspectives here.

The new CLOUD Act law allows for POTUS to enter into executive agreements with foreign governments about data on criminal suspects. Granted what is or is not a crime in a given country will likely open Pandora’s box of issues. For example, in the case of Microsoft, if an agreement between the U.S. and Ireland were in place, and, Ireland agreed to release the data, it could then be accessed.

Now, for some who might be hyperventilating after reading the last sentence, keep this in mind that if you are overseas, it is up to your government to protect your privacy. The foreign government must have an agreement in place with the U.S. and that a crime has or had been committed, a crime that both parties concur with.

Also, keep in mind that is also appeal processes for providers including that the customer is not a U.S. person and does not reside in the U.S. and the disclosure would put the provider at risk of violating foreign law. Also, keep in mind that various provisions must be met before a cloud or service provider has to hand over your data regardless of what country you reside, or where the data resides.

Where to learn more

Learn more about CLOUD Act, cloud, data protection, world backup day, recovery, restoration, GDPR along with related data infrastructure topics for cloud, legacy and other software defined environments via the following links:

AWS Cloud Application Data Protection Webinar
U.S. House and Senate versions of CLOUD Act data regulations
CLOUD (Clarifying Lawful Overseas Use of Data) Act data regulation became law
$1.3 Trillion (USD) omnibus U.S. government budget spending bill passed by Congress
US DOJ has asked SCOTUS to dismiss pending case against Microsoft
1986 era regulations and Stored Communications Act
Microsoft Azure Cloud regions
Data Protection Recovery Life Post World Backup Day Pre GDPR
Additional perspectives via Microsoft Brad Smith on his blog along with a video.
March 2018 Server StorageIO Data Infrastructure Update Newsletter
Application Data Value Characteristics Everything Is Not the Same (five-part mini-series)
Application Data Availability 4 3 2 1 Data Protection (part of the mini-series)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, Replication, Security)
Veeam GDPR preparedness experiences Webinar walking the talk
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Data Infrastructure server storage I/O network Recommended Reading
Object Storage Center resources (www.objectstoragecenter.com)
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Is the new CLOUD Act data regulation unique to Microsoft Azure Cloud?

No, it also applies to Amazon Web Services (AWS), Google, IBM Softlayer Cloud, Facebook, LinkedIn, Twitter and the long list of other service providers.

What about GDPR?

Keep in mind that the new Global Data Protection Regulations (GDPR) go into effect May 25, 2018, that while based out of the European Union (EU), have global applicability across organizations of all size, scope, and type. Learn more about GDPR, Data Protection and its global impact here.

Thus, if you have not heard about the new CLOUD Act data regulation, now is the time to become aware of it.

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

April 2, 2018November 26, 2023

Data Protection Recovery Life Post World Backup Day Pre GDPR

Data Protection Recovery Life Post World Backup Day Pre GDPR trends

It’s time for Data Protection Recovery Life Post World Backup Day Pre GDPR Start Date.

The annual March 31 world backup day focus has come and gone once again.

However, that does not mean data protection including backup as well as recovery along with security gets a 364-day vacation until March 31, 2019 (or the days leading up to it).

Granted, for some environments, public relations, editors, influencers and other industry folks backup day will take some time off while others jump on the ramp up to GDPR which goes into effect May 25, 2018.

Expanding Focus Data Protection and GDPR

As I mentioned in this post here, world backup day should be expanded to include increased focus not just on backup, also recovery as well as other forms of data protection. Likewise, May 25 2018 is not the deadline or finish line or the destination for GDPR (e.g. Global Data Protection Regulations), rather, it is the starting point for an evolving journey, one that has global impact as well as applicability. Recently I participated in a fireside chat discussion with Danny Allan of Veeam who shared his GDPR expertise as well as experiences, lessons learned, tips of Veeam as they started their journey, check it out here.

Expanding Focus Data Protection Recovery and other Things that start with R

As part of expanding the focus on Data Protection Recovery Life Post World Backup Day Pre GDPR, that also means looking at, discussing things that start with R (like Recovery). Some examples besides recovery include restoration, reassess, review, rethink protection, recovery point, RPO, RTO, reconstruction, resiliency, ransomware, RAID, repair, remediation, restart, resume, rollback, and regulations among others.

Data Protection Tips, Reminders and Recommendations

There are no blue participation ribbons for failed recovery. However, there can be pink slips.
Only you can prevent on-premises or cloud data loss. However, it is also a shared responsibility with vendors and service providers
You can’t go forward in the future when there is a disaster or loss of data if you can’t go back in time for recovery
GDPR appliances to organizations around the world of all size and across all sectors including nonprofit
Keep new school 4 3 2 1 data protection in mind while evolving from old school 3 2 1 backup rules

4 3 2 1 backup data protection rule

A Fundamental premise of data infrastructures is to enable applications and their data, protect, preserve, secure and serve
Remember to protect your applications, as well as data including metadata, settings configurations
Test your restores including can you use the data along with security settings
Don’t cause a disaster in the course of testing your data protection, backups or recovery
Expand (or refresh) your data protection and data infrastructure education tradecraft skills experiences

Where to learn more

Learn more about data protection, world backup day, recovery, restoration, GDPR along with related data infrastructure topics for cloud, legacy and other software defined environments via the following links:

AWS Cloud Application Data Protection Webinar
March 2018 Server StorageIO Data Infrastructure Update Newsletter
Application Data Value Characteristics Everything Is Not The Same (five-part mini-series)
Application Data Availability 4 3 2 1 Data Protection (part of the mini-series)
World Backup Day: Best Practices for a Hybrid Approach
Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips
World Backup Day 2018 Data Protection Readiness Reminder
Data Protection Fundamental Topics Tools Techniques Technologies Tips
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)
Veeam GDPR preparedness experiences Webinar walking the talk
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Have you heard about the new CLOUD Act data regulation?
Data Infrastructure server storage I/O network Recommended Reading
Object Storage Center resources (www.objectstoragecenter.com)
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data protection including business continuance (BC), business resiliency (BR), disaster recovery (DR), availability, accessibility, backup, snapshots, encryption, security, privacy among others is a 7 x 24 x 365 day a year focus. The focus of data protection also needs to evolve from an after the fact cost overhead to proactive, business enabler Meanwhile, welcome to Data Protection Recovery Post World Backup Day Pre GDPR Start Date.

Ok, nuff said, for now.

March 29, 2018November 26, 2023

AWS Cloud Application Data Protection Webinar

AWS Cloud Application Data Protection Webinar trends

AWS Cloud Application Data Protection Webinar
Date: Tuesday, April 24, 2018 at 11:00am PT / 2:00pm ET

Only YOU can prevent data loss for on-premises, Amazon Web Service (AWS) based cloud, and hybrid applications.

Join me in this free AWS Cloud Application Data Protection Webinar (registration required) sponsored by Veeam produced by Redmond Magazine as we explore issues, trends, tools, best practices and techniques for enabling data protection with AWS technologies.

Attend and learn about:

Application-aware point in time snapshot data protection
Protecting AWS EC2 and on-premises applications (and data)
Leveraging AWS for data protection and recovery
And much more

Where to learn more

Learn more about data protection, software defined data center (SDDC), software defined data infrastructures (SDDI), AWS, cloud and related topics via the following links:

World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Only you can prevent cloud data loss
Cloud conversations: confidence, certainty and confidentiality
Cloud conversations: AWS EBS, Glacier and S3 overview (Part I)
Cloud Conversations AWS Azure Service Maps via Microsoft
AWS S3 Storage Gateway Revisited (Part I)
Cloud Conversations: AWS S3 Cross Region Replication storage enhancements
Cloud conversations: AWS EBS, Glacier and S3 overview (Part II S3)
Microsoft Windows Server 2019 Insiders Preview
March 2018 Server StorageIO Data Infrastructure Update Newsletter
AWS Announces New S3 Cloud Storage Security Encryption Features
All You Need To Know about ROBO Data Protection Backup (free webinar with registration)
Data Protection Recovery Life Post World Backup Day Pre GDPR
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Various Data Infrastructure related events, webinars and other activities
Server StorageIO.tv (various videos and podcasts, fun and for work)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

You can not go forward if you can not go back to a particular point in time (e.g. recovery point objective or RPO). Likewise, if you can not go back to a given RPO, how can you go forward with your business as well as meet your recovery time objective (RTO)? Join us for the live conversation or replay by registering (free) here to learn how to enable AWS Cloud Application Data Protection Webinar, as well as using AWS S3 for on-site, on-premises data protection.

Ok, nuff said, for now.

March 28, 2018November 26, 2023

Microsoft Windows Server 2019 Insiders Preview

Application Data Value Characteristics Everything Is Not The Same

Microsoft Windows Server 2019 Insiders Preview has been announced. Windows Server 2019 in the past might have been named 2016 R2 also known as a Long-Term Servicing Channel (LTSC) release. Microsoft recommends LTSC Windows Server for workloads such as Microsoft SQL Server, Share Point and SDDC. The focus of Microsoft Windows Server 2019 Insiders Preview is around hybrid cloud, security, application development as well as deployment including containers, software defined data center (SDDC) and software defined data infrastructure, as well as converged along with hyper-converged infrasture (HCI) management.

Windows Server 2019 Preview Features

Features and enhancements in the Microsoft Windows Server 2019 Insiders Preview span HCI management, security, hybrid cloud among others.

Hybrid cloud – Extending active directory, file server synchronize, cloud backup, applications spanning on-premises and cloud, management).
Security – Protect, detect and respond including shielded VMs, attested guarded fabric of host guarded machines, Windows and Linux VM (shielded), VMConnect for Windows and Linux troubleshooting of Shielded VM and encrypted networks, Windows Defender Advanced Threat Protection (ATP) among other enhancements.
Application platform – Developer and deployment tools for Windows Server containers and Windows Subsystem on Linux (WSL). Note that Microsoft has also been reducing the size of the Server image while extending feature functionality. The smaller images take up less storage space, plus load faster. As part of continued serverless and container support (Windows and Linux along with Docker), there are options for deployment orchestration including Kubernetes (in beta). Other enhancements include extending previous support for Windows Subsystem for Linux (WSL).

Other enhancements part of Microsoft Windows Server 2019 Insiders Preview include cluster sets in support of software defined data center (SDDC). Cluster sets expand SDDC clusters of loosely coupled grouping of multiple failover clusters including compute, storage as well as hyper-converged configurations. Virtual machines have fluidity across member clusters within a cluster set and unified storage namespace. Existing failover cluster management experiences is preserved for member clusters, along with a new cluster set instance of the aggregate resources.

Management enhancements include S2D software defined storage performance history, project Honolulu support for storage updates, along with powershell cmdlet updates, as well as system center 2019. Learn more about project Honolulu hybrid management here and here.

Microsoft and Windows LTSC and SAC

As a refresher, Microsoft Windows (along with other software) is now being released on two paths including more frequent semi-annual channel (SAC), and less frequent LTSC releases. Some other things to keep in mind that SAC are focused around server core and nano server as container image while LTSC includes server with desktop experience as well as server core. For example, Windows Server 2016 released fall of 2016 is an LTSC, while the 1709 release was a SAC which had specific enhancements for container related environments.

There was some confusion fall of 2017 when 1709 was released as it was optimized for container and serverless environments and thus lacked storage spaces direct (S2D) leading some to speculate S2D was dead. S2D among other items that were not in the 1709 SAC are very much alive and enhanced in the LTSC preview for Windows Server 2019. Learn more about Microsoft LTSC and SAC here.

Test Driving Installing The Bits

One of the enhancements with LTSC preview candidate server 2019 is improved upgrades of existing environments. Granted not everybody will choose the upgrade in place keeping existing files however some may find the capability useful. I chose to give the upgrade keeping current files in place as an option to see how it worked. To do the upgrade I used a clean and up to date Windows Server 2016 data center edition with desktop. This test system is a VMware ESXi 6.5 guest running on flash SSD storage. Before the upgrade to Windows Server 2019, I made a VMware vSphere snapshot so I could quickly and easily restore the system to a good state should something not work.

To get the bits, go to Windows Insiders Preview Downloads (you will need to register)

Windows Server 2019 LTSC build 17623 is available in 18 languages in an ISO format and require a key.

The keys for the pre-release unlimited activations are:
Datacenter Edition 6XBNX-4JQGW-QX6QG-74P76-72V67
Standard Edition MFY9F-XBN2F-TYFMP-CCV49-RMYVH

First step is downloading the bits from the Windows insiders preview page including select language for the image to use.

Getting the windows server 2019 preview bits
Select the language for the image to download

windows server 2019 select language

Starting the download

Once you have the image download, apply it to your bare metal server or hypervisors guest. In this example, I copied the windows server 2019 image to a VMware ESXi server for a Windows Server 2016 guest machine to access via its virtual CD/DVD.

pre upgrade check windows server version
Verify the Windows Server version before upgrade

After download, access the image, in this case, I attached the image to the virtual machine CD, then accessed it and ran the setup application.

Microsoft Windows Server 2019 Insiders Preview download

Download updates now or later

Entering license key for pre-release windows server 2019

Microsoft Windows Server 2019 Insiders Preview datacenter desktop version

Selecting Windows Server Datacenter with Desktop

Microsoft Windows Server 2019 Insiders Preview license

Accepting Software License for pre-release version.

Next up is determining to do a new install (keep nothing), or an in-place upgrade. I wanted to see how smooth the in-place upgrade was so selected that option.

Microsoft Windows Server 2019 Insiders Preview inplace upgrade

What to keep, nothing, or existing files and data

Confirming your selections

Microsoft Windows Server 2019 Insiders Preview install start

Ready to start the installation process

Microsoft Windows Server 2019 Insiders Preview upgrade in progress
Installation underway of Windows Server 2019 preview

Once the installation is complete, verify that Windows Server 2019 is now installed.

Completed upgrade from Windows Server 2016 to Microsoft Windows Server 2019 Insiders Preview

The above shows verifying the system build using Powershell, as well as the message in the lower right corner of the display. Granted the above does not show the new functionality, however you should get an idea of how quickly a Windows Server 2019 preview can be deployed to explore and try out the new features.

Where to learn more

Learn more Microsoft Windows Server 2019 Insiders Preview, Windows Server Storage Spaces Direct (S2D), Azure and related software defined data center (SDDC), software defined data infrastructures (SDDI) topics via the following links:

Fixing the Microsoft Windows 10 1709 post upgrade restart loop
Introducing Windows Subsystem for Linux WSL Overview
Microsoft Azure September 2017 Software Defined Data Infrastructure Updates
November 2017 Server StorageIO Data Infrastructure Update Newsletter
October 2017 Server StorageIO Data Infrastructure Update Newsletter
Via Microsoft: Announcing Windows Server 2019 Insider Preview Build 17623
Via Microsoft: Getting Started Windows Server Insiders
Via Microsoft: Introducing Windows Server 2019 – now available in preview
Via Microsoft: Pricing and licensing for Windows Server 2016
Via Microsoft: Windows Server Community
Via Microsoft: Windows Server Insider Preview Downloads
Part 1 – Application Data Value Characteristics Everything Is Not The Same
Data Infrastructure server storage I/O network Recommended Reading
AWS Cloud Application Data Protection Webinar
March 2018 Server StorageIO Data Infrastructure Update Newsletter
Data Infrastructure Server Storage I/O related Tradecraft Overview
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Microsoft Windows Server 2019 Insiders Preview gives a glimpse of some of the new features that are part of the next evolution of Windows Server as part of supporting hybrid IT environments. In addition to the new features and functionality that convey not only support for hybrid cloud, also hybrid applications development, deployment, devops and workloads, Microsoft is showing flexibility in management, ease of use, scalability, along with security as well as scale out stability. If you have not looked at Windows Server for a while, or involved with serverless, containers, Kubernetes among other initiatives, now is a good time to check out Microsoft Windows Server 2019 Insiders Preview.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Value Characteristics Everything Is Not The Same (Part I)

Application Data Value Characteristics Everything Is Not The Same

This is part one of a five-part mini-series looking at Application Data Value Characteristics Everything Is Not The Same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we start things off by looking at general application server storage I/O characteristics that have an impact on data value as well as access.

Everything is not the same across different organizations including Information Technology (IT) data centers, data infrastructures along with the applications as well as data they support. For example, there is so-called big data that can be many small files, objects, blobs or data and bit streams representing telemetry, click stream analytics, logs among other information.

Keep in mind that applications impact how data is accessed, used, processed, moved and stored. What this means is that a focus on data value, access patterns, along with other related topics need to also consider application performance, availability, capacity, economic (PACE) attributes.

If everything is not the same, why is so much data along with many applications treated the same from a PACE perspective?

Data Infrastructure resources including servers, storage, networks might be cheap or inexpensive, however, there is a cost to managing them along with data.

Managing includes data protection (backup, restore, BC, DR, HA, security) along with other activities. Likewise, there is a cost to the software along with cloud services among others. By understanding how applications use and interact with data, smarter, more informed data management decisions can be made.

IT Applications and Data Infrastructure Layers

Keep in mind that everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them. Also keep in mind that programs (e.g. applications) = algorithms (code) + data structures (how data defined and organized, structured or unstructured).

There are traditional applications, along with those tied to Internet of Things (IoT), Artificial Intelligence (AI) and Machine Learning (ML), Big Data and other analytics including real-time click stream, media and entertainment, security and surveillance, log and telemetry processing among many others.

What this means is that there are many different application with various character attributes along with resource (server compute, I/O network and memory, storage requirements) along with service requirements.

Common Applications Characteristics

Different applications will have various attributes, in general, as well as how they are used, for example, database transaction activity vs. reporting or analytics, logs and journals vs. redo logs, indices, tables, indices, import/export, scratch and temp space. Performance, availability, capacity, and economics (PACE) describes the applications and data characters and needs shown in the following figure.

Application PACE attributes (via Software Defined Data Infrastructure Essentials)

All applications have PACE attributes, however:

PACE attributes vary by application and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application

Think of applications along with associated data PACE as its personality or how it behaves, what it does, how it does it, and when, along with value, benefit, or cost as well as quality-of-service (QoS) attributes.

Understanding applications in different environments, including data values and associated PACE attributes, is essential for making informed server, storage, I/O decisions and data infrastructure decisions. Data infrastructures decisions range from configuration to acquisitions or upgrades, when, where, why, and how to protect, and how to optimize performance including capacity planning, reporting, and troubleshooting, not to mention addressing budget concerns.

Primary PACE attributes for active and inactive applications and data are:

P – Performance and activity (how things get used)
A – Availability and durability (resiliency and data protection)
C – Capacity and space (what things use or occupy)
E – Economics and Energy (people, budgets, and other barriers)

Some applications need more performance (server computer, or storage and network I/O), while others need space capacity (storage, memory, network, or I/O connectivity). Likewise, some applications have different availability needs (data protection, durability, security, resiliency, backup, business continuity, disaster recovery) that determine the tools, technologies, and techniques to use.

Budgets are also nearly always a concern, which for some applications means enabling more performance per cost while others are focused on maximizing space capacity and protection level per cost. PACE attributes also define or influence policies for QoS (performance, availability, capacity), as well as thresholds, limits, quotas, retention, and disposition, among others.

Performance and Activity (How Resources Get Used)

Some applications or components that comprise a larger solution will have more performance demands than others. Likewise, the performance characteristics of applications along with their associated data will also vary. Performance applies to the server, storage, and I/O networking hardware along with associated software and applications.

For servers, performance is focused on how much CPU or processor time is used, along with memory and I/O operations. I/O operations to create, read, update, or delete (CRUD) data include activity rate (frequency or data velocity) of I/O operations (IOPS). Other considerations include the volume or amount of data being moved (bandwidth, throughput, transfer), response time or latency, along with queue depths.

Activity is the amount of work to do or being done in a given amount of time (seconds, minutes, hours, days, weeks), which can be transactions, rates, IOPs. Additional performance considerations include latency, bandwidth, throughput, response time, queues, reads or writes, gets or puts, updates, lists, directories, searches, pages views, files opened, videos viewed, or downloads.

Server, storage, and I/O network performance include:

Processor CPU usage time and queues (user and system overhead)
Memory usage effectiveness including page and swap
I/O activity including between servers and storage
Errors, retransmission, retries, and rebuilds

the following figure shows a generic performance example of data being accessed (mixed reads, writes, random, sequential, big, small, low and high-latency) on a local and a remote basis. The example shows how for a given time interval (see lower right), applications are accessing and working with data via different data streams in the larger image left center. Also shown are queues and I/O handling along with end-to-end (E2E) response time.

Server I/O performance fundamentals (via Software Defined Data Infrastructure Essentials)

Click here to view a larger version of the above figure.

Also shown on the left in the above figure is an example of E2E response time from the application through the various data infrastructure layers, as well as, lower center, the response time from the server to the memory or storage devices.

Various queues are shown in the middle of the above figure which are indicators of how much work is occurring, if the processing is keeping up with the work or causing backlogs. Context is needed for queues, as they exist in the server, I/O networking devices, and software drivers, as well as in storage among other locations.

Some basic server, storage, I/O metrics that matter include:

Queue depth of I/Os waiting to be processed and concurrency
CPU and memory usage to process I/Os
I/O size, or how much data can be moved in a given operation
I/O activity rate or IOPs = amount of data moved/I/O size per unit of time
Bandwidth = data moved per unit of time = I/O size × I/O rate
Latency usually increases with larger I/O sizes, decreases with smaller requests
I/O rates usually increase with smaller I/O sizes and vice versa
Bandwidth increases with larger I/O sizes and vice versa
Sequential stream access data may have better performance than some random access data
Not all data is conducive to being sequential stream, or random
Lower response time is better, higher activity rates and bandwidth are better

Queues with high latency and small I/O size or small I/O rates could indicate a performance bottleneck. Queues with low latency and high I/O rates with good bandwidth or data being moved could be a good thing. An important note is to look at several metrics, not just IOPs or activity, or bandwidth, queues, or response time. Also, keep in mind that metrics that matter for your environment may be different from those for somebody else.

Something to keep in perspective is that there can be a large amount of data with low performance, or a small amount of data with high-performance, not to mention many other variations. The important concept is that as space capacity scales, that does not mean performance also improves or vice versa, after all, everything is not the same.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Availability 4 3 2 1 Data Protection

4 3 2 1 data protection Application Data Availability Everything Is Not The Same

Application Data Availability 4 3 2 1 Data Protection

This is part two of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application performance, availability, capacity, economic (PACE) attributes that have an impact on data value as well as availability.

Availability (Accessibility, Durability, Consistency)

Just as there are many different aspects and focus areas for performance, there are also several facets to availability. Note that applications performance requires availability and availability relies on some level of performance.

Availability is a broad and encompassing area that includes data protection to protect, preserve, and serve (backup/restore, archive, BC, BR, DR, HA) data and applications. There are logical and physical aspects of availability including data protection as well as security including key management (manage your keys or authentication and certificates) and permissions, among other things.

Availability = accessibility (can you get to your application and data) + durability (is the data intact and consistent). This includes basic Reliability, Availability, Serviceability (RAS), as well as high availability, accessibility, and durability. “Durable” has multiple meanings, so context is important. Durable means how data infrastructure resources hold up to, survive, and tolerate wear and tear from use (i.e., endurance), for example, Flash SSD or mechanical devices such as Hard Disk Drives (HDDs). Another context for durable refers to data, meaning how many copies in various places.

Server, storage, and I/O network availability topics include:

Resiliency and self-healing to tolerate failure or disruption
Hardware, software, and services configured for resiliency
Accessibility to reach or be reached for handling work
Durability and consistency of data to be available for access
Protection of data, applications, and assets including security

Additional server I/O and data infrastructure along with storage topics include:

Backup/restore, replication, snapshots, sync, and copies
Basic Reliability, Availability, Serviceability, HA, fail over, BC, BR, and DR
Alternative paths, redundant components, and associated software
Applications that are fault-tolerant, resilient, and self-healing
Non disruptive upgrades, code (application or software) loads, and activation
Immediate data consistency and integrity vs. eventual consistency
Virus, malware, and other data corruption or loss prevention

From a data protection standpoint, the fundamental rule or guideline is 4 3 2 1, which means having at least four copies consisting of at least three versions (different points in time), at least two of which are on different systems or storage devices and at least one of those is off-site (on-line, off-line, cloud, or other). There are many variations of the 4 3 2 1 rule shown in the following figure along with approaches on how to manage technology to use. We will go into deeper this subject in later chapters. For now, remember the following.

4 3 2 1 data protection (via Software Defined Data Infrastructure Essentials)

4    At least four copies of data (or more), Enables durability in case a copy goes bad, deleted, corrupted, failed device, or site.
3    The number (or more) versions of the data to retain, Enables various recovery points in time to restore, resume, restart from.
2    Data located on two or more systems (devices or media/mediums), Enables protection against device, system, server, file system, or other fault/failure.

1 With at least one of those copies being off-premise and not live (isolated from active primary copy), Enables resiliency across sites, as well as space, time, distance gap for protection.

Capacity and Space (What Gets Consumed and Occupied)

In addition to being available and accessible in a timely manner (performance), data (and applications) occupy space. That space is memory in servers, as well as using available consumable processor CPU time along with I/O (performance) including over networks.

Data and applications also consume storage space where they are stored. In addition to basic data space, there is also space consumed for metadata as well as protection copies (and overhead), application settings, logs, and other items. Another aspect of capacity includes network IP ports and addresses, software licenses, server, storage, and network bandwidth or service time.

Server, storage, and I/O network capacity topics include:

Consumable time-expiring resources (processor time, I/O, network bandwidth)
Network IP and other addresses
Physical resources of servers, storage, and I/O networking devices
Software licenses based on consumption or number of users
Primary and protection copies of data and applications
Active and standby data infrastructure resources and sites
Data footprint reduction (DFR) tools and techniques for space optimization
Policies, quotas, thresholds, limits, and capacity QoS
Application and database optimization

DFR includes various techniques, technologies, and tools to reduce the impact or overhead of protecting, preserving, and serving more data for longer periods of time. There are many different approaches to implementing a DFR strategy, since there are various applications and data.

Common DFR techniques and technologies include archiving, backup modernization, copy data management (CDM), clean up, compress, and consolidate, data management, deletion and dedupe, storage tiering, RAID (including parity-based, erasure codes , local reconstruction codes [LRC] , and Reed-Solomon , Ceph Shingled Erasure Code (SHEC ), among others), along with protection configurations along with thin-provisioning, among others.

DFR can be implemented in various complementary locations from row-level compression in database or email to normalized databases, to file systems, operating systems, appliances, and storage systems using various techniques.

Also, keep in mind that not all data is the same; some is sparse, some is dense, some can be compressed or deduped while others cannot. Likewise, some data may not be compressible or dedupable. However, identical copies can be identified with links created to a common copy.

Economics (People, Budgets, Energy and other Constraints)

If one thing in life and technology that is constant is change, then the other constant is concern about economics or costs. There is a cost to enable and maintain a data infrastructure on premise or in the cloud, which exists to protect, preserve, and serve data and information applications.

However, there should also be a benefit to having the data infrastructure to house data and support applications that provide information to users of the services. A common economic focus is what something costs, either as up-front capital expenditure (CapEx) or as an operating expenditure (OpEx) expense, along with recurring fees.

In general, economic considerations include:

Budgets (CapEx and OpEx), both up front and in recurring fees
Whether you buy, lease, rent, subscribe, or use free and open sources
People time needed to integrate and support even free open-source software
Costs including hardware, software, services, power, cooling, facilities, tools
People time includes base salary, benefits, training and education

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. All applications have some element of performance, availability, capacity, economic (PACE) needs as well as resource demands. There is often a focus around data storage about storage efficiency and utilization which is where data footprint reduction (DFR) techniques, tools, trends and as well as technologies address capacity requirements. However with data storage there is also an expanding focus around storage effectiveness also known as productivity tied to performance, along with availability including 4 3 2 1 data protection. Continue reading the next post (Part III Application Data Characteristics Types Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Characteristics Types Everything Is Not The Same

This is part three of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on different types of data. There is more to data than simply being big data, fast data, big fast or unstructured, structured or semistructured, some of which has been touched on in this series, with more to follow. Note that there is also data in terms of the programs, applications, code, rules, policies as well as configuration settings, metadata along with other items stored.

Various Types of Data

Data types along with characteristics include big data, little data, fast data, and old as well as new data with a different value, life-cycle, volume and velocity. There are data in files and objects that are big representing images, figures, text, binary, structured or unstructured that are software defined by the applications that create, modify and use them.

There are many different types of data and applications to meet various business, organization, or functional needs. Keep in mind that applications are based on programs which consist of algorithms and data structures that define the data, how to use it, as well as how and when to store it. Those data structures define data that will get transformed into information by programs while also being stored in memory and on data stored in various formats.

Just as various applications have different algorithms, they also have different types of data. Even though everything is not the same in all environments, or even how the same applications get used across various organizations, there are some similarities. Even though there are different types of applications and data, there are also some similarities and general characteristics. Keep in mind that information is the result of programs (applications and their algorithms) that process data into something useful or of value.

Data typically has a basic life cycle of:

Creation and some activity, including being protected
Dormant, followed by either continued activity or going inactive
Disposition (delete or remove)

In general, data can be

Temporary, ephemeral or transient
Dynamic or changing (“hot data”)
Active static on-line, near-line, or off-line (“warm-data”)
In-active static on-line or off-line (“cold data”)

Data is organized

Structured
Semi-structured
Unstructured

General data characteristics include:

Value = From no value to unknown to some or high value
Volume = Amount of data, files, objects of a given size
Variety = Various types of data (small, big, fast, structured, unstructured)
Velocity = Data streams, flows, rates, load, process, access, active or static

The following figure shows how different data has various values over time. Data that has no value today or in the future can be deleted, while data with unknown value can be retained.

Different data with various values over time

Application Data Value across sddc
Data Value Known, Unknown and No Value

General characteristics include the value of the data which in turn determines its performance, availability, capacity, and economic considerations. Also, data can be ephemeral (temporary) or kept for longer periods of time on persistent, non-volatile storage (you do not lose the data when power is turned off). Examples of temporary scratch include work and scratch areas such as where data gets imported into, or exported out of, an application or database.

Data can also be little, big, or big and fast, terms which describe in part the size as well as volume along with the speed or velocity of being created, accessed, and processed. The importance of understanding characteristics of data and how their associated applications use them is to enable effective decision-making about performance, availability, capacity, and economics of data infrastructure resources.

Data Value

There is more to data storage than how much space capacity per cost.

All data has one of three basic values:

No value = ephemeral/temp/scratch = Why keep it?
Some value = current or emerging future value, which can be low or high = Keep
Unknown value = protect until value is unlocked, or no remaining value

In addition to the above basic three, data with some value can also be further subdivided into little value, some value, or high value. Of course, you can keep subdividing into as many more or different categories as needed, after all, everything is not always the same across environments.

Besides data having some value, that value can also change by increasing or decreasing in value over time or even going from unknown to a known value, known to unknown, or to no value. Data with no value can be discarded, if in doubt, make and keep a copy of that data somewhere safe until its value (or lack of value) is fully known and understood.

The importance of understanding the value of data is to enable effective decision-making on where and how to protect, preserve, and cost-effectively store the data. Note that cost-effective does not necessarily mean the cheapest or lowest-cost approach, rather it means the way that aligns with the value and importance of the data at a given point in time.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value at various times, and that value is also evolving. Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part IV Application Data Volume Velocity Variety Everything Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018October 18, 2024

Application Data Volume Velocity Variety Everything Is Not The Same

Application Data Volume Velocity Variety Everything Not The Same

This is part four of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on data volume velocity and variety, after all, everything is not the same, not to mention many different aspects of big data as well as little data.

Volume of Data

More data is growing at a faster rate every day, and that data is being retained for longer periods. Some data being retained has known value, while a growing amount of data has an unknown value. Data is generated or created from many sources, including mobile devices, social networks, web-connected systems or machines, and sensors including IoT and IoD. Besides where data is created from, there are also many consumers of data (applications) that range from legacy to mobile, cloud, IoT among others.

Unknown-value data may eventually have value in the future when somebody realizes that he can do something with it, or a technology tool or application becomes available to transform the data with unknown value into valuable information.

Some data gets retained in its native or raw form, while other data get processed by application program algorithms into summary data, or is curated and aggregated with other data to be transformed into new useful data. The figure below shows, from left to right and front to back, more data being created, and that data also getting larger over time. For example, on the left are two data items, objects, files, or blocks representing some information.

In the center of the following figure are more columns and rows of data, with each of those data items also becoming larger. Moving farther to the right, there are yet more data items stacked up higher, as well as across and farther back, with those items also being larger. The following figure can represent blocks of storage, files in a file system, rows, and columns in a database or key-value repository, or objects in a cloud or object storage system.

Application Data Value sddc
Increasing data velocity and volume, more data and data getting larger

In addition to more data being created, some of that data is relatively small in terms of the records or data structure entities being stored. However, there can be a large quantity of those smaller data items. In addition to the amount of data, as well as the size of the data, protection or overhead copies of data are also kept.

Another dimension is that data is also getting larger where the data structures describing a piece of data for an application have increased in size. For example, a still photograph was taken with a digital camera, cell phone, or another mobile handheld device, drone, or other IoT device, increases in size with each new generation of cameras as there are more megapixels.

Variety of Data

In addition to having value and volume, there are also different varieties of data, including ephemeral (temporary), persistent, primary, metadata, structured, semi-structured, unstructured, little, and big data. Keep in mind that programs, applications, tools, and utilities get stored as data, while they also use, create, access, and manage data.

There is also primary data and metadata, or data about data, as well as system data that is also sometimes referred to as metadata. Here is where context comes into play as part of tradecraft, as there can be metadata describing data being used by programs, as well as metadata about systems, applications, file systems, databases, and storage systems, among other things, including little and big data.

Context also matters regarding big data, as there are applications such as statistical analysis software and Hadoop, among others, for processing (analyzing) large amounts of data. The data being processed may not be big regarding the records or data entity items, but there may be a large volume. In addition to big data analytics, data, and applications, there is also data that is very big (as well as large volumes or collections of data sets).

For example, video and audio, among others, may also be referred to as big fast data, or large data. A challenge with larger data items is the complexity of moving over the distance promptly, as well as processing requiring new approaches, algorithms, data structures, and storage management techniques.

Likewise, the challenges with large volumes of smaller data are similar in that data needs to be moved, protected, preserved, and served cost-effectively for long periods of time. Both large and small data are stored (in memory or storage) in various types of data repositories.

In general, data in repositories is accessed locally, remotely, or via a cloud using:

Object and blobs stream, queue, and Application Programming Interface (API)
File-based using local or networked file systems
Block-based access of disk partitions, LUNs (logical unit numbers), or volumes

The following figure shows varieties of application data value including (left) photos or images, audio, videos, and various log, event, and telemetry data, as well as (right) sparse and dense data.

Application Data Value bits bytes blocks blobs bitstreams sddc
Varieties of data (bits, bytes, blocks, blobs, and bitstreams)

Velocity of Data

Data, in addition to having value (known, unknown, or none), volume (size and quantity), and variety (structured, unstructured, semi structured, primary, metadata, small, big), also has velocity. Velocity refers to how fast (or slowly) data is accessed, including being stored, retrieved, updated, scanned, or if it is active (updated, or fixed static) or dormant and inactive. In addition to data access and life cycle, velocity also refers to how data is used, such as random or sequential or some combination. Think of data velocity as how data, or streams of data, flow in various ways.

Velocity also describes how data is used and accessed, including:

Active (hot), static (warm and WORM), or dormant (cold)
Random or sequential, read or write-accessed
Real-time (online, synchronous) or time-delayed

Why this matters is that by understanding and knowing how applications use data, or how data is accessed via applications, you can make informed decisions. Also, having insight enables how to design, configure, and manage servers, storage, and I/O resources (hardware, software, services) to meet various needs. Understanding Application Data Value including the velocity of the data both for when it is created as well as when used is important for aligning the applicable performance techniques and technologies.

Where to learn more

Learn more about Application Data Value, application characteristics, performance, availability, capacity, economic (PACE) along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

- Part 1 – Application Data Value Characteristics Everything Is Not The Same
- Part 2 – 4 3 2 1 Data Protection Application Data Availability
- Part 3 – Application Data Characteristics Types Everything Is Not The Same
- Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
- Part 5 – Application Data Access life cycle Patterns Everything Not The Same
- Software Defined, Cloud, Object and Blob Storage
- Data Infrastructure server storage I/O network Recommended Reading
- World Backup Day 2018 Data Protection Readiness Reminder
- Data Infrastructure Server Storage I/O related Tradecraft Overview
- Data Infrastructure Overview, Its What’s Inside of Data Centers
- 4 3 2 1 and 3 2 1 data protection best practices
- Garbage data in, garbage information out, big data or big garbage?
- GDPR (General Data Protection Regulation) Resources Are You Ready?
- Which Enterprise HDD to use for a Content Server Platform
- The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
- The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
- Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value, size, as well as velocity as part of its characteristic including how used by various applications. Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part V Application Data Access life cycle Patterns Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Access Lifecycle Patterns Everything Is Not The Same

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Application Data Access Life cycle Patterns Everything Is Not The Same

This is part five of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we look at various application and data lifecycle patterns as well as wrap up this series.

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

When it comes to Application Data Value, a common question I hear is why not keep all data?

If the data has value, and you have a large enough budget, why not? On the other hand, most organizations have a budget and other constraints that determine how much and what data to retain.

Another common question I get asked (or told) it isn’t the objective to keep less data to cut costs?

If the data has no value, then get rid of it. On the other hand, if data has value or unknown value, then find ways to remove the cost of keeping more data for longer periods of time so its value can be realized.

In general, the data life cycle (called by some cradle to grave, birth or creation to disposition) is created, save and store, perhaps update and read with changing access patterns over time, along with value. During that time, the data (which includes applications and their settings) will be protected with copies or some other technique, and eventually disposed of.

Between the time when data is created and when it is disposed of, there are many variations of what gets done and needs to be done. Considering static data for a moment, some applications and their data, or data and their applications, create data which is for a short period, then goes dormant, then is active again briefly before going cold (see the left side of the following figure). This is a classic application, data, and information life-cycle model (ILM), and tiering or data movement and migration that still applies for some scenarios.

Application Data Value
Changing data access patterns for different applications

However, a newer scenario over the past several years that continues to increase is shown on the right side of the above figure. In this scenario, data is initially active for updates, then goes cold or WORM (Write Once/Read Many); however, it warms back up as a static reference, on the web, as big data, and for other uses where it is used to create new data and information.

Data, in addition to its other attributes already mentioned, can be active (hot), residing in a memory cache, buffers inside a server, or on a fast storage appliance or caching appliance. Hot data means that it is actively being used for reads or writes (this is what the term Heat map pertains to in the context of the server, storage data, and applications. The heat map shows where the hot or active data is along with its other characteristics.

Context is important here, as there are also IT facilities heat maps, which refer to physical facilities including what servers are consuming power and generating heat. Note that some current and emerging data center infrastructure management (DCIM) tools can correlate the physical facilities power, cooling, and heat to actual work being done from an applications perspective. This correlated or converged management view enables more granular analysis and effective decision-making on how to best utilize data infrastructure resources.

In addition to being hot or active, data can be warm (not as heavily accessed) or cold (rarely if ever accessed), as well as online, near-line, or off-line. As their names imply, warm data may occasionally be used, either updated and written, or static and just being read. Some data also gets protected as WORM data using hardware or software technologies. WORM (immutable) data, not to be confused with warm data, is fixed or immutable (cannot be changed).

When looking at data (or storage), it is important to see when the data was created as well as when it was modified. However, you should avoid the mistake of looking only at when it was created or modified: Instead, also look to see when it was the last read, as well as how often it is read. You might find that some data has not been updated for several years, but it is still accessed several times an hour or minute. Also, keep in mind that the metadata about the actual data may be being updated, even while the data itself is static.

Also, look at your applications characteristics as well as how data gets used, to see if it is conducive to caching or automated tiering based on activity, events, or time. For example, there is a large amount of data for an energy or oil exploration project that normally sits on slower lower-cost storage, but that now and then some analysis needs to run on.

Using data and storage management tools, given notice or based on activity, which large or big data could be promoted to faster storage, or applications migrated to be closer to the data to speed up processing. Another example is weekly, monthly, quarterly, or year-end processing of financial, accounting, payroll, inventory, or enterprise resource planning (ERP) schedules. Knowing how and when the applications use the data, which is also understanding the data, automated tools, and policies, can be used to tier or cache data to speed up processing and thereby boost productivity.

All applications have performance, availability, capacity, economic (PACE) attributes, however:

PACE attributes vary by Application Data Value and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application
PACE application and data characteristics along with value change over time

Read more about Application Data Value, PACE and application characteristics in Software Defined Data Infrastructure Essentials (CRC Press 2017).

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Lifecycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that Application Data Value everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them.

Also keep in mind that there is more data being created, the size of those data items, files, objects, entities, records are also increasing, as well as the speed at which they get created and accessed. The challenge is not just that there is more data, or data is bigger, or accessed faster, it’s all of those along with changing value as well as diverse applications to keep in perspective. With new Global Data Protection Regulations (GDPR) going into effect May 25, 2018, now is a good time to assess and gain insight into what data you have, its value, retention as well as disposition policies.

Remember, there are different data types, value, life-cycle, volume and velocity that change over time, and with Application Data Value Everything Is Not The Same, so why treat and manage everything the same?

Ok, nuff said, for now.

January 10, 2018November 26, 2023

How to Achieve Flexible Data Protection Availability with All Flash Storage Solutions

Achieve Flexible Data Protection Availability with All Flash Solutions

server storage I/O data infrastructure trends

Updated 1/21/2018

How to Achieve Flexible flash data protection and Availability with All-Flash Storage Solutions

Interactive webinar discussion (not death by power point or Ui Gui product demo ;) pertaining flash data protection )
Tuesday January 30 2018 11AM PT / 2PM ET
Via Redmond Magazine (Free with registration)

Everything is not the same across different organizations, environments, application workloads and the data infrastructures that support them. Fast application and workloads need fast protection, restoration, and resumption as well as fast flash storage. This applies across legacy, software-defined, virtual, container, cloud, hybrid, converged and HCI among other environments.

Join me along with representatives from Pure Storage along with Veeam for this interactive discussion as we explore how to boost the performance, availability, capacity, and economics (PACE) of your applications along with the data infrastructures that support them.

How all-flash storage enables faster protection and restoration of fast applications
Why data protection and availability should not be an afterthought
Ways to leverage your data protection storage to drive business change
How to simplify and reduce complexity to boost productivity while lowering costs
Why workload aggregation consolidation should not cause aggravation

Where to learn more

Learn more about data protection, SSD, flash, data infrastructure and related topics via the following links:

Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its Whats Inside of Data Cetners
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
Server StorageIO.tv (various videos and podcasts, fun and for work)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Fast applications need fast and resilient data infrastructures that include server, storage, I/O networking along with data protection. Likewise performance depends on availability along with durability, likewise, availability and accessibility depend on performance, they go hand in hand. Join me and others from Pure Storage as well as Veeam for this conversational discussion about How to Achieve Flexible Data Protection and Availability with All-Flash Storage Solutions.

Ok, nuff said, for now.

November 26, 2017November 26, 2023

Data Protection Diaries Reliability, Availability, Serviceability RAS Fundamentals

Reliability, Availability, Serviceability RAS Fundamentals

Companion to Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Fundamental Server Storage I/O Tradecraft ( CRC Press 2017)

server storage I/O data infrastructure trends

By Greg Schulz – www.storageioblog.com November 26, 2017

This is Part 2 of a multi-part series on Data Protection fundamental tools topics techniques terms technologies trends tradecraft tips as a follow-up to my Data Protection Diaries series, as well as a companion to my new book Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Server Storage I/O Fundamental tradecraft (CRC Press 2017).

Click here to view the previous post Part 1 Data Infrastructure Data Protection Fundamentals, and click here to view the next post Part 3 Data Protection Access Availability RAID Erasure Codes (EC) including LRC.

Post in the series includes excerpts from Software Defined Data Infrastructure (SDDI) pertaining to data protection for legacy along with software defined data centers ( SDDC), data infrastructures in general along with related topics. In addition to excerpts, the posts also contain links to articles, tips, posts, videos, webinars, events and other companion material. Note that figure numbers in this series are those from the SDDI book and not in the order that they appear in the posts.

In this post the focus is around Data Protection availability from Chapter 9 which includes access, durability, RAS, RAID and Erasure Codes (including LRC), mirroring and replication along with related topics.

Figure 1.5 Data Infrastructures and other IT Infrastructure Layers

Reliability, Availability, Serviceability (RAS) Data Protection Fundamentals

Reliability, Availability Serviceability (RAS) and other access availability along with Data Protection topics are covered in chapter 9. A resilient data infrastructure (software-defined, SDDC and legacy) protects, preserves, secures and serves information involving various layers of technology. These technologies enable various layers ( altitudes) of functionality, from devices up to and through the various applications themselves.

SDDI SDDC Data Protection Big Picture
Figure 9.2 Various threat issues and challenges that drive the need for data protection

Some applications need a faster rebuild, while others need sustained performance (bandwidth, latency, IOPs, or transactions) with the slower rebuild; some need lower cost at the expense of performance; others are ok with more space if other objectives are meet. The result is that since everything is different yet there are similarities, there is also the need to tune how data Infrastructure protects, preserves, secures, and serves applications and data.

General reliability, availability, serviceability, and data protection functionality includes:

Manually or automatically via policies, start, stop, pause, resume protection
Adjust priorities of protection tasks, including speed, for faster or slower protection
Fast-reacting to changes, disruptions or failures, or slower cautious approaches
Workload and application load balancing (performance, availability, and capacity)

RAS can be optimized for:

Reduced redundancy for lower overall costs vs. resiliency
Basic or standard availability (leverage component plus)
High availability (use better components, multiple systems, multiple sites)
Fault-tolerant with no single points of failure (SPOF)
Faster restart, restore, rebuild, or repair with higher overhead costs
Lower overhead costs (space and performance) with lower resiliency
Lower impact to applications during rebuild vs. faster repair
Maintenance and planned outages or for continues operations

Common availability Data Protection related terms, technologies, techniques, trends and topics pertaining to data protection from availability and access to durability and consistency to point in time protection and security are shown below.

Data Protection Gaps and Air Gap

There are Good Data Protection Gaps that provide recovery points to a past time enabling recoverability in the future to move forward. Another good data protection gap is an Air Gap that isolates protection copies off-site or off-line so that they can not be tampered with enabling recovery from ransomware and other software defined threats. There are Bad data protection gaps including gaps in coverage where data is not protected or items are missing. Then there are Ugly data protecting gaps which include Bad gaps that result in what you think is protected are not and finding that your copies are bad when it is too late.

Data Protection Gaps Good Bad and Ugly

The following figure shows good data protection gaps including recovery points (point in time protection) along with air gaps.

Good Data Protection Gaps
Figure 9.9 Air Gaps and Data Protection

Fault / Failures To Tolerate (FTT)

FTT is how many faults or failures to tolerate for a given solution or service which in turn determines what mode of protection, or fault tolerant mode ( FTM) to use.

Fault Tolerant Mode (FTM)

FTM is the mode or technique used to enable resiliency and protect against some number of faults.

Fault / Failure Domains

Fault or Failure domains are places and things that can fail from regions, data centers or availability zones, clusters, stamps, pods, servers, networks, storage, hardware (systems, components including SSD and HDDs, power supplies, adapters). Other fault domain topics and focus areas include facility power, cooling, software including applications, databases, operating systems and hypervisors among others.

SDDI SDDC Fault Domains Zones Regions
Figure 9.5 Various Fault and Failure Domains, Regions, Locations

Clustering

Clustering is a technique and technology for enabling resiliency, as well as scaling performance, availability, and capacity. Clusters can be local, remote, or wide-area to support different data infrastructure objectives, combined with replication and other techniques.

SDDI SDDC Clustering
Figure 9.12 Clustering and Replication Examples

Another characteristic of clustering and resiliency techniques is the ability to detect and react quickly to failures to isolate and contain faults, as well as invoking automatic repair if needed. Different clustering technologies enable various approaches, from proprietary hardware and software tightly coupled to loosely coupled general-purpose hardware or software.

Clustering characteristics include:

Application, database, file system, operating system (Windows Storage Replica)
Storage systems, appliances, adapters and network devices
Hypervisors ( Hyper-V, VMware vSphere ESXi and vSAN among others)
Share everything, share some things, share nothing
Tightly or loosely coupled with common or individual system metadata
Local in a data center, campus, metro, or stretch cluster
Wide-area in different regions and availability zones
Active/active for fast fail over or restart, active/passive (standby) mode

Additional clustering considerations include:

How does performance scale as nodes are added, or what overhead exists?
How is cluster resource locking in shared environments handled?
How many (or few) nodes are needed for quorum to exist?
Network and I/O interface (and management) requirements
Cluster partition or split-brain (i.e., cluster splits into two)?
Fast-reacting fail over and resiliency vs. overhead of failing back
Locality of where applications are located vs. storage access and clustering

Where To Learn More

Continue reading additional posts in this series of Data Infrastructure Data Protection fundamentals and companion to Software Defined Data Infrastructure Essentials (CRC Press 2017) book, as well as the following links covering technology, trends, tools, techniques, tradecraft and tips.

Part 1 – Data Infrastructure Data Protection Fundamentals
Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
Part 3 – Data Protection Access Availability RAID Erasure Codes ( EC) including LRC
Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
Part 5 – Point In Time Data Protection Granularity Points of Interest
Part 6 – Data Protection Security Logical Physical Software Defined
Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
Part 8 – Data Protection Diaries Walking Data Protection Talk
Part 9 – who’s Doing What ( Toolbox Technology Tools)
Part 10 – Data Protection Resources Where to Learn More
Data Protection Diaries series
Data Infrastructure server storage I/O network Recommended Reading List Book Shelf
Software Defined Data Infrastructure Essentials (CRC 2017) Book

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Everything is not the same across different environments, data centers, data infrastructures and applications. There are various performance, availability, capacity economic (PACE) considerations along with service level objectives (SLO). Availability means being able to access information resources (applications, data and underlying data infrastructure resources), as well as data being consistent along with durable. Being durable means enabling data to be accessible in the event of a device, component or other fault domain item failures (hardware, software, data center).

Just as everything is not the same across different environments, there are various techniques, technologies and tools that can be used in different ways to enable availability and accessibility. These include high availability (HA), RAS, mirroring, replication, parity along with derivative erasure code (EC), LRC, RS and other RAID implementations, along with clustering. Also keep in mind that pertaining to data protection, there are good gaps (e.g. time intervals for recovery points, air gaps), bad gaps (missed coverage or lack of protection), and ugly gaps (not being able to recover from a gap in time).

Note that mirroring, replication, EC, LRC, RS or other Parity and RAID approaches are not replacements for backup, rather they are companions to time interval based recovery point protection such as snapshots, backup, checkpoints, consistency points and versioning among others (discussed in follow-up posts in this series).

Which data protection tool, technology to trend is the best depends on what you are trying to accomplish and your application workload PACE requirements along with SLOs. Get your copy of Software Defined Data Infrastructure Essentials here at Amazon.com, at CRC Press among other locations and learn more here. Meanwhile, continue reading with the next post in this series, Part 3 Data Protection Access Availability RAID Erasure Codes (EC) including LRC.

Ok, nuff said, for now.

November 14, 2017November 26, 2023

AWS Announces New S3 Cloud Storage Security Encryption Features

server storage I/O data infrastructure trends

Updated 1/17/2018

Amazon Web Services (AWS) recently announced new Simple Storage Service (S3) e.g. AWS S3 encryption and security enhancements including Default Encryption, Permission Checks, Cross-Region Replication ACL Overwrite, Cross-Region Replication with KMS and Detailed Inventory Report. Another recent announcement by AWS is for PrivateLinks endpoints within a Virtual Private Cloud (VPC).

AWS Dashboard
AWS Service Dashboard

Default Encryption

Extending previous security features, now you can mandate all objects stored in a given S3 bucket be encrypted without specifying a bucket policy that rejects non-encrypted objects. There are three server-side encryption (SSE) options for S3 objects including keys managed by S3, AWS KMS and SSE Customer ( SSE-C) managed keys. These options provide more flexibility as well as control for different environments along with increased granularity. Note that encryption can be forced on all objects in a bucket by specifying a bucket encryption configuration. When an unencrypted object is stored in an encrypted bucket, it will inherit the same encryption as the bucket, or, alternately specified by a PUT required.

AWS S3 Bucket Encryption
AWS S3 Buckets

Permission Checks

There is now an indicator on the S3 console dashboard prominently indicating which S3 buckets are publicly accessible. In the above image, some of my AWS S3 buckets are shown including one that is public facing. Note in the image above how there is a notion next to buckets that are open to public.

Cross-Region Replication ACL Overwrite and KMS

AWS Key Management Service (KMS) keys can be used for encrypting objects. Building on previous cross-region replication capabilities, now when you replicate objects across AWS accounts, a new ACL providing full access to the destination account can be specified.

Detailed Inventory Report

The S3 Inventory report ( which can also be encrypted) now includes the encryption status of each object.

PrivateLink for AWS Services

PrivateLinks enable AWS customers to access services from a VPC without using a public IP as well as traffic not having to go across the internet (e.g. keeps traffic within the AWS network. PrivateLink endpoints appear in Elastic Network Interface (ENI) with private IPs in your VPC and are highly available, resiliency and scalable. Besides scaling and resiliency, PrivateLink eliminates the need for white listing of public IPs as well as managing internet gateway, NAT and firewall proxies to connect to AWS services (Elastic Cloud Compute (EC2), Elastic Load Balancer (ELB), Kinesis Streams, Service Catalog, EC2 Systems Manager). Learn more about AWS PrivateLink for services here including VPC Endpoint Pricing here.

Where To Learn More

Learn more about related technology, trends, tools, techniques, and tips with the following links.

What This All Means

Common cloud concern considerations include privacy and security. AWS S3 among other industry cloud service and storage providers have had their share of not so pleasant news coverage involving security.

Keep in mind that data protection including security is a shared responsibility (and only you can prevent data loss). This means that the vendor or service provider has to take care of their responsibility making sure their solutions have proper data protection and security features by default, as well as extensions, and making those capabilities known to consumers.

The other part of shared responsibility is that consumers and users of cloud services need to know what the capabilities are, defaults and options as well as when to use various approaches. Ultimately it is up to the user of a cloud service to implement best practices to leverage cloud as well as their own on-premises technologies so that they can support data infrastructure that in turn protect, preserve, secure and serve information (along with their applications and data).

These are good enhancements by AWS to make their S3 cloud storage security encryption features available as well as provide options and awareness for users on how to use those capabilities.

Ok, nuff said, for now.

October 30, 2017April 27, 2025

Data Infrastructure server storage I/O network Recommended Reading #blogtober

server storage I/O data infrastructure trends recommended reading list

Updated 7/30/2018

The following is an evolving recommended reading list of data infrastructure topics including, server, storage I/O, networking, cloud, virtual, container, data protection and related topics that includes books, blogs, podcast’s, events and industry links among other resources.

Various Data Infrastructure including hardware, software, services related links:

Links A-E
Links F-J
Links K-O
Links P-T
Links U-Z
Other Links

In addition to my own books including Software Defined Data Infrastructure Essentials (CRC Press 2017), the following are Server StorageIO recommended reading list items . The recommended reading list includes various IT, Data Infrastructure and related topics.

Intel Recommended Reading List (IRRL) for developers is a good resource to check out.

Duncan Epping (@DuncanYB), Frank Denneman (@FrankDenneman) and Neils Hagoort (@NHagoort) have released their VMware vSphere 6.7 Clustering Deep Dive book available at venues including Amazon.com. This is the latest in a series of Cluster and deep dive books from Frank and Duncan which if you are involved with VMware, SDDC and related software defined data infrastructures these should be on your bookshelf.

Check out the Blogtober list of check out some of the blogs and posts occurring during October 2017 here.

Preston De Guise aka @backupbear is Author of several books has an interesting new site Foolsrushin.info that looks at topics including Ethics in IT among others. Check out his new book Data Protection: Ensuring Data Availability (CRC Press 2017) and available via Amazon.com here.

Brendan Gregg has a great site for Linux performance related topics here.

Greg Knieriemen has a must read weekly blog, post, column collection of whats going on in and around the IT and data infrastructure related industries, Check it out here.

Interested in file systems, CIFS, SMB, SAMBA and related topics then check out Chris Hertels book on implementing CIFS here at Amazon.com

For those involved with VMware, check out Frank Denneman VMware vSphere 6.5 host resource guide-book here at Amazon.com.

Docker: Up & Running: Shipping Reliable Containers in Production by Karl Matthias & Sean P. Kane via Amazon.com here.

Essential Virtual SAN (VSAN): Administrator’s Guide to VMware Virtual SAN,2nd ed. by Cormac Hogan & Duncan Epping via Amazon.com here.

Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale by Tom White via Amazon.com here.

Systems Performance: Enterprise and the Cloud by Brendan Gregg Via Amazon.com here.

Implementing Cloud Storage with OpenStack Swift by Amar Kapadia, Sreedhar Varma, & Kris Rajana Via Amazon.com here.

The Human Face of Big Data by Rick Smolan & Jennifer Erwitt Via Amazon.com here.

VMware vSphere 5.1 Clustering Deepdive (Vol. 1) by Duncan Epping & Frank Denneman Via Amazon.com here. Note: This is an older title, but there are still good fundamentals in it.

Linux Administration: A Beginners Guide by Wale Soyinka Via Amazon.com here.

TCP/IP Network Administration by Craig Hunt Via Amazon.com here.

Cisco IOS Cookbook: Field tested solutions to Cisco Router Problems by Kevin Dooley and Ian Brown Via Amazon.com here.

I often mention in presentations a must have for anybody involved with software defined anything, or programming for that matter which is the Niklaus Wirth classic Algorithms + Data Structures = Programs that you can get on Amazon.com here.

Seven Databases in Seven Weeks including NoSQL

Another great book to have is Seven Databases in Seven Weeks (here is a book review) which not only provides an overview of popular NoSQL databases such as Cassandra, Mongo, HBASE among others, lots of good examples and hands on guides. Get your copy here at Amazon.com.

Additional Data Infrastructure and related topic sites

In addition to those mentioned above, other sites, venues and data infrastructure related resources include:

aiim.com – Archiving and records management trade group

apache.org – Various open-source software

blog.scottlowe.org – Scott Lowe VMware Networking and topics

blogs.msdn.microsoft.com/virtual_pc_guy – Ben Armstrong Hyper-V blog

brendangregg.com – Linux performance-related topics

cablemap.info – Global network maps

CMG.org – Computer Measurement Group (CMG)

communities.vmware.com – VMware technical community and resources

comptia.org – Various IT, cloud, and data infrastructure certifications

cormachogan.com – Cormac Hogan VMware and vSAN related topics

csrc.nist.gov – U.S. government cloud specifications

dmtf.org – Distributed Management Task Force (DMTF)

ethernetalliance.org – Ethernet industry trade group

fibrechannel.org – Fibre Channel trade group

github.com – Various open-source solutions and projects

Intel Reading List – recommended reading list for developers

ieee.org – Institute of Electrical and Electronics Engineers

ietf.org – Internet Engineering Task Force

iso.org – International Standards Organizations

it.toolbox.com – Various IT and data infrastructure topics forums

labs.vmware.com/flings – VMware Fling additional tools and software

nist.gov – National Institute of Standards and Technology

nvmexpress.org – NVM Express (NVMe) industry trade group

objectstoragecenter.com – Various object and cloud storage items

opencompute.org – Open Compute Project (OCP) servers and related topics

opendatacenteralliance.org – Open Data Center Alliance (ODCA)

openfabrics.org – Open-fabric software industry group

opennetworking.org – Open Networking Foundation (ONF)

openstack.org – OpenStack resources

pcisig.com – Peripheral Component Interconnect (PCI) trade group

reddit.com – Various IT, cloud, and data infrastructure topics

scsita.org – SCSI trade association (SAS and others)

SNIA.org – Storage Network Industry Association (SNIA)

Speakingintech.com – Popular industry and data infrastructure podcast

Storage Bibliography – Collection of Dr. J. Metz storage related content

technet.microsoft.com – Microsoft TechNet data infrastructure–related topics

thenvmeplace.com – various NVMe and related tools, topics and links

thevpad.com – Collection of various virtualization and related sites

thessdplace.com – various NVM, SSD, flash, 3D XPoint related topics, tools, links

tpc.org – Transaction Performance Council benchmark site

vmug.org – VMware User Groups (VMUG)

wahlnetwork.com – Chris Whal Networking and related topics

yellow-bricks.com – Duncan Epping VMware and related topics

Additional Data Infrastructure Venues

Additional useful data infrastructure related information can be found at BizTechMagazine, BrightTalk, ChannelProNetwork, ChannelproSMB, ComputerWeekly, Computerworld, CRN, CruxialCIO, Data Center Journal (DCJ), Datacenterknowledge, and DZone. Other good sourses include Edtechmagazine, Enterprise Storage Forum, EnterpriseTech, Eweek.com, FedTech, Google+, HPCwire, InfoStor, ITKE, LinkedIn, NAB, Network Computing, Networkworld, and nextplatform. Also check out Reddit, Redmond Magazine and Webinars, Spiceworks Forums, StateTech, techcrunch.com, TechPageOne, TechTarget Venues (various Search sites, e.g., SearchStorage, SearchSSD, SearchAWS, and others), theregister.co.uk, TheVarGuy, Tom’s Hardware, and zdnet.com, among many others.

Where To Learn More

Learn more about related technology, trends, tools, techniques, and tips with the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

The above is an evolving collection of recommended reading including what I have on my physical and virtual bookshelves, as well as list of web sites, blogs and podcasts worth listening, reading or watching. Watch for more items to be added to the book shelf soon, and if you have a suggested recommendation, add it to the comments below.

By the way, if you have not heard, its #Blogtober, check out some of the other blogs and posts occurring during October here as part of your recommended reading list.

Ok, nuff said, for now.

RTO Context Matters

RTO Recovery Time Object Context

Recovery Time Objective Focus

Additional Resources Where to learn more

What this all means

Share this:

Have you heard about the new CLOUD Act data regulation?

CLOUD Act background and Stored Communications Act

What is CLOUD Act

Where to learn more

What this all means and wrap-up

Share this:

Data Protection Recovery Life Post World Backup Day Pre GDPR

Expanding Focus Data Protection and GDPR

Expanding Focus Data Protection Recovery and other Things that start with R

Data Protection Tips, Reminders and Recommendations

Where to learn more

What this all means and wrap-up

Share this:

AWS Cloud Application Data Protection Webinar

Where to learn more

What this all means and wrap-up

Share this:

Microsoft Windows Server 2019 Insiders Preview

Windows Server 2019 Preview Features

Microsoft and Windows LTSC and SAC

Test Driving Installing The Bits

Where to learn more

What this all means and wrap-up

Share this:

Application Data Value Characteristics Everything Is Not The Same

Common Applications Characteristics

Performance and Activity (How Resources Get Used)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Availability 4 3 2 1 Data Protection

Availability (Accessibility, Durability, Consistency)

Capacity and Space (What Gets Consumed and Occupied)

Economics (People, Budgets, Energy and other Constraints)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Characteristics Types Everything Is Not The Same

Various Types of Data

Different data with various values over time

Data Value

Where to learn more

What this all means and wrap-up

Share this:

Application Data Volume Velocity Variety Everything Not The Same

Volume of Data

Variety of Data

Velocity of Data

Where to learn more

What this all means and wrap-up

Share this:

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

Where to learn more

What this all means and wrap-up

Share this:

Achieve Flexible Data Protection Availability with All Flash Solutions

Where to learn more

What this all means and wrap-up

Share this:

Reliability, Availability, Serviceability RAS Fundamentals

Reliability, Availability, Serviceability (RAS) Data Protection Fundamentals

Data Protection Gaps and Air Gap

Fault / Failures To Tolerate (FTT)

Fault Tolerant Mode (FTM)

Fault / Failure Domains

Clustering

Where To Learn More

What This All Means

Share this:

AWS Announces New S3 Cloud Storage Security Encryption Features

Default Encryption

Permission Checks

Cross-Region Replication ACL Overwrite and KMS