Comment Archives

March 20, 2018April 27, 2025

March 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 3 (March 2018)

Hello and welcome to the March 2018 Server StorageIO Data Infrastructure Update Newsletter.

If you are wondering where the January and February 2018 update newsletters are, they are rolled into this combined edition. In addition to the short email version (free signup here), you can access full versions (html here and PDF here) along with previous editions here.

In this issue:

Data Infrastructure Industry Activity
News Commentary and Tips
Server StorageIOblog posts
Recommended Reading
Various Events and Webinars
Industry Resources and Links

Enjoy this edition of the Server StorageIO Data Infrastructure update newsletter.

Cheers GS

Data Infrastructure and IT Industry Activity Trends

Data Infrastructure Data Protection and Backup BC BR DR HA Security

World Backup day is coming up on March 31 which is a good time to remember to verify and validate that your data protection is working as intended. On one hand I think it is a good idea to call out the importance of making sure your data is protected including backed up.

On the other hand data protection is not a once a year, rather a year around, 7 x 24 x 365 day focus. Also the focus needs to be on more than just backup, rather, all aspects of data protection from archiving to business continuance (BC), business resiliency (BR), disaster recovery (DR), always on, always accessible, along with security and recovery.

Data Infrastructure Data Protection Backup 4 3 2 1 rule
Data Infrastructure 4 3 2 1 Data Protection and Backup

Some data spring thoughts, perspectives and reminders. Data lakes may swell beyond their banks causing rivers of data to flood as they flow into larger reservoirs, great data lakes, gulfs of data, seas and oceans of data. Granted, some of that data will be inactive cold parked like glaciers while others semi-active floating around like icebergs. Hopefully your data is stored on durable storage solutions or services and does not melt.

Data Infrastructure Server Storage I/O flash SSD NVMe
Various NAND Flash SSD devices and SAS, SATA, NVMe, M.2 interfaces

Non-Volatile Memory (NVM) including various solid state device (SSD) mediums (e.g. nand flash, 3D XPoint, MRAM among others), packaging (drives, PCIe Add in cars [AiC] along with entire systems, appliances or arrays). Also part of the continue evolution of NVM, SSD and other persistent memories (PM) including storage class memories (SCM) are different access protocol interfaces.

Keep in mind that there is a difference between NVM (medium) and NVMe (access), NVM is the generic category of mediums or media and devices such as nand flash, nvram, 3D XPoint among others SCM (and PMs). In other words, NVM is what data devices use for storing data, NVMe is how devices and systems are accessed. NVMe and its variations is how NVM, SSD, PM, SCM media and devices get accessed locally, as well as over network fabrics (e.g. NVMe-oF an FC-NVMe).

NVMe continues to evolve including with networked fabric variations such as RDMA based NVMe over Fabric (NVMe-oF), along with Fibre Channel based (FC-NVMe). The Fibre Channel Industry Association trade group recently held its second multi-vendor plugfest in support of NVMe over Fibre Channel.

Read more about NVM, NVMe, SSD, SCM, flash and related technologies, tools, trends, tips via the following resources:

Has Object Storage failed to live up to its industry hype lacking traction? Or, is object storage (also known as blobs) progressing with customer adoption and deployment on normal realistic timelines? Recently I have seen some industry comments about object storage not catching on with customers or failing to live up to its hyped expectation. IMHO object storage is very much alive along with block, file, table (e.g. database SQL and NoSQL repositories), message/queue among others, as well as emerging blockchain aka data exchanges.

Various Industry and Customer Adoption Deployment Timeline (Via: StorageIOblog.com)

An issue with object storage is that it is still new, still evolving, many IT environments applications do not yet speak or access objects and blobs natively. Likewise as is often the case, industry adoption and deployment is usually early and short term around the hype, vs. the longer cycle of customer adoption and deployment. The downside for those who only focus on object storage (or blobs) is that they may be under pressure to do things short term instead of adjusting to customer cycles which take longer, however real adoption and deployment also last longer.

While the hype and industry buzz around object storage (and blobs) may have faded, customer adoption continues and is here to stay, along with block, file among others, learn more at www.objectstoragecenter.com. Also keep in mind that there is a difference between industry and customer adoption along with deployment.

Some recent Industry Activities, Trends, News and Announcements include:

In case you missed it, Amazon Web Services (e.g. AWS) announced EKS (Elastic Kubernetes Service) which as its name implies, is an easy to use and manage Kubernetes (containers, serverless data infrastructure) running on AWS. AWS joins others including Microsoft Azure Kubernetes Services (AKS), Googles Kubernetes Engine, EasyStack (ESContainer for openstack and Kubernetes),VMware Pivotal Container Service (PKS) among others. What this means is that in the container serverless data infrastructure ecosystem Kubernetes container management (orchestration platform) is gaining in both industry as well as customer adoption along with deployment.

Check out other industry news, comments, trends perspectives here.

Data Infrastructure Server StorageIO Comments Content

Server StorageIO Commentary in the news, tips and articles

Recent Server StorageIO industry trends perspectives commentary in the news.

Via BizTech: Why Hybrid (SSD and HDD) Storage Might Be Fit for SMB environments
Via Excelero: Server StorageIO white paper enabling database DBaaS productivity
Via Cloudian: YouTube video interview file services on object storage with HyperFile
Via CDW Solutions: Comments on Software Defined Access
Via SearchStorage: Comments on Cloudian HyperStore on demand cloud like pricing
Via EnterpriseStorageForum: Comments and tips on Software Defined Storage Best Practices
Via PRNewsWire: Comments on Excelero NVMe NVMesh Database and DBaaS solutions
Via SearchStorage: Comments on NooBaa multi-cloud storage management
Via CDW: Comments on New IT Strategies Improve Your Bottom Line
Via EnterpriseStorageForum: Comments on Software Defined Storage: Pros and Cons
Via DataCenterKnowledge: Comments on The Great Data Center Headache IoT
Via SearchStorage: Comments on Dell and VMware merger scenario options
Via PRNewswire: Comments on Chelsio Microsoft Validation of iWARP/RDMA
Via SearchStorage: Comments on Server Storage Industry trends and Dell EMC
Via ChannelProSMB: Comments on Hybrid HDD and SSD storage solutions
Via ChannelProNetwork: Comments on What the Future Holds for HDDs
Via HealthcareITnews: Comments on MOUNTAINS OF MOBILE DATA
Via SearchStorage: Comments on Cloudian HyperStore 7 targets multi-cloud complexities
Via GlobeNewsWire: Comments on Cloudian HyperStore 7
Via GizModo: Comments on Intel Optane 800P NVMe M.2 SSD
Via DataCenterKnowledge: Comments on getting data centers ready for IoT
Via DataCenterKnowledge: Comments on Beyond the Hype: AI in the Data Center
Via DataCenterKnowledge: Comments on Data Center and Cloud Disaster Recovery
Via SearchStoragae: Comments on Cloudian HyperFile marries NAS and object storage
Via SearchStoragae: Comments on Top 10 Tips on Solid State Storage Adoption Strategy
Via SearchStoragae: Comments on 8 Top Tips for Beating the Big Data Deluge

View more Server, Storage and I/O trends and perspectives comments here.

Data Infrastructure Server StorageIOblog posts

Server StorageIOblog Data Infrastructure Posts

Recent and popular Server StorageIOblog posts include:

Application Data Value Characteristics Everything Is Not The Same
Application Data Availability 4 3 2 1 Data Protection
AWS Cloud Application Data Protection Webinar
Microsoft Windows Server 2019 Insiders Preview
Application Data Characteristics Types Everything Is Not The Same
Application Data Volume Velocity Variety Everything Is Not The Same
Application Data Access Lifecycle Patterns Everything Is Not The Same
Veeam GDPR preparedness experiences Webinar walking the talk
VMware continues cloud construction with March announcements
Benefits of Moving Hyper-V Disaster Recovery to the Cloud Webinar
World Backup Day 2018 Data Protection Readiness Reminder
Use Intel Optane NVMe U.2 SFF 8639 SSD drive in PCIe slot
Data Infrastructure Resource Links cloud data protection tradecraft trends
How to Achieve Flexible Data Protection Availability with All Flash Storage Solutions
November 2017 Server StorageIO Data Infrastructure Update Newsletter
IT transformation Serverless Life Beyond DevOps Podcast
Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips
HPE Announces AMD Powered Gen 10 ProLiant DL385 For Software Defined Workloads
AWS Announces New S3 Cloud Storage Security Encryption Features
Introducing Windows Subsystem for Linux WSL Overview #blogtober
Hot Popular New Trending Data Infrastructure Vendors To Watch

View other recent as well as past StorageIOblog posts here

Server StorageIO Recommended Reading (Watching and Listening) List

In addition to my own books including Software Defined Data Infrastructure Essentials (CRC Press 2017) available at Amazon.com (check out special sale price), the following are Server StorageIO data infrastructure recommended reading, watching and listening list items. The Server StorageIO data infrastructure recommended reading list includes various IT, Data Infrastructure and related topics including Intel Recommended Reading List (IRRL) for developers is a good resource to check out. Speaking of my books, Didier Van Hoye (@WorkingHardInIt) has a good review over on his site you can view here, also check out the rest of his great content while there.

In case you may have missed it, here is a good presentation from AWS re:invent 2017 by Brendan Gregg (@brendangregg) about how Netflix does EC2 and other AWS tuning along with plenty of great resource links. Keith Tenzer (@keithtenzer) provides a good perspective piece about containers in a large IT enterprise environment here including various options.

Speaking of IT data centers and data infrastructure environments, checkout the list of some of the worlds most extreme habitats for technology here. Mark Betz (@markbetz) has a series of Docker and Kubernetes networking fundamentals posts on his site here, as well as over at Medium including mention of Google Cloud (@googlecloud). The posts in Marks series are good refresher or intros to how Docker and Kubernetes handles basic networking between containers, pods, nodes, hosts in clusters. Check out part I here and part II here.

Blockchain elements
Image via https://stevetodd.typepad.com

Steve Todd (@Stevetodd) has some good perspectives about Trusted Data Exchanges e.g. life beyond blockchain and bitcoin here along with core element considerations (beyond the product pitch) here, along with associated data infrastructure and storage evolution vs. revolution here.

Watch for more items to be added to the recommended reading list book shelf soon.

Data Infrastructure Server StorageIO event activities

Events and Activities

Recent and upcoming event activities.

March 27, 2018 – Webinar – Veeams Road to GDPR Compliancy The 5 Lessons Learned

Feb 28, 2018 – Webinar – Benefits of Moving Hyper-V Disaster Recovery to the Cloud

Jan 30, 2018 – Webinar – Achieve Flexible Data Protection and Availability with All Flash Storage

Nov. 9, 2017 – Webinar – All You Need To Know about ROBO Data Protection Backup

See more webinars and activities on the Server StorageIO Events page here.

Data Infrastructure Server StorageIO Industry Resources and Links

Various useful links and resources:

Data Infrastructure Recommend Reading and watching list
Microsoft TechNet – Various Microsoft related from Azure to Docker to Windows
storageio.com/links – Various industry links (over 1,000 with more to be added soon)
objectstoragecenter.com – Cloud and object storage topics, tips and news items
OpenStack.org – Various OpenStack related items
storageio.com/downloads – Various presentations and other download material
storageio.com/protect – Various data protection items and topics
thenvmeplace.com – Focus on NVMe trends and technologies
thessdplace.com – NVM and Solid State Disk topics, tips and techniques
storageio.com/converge – Various CI, HCI and related SDS topics
storageio.com/performance – Various server, storage and I/O benchmark and tools
VMware Technical Network – Various VMware related items

Connect and Converse With Us

Subscribe to Newsletter – Newsletter Archives – StorageIO.com – StorageIOblog.com

What this all means and wrap-up

Data Infrastructures are what exists inside physical data centers spanning cloud, converged, hyper-converged, virtual, serverless and other software defined as well as legacy environments. The fundamental role of data infrastructures comprising server (compute), storage, I/O networking hardware, software, services defined by management tools, best practices and policies is to provide a platform for applications along with their data to deliver information services. With March 31 being world backup day, also focus on making sure that on April 1st you are not a fool trying to recover from a bad data protection copy. With the continued movement to flash SSD along with other forms of storage class memory (SCM) and persistent memories (PM), data moves at a faster rate meaning data protection is even more important to get you out of trouble as fast as you get into issues.

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2026 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

March 13, 2018November 26, 2023

Application Data Value Characteristics Everything Is Not The Same (Part I)

Application Data Value Characteristics Everything Is Not The Same

This is part one of a five-part mini-series looking at Application Data Value Characteristics Everything Is Not The Same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we start things off by looking at general application server storage I/O characteristics that have an impact on data value as well as access.

Everything is not the same across different organizations including Information Technology (IT) data centers, data infrastructures along with the applications as well as data they support. For example, there is so-called big data that can be many small files, objects, blobs or data and bit streams representing telemetry, click stream analytics, logs among other information.

Keep in mind that applications impact how data is accessed, used, processed, moved and stored. What this means is that a focus on data value, access patterns, along with other related topics need to also consider application performance, availability, capacity, economic (PACE) attributes.

If everything is not the same, why is so much data along with many applications treated the same from a PACE perspective?

Data Infrastructure resources including servers, storage, networks might be cheap or inexpensive, however, there is a cost to managing them along with data.

Managing includes data protection (backup, restore, BC, DR, HA, security) along with other activities. Likewise, there is a cost to the software along with cloud services among others. By understanding how applications use and interact with data, smarter, more informed data management decisions can be made.

IT Applications and Data Infrastructure Layers

Keep in mind that everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them. Also keep in mind that programs (e.g. applications) = algorithms (code) + data structures (how data defined and organized, structured or unstructured).

There are traditional applications, along with those tied to Internet of Things (IoT), Artificial Intelligence (AI) and Machine Learning (ML), Big Data and other analytics including real-time click stream, media and entertainment, security and surveillance, log and telemetry processing among many others.

What this means is that there are many different application with various character attributes along with resource (server compute, I/O network and memory, storage requirements) along with service requirements.

Common Applications Characteristics

Different applications will have various attributes, in general, as well as how they are used, for example, database transaction activity vs. reporting or analytics, logs and journals vs. redo logs, indices, tables, indices, import/export, scratch and temp space. Performance, availability, capacity, and economics (PACE) describes the applications and data characters and needs shown in the following figure.

Application PACE attributes (via Software Defined Data Infrastructure Essentials)

All applications have PACE attributes, however:

PACE attributes vary by application and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application

Think of applications along with associated data PACE as its personality or how it behaves, what it does, how it does it, and when, along with value, benefit, or cost as well as quality-of-service (QoS) attributes.

Understanding applications in different environments, including data values and associated PACE attributes, is essential for making informed server, storage, I/O decisions and data infrastructure decisions. Data infrastructures decisions range from configuration to acquisitions or upgrades, when, where, why, and how to protect, and how to optimize performance including capacity planning, reporting, and troubleshooting, not to mention addressing budget concerns.

Primary PACE attributes for active and inactive applications and data are:

P – Performance and activity (how things get used)
A – Availability and durability (resiliency and data protection)
C – Capacity and space (what things use or occupy)
E – Economics and Energy (people, budgets, and other barriers)

Some applications need more performance (server computer, or storage and network I/O), while others need space capacity (storage, memory, network, or I/O connectivity). Likewise, some applications have different availability needs (data protection, durability, security, resiliency, backup, business continuity, disaster recovery) that determine the tools, technologies, and techniques to use.

Budgets are also nearly always a concern, which for some applications means enabling more performance per cost while others are focused on maximizing space capacity and protection level per cost. PACE attributes also define or influence policies for QoS (performance, availability, capacity), as well as thresholds, limits, quotas, retention, and disposition, among others.

Performance and Activity (How Resources Get Used)

Some applications or components that comprise a larger solution will have more performance demands than others. Likewise, the performance characteristics of applications along with their associated data will also vary. Performance applies to the server, storage, and I/O networking hardware along with associated software and applications.

For servers, performance is focused on how much CPU or processor time is used, along with memory and I/O operations. I/O operations to create, read, update, or delete (CRUD) data include activity rate (frequency or data velocity) of I/O operations (IOPS). Other considerations include the volume or amount of data being moved (bandwidth, throughput, transfer), response time or latency, along with queue depths.

Activity is the amount of work to do or being done in a given amount of time (seconds, minutes, hours, days, weeks), which can be transactions, rates, IOPs. Additional performance considerations include latency, bandwidth, throughput, response time, queues, reads or writes, gets or puts, updates, lists, directories, searches, pages views, files opened, videos viewed, or downloads.

Server, storage, and I/O network performance include:

Processor CPU usage time and queues (user and system overhead)
Memory usage effectiveness including page and swap
I/O activity including between servers and storage
Errors, retransmission, retries, and rebuilds

the following figure shows a generic performance example of data being accessed (mixed reads, writes, random, sequential, big, small, low and high-latency) on a local and a remote basis. The example shows how for a given time interval (see lower right), applications are accessing and working with data via different data streams in the larger image left center. Also shown are queues and I/O handling along with end-to-end (E2E) response time.

Server I/O performance fundamentals (via Software Defined Data Infrastructure Essentials)

Click here to view a larger version of the above figure.

Also shown on the left in the above figure is an example of E2E response time from the application through the various data infrastructure layers, as well as, lower center, the response time from the server to the memory or storage devices.

Various queues are shown in the middle of the above figure which are indicators of how much work is occurring, if the processing is keeping up with the work or causing backlogs. Context is needed for queues, as they exist in the server, I/O networking devices, and software drivers, as well as in storage among other locations.

Some basic server, storage, I/O metrics that matter include:

Queue depth of I/Os waiting to be processed and concurrency
CPU and memory usage to process I/Os
I/O size, or how much data can be moved in a given operation
I/O activity rate or IOPs = amount of data moved/I/O size per unit of time
Bandwidth = data moved per unit of time = I/O size × I/O rate
Latency usually increases with larger I/O sizes, decreases with smaller requests
I/O rates usually increase with smaller I/O sizes and vice versa
Bandwidth increases with larger I/O sizes and vice versa
Sequential stream access data may have better performance than some random access data
Not all data is conducive to being sequential stream, or random
Lower response time is better, higher activity rates and bandwidth are better

Queues with high latency and small I/O size or small I/O rates could indicate a performance bottleneck. Queues with low latency and high I/O rates with good bandwidth or data being moved could be a good thing. An important note is to look at several metrics, not just IOPs or activity, or bandwidth, queues, or response time. Also, keep in mind that metrics that matter for your environment may be different from those for somebody else.

Something to keep in perspective is that there can be a large amount of data with low performance, or a small amount of data with high-performance, not to mention many other variations. The important concept is that as space capacity scales, that does not mean performance also improves or vice versa, after all, everything is not the same.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Availability 4 3 2 1 Data Protection

4 3 2 1 data protection Application Data Availability Everything Is Not The Same

Application Data Availability 4 3 2 1 Data Protection

This is part two of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application performance, availability, capacity, economic (PACE) attributes that have an impact on data value as well as availability.

Availability (Accessibility, Durability, Consistency)

Just as there are many different aspects and focus areas for performance, there are also several facets to availability. Note that applications performance requires availability and availability relies on some level of performance.

Availability is a broad and encompassing area that includes data protection to protect, preserve, and serve (backup/restore, archive, BC, BR, DR, HA) data and applications. There are logical and physical aspects of availability including data protection as well as security including key management (manage your keys or authentication and certificates) and permissions, among other things.

Availability = accessibility (can you get to your application and data) + durability (is the data intact and consistent). This includes basic Reliability, Availability, Serviceability (RAS), as well as high availability, accessibility, and durability. “Durable” has multiple meanings, so context is important. Durable means how data infrastructure resources hold up to, survive, and tolerate wear and tear from use (i.e., endurance), for example, Flash SSD or mechanical devices such as Hard Disk Drives (HDDs). Another context for durable refers to data, meaning how many copies in various places.

Server, storage, and I/O network availability topics include:

Resiliency and self-healing to tolerate failure or disruption
Hardware, software, and services configured for resiliency
Accessibility to reach or be reached for handling work
Durability and consistency of data to be available for access
Protection of data, applications, and assets including security

Additional server I/O and data infrastructure along with storage topics include:

Backup/restore, replication, snapshots, sync, and copies
Basic Reliability, Availability, Serviceability, HA, fail over, BC, BR, and DR
Alternative paths, redundant components, and associated software
Applications that are fault-tolerant, resilient, and self-healing
Non disruptive upgrades, code (application or software) loads, and activation
Immediate data consistency and integrity vs. eventual consistency
Virus, malware, and other data corruption or loss prevention

From a data protection standpoint, the fundamental rule or guideline is 4 3 2 1, which means having at least four copies consisting of at least three versions (different points in time), at least two of which are on different systems or storage devices and at least one of those is off-site (on-line, off-line, cloud, or other). There are many variations of the 4 3 2 1 rule shown in the following figure along with approaches on how to manage technology to use. We will go into deeper this subject in later chapters. For now, remember the following.

4 3 2 1 data protection (via Software Defined Data Infrastructure Essentials)

4    At least four copies of data (or more), Enables durability in case a copy goes bad, deleted, corrupted, failed device, or site.
3    The number (or more) versions of the data to retain, Enables various recovery points in time to restore, resume, restart from.
2    Data located on two or more systems (devices or media/mediums), Enables protection against device, system, server, file system, or other fault/failure.

1 With at least one of those copies being off-premise and not live (isolated from active primary copy), Enables resiliency across sites, as well as space, time, distance gap for protection.

Capacity and Space (What Gets Consumed and Occupied)

In addition to being available and accessible in a timely manner (performance), data (and applications) occupy space. That space is memory in servers, as well as using available consumable processor CPU time along with I/O (performance) including over networks.

Data and applications also consume storage space where they are stored. In addition to basic data space, there is also space consumed for metadata as well as protection copies (and overhead), application settings, logs, and other items. Another aspect of capacity includes network IP ports and addresses, software licenses, server, storage, and network bandwidth or service time.

Server, storage, and I/O network capacity topics include:

Consumable time-expiring resources (processor time, I/O, network bandwidth)
Network IP and other addresses
Physical resources of servers, storage, and I/O networking devices
Software licenses based on consumption or number of users
Primary and protection copies of data and applications
Active and standby data infrastructure resources and sites
Data footprint reduction (DFR) tools and techniques for space optimization
Policies, quotas, thresholds, limits, and capacity QoS
Application and database optimization

DFR includes various techniques, technologies, and tools to reduce the impact or overhead of protecting, preserving, and serving more data for longer periods of time. There are many different approaches to implementing a DFR strategy, since there are various applications and data.

Common DFR techniques and technologies include archiving, backup modernization, copy data management (CDM), clean up, compress, and consolidate, data management, deletion and dedupe, storage tiering, RAID (including parity-based, erasure codes , local reconstruction codes [LRC] , and Reed-Solomon , Ceph Shingled Erasure Code (SHEC ), among others), along with protection configurations along with thin-provisioning, among others.

DFR can be implemented in various complementary locations from row-level compression in database or email to normalized databases, to file systems, operating systems, appliances, and storage systems using various techniques.

Also, keep in mind that not all data is the same; some is sparse, some is dense, some can be compressed or deduped while others cannot. Likewise, some data may not be compressible or dedupable. However, identical copies can be identified with links created to a common copy.

Economics (People, Budgets, Energy and other Constraints)

If one thing in life and technology that is constant is change, then the other constant is concern about economics or costs. There is a cost to enable and maintain a data infrastructure on premise or in the cloud, which exists to protect, preserve, and serve data and information applications.

However, there should also be a benefit to having the data infrastructure to house data and support applications that provide information to users of the services. A common economic focus is what something costs, either as up-front capital expenditure (CapEx) or as an operating expenditure (OpEx) expense, along with recurring fees.

In general, economic considerations include:

Budgets (CapEx and OpEx), both up front and in recurring fees
Whether you buy, lease, rent, subscribe, or use free and open sources
People time needed to integrate and support even free open-source software
Costs including hardware, software, services, power, cooling, facilities, tools
People time includes base salary, benefits, training and education

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. All applications have some element of performance, availability, capacity, economic (PACE) needs as well as resource demands. There is often a focus around data storage about storage efficiency and utilization which is where data footprint reduction (DFR) techniques, tools, trends and as well as technologies address capacity requirements. However with data storage there is also an expanding focus around storage effectiveness also known as productivity tied to performance, along with availability including 4 3 2 1 data protection. Continue reading the next post (Part III Application Data Characteristics Types Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Characteristics Types Everything Is Not The Same

This is part three of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on different types of data. There is more to data than simply being big data, fast data, big fast or unstructured, structured or semistructured, some of which has been touched on in this series, with more to follow. Note that there is also data in terms of the programs, applications, code, rules, policies as well as configuration settings, metadata along with other items stored.

Various Types of Data

Data types along with characteristics include big data, little data, fast data, and old as well as new data with a different value, life-cycle, volume and velocity. There are data in files and objects that are big representing images, figures, text, binary, structured or unstructured that are software defined by the applications that create, modify and use them.

There are many different types of data and applications to meet various business, organization, or functional needs. Keep in mind that applications are based on programs which consist of algorithms and data structures that define the data, how to use it, as well as how and when to store it. Those data structures define data that will get transformed into information by programs while also being stored in memory and on data stored in various formats.

Just as various applications have different algorithms, they also have different types of data. Even though everything is not the same in all environments, or even how the same applications get used across various organizations, there are some similarities. Even though there are different types of applications and data, there are also some similarities and general characteristics. Keep in mind that information is the result of programs (applications and their algorithms) that process data into something useful or of value.

Data typically has a basic life cycle of:

Creation and some activity, including being protected
Dormant, followed by either continued activity or going inactive
Disposition (delete or remove)

In general, data can be

Temporary, ephemeral or transient
Dynamic or changing (“hot data”)
Active static on-line, near-line, or off-line (“warm-data”)
In-active static on-line or off-line (“cold data”)

Data is organized

Structured
Semi-structured
Unstructured

General data characteristics include:

Value = From no value to unknown to some or high value
Volume = Amount of data, files, objects of a given size
Variety = Various types of data (small, big, fast, structured, unstructured)
Velocity = Data streams, flows, rates, load, process, access, active or static

The following figure shows how different data has various values over time. Data that has no value today or in the future can be deleted, while data with unknown value can be retained.

Different data with various values over time

Application Data Value across sddc
Data Value Known, Unknown and No Value

General characteristics include the value of the data which in turn determines its performance, availability, capacity, and economic considerations. Also, data can be ephemeral (temporary) or kept for longer periods of time on persistent, non-volatile storage (you do not lose the data when power is turned off). Examples of temporary scratch include work and scratch areas such as where data gets imported into, or exported out of, an application or database.

Data can also be little, big, or big and fast, terms which describe in part the size as well as volume along with the speed or velocity of being created, accessed, and processed. The importance of understanding characteristics of data and how their associated applications use them is to enable effective decision-making about performance, availability, capacity, and economics of data infrastructure resources.

Data Value

There is more to data storage than how much space capacity per cost.

All data has one of three basic values:

No value = ephemeral/temp/scratch = Why keep it?
Some value = current or emerging future value, which can be low or high = Keep
Unknown value = protect until value is unlocked, or no remaining value

In addition to the above basic three, data with some value can also be further subdivided into little value, some value, or high value. Of course, you can keep subdividing into as many more or different categories as needed, after all, everything is not always the same across environments.

Besides data having some value, that value can also change by increasing or decreasing in value over time or even going from unknown to a known value, known to unknown, or to no value. Data with no value can be discarded, if in doubt, make and keep a copy of that data somewhere safe until its value (or lack of value) is fully known and understood.

The importance of understanding the value of data is to enable effective decision-making on where and how to protect, preserve, and cost-effectively store the data. Note that cost-effective does not necessarily mean the cheapest or lowest-cost approach, rather it means the way that aligns with the value and importance of the data at a given point in time.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value at various times, and that value is also evolving. Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part IV Application Data Volume Velocity Variety Everything Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018October 18, 2024

Application Data Volume Velocity Variety Everything Is Not The Same

Application Data Volume Velocity Variety Everything Not The Same

This is part four of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on data volume velocity and variety, after all, everything is not the same, not to mention many different aspects of big data as well as little data.

Volume of Data

More data is growing at a faster rate every day, and that data is being retained for longer periods. Some data being retained has known value, while a growing amount of data has an unknown value. Data is generated or created from many sources, including mobile devices, social networks, web-connected systems or machines, and sensors including IoT and IoD. Besides where data is created from, there are also many consumers of data (applications) that range from legacy to mobile, cloud, IoT among others.

Unknown-value data may eventually have value in the future when somebody realizes that he can do something with it, or a technology tool or application becomes available to transform the data with unknown value into valuable information.

Some data gets retained in its native or raw form, while other data get processed by application program algorithms into summary data, or is curated and aggregated with other data to be transformed into new useful data. The figure below shows, from left to right and front to back, more data being created, and that data also getting larger over time. For example, on the left are two data items, objects, files, or blocks representing some information.

In the center of the following figure are more columns and rows of data, with each of those data items also becoming larger. Moving farther to the right, there are yet more data items stacked up higher, as well as across and farther back, with those items also being larger. The following figure can represent blocks of storage, files in a file system, rows, and columns in a database or key-value repository, or objects in a cloud or object storage system.

Application Data Value sddc
Increasing data velocity and volume, more data and data getting larger

In addition to more data being created, some of that data is relatively small in terms of the records or data structure entities being stored. However, there can be a large quantity of those smaller data items. In addition to the amount of data, as well as the size of the data, protection or overhead copies of data are also kept.

Another dimension is that data is also getting larger where the data structures describing a piece of data for an application have increased in size. For example, a still photograph was taken with a digital camera, cell phone, or another mobile handheld device, drone, or other IoT device, increases in size with each new generation of cameras as there are more megapixels.

Variety of Data

In addition to having value and volume, there are also different varieties of data, including ephemeral (temporary), persistent, primary, metadata, structured, semi-structured, unstructured, little, and big data. Keep in mind that programs, applications, tools, and utilities get stored as data, while they also use, create, access, and manage data.

There is also primary data and metadata, or data about data, as well as system data that is also sometimes referred to as metadata. Here is where context comes into play as part of tradecraft, as there can be metadata describing data being used by programs, as well as metadata about systems, applications, file systems, databases, and storage systems, among other things, including little and big data.

Context also matters regarding big data, as there are applications such as statistical analysis software and Hadoop, among others, for processing (analyzing) large amounts of data. The data being processed may not be big regarding the records or data entity items, but there may be a large volume. In addition to big data analytics, data, and applications, there is also data that is very big (as well as large volumes or collections of data sets).

For example, video and audio, among others, may also be referred to as big fast data, or large data. A challenge with larger data items is the complexity of moving over the distance promptly, as well as processing requiring new approaches, algorithms, data structures, and storage management techniques.

Likewise, the challenges with large volumes of smaller data are similar in that data needs to be moved, protected, preserved, and served cost-effectively for long periods of time. Both large and small data are stored (in memory or storage) in various types of data repositories.

In general, data in repositories is accessed locally, remotely, or via a cloud using:

Object and blobs stream, queue, and Application Programming Interface (API)
File-based using local or networked file systems
Block-based access of disk partitions, LUNs (logical unit numbers), or volumes

The following figure shows varieties of application data value including (left) photos or images, audio, videos, and various log, event, and telemetry data, as well as (right) sparse and dense data.

Application Data Value bits bytes blocks blobs bitstreams sddc
Varieties of data (bits, bytes, blocks, blobs, and bitstreams)

Velocity of Data

Data, in addition to having value (known, unknown, or none), volume (size and quantity), and variety (structured, unstructured, semi structured, primary, metadata, small, big), also has velocity. Velocity refers to how fast (or slowly) data is accessed, including being stored, retrieved, updated, scanned, or if it is active (updated, or fixed static) or dormant and inactive. In addition to data access and life cycle, velocity also refers to how data is used, such as random or sequential or some combination. Think of data velocity as how data, or streams of data, flow in various ways.

Velocity also describes how data is used and accessed, including:

Active (hot), static (warm and WORM), or dormant (cold)
Random or sequential, read or write-accessed
Real-time (online, synchronous) or time-delayed

Why this matters is that by understanding and knowing how applications use data, or how data is accessed via applications, you can make informed decisions. Also, having insight enables how to design, configure, and manage servers, storage, and I/O resources (hardware, software, services) to meet various needs. Understanding Application Data Value including the velocity of the data both for when it is created as well as when used is important for aligning the applicable performance techniques and technologies.

Where to learn more

Learn more about Application Data Value, application characteristics, performance, availability, capacity, economic (PACE) along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

- Part 1 – Application Data Value Characteristics Everything Is Not The Same
- Part 2 – 4 3 2 1 Data Protection Application Data Availability
- Part 3 – Application Data Characteristics Types Everything Is Not The Same
- Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
- Part 5 – Application Data Access life cycle Patterns Everything Not The Same
- Software Defined, Cloud, Object and Blob Storage
- Data Infrastructure server storage I/O network Recommended Reading
- World Backup Day 2018 Data Protection Readiness Reminder
- Data Infrastructure Server Storage I/O related Tradecraft Overview
- Data Infrastructure Overview, Its What’s Inside of Data Centers
- 4 3 2 1 and 3 2 1 data protection best practices
- Garbage data in, garbage information out, big data or big garbage?
- GDPR (General Data Protection Regulation) Resources Are You Ready?
- Which Enterprise HDD to use for a Content Server Platform
- The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
- The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
- Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value, size, as well as velocity as part of its characteristic including how used by various applications. Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part V Application Data Access life cycle Patterns Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Access Lifecycle Patterns Everything Is Not The Same

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Application Data Access Life cycle Patterns Everything Is Not The Same

This is part five of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we look at various application and data lifecycle patterns as well as wrap up this series.

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

When it comes to Application Data Value, a common question I hear is why not keep all data?

If the data has value, and you have a large enough budget, why not? On the other hand, most organizations have a budget and other constraints that determine how much and what data to retain.

Another common question I get asked (or told) it isn’t the objective to keep less data to cut costs?

If the data has no value, then get rid of it. On the other hand, if data has value or unknown value, then find ways to remove the cost of keeping more data for longer periods of time so its value can be realized.

In general, the data life cycle (called by some cradle to grave, birth or creation to disposition) is created, save and store, perhaps update and read with changing access patterns over time, along with value. During that time, the data (which includes applications and their settings) will be protected with copies or some other technique, and eventually disposed of.

Between the time when data is created and when it is disposed of, there are many variations of what gets done and needs to be done. Considering static data for a moment, some applications and their data, or data and their applications, create data which is for a short period, then goes dormant, then is active again briefly before going cold (see the left side of the following figure). This is a classic application, data, and information life-cycle model (ILM), and tiering or data movement and migration that still applies for some scenarios.

Application Data Value
Changing data access patterns for different applications

However, a newer scenario over the past several years that continues to increase is shown on the right side of the above figure. In this scenario, data is initially active for updates, then goes cold or WORM (Write Once/Read Many); however, it warms back up as a static reference, on the web, as big data, and for other uses where it is used to create new data and information.

Data, in addition to its other attributes already mentioned, can be active (hot), residing in a memory cache, buffers inside a server, or on a fast storage appliance or caching appliance. Hot data means that it is actively being used for reads or writes (this is what the term Heat map pertains to in the context of the server, storage data, and applications. The heat map shows where the hot or active data is along with its other characteristics.

Context is important here, as there are also IT facilities heat maps, which refer to physical facilities including what servers are consuming power and generating heat. Note that some current and emerging data center infrastructure management (DCIM) tools can correlate the physical facilities power, cooling, and heat to actual work being done from an applications perspective. This correlated or converged management view enables more granular analysis and effective decision-making on how to best utilize data infrastructure resources.

In addition to being hot or active, data can be warm (not as heavily accessed) or cold (rarely if ever accessed), as well as online, near-line, or off-line. As their names imply, warm data may occasionally be used, either updated and written, or static and just being read. Some data also gets protected as WORM data using hardware or software technologies. WORM (immutable) data, not to be confused with warm data, is fixed or immutable (cannot be changed).

When looking at data (or storage), it is important to see when the data was created as well as when it was modified. However, you should avoid the mistake of looking only at when it was created or modified: Instead, also look to see when it was the last read, as well as how often it is read. You might find that some data has not been updated for several years, but it is still accessed several times an hour or minute. Also, keep in mind that the metadata about the actual data may be being updated, even while the data itself is static.

Also, look at your applications characteristics as well as how data gets used, to see if it is conducive to caching or automated tiering based on activity, events, or time. For example, there is a large amount of data for an energy or oil exploration project that normally sits on slower lower-cost storage, but that now and then some analysis needs to run on.

Using data and storage management tools, given notice or based on activity, which large or big data could be promoted to faster storage, or applications migrated to be closer to the data to speed up processing. Another example is weekly, monthly, quarterly, or year-end processing of financial, accounting, payroll, inventory, or enterprise resource planning (ERP) schedules. Knowing how and when the applications use the data, which is also understanding the data, automated tools, and policies, can be used to tier or cache data to speed up processing and thereby boost productivity.

All applications have performance, availability, capacity, economic (PACE) attributes, however:

PACE attributes vary by Application Data Value and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application
PACE application and data characteristics along with value change over time

Read more about Application Data Value, PACE and application characteristics in Software Defined Data Infrastructure Essentials (CRC Press 2017).

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Lifecycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that Application Data Value everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them.

Also keep in mind that there is more data being created, the size of those data items, files, objects, entities, records are also increasing, as well as the speed at which they get created and accessed. The challenge is not just that there is more data, or data is bigger, or accessed faster, it’s all of those along with changing value as well as diverse applications to keep in perspective. With new Global Data Protection Regulations (GDPR) going into effect May 25, 2018, now is a good time to assess and gain insight into what data you have, its value, retention as well as disposition policies.

Remember, there are different data types, value, life-cycle, volume and velocity that change over time, and with Application Data Value Everything Is Not The Same, so why treat and manage everything the same?

Ok, nuff said, for now.

March 7, 2018November 26, 2023

Veeam GDPR preparedness experiences Webinar walking the talk

Veeam GDPR I/O data infrastructure trends

Veeam GDPR preparedness experiences Fireside chat Webinar

March 27, 9AM PT
This free (register here) fireside chat webinar sponsored by Veeam looks at Veeam GDPR preparedness experiences based on what Veeam did to be ready for the May 25, 2018 Global Data Protection Regulations taking effect. The format of this webinar will be fireside chat between myself and Danny Allan (@DannyAllan5) of Veeam as we discuss the experiences, lessons learned by Veeam during their journey to prepare for GDPR.

Danny has put together a five-part blog series here covering some of Veeams findings and lessons learned that you can leverage to prepare for GDPR, as well as what we will discuss among other related topics during the fireside chat webinar. Keep in mind that GDPR is commonly mistaken as just an European regulation when in fact it is global. In addition to being global, it is also inclusive of big as well as small organizations, cloud and non cloud entities, as well as spanning industries, along with different parts of an organization from human resources (HR) to accounting and finance to sales, marketing among others.

Join me and Danny Allan as we discuss GDPR along with five key lessons learned during Veeams road to GDPR compliance, as well as how their software solutions played a critical role in managing their own environment. In other words, Veeam is not just talking the talk, they are also walking the talk, eating their own dog food among other clichés.

Where to learn more

Learn more about data protection, GDPR, software defined data center (SDDC), software defined data infrastructures (SDDI), cloud and related topics via the following links:

GDPR (General Data Protection Regulation) Resources Are You Ready?
GDPR: An overview of Veeam’s 5 lessons learned on our way to compliancy
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
Server StorageIO.tv (various videos and podcasts, fun and for work)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Now is the time to be prepared for upcoming GDPR implementation. Join me and Danny Allan to learn what you need to be doing now, as well as compare what you have done or are doing to be prepared for GDPR.

Ok, nuff said, for now.

March 7, 2018December 29, 2025

VMware continues cloud construction with March announcements

VMware continues cloud construction sddc

VMware continues cloud construction with March announcements of new features and other enhancements.

VMware Cloud Provides Consistent Operations and Infrastructure Via: VMware.com

With its recent announcements, VMware continues cloud construction adding new features, enhancements, partnerships along with services.

VMware continues cloud construction, like other vendors and service providers who tried and test the waters of having their own public cloud, VMware has moved beyond its vCloud Air initiative selling that to OVH. VMware which while being a public traded company (VMW) is by way of majority ownership part of the Dell Technologies family of company via the 2016 acquisition of EMC by Dell. What this means is that like Dell Technologies, VMware is focused on providing solutions and services to its cloud provider partners instead of building, deploying and running its own cloud in competition with partners.

VMware Cloud Data Infrastructure and SDDC layers Via: VMware.com

The VMware Cloud message and strategy is focused around providing software solutions to cloud and other data infrastructure partners (and customers) instead of competing with them (e.g. divesting of vCloud Air, partnering with AWS, IBM Softlayer). Part of the VMware cloud message and strategy is to provide consistent operations and management across clouds, containers, virtual machines (VM) as well as other software defined data center (SDDC) and software defined data infrastructures.

In other words, what this means is VMware providing consistent management to leverage common experiences of data infrastructure staff along with resources in a hybrid, cross cloud and software defined environment in support of existing as well as cloud native applications.

VMware Cloud on AWS Image via: AWS.com

Note that VMware Cloud services run on top of AWS EC2 bare metal (BM) server instances, as well as on BM instances at IBM softlayer as well as OVH. Learn more about AWS EC2 BM compute instances aka Metal as a Service (MaaS) here. In addition to AWS, IBM and OVH, VMware claims over 4,000 regional cloud and managed service providers who have built their data infrastructures out using VMware based technologies.

VMware continues cloud construction updates

Building off of previous announcements, VMware continues cloud construction with enhancements to their Amazon Web Services (AWS) partnership along with services for IBM Softlayer cloud as well as OVH. As a refresher, OVH is what formerly was known as VMware vCloud air before it was sold off.

Besides expanding on existing cloud partner solution offerings, VMware also announced additional cloud, software defined data center (SDDC) and other software defined data infrastructure environment management capabilities. SDDC and Data infrastructure management tools include leveraging VMwares acquisition of Wavefront among others.

VMware Cloud Updates and New Features

VMware Cloud on AWS European regions (now in London, adding Frankfurt German)
Stretch Clusters with synchronous replication for cross geography location resiliency
Support for data intensive workloads including data footprint reduction (DFR) with vSAN based compression and data de duplication
Fujitsu services offering relationships
Expanded VMware Cloud Services enhancements

VMware Cloud Services enhancements include:

Hybrid Cloud Extension
Log intelligence
Cost insight
Wavefront

VMware Cloud in additional AWS Regions

As part of service expansion, VMware Cloud on AWS has been extended into European region (London) with plans to expand into Frankfurt and an Asian Pacific location. Previously VMware Cloud on AWS has been available in US West Oregon and US East Northern Virginia regions. Learn more about AWS Regions and availability zones (AZ) here.

VMware Cloud on AWS Stretch Clusters Source: VMware.com

VMware Cloud on AWS Stretch Clusters

In addition to expanding into additional regions, VMware Cloud on AWS is also being extended with stretch clusters for geography dispersed protection. Stretched clusters provide protection against an AZ failure (e.g. data center site) for mission critical applications. Build on vSphere HA and DRS automated host failure technology, stretched clusters provide recovery point objective zero (RPO 0) for continuous protection, high availability across AZs at the data infrastructure layer.

The benefit of data infrastructure layer based HA and resiliency is not having to re architect or modify upper level, higher up layered applications or software. Synchronous replication between AZs enables RPO 0, if one AZ goes down, it is treated as a vSphere HA event with VMs restarted in another AZ.

vSAN based Data Footprint Reduction (DFR) aka Compression and De duplication

To support applications that leverage large amounts of data, aka data intensive applications in marketing speak, VMware is leveraging vSAN based data footprint reduction (DFR) techniques including compression as well as de duplication (dedupe). Leveraging DFR technologies like compression and dedupe integrated into vSAN, VMware Clouds have the ability to store more data in a given cubic density. Storing more data in a given cubic density storage efficiency (e.g. space saving utilization) as well as with performance acceleration, also facilitate storage effectiveness along with productivity.

With VMware vSAN technology as one of the core underlying technologies for enabling VMware Cloud on AWS (among other deployments), applications with large data needs can store more data at a lower cost point. Note that VMware Cloud can support 10 clusters per SDDC deployment, with each cluster having 32 nodes, with cluster wide and aware dedupe. Also note that for performance, VMware Cloud on AWS leverages NVMe attached Solid State Devices (SSD) to boost effectiveness and productivity.

Extending VMware vSphere any to any migration across clouds Source: VMware.com

VMware Hybrid Cloud Extension

VMware Hybrid Cloud Extension enables common management of common underlying data infrastructure as well as software defined environments including across public, private as well as hybrid clouds. Some of the capabilities include enabling warm VM migration across various software defined environments from local on-premises and private cloud to public clouds.

New enhancements leverages previously available technology now as a service for enterprises besides service providers to support data center to data center, or cloud centric AZ to AZ, as well as region to region migrations. Some of the use cases include small to large bulk migrations of hundreds to thousands of VM move and migrations, both scheduling as well as the actual move. Move and migrations can span hybrid deployments with mix of on-premises as well as various cloud services.

VMware Cloud Cost Insight

VMware Cost Insight enables analysis, compare cloud costs across public AWS, Azure and private VMware clouds) to avoid flying blind in and among clouds. VMware Cloud cost insight enables awareness of how resources are used, their cost and benefit to applications as well as IT budget impacts. Integrates vSAN sizer tool along with AWS metrics for improved situational awareness, cost modeling, analysis and what if comparisons.

With integration to Network insight, VMware Cloud Cost Insight also provides awareness into networking costs in support of migrations. What this means is that using VMware Cloud Cost insight you can take the guess-work out of what your expenses will be for public, private on-premisess or hybrid cloud will be having deeper insight awareness into your SDDC environment. Learn more about VVMware Cost Insight here.

VMware Log Intelligence

Log Intelligence is a new VMware cloud service that provides real-time data infrastructure insight along with application visibility from private, on-premises, to public along with hybrid clouds. As its name implies, Log Intelligence provides syslog and other log insight, analysis and intelligence with real-time visibility into VMware as well as AWS among other resources for faster troubleshooting, diagnostics, event correlation and other data infrastructure management tasks.

Log and telemetry input sources for VMware Log Intelligence include data infrastructure resources such as operating systems, servers, system statistics, security, applications among other syslog events. For those familiar with VMware Log Insight, this capability is an extension of that known experience expanding it to be a cloud based service.

Wavefront by VMware Source: VMware.com

VMware Wavefront

VMware Wavefront enables monitoring of cloud native high scale environments with custom metrics and analytics. As a reminder Wavefront was acquired by VMware to enable deep metrics and analytics for developers, DevOps, data infrastructure operations as well as SaaS application developers among others. Wavefront integrates with VMware vRealize along with enabling monitoring of AWS data infrastructure resources and services. With the ability to ingest, process, analyze various data feeds, the Wavefront engine enables the predictive understanding of mixed application, cloud native data and data infrastructure platforms including big data based.

Where to learn more

Learn more about VMware, vSphere, vRealize, VMware Cloud, AWS (and other clouds), along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

VMware Cloud Briefing site
VMware vRealize Cloud Management Platform (Application delivery and operations automation across clouds)
VMware Cost Insight (Analyze and compare cloud costs across public AWS, Azure and private VMware clouds)
VMware Network Insight (Accelerate application security and networking across public, private, hybrid clouds)
VMware Wavefront (Monitor cloud native high scale environments with custom metrics and analytics
VMware Cloud Community resources
VMware Cloud Partner resources
VMware vSAN V6.6 Part V (vSAN evolution and summary)
Dell EMC World 2017 Day One news announcement summary
September 2017 Server StorageIO Data Infrastructure Update Newsletter
Getting Caught Up What Happened In September 2017
Travel Fun Crossword Puzzle For VMworld 2017 Las Vegas
Hot Popular New Trending Data Infrastructure Vendors To Watch
Dell EMC VMware September 2017 Software Defined Data Infrastructure Updates
Amazon Web Service AWS September 2017 Software Defined Data Infrastructure Updates
Data Infrastructure server storage I/O network Recommended Reading
EMC is now Dell EMC, part of Dell Technologies and other server storage Updates
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

VMware continues cloud construction. For now, it appears that VMware like Dell Technologies is content on being a technology provider partner to large as well as small public, private and hybrid cloud environments instead of building their own and competing. With these series of announcements, VMware continues cloud construction enabling its partners and customers on their various software defined data center (SDDC) and related data infrastructure journeys. Overall, this is a good set of enhancements, updates, new and evolving features for their partners as well as customers who leverage VMware based technologies. Meanwhile VMware continues cloud construction.

Ok, nuff said, for now.

February 11, 2018April 27, 2025

World Backup Day 2018 Data Protection Readiness Reminder

server storage I/O trends

It’s that time of year again, World Backup Day 2018 Data Protection Readiness Reminder.

In case you have forgotten, or were not aware, this coming Saturday March 31 is World Backup (and recovery day). The annual day is a to remember to make sure you are protecting your applications, data, information, configuration settings as well as data infrastructures. While the emphasis is on Backup, that also means recovery as well as testing to make sure everything is working properly.

Its time that the focus of world backup day should expand from just a focus on backup to also broader data protection and things that start with R. Some data protection (and backup) related things, tools, tradecraft techniques, technologies and trends that start with R include readiness, recovery, reconstruct, restore, restart, resume, replication, rollback, roll forward, RAID and erasure codes, resiliency, recovery time objective (RTO), recovery point objective (RPO), replication among others.

Keep in mind that Data Protection is a broader focus than just backup and recovery. Data protection includes disaster recovery DR, business continuance BC, business resiliency BR, security (logical and physical), standard and high availability HA, as well as durability, archiving, data footprint reduction, copy data management CDM along with various technologies, tradecraft techniques, tools.

Quick Data Protection, Backup and Recovery Checklist

Keep the 4 3 2 1 or shorter older 3 2 1 data protection rules in mind
Do you know what data, applications, configuration settings, meta data, keys, certificates are being protected?
Do you know how many versions, copies, where stored and what is on or off-site, on or off-line?
Implement data protection at different intervals and coverage of various layers (application, transaction, database, file system, operating system, hypervisors, device or volume among others)

Have you protected your data protection environment including software, configuration, catalogs, indexes, databases along with management tools?
Verify that data protection point in time copies (backups, snapshots, consistency points, checkpoints, version, replicas) are working as intended
Make sure that not only are the point in time protection copies running when scheduled, also that they are protected what’s intended

Test to see if the protection copies can actually be used, this means restoring as well as accessing the data via applications
Watch out to prevent a disaster in the course of testing, plan, prepare, practice, learn, refine, improve
In addition to verifying your data protection (backup, bc, dr) for work, also take time to see how your home or personal data is protected
View additional tips, techniques, checklist items in this Data Protection fundamentals series of posts here.

storageio data protection toolbox

Where To Learn More

View additional Data Infrastructure Data Protection and related tools, trends, technology and tradecraft skills topics via the following links.

Data Protection Diaries series
Part 1 – Data Infrastructure Data Protection Fundamentals
Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
Part 3 – Data Protection Fundamental Access Availability RAID Erasure Codes ( EC) including LRC
Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
Part 5 – Point In Time Data Protection Granularity Points of Interest
Part 6 – Data Protection Security Logical Physical Software Defined
Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
Part 8 – Data Protection Diaries Walking Data Protection Talk
Part 9 – who’s Doing What ( Toolbox Technology Tools)
Part 10 – Data Protection Resources Where to Learn More
Revisiting RAID storage remains relevant and resources
Time to restore from backup: Do you know where your data is?

data protection rto rpo

February 2017 Server StorageIO Update Newsletter
AWS Announces New S3 Cloud Storage Security Encryption Features
Data Infrastructure Server Storage I/O Tradecraft Trends
Data Infrastructure Server Storage I/O related Tradecraft Overview
What’s a data infrastructure?
Data Infrastructure Overview, Its Whats Inside of Data Centers
Ensure your data infrastructure remains available and resilient
GDPR (General Data Protection Regulation) Resources Are You Ready?
GDPR goes into effect May 25 2018 Are You Ready?
Until the focus expands to data protection – Taking action
Backup, Big data, Big Data Protection, CMG & More with Tom Becchetti Podcast
Six plus data center software defined management dashboards
Cloud Storage Concerns, Considerations and Trends
Zombie Technology Life after Death Tape Is Still Alive
Data Infrastructure server storage I/O network Recommended Reading List Book Shelf
Software Defined Data Infrastructure Essentials (CRC 2017) Book
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

You can not go forward if you can not go back to a particular point in time (e.g. recovery point objective or RPO). Likewise, if you can not go back to a given RPO, how can you go forward with your business as well as meet your recovery time objective (RTO)?

Backup is as important as restore, without a good backup or data protection point in time copy, how can you restore? Some will say backup is more important than recovery, however its the enablement that matters, in other words being able to provide data protection and recover, restart, resume or other things that start with R. World backup day should be a reminder to think about broader data protection which also means recovery, restore and realizing if your copies and versions are good. Keep the above in mind and this is your World Backup Day 2018 Data Protection Readiness Reminder.

Ok, nuff said, for now.

January 10, 2018November 26, 2023

How to Achieve Flexible Data Protection Availability with All Flash Storage Solutions

Achieve Flexible Data Protection Availability with All Flash Solutions

server storage I/O data infrastructure trends

Updated 1/21/2018

How to Achieve Flexible flash data protection and Availability with All-Flash Storage Solutions

Interactive webinar discussion (not death by power point or Ui Gui product demo ;) pertaining flash data protection )
Tuesday January 30 2018 11AM PT / 2PM ET
Via Redmond Magazine (Free with registration)

Everything is not the same across different organizations, environments, application workloads and the data infrastructures that support them. Fast application and workloads need fast protection, restoration, and resumption as well as fast flash storage. This applies across legacy, software-defined, virtual, container, cloud, hybrid, converged and HCI among other environments.

Join me along with representatives from Pure Storage along with Veeam for this interactive discussion as we explore how to boost the performance, availability, capacity, and economics (PACE) of your applications along with the data infrastructures that support them.

How all-flash storage enables faster protection and restoration of fast applications
Why data protection and availability should not be an afterthought
Ways to leverage your data protection storage to drive business change
How to simplify and reduce complexity to boost productivity while lowering costs
Why workload aggregation consolidation should not cause aggravation

Where to learn more

Learn more about data protection, SSD, flash, data infrastructure and related topics via the following links:

Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its Whats Inside of Data Cetners
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
Server StorageIO.tv (various videos and podcasts, fun and for work)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Fast applications need fast and resilient data infrastructures that include server, storage, I/O networking along with data protection. Likewise performance depends on availability along with durability, likewise, availability and accessibility depend on performance, they go hand in hand. Join me and others from Pure Storage as well as Veeam for this conversational discussion about How to Achieve Flexible Data Protection and Availability with All-Flash Storage Solutions.

Ok, nuff said, for now.

December 7, 2017April 27, 2025

November 2017 Server StorageIO Data Infrastructure Update Newsletter

Volume 17, Issue 11 (November 2017)

Hello and welcome to the November 2017 issue of the Server StorageIO update newsletter.

2017 has a few more weeks left which look to be busy with end of year, holidays and other activities. Like the rest of 2017 November saw a lot of activity in and around the industry, setting up 2018 as yet another sequel to the busiest and most exciting year ever.

This is also the time of year when predictions for the following year (e.g. 2018) start to roll out, some of which are variations from those of the past or perennial favorites (e.g. the year of flash, the year of cloud, the year of software defined, the year of <insert_your_favorite_item_here>. Look for predictions and perspectives in future posts and newsletters.

Having been a busy month, let’s get to the content…

Recommended Reading
Various Events and Webinars
Industry Resources and Links

Enjoy this edition of the Server StorageIO data infrastructure update newsletter.

Cheers GS

Data Infrastructure and IT Industry Activity Trends

Some recent Industry Activities, Trends, News and Announcements include:

On the heals of completing its acquisition of Brocade (note previously Avago (who bought LSI) also bought Broadcom and then changed its name to the more well-known entity. Broadcom also announced relocating it headquarters from Singapore to the US, along an over $100 Billion USD acquisition offer of Qualcomm (here is interesting perspective Apple might play). Broadcom has been focused more on server, storage, I/O and general networking technology, while Qualcomm on mobile including phones and related items. Note that Qualcomm has previously made a $38.5 Billion USD offer for NXP semiconductors waiting regularity approval. View recent Broadcom financial results here.

Also in November server storage I/O controller chip maker Marvell (not to be confused with entertainment provider Marvel) announced a merger with Cavium who had previously acquired Qlogic among others. The resulting combined entity to be called Marvell will have an estimated $16 Billion USD revenue stream focused on server, storage, I/O and networking technologies among others.

In other merger and acquisition activity, VMware announced acquisition of VeloCloud for software defined wide area networking (SD-WAN).

With Super Compute 2017 (SC17) in November there were several announcements including from ATTO, DDN, Enmotus and Micron, Everspin, along with many others. By the way, in case you missed it at end of October Microsoft and Cray announced a partnership to bring Super Compute capabilities to Azure clouds. Speaking of Microsoft, there was also an announcement of adding VMware running on top of Azure (granted without VMware support), similar in concept to VMware on AWS (read hare).

Also at the end of November was AWS Reinvent with many announcements (more on those in a follow-up newsletter and posts). Prior to Reinvent AWS announced several server, storage and other data infrastructure security enhancements including for S3. Highlights from AWS reinvent include Fargate (serverless aka containers at scale without managing infrastructure), Elastic Container Services for Kubernetes (EKS), Greengrass (machine learning [ML] data infrastructure), along with many others.

Fargate is for those who want to leverage serveless microservices containers without having to devote DevOps and related activity to the care and feeding of its data infrastructure. In other words, Fargate is for those who want to focus maximum effort on the business applications, vs. the business of setting up and maintaining the data infrastructure for serverless On the other hand, AWS also announced EKS for those who want or need to customize their serverless data infrastructure including around Kubernetes among others.

In other industry activity, Taiwanese based Foxconn who manufactures technology for the who’s who of the industry announced progress towards their future Wisconsin based factory complex.

Over at HPE, the big news announcement is that CEO Meg Whitman is stepping down. HPE also announced new AMD powered Gen 10 Proliant services, as well as multi-cloud management solutions. HPE also announced new partnerships with DDN for HPC and SC, with Rackspace for selling private cloud services, along with Cloudian EMEA partnership among others.

OwnBackup announced a new version of their data protection software, while low-cost budget bulk storage service backblaze (B2) announced their more recent quarterly drive failure (or success) reliability reports. Meanwhile over at Quantum they released former Ceo Jon Gacek and rotated in new management.

Red Hat announced Ceph Storage 3 including CephFS (POSIX compatible file system), iSCSI gateway including support for VMware and Windows that lack native Ceph drivers, daemon deployment in Linux containers for smaller hardware footprint. Also included are enhanced monitoring, troubleshooting and diagnostics to streamline deployment and ongoing management. Red Hat also announced OpenShift version 3.7 for containers.

SANblaze announced NVMf and dual port NVMe capabilities for NVMe fabrics, while Linbit won an European grant to build out a software defined storage cloud scale out solution.

I often get asked who are the hot, new, trendy or other vendors and services to keep an eye on some of which I have mentioned in previous newsletters, as well as posts such as here and here. Moving in to 2018 some to keep an eye on (not all are new or trendy, yet they can enable you to be productive, or differentiate) include the following.

AWS, Bluemedora, Chelsio, Cloudian, CloudPassage, Compuverde, Databricks, Datadog, Datos, Enmotus, Everspin, Excelero, Fluree (Blockchain database), Google, Mellonox, Microsemi, Microsoft, Marvel and Cavium, MyWorkDrive, Red Hat, Rook, Rozo, Rubrik, Strongbox, Storone, Turbonomic, Ubuntu, Veeam, Velostrata, Virtuozo, VMware, WekaIO and others.

What the above means, is that it has been a busy month as well as year, and, the year is not over yet. There are still plenty of shopping days left both for christmas and the holidays, as well as for IT year-end spending, vendors looking to do acquisitions, or other last-minute projects. Speaking of which, drop me a note if you have any end of year, or new year projects Server StorageIO can assist you with.

Check out other industry news, comments, trends perspectives here.

Server StorageIO Commentary in the news, tips and articles

Recent Server StorageIO industry trends perspectives commentary in the news.

Via HPE Insights: Comments on Public cloud versus on-prem storage
Via DataCenterKnowledge: Data Center Standards: Where’s the Value?
Via arsTechnica: Comments on cloud backup disaster recovery

View more Server, Storage and I/O trends and perspectives comments here

Server StorageIOblog Data Infrastructure Posts

Recent and popular Server StorageIOblog posts include:

IT transformation Serverless Life Beyond DevOps Podcast
In this Server StorageIO podcast episode New York Times CTO / CIO Nick Rockwell (@nicksrockwell) joins me for a conversation discussing Digital, Business and IT transformation, Serverless Life Beyond DevOps and related topics.Read more here.

Data Protection Diaries Fundamental Topics Tools Techniques Technologies Tips
This is a multi-part series on Data Protection fundamental tools topics techniques terms technologies trends tradecraft tips as a follow-up to my Data Protection Diaries series, as well as a companion to my new book Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Server Storage I/O Fundamental tradecraft (CRC Press 2017).
Posts in this series include:
- Part 1 – Data Infrastructure Data Protection Fundamentals
- Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
- Part 3 – Access Availability RAID Erasure Codes ( EC) including LRC
- Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
- Part 5 – Point In Time Data Protection Granularity Points of Interest
- Part 6 – Data Protection Security Logical Physical Software Defined
- Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
- Part 8 – Data Protection Diaries Walking Data Protection Talk
- Part 9 – who’s Doing What ( Toolbox Technology Tools)
- Part 10 – Data Protection Resources Where to Learn More
Read more here.

HPE Announces AMD Powered Gen 10 ProLiant DL385 For Software Defined Workloads
HPE Announced a new AMD EPYC 7000 Powered Gen 10 ProLiant DL385 for Software Defined Workloads including server virtualization, software-defined data center (SDDC), software-defined data infrastructure (SDDI), software-defined storage among others. These new servers are part of a broader Gen10 HPE portfolioof ProLiant DL systems. Read more here.

AWS Announces New S3 Cloud Storage Security Encryption Features
Amazon Web Services (AWS) recently announced new Simple Storage Service (S3) encryption and security enhancements including Default Encryption, Permission Checks, Cross-Region Replication ACL Overwrite, Cross-Region Replication with KMS and Detailed Inventory Report. Another recent announcement by AWS is for PrivateLinks endpoints within a Virtual Private Cloud (VPC). Read more here.

Server StorageIO Recommended Reading (Watching and Listening) List

In addition to my own books including Software Defined Data Infrastructure Essentials (CRC Press 2017), the following are Server StorageIO data infrastructure recommended reading, watching and listening list items. The list includes various IT, Data Infrastructure and related topics. Speaking of my books, Didier Van Hoye (@WorkingHardInIt) has a good review over on his site you can view here, also check out the rest of his great content while there.

Intel Recommended Reading List (IRRL) for developers is a good resource to check out.

For those who are into Linux, container and hypervisor performance along with internals including cloud based, check out Brendan Gregg site. He has a lot of great material including some recent interesting posts ranging from dealing with workplace jerks, to whats inside AWS EC2 new KVM (switch from Xen based) hypervisors among others.

Here is a post by New York Times CIO/CTO Nick Rockwell The (Futile) Resistance to Serverless, also check out my podcast discussion with Nick here.

Over at Next Platform they have some interesting perspectives on Intel’s next Exascale architecture worth spending a few minutes to read.

Watch for more items to be added to the recommended reading list book shelf soon.

Events and Activities

Recent and upcoming event activities.

Nov. 9, 2017 – Webinar – All You Need To Know about ROBO Data Protection Backup
Nov. 2, 2017 – Webinar – Modern Data Protection for Hyper-Convergence

See more webinars and activities on the Server StorageIO Events page here.

Server StorageIO Industry Resources and Links

Useful links and pages:
Data Infrastructure Recommend Reading and watching list
Microsoft TechNet – Various Microsoft related from Azure to Docker to Windows
storageio.com/links – Various industry links (over 1,000 with more to be added soon)
objectstoragecenter.com – Cloud and object storage topics, tips and news items
OpenStack.org – Various OpenStack related items
storageio.com/downloads – Various presentations and other download material
storageio.com/protect – Various data protection items and topics
thenvmeplace.com – Focus on NVMe trends and technologies
thessdplace.com – NVM and Solid State Disk topics, tips and techniques
storageio.com/converge – Various CI, HCI and related SDS topics
storageio.com/performance – Various server, storage and I/O benchmark and tools
VMware Technical Network – Various VMware related items

Connect and Converse With Us

Ok, nuff said, for now.

November 30, 2017December 29, 2025

IT transformation Serverless Life Beyond DevOps with New York Times CTO Nick Rockwell Podcast

server storage I/O data infrastructure trends

By Greg Schulz – www.storageioblog.com November 30, 2017

In this Server StorageIO podcast episode New York Times CTO / CIO Nick Rockwell (@nicksrockwell) joins me for a conversation discussing Digital, Business and IT transformation, Serverless Life Beyond DevOps and related topics.

In our conversation we discuss challenges with metrics, understanding value vs. cost particular for software, Nicks perspective as both a CIO and CTO of the New York Times, importance of IT being involved and understanding the business vs. just being technology focused. We also discuss the bigger broader opportunity of serverless (aka micro services, containers) life beyond DevOps and how higher level business logic developers can benefit from the technology instead of just a DevOps for infrastructure focus. Buzzwords, buzz terms and themes include datacenter technologies, NY Times, data infrastructure, management, trends, metrics, digital transformation, tradecraft skills, DevOps, serverless among others.

Check out Nicks post The Futile Resistance to Serverless here. Listen to the podcast discussion here (MP3 16 minutes and 50 seconds) as well as on iTunes here.

Where to learn more

Learn more about Oracle, Database Performance, Benchmarking along with other tools via the following links:

The Futile Resistance to Serverless (via Nick Rockwell blog)
Nick Rockwell New York Times profile, Nick on Twitter (@NickRocksWell)
Data Infrastructure Server Storage I/O related Tradecraft Overview
Did you want a side of SLBS (server less BS) with your software or hardware FUD?
Software Defined Data Infrastructure Essentials (CRC Press 2017)
Server StorageIO.tv (various videos and podcasts, fun and for work)

What this all means and wrap-up

Check out my discussion here (MP3) with Nick Rockwell as we discuss IT and business transition, metrics, software development, and serverless life beyond DevOps. Also available on

Ok, nuff said, for now…

Cheers
Gs

Ok, nuff said, for now.

November 26, 2017November 26, 2023

Data Protection Diaries Fundamental Point In Time Granularity Points of Interest

Data Protection Diaries Fundamental Point In Time Granularity

Companion to Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Fundamental Server Storage I/O Tradecraft ( CRC Press 2017)

server storage I/O data infrastructure trends

By Greg Schulz – www.storageioblog.com November 26, 2017

This is Part 5 of a multi-part series on Data Protection fundamental tools topics techniques terms technologies trends tradecraft tips as a follow-up to my Data Protection Diaries series, as well as a companion to my new book Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Server Storage I/O Fundamental tradecraft (CRC Press 2017).

Click here to view the previous post Part 4 Data Protection Recovery Points (Archive, Backup, Snapshots, Versions), and click here to view the next post Part 6 Data Protection Security Logical Physical Software Defined.

Post in the series includes excerpts from Software Defined Data Infrastructure (SDDI) pertaining to data protection for legacy along with software defined data centers ( SDDC), data infrastructures in general along with related topics. In addition to excerpts, the posts also contain links to articles, tips, posts, videos, webinars, events and other companion material. Note that figure numbers in this series are those from the SDDI book and not in the order that they appear in the posts.

In this post the focus is around Data Protection points of granularity, addressing different layers and stack altitude (higher application and lower system level) Chapter 10 . among others.

Point-in-Time Protection Granularity Points of Interest

SDDC SDDI Data Protection Recovery consistency points
Figure 10.1 Recovery and consistency points

Figure 10.1 above is a refresh from previous posts about the role and importance of having various recovery points at different time intervals to enable data protection (and restoration). Building upon figure 10.1, figure 10.5 looks at different granularity of where and how data should be protected. Keep in mind that everything is not the same, so why treat everything the same with the same type of protection?

Figure 10.5 shows backup and Data Protection focus, granularity, and coverage. For example, at the top left is less frequent protection of the operating system, hypervisors, and BIOS, UEFI settings. At the middle left is volume, or device level protection (full, incremental, differential), along with various views on the right ranging from protecting everything, to different granularity such as file system, database, database logs and journals, and operating system (OS) and application software, along with settings.

SDDC SDDI Different Protection Granularity
Figure 10.5 Backup and data protection focus, granularity, and coverage

In Figure 10.5, note that the different recovery point focus and granularity also take into consideration application and data consistency (as well as checkpoints), along with different frequencies and coverage (e.g. full, partial, incremental, incremental forever, differential) as well as retention.

Tip – Some context is needed about object backup and backing up objects, which can mean different things. As mentioned elsewhere, objects refer to many different things, including cloud and object storage buckets, containers, blobs, and objects accessed via S3 or Swift, among other APIs. There are also database objects and entities, which are different from cloud or object storage objects.

Another context factor is that an object backup can refer to protecting different systems, servers, storage devices, volumes, and entities that collectively comprise an application such as accounting, payroll, or engineering, vs. focusing on the individual components. An object backup may, in fact, be a collection of individual backups, PIT copies, and snapshots that combined represent what’s needed to restore an application or system.

On the other hand, the content of a cloud or object storage repository ( buckets, containers, blobs, objects, and metadata) can be backed up, as well as serve as a destination target for protection.

Backups can be cold and off-line like archives, as well as on-line and accessible. However, the difference between the two, besides intended use and scope, is granularity. Archives are intended to be coarser and less frequently accessed, while backups can be more frequently and granular accessed. Can you use a backup for an archive and vice versa? A qualified yes, as an archive could be a master gold copy such as an annual protection copy, in addition to functioning in its role as a compliance and retention copy. Likewise, a full backup set to long-term retention can provide and enable some archive functions.

Where To Learn More

Continue reading additional posts in this series of Data Infrastructure Data Protection fundamentals and companion to Software Defined Data Infrastructure Essentials (CRC Press 2017) book, as well as the following links covering technology, trends, tools, techniques, tradecraft and tips.

Part 1 – Data Infrastructure Data Protection Fundamentals
Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
Part 3 – Data Protection Access Availability RAID Erasure Codes ( EC) including LRC
Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
Part 5 – Point In Time Data Protection Granularity Points of Interest
Part 6 – Data Protection Security Logical Physical Software Defined
Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
Part 8 – Data Protection Diaries Walking Data Protection Talk
Part 9 – who’s Doing What ( Toolbox Technology Tools)
Part 10 – Data Protection Resources Where to Learn More
Data Protection Diaries series
Data Infrastructure server storage I/O network Recommended Reading List Book Shelf
Software Defined Data Infrastructure Essentials (CRC 2017) Book

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

A common theme in this series as well as in my books, webinars, seminars and general approach to data infrastructures, data centers and IT in general is that everything is not the same, why treat it all the same? What this means is that there are differences across various environments, data centers, data infrastructures, applications, workloads and data. There are also different threat risks scenarios (e.g. threat vectors and attack surface if you like vendor industry talk) to protect against.

Rethinking and modernizing data protection means using new (and old) tools in new ways, stepping back and rethinking what to protect, when, where, why, how, with what. This also means protecting in different ways at various granularity, time intervals, as well as multiple layers or altitude (higher up the application stack, or lower level).

Get your copy of Software Defined Data Infrastructure Essentials here at Amazon.com, at CRC Press among other locations and learn more here. Meanwhile, continue reading with the next post in this series, Part 6 Data Protection Security Logical Physical Software Defined.

Ok, nuff said, for now.

November 26, 2017November 3, 2024

Data Infrastructure Data Protection Diaries Fundamental Security Logical Physical

Data Infrastructure Data Protection Security Logical Physical

Companion to Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Fundamental Server Storage I/O Tradecraft ( CRC Press 2017)

server storage I/O data infrastructure trends

By Greg Schulz – www.storageioblog.com November 26, 2017

This is Part 6 of a multi-part series on Data Protection fundamental tools topics techniques terms technologies trends tradecraft tips as a follow-up to my Data Protection Diaries series, as well as a companion to my new book Software Defined Data Infrastructure Essentials – Cloud, Converged, Virtual Server Storage I/O Fundamental tradecraft (CRC Press 2017).

Click here to view the previous post Part 5 – Point In Time Data Protection Granularity Points of Interest, and click here to view the next post Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends.

In this post the focus is around Data Infrastructure and Data Protection security including logical as well as physical from chapter 10 , 13 and 14 among others.

Figure 1.5 Data Infrastructures and other IT Infrastructure Layers

There are many different aspects of security pertaining to data infrastructures that span various technology domains or focus areas from higher level application software to lower level hardware, from legacy to cloud an software-defined, from servers to storage and I/O networking, logical and physical, from access control to intrusion detection, monitoring, analytics, audit, monitoring, telemetry logs, encryption, digital forensics among many others. Security should not be an after thought of something done independent of other data infrastructure, data center and IT functions, rather integrated.

Security Logical Physical Software Defined

Physical security includes locked doors of facilities, rooms, cabinets or devices to prevent un-authorized access. In addition to locked doors, physical security also includes safeguards to prevent accidental or intentional acts that would compromise the contents of a data center including data Infrastructure resources (servers, storage, I/O networks, hardware, software, services) along with the applications that they support.

Logical security includes access controls, passwords, event and access logs, encryption among others technologies, tools, techniques. Figure 10.11 shows various data infrastructure security–related items from cloud to virtual, hardware and software, as well as network services. Also shown are mobile and edge devices as well as network connectivity between on-premises and remote cloud services. Cloud services include public, private, as well as hybrid and virtual private clouds (VPC) along with virtual private networks (VPN). Access logs for telemetry are also used to track who has accessed what and when, as well as success along with failed attempts.

Certificates (public or private), Encryption, Access keys including .pem and RSA files via a service provider or self-generated with a tool such as Putty or ssh-keygen among many others. Some additional terms including Two Factor Authentication (2FA), Subordinated, Role based and delegated management, Single Sign On (SSO), Shared Access Signature (SAS) that is used by Microsoft Azure for access control, Server Side Encryption (SSE) with various Key Management System (KMS) attributes including customer managed or via a third-party.

SDDC SDDI Data Protection Security
Figure 10.11 Various physical and logical security and access controls

Also shown in figure 10.11 are encryption enabled at various layers, levels or altitude that can range from simple to complex. Also shown are iSCSI IPsec and CHAP along with firewalls, Active Directory (AD) along with Azure AD (AAD), and Domain Controllers (DC), Group Policies Objects (GPO) and Roles. Note that firewalls can exist in various locations both in hardware appliances in the network, as well as software defined network (SDN), network function virtualization (NFV), as well as higher up.

For example there are firewalls in network routers and appliances, as well as within operating systems, hypervisors, and further up in web blogs platforms such as WordPress among many others. Likewise further up the stack or higher in altitude access to applications as well as database among other resources is also controlled via their own, or in conjunction with other authentication, rights and access control including ADs among others.

A term that might be new for some is attestation which basically means to authenticate and be validated by a server or service, for example, a host guarded server attests with a attestation server. What this means is that the host guarded server (for example Microsoft Windows Server) attests with a known attestation server, that looks at the Windows server comparing it to known good fingerprints, profiles, making sure it is safe to run as a guarded resources.

Other security concerns for legacy and software defined environments include secure boot, shield VMs, host guarded servers and fabrics (networks or clusters of servers) for on-premises, as well as cloud. The following image via Microsoft shows an example of shielded VMs in a Windows Server 2016 environment along with host guarded service (HGS) components ( see how to deploy here).

Via Microsoft.com Guarded Hosts, Shielded VMs and Key Protection Services

Encryption can be done in different locations ranging from data in flight or transit over networks (local and remote), as well as data at rest or while stored. Strength of encryption is determined by different hash and cipher codes algorithms including SHA among others ranging from simple to more complex. The encryption can be done by networks, servers, storage systems, hypervisors, operating systems, databases, email, word and many other tools at granularity from device, file systems, folder, file, database, table, object or blob.

Virtual machine and their virtual disks ( VHDX and VMDK) can be encrypted, as well as migration or movements such as vMotions among other activities. Here are some VMware vSphere encryption topics, along with deep dive previews from VMworld 2016 among other resources here, VMware hardening guides here (NSX, vSphere), and a VMware security white paper (PDF) here.

Other security-related items shown in Figure 10.11 include Lightweight Direct Access Protocol (LDAP), Remote Authentication Dial-In User Service (RADIUS), and Kerberos network authentication. Also shown are VPN along with Secure Socket Layer (SSL) network security, along with security and authentication keys, credentials for SSH remote access including SSO. The cloud shown in figure 10.11 could be your own private using AzureStack, VMware (on-site, or public cloud such as IBM or AWS), OpenStack among others, or a public cloud such as AWS, Azure or Google (among others).

Where To Learn More

Part 1 – Data Infrastructure Data Protection Fundamentals
Part 2 – Reliability, Availability, Serviceability ( RAS) Data Protection Fundamentals
Part 3 – Data Protection Access Availability RAID Erasure Codes ( EC) including LRC
Part 4 – Data Protection Recovery Points (Archive, Backup, Snapshots, Versions)
Part 5 – Point In Time Data Protection Granularity Points of Interest
Part 6 – Data Protection Security Logical Physical Software Defined
Part 7 – Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends
Part 8 – Data Protection Diaries Walking Data Protection Talk
Part 9 – who’s Doing What ( Toolbox Technology Tools)
Part 10 – Data Protection Resources Where to Learn More
Data Protection Diaries series
Data Infrastructure server storage I/O network Recommended Reading List Book Shelf
Software Defined Data Infrastructure Essentials (CRC 2017) Book

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

There are many different aspects, as well as layers of security from logical to physical pertaining to data centers, applications and associated data Infrastructure resources, both on-premises and cloud. Security for legacy and software defined environments needs to be integrated as part of various technology domain focus areas, as well as across them including data protection. The above is a small sampling of security related topics with more covered in various chapters of SDDI Essentials as well as in my other books, webinars, presentations and content.

From a data protection focus, security needs to be addressed from a physical who has access to primary and protection copies, what is being protected against and where, as well as who can access logically protection copes, as well as the configuration, settings, certificates involved in data protection. In other words, how are you protecting your data protection environment, configuration and deployment. Data protection copies need to be encrypted to meet regulations, compliance and other requirements to guard against loss or theft, accidental or intentional. Likewise access control needs to be managed including granting of roles, security, authentication, monitoring of access, along with revocation.

Get your copy of Software Defined Data Infrastructure Essentials here at Amazon.com, at CRC Press among other locations and learn more here. Meanwhile, continue reading with the next post in this series, Part 7 Data Protection Tools, Technologies, Toolbox, Buzzword Bingo Trends

Ok, nuff said, for now.

March 2018 Server StorageIO Data Infrastructure Update Newsletter

Volume 18, Issue 3 (March 2018)

What this all means and wrap-up

Share this:

Application Data Value Characteristics Everything Is Not The Same

Common Applications Characteristics

Performance and Activity (How Resources Get Used)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Availability 4 3 2 1 Data Protection

Availability (Accessibility, Durability, Consistency)

Capacity and Space (What Gets Consumed and Occupied)

Economics (People, Budgets, Energy and other Constraints)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Characteristics Types Everything Is Not The Same

Various Types of Data

Different data with various values over time

Data Value

Where to learn more

What this all means and wrap-up

Share this:

Application Data Volume Velocity Variety Everything Not The Same

Volume of Data

Variety of Data

Velocity of Data

Where to learn more

What this all means and wrap-up

Share this:

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

Where to learn more

What this all means and wrap-up

Share this:

Veeam GDPR preparedness experiences Webinar walking the talk

Where to learn more

What this all means and wrap-up

Share this:

VMware continues cloud construction with March announcements

VMware continues cloud construction updates

VMware Cloud in additional AWS Regions

VMware Cloud on AWS Stretch Clusters

vSAN based Data Footprint Reduction (DFR) aka Compression and De duplication

VMware Hybrid Cloud Extension

VMware Cloud Cost Insight

VMware Log Intelligence

VMware Wavefront

Where to learn more

What this all means and wrap-up

Share this:

World Backup Day 2018 Data Protection Readiness Reminder

Quick Data Protection, Backup and Recovery Checklist

Where To Learn More

What This All Means

Share this:

Achieve Flexible Data Protection Availability with All Flash Solutions

Where to learn more

What this all means and wrap-up

Share this:

Volume 17, Issue 11 (November 2017)

In This Issue

Data Infrastructure and IT Industry Activity Trends

Server StorageIO Commentary in the news, tips and articles

Server StorageIOblog Data Infrastructure Posts

Server StorageIO Recommended Reading (Watching and Listening) List

Events and Activities

Share this:

IT transformation Serverless Life Beyond DevOps with New York Times CTO Nick Rockwell Podcast

Where to learn more

What this all means and wrap-up

Share this:

Data Protection Diaries Fundamental Point In Time Granularity

Point-in-Time Protection Granularity Points of Interest

Where To Learn More

What This All Means

Share this:

Data Infrastructure Data Protection Security Logical Physical

Security Logical Physical Software Defined