ILM Archives

March 13, 2018November 26, 2023

Application Data Value Characteristics Everything Is Not The Same (Part I)

Application Data Value Characteristics Everything Is Not The Same

This is part one of a five-part mini-series looking at Application Data Value Characteristics Everything Is Not The Same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we start things off by looking at general application server storage I/O characteristics that have an impact on data value as well as access.

Everything is not the same across different organizations including Information Technology (IT) data centers, data infrastructures along with the applications as well as data they support. For example, there is so-called big data that can be many small files, objects, blobs or data and bit streams representing telemetry, click stream analytics, logs among other information.

Keep in mind that applications impact how data is accessed, used, processed, moved and stored. What this means is that a focus on data value, access patterns, along with other related topics need to also consider application performance, availability, capacity, economic (PACE) attributes.

If everything is not the same, why is so much data along with many applications treated the same from a PACE perspective?

Data Infrastructure resources including servers, storage, networks might be cheap or inexpensive, however, there is a cost to managing them along with data.

Managing includes data protection (backup, restore, BC, DR, HA, security) along with other activities. Likewise, there is a cost to the software along with cloud services among others. By understanding how applications use and interact with data, smarter, more informed data management decisions can be made.

IT Applications and Data Infrastructure Layers

Keep in mind that everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them. Also keep in mind that programs (e.g. applications) = algorithms (code) + data structures (how data defined and organized, structured or unstructured).

There are traditional applications, along with those tied to Internet of Things (IoT), Artificial Intelligence (AI) and Machine Learning (ML), Big Data and other analytics including real-time click stream, media and entertainment, security and surveillance, log and telemetry processing among many others.

What this means is that there are many different application with various character attributes along with resource (server compute, I/O network and memory, storage requirements) along with service requirements.

Common Applications Characteristics

Different applications will have various attributes, in general, as well as how they are used, for example, database transaction activity vs. reporting or analytics, logs and journals vs. redo logs, indices, tables, indices, import/export, scratch and temp space. Performance, availability, capacity, and economics (PACE) describes the applications and data characters and needs shown in the following figure.

Application PACE attributes (via Software Defined Data Infrastructure Essentials)

All applications have PACE attributes, however:

PACE attributes vary by application and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application

Think of applications along with associated data PACE as its personality or how it behaves, what it does, how it does it, and when, along with value, benefit, or cost as well as quality-of-service (QoS) attributes.

Understanding applications in different environments, including data values and associated PACE attributes, is essential for making informed server, storage, I/O decisions and data infrastructure decisions. Data infrastructures decisions range from configuration to acquisitions or upgrades, when, where, why, and how to protect, and how to optimize performance including capacity planning, reporting, and troubleshooting, not to mention addressing budget concerns.

Primary PACE attributes for active and inactive applications and data are:

P – Performance and activity (how things get used)
A – Availability and durability (resiliency and data protection)
C – Capacity and space (what things use or occupy)
E – Economics and Energy (people, budgets, and other barriers)

Some applications need more performance (server computer, or storage and network I/O), while others need space capacity (storage, memory, network, or I/O connectivity). Likewise, some applications have different availability needs (data protection, durability, security, resiliency, backup, business continuity, disaster recovery) that determine the tools, technologies, and techniques to use.

Budgets are also nearly always a concern, which for some applications means enabling more performance per cost while others are focused on maximizing space capacity and protection level per cost. PACE attributes also define or influence policies for QoS (performance, availability, capacity), as well as thresholds, limits, quotas, retention, and disposition, among others.

Performance and Activity (How Resources Get Used)

Some applications or components that comprise a larger solution will have more performance demands than others. Likewise, the performance characteristics of applications along with their associated data will also vary. Performance applies to the server, storage, and I/O networking hardware along with associated software and applications.

For servers, performance is focused on how much CPU or processor time is used, along with memory and I/O operations. I/O operations to create, read, update, or delete (CRUD) data include activity rate (frequency or data velocity) of I/O operations (IOPS). Other considerations include the volume or amount of data being moved (bandwidth, throughput, transfer), response time or latency, along with queue depths.

Activity is the amount of work to do or being done in a given amount of time (seconds, minutes, hours, days, weeks), which can be transactions, rates, IOPs. Additional performance considerations include latency, bandwidth, throughput, response time, queues, reads or writes, gets or puts, updates, lists, directories, searches, pages views, files opened, videos viewed, or downloads.

Server, storage, and I/O network performance include:

Processor CPU usage time and queues (user and system overhead)
Memory usage effectiveness including page and swap
I/O activity including between servers and storage
Errors, retransmission, retries, and rebuilds

the following figure shows a generic performance example of data being accessed (mixed reads, writes, random, sequential, big, small, low and high-latency) on a local and a remote basis. The example shows how for a given time interval (see lower right), applications are accessing and working with data via different data streams in the larger image left center. Also shown are queues and I/O handling along with end-to-end (E2E) response time.

Server I/O performance fundamentals (via Software Defined Data Infrastructure Essentials)

Click here to view a larger version of the above figure.

Also shown on the left in the above figure is an example of E2E response time from the application through the various data infrastructure layers, as well as, lower center, the response time from the server to the memory or storage devices.

Various queues are shown in the middle of the above figure which are indicators of how much work is occurring, if the processing is keeping up with the work or causing backlogs. Context is needed for queues, as they exist in the server, I/O networking devices, and software drivers, as well as in storage among other locations.

Some basic server, storage, I/O metrics that matter include:

Queue depth of I/Os waiting to be processed and concurrency
CPU and memory usage to process I/Os
I/O size, or how much data can be moved in a given operation
I/O activity rate or IOPs = amount of data moved/I/O size per unit of time
Bandwidth = data moved per unit of time = I/O size × I/O rate
Latency usually increases with larger I/O sizes, decreases with smaller requests
I/O rates usually increase with smaller I/O sizes and vice versa
Bandwidth increases with larger I/O sizes and vice versa
Sequential stream access data may have better performance than some random access data
Not all data is conducive to being sequential stream, or random
Lower response time is better, higher activity rates and bandwidth are better

Queues with high latency and small I/O size or small I/O rates could indicate a performance bottleneck. Queues with low latency and high I/O rates with good bandwidth or data being moved could be a good thing. An important note is to look at several metrics, not just IOPs or activity, or bandwidth, queues, or response time. Also, keep in mind that metrics that matter for your environment may be different from those for somebody else.

Something to keep in perspective is that there can be a large amount of data with low performance, or a small amount of data with high-performance, not to mention many other variations. The important concept is that as space capacity scales, that does not mean performance also improves or vice versa, after all, everything is not the same.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

March 13, 2018November 26, 2023

Application Data Availability 4 3 2 1 Data Protection

4 3 2 1 data protection Application Data Availability Everything Is Not The Same

Application Data Availability 4 3 2 1 Data Protection

This is part two of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application performance, availability, capacity, economic (PACE) attributes that have an impact on data value as well as availability.

Availability (Accessibility, Durability, Consistency)

Just as there are many different aspects and focus areas for performance, there are also several facets to availability. Note that applications performance requires availability and availability relies on some level of performance.

Availability is a broad and encompassing area that includes data protection to protect, preserve, and serve (backup/restore, archive, BC, BR, DR, HA) data and applications. There are logical and physical aspects of availability including data protection as well as security including key management (manage your keys or authentication and certificates) and permissions, among other things.

Availability = accessibility (can you get to your application and data) + durability (is the data intact and consistent). This includes basic Reliability, Availability, Serviceability (RAS), as well as high availability, accessibility, and durability. “Durable” has multiple meanings, so context is important. Durable means how data infrastructure resources hold up to, survive, and tolerate wear and tear from use (i.e., endurance), for example, Flash SSD or mechanical devices such as Hard Disk Drives (HDDs). Another context for durable refers to data, meaning how many copies in various places.

Server, storage, and I/O network availability topics include:

Resiliency and self-healing to tolerate failure or disruption
Hardware, software, and services configured for resiliency
Accessibility to reach or be reached for handling work
Durability and consistency of data to be available for access
Protection of data, applications, and assets including security

Additional server I/O and data infrastructure along with storage topics include:

Backup/restore, replication, snapshots, sync, and copies
Basic Reliability, Availability, Serviceability, HA, fail over, BC, BR, and DR
Alternative paths, redundant components, and associated software
Applications that are fault-tolerant, resilient, and self-healing
Non disruptive upgrades, code (application or software) loads, and activation
Immediate data consistency and integrity vs. eventual consistency
Virus, malware, and other data corruption or loss prevention

From a data protection standpoint, the fundamental rule or guideline is 4 3 2 1, which means having at least four copies consisting of at least three versions (different points in time), at least two of which are on different systems or storage devices and at least one of those is off-site (on-line, off-line, cloud, or other). There are many variations of the 4 3 2 1 rule shown in the following figure along with approaches on how to manage technology to use. We will go into deeper this subject in later chapters. For now, remember the following.

4 3 2 1 data protection (via Software Defined Data Infrastructure Essentials)

4    At least four copies of data (or more), Enables durability in case a copy goes bad, deleted, corrupted, failed device, or site.
3    The number (or more) versions of the data to retain, Enables various recovery points in time to restore, resume, restart from.
2    Data located on two or more systems (devices or media/mediums), Enables protection against device, system, server, file system, or other fault/failure.

1 With at least one of those copies being off-premise and not live (isolated from active primary copy), Enables resiliency across sites, as well as space, time, distance gap for protection.

Capacity and Space (What Gets Consumed and Occupied)

In addition to being available and accessible in a timely manner (performance), data (and applications) occupy space. That space is memory in servers, as well as using available consumable processor CPU time along with I/O (performance) including over networks.

Data and applications also consume storage space where they are stored. In addition to basic data space, there is also space consumed for metadata as well as protection copies (and overhead), application settings, logs, and other items. Another aspect of capacity includes network IP ports and addresses, software licenses, server, storage, and network bandwidth or service time.

Server, storage, and I/O network capacity topics include:

Consumable time-expiring resources (processor time, I/O, network bandwidth)
Network IP and other addresses
Physical resources of servers, storage, and I/O networking devices
Software licenses based on consumption or number of users
Primary and protection copies of data and applications
Active and standby data infrastructure resources and sites
Data footprint reduction (DFR) tools and techniques for space optimization
Policies, quotas, thresholds, limits, and capacity QoS
Application and database optimization

DFR includes various techniques, technologies, and tools to reduce the impact or overhead of protecting, preserving, and serving more data for longer periods of time. There are many different approaches to implementing a DFR strategy, since there are various applications and data.

Common DFR techniques and technologies include archiving, backup modernization, copy data management (CDM), clean up, compress, and consolidate, data management, deletion and dedupe, storage tiering, RAID (including parity-based, erasure codes , local reconstruction codes [LRC] , and Reed-Solomon , Ceph Shingled Erasure Code (SHEC ), among others), along with protection configurations along with thin-provisioning, among others.

DFR can be implemented in various complementary locations from row-level compression in database or email to normalized databases, to file systems, operating systems, appliances, and storage systems using various techniques.

Also, keep in mind that not all data is the same; some is sparse, some is dense, some can be compressed or deduped while others cannot. Likewise, some data may not be compressible or dedupable. However, identical copies can be identified with links created to a common copy.

Economics (People, Budgets, Energy and other Constraints)

If one thing in life and technology that is constant is change, then the other constant is concern about economics or costs. There is a cost to enable and maintain a data infrastructure on premise or in the cloud, which exists to protect, preserve, and serve data and information applications.

However, there should also be a benefit to having the data infrastructure to house data and support applications that provide information to users of the services. A common economic focus is what something costs, either as up-front capital expenditure (CapEx) or as an operating expenditure (OpEx) expense, along with recurring fees.

In general, economic considerations include:

Budgets (CapEx and OpEx), both up front and in recurring fees
Whether you buy, lease, rent, subscribe, or use free and open sources
People time needed to integrate and support even free open-source software
Costs including hardware, software, services, power, cooling, facilities, tools
People time includes base salary, benefits, training and education

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. All applications have some element of performance, availability, capacity, economic (PACE) needs as well as resource demands. There is often a focus around data storage about storage efficiency and utilization which is where data footprint reduction (DFR) techniques, tools, trends and as well as technologies address capacity requirements. However with data storage there is also an expanding focus around storage effectiveness also known as productivity tied to performance, along with availability including 4 3 2 1 data protection. Continue reading the next post (Part III Application Data Characteristics Types Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Characteristics Types Everything Is Not The Same

This is part three of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on different types of data. There is more to data than simply being big data, fast data, big fast or unstructured, structured or semistructured, some of which has been touched on in this series, with more to follow. Note that there is also data in terms of the programs, applications, code, rules, policies as well as configuration settings, metadata along with other items stored.

Various Types of Data

Data types along with characteristics include big data, little data, fast data, and old as well as new data with a different value, life-cycle, volume and velocity. There are data in files and objects that are big representing images, figures, text, binary, structured or unstructured that are software defined by the applications that create, modify and use them.

There are many different types of data and applications to meet various business, organization, or functional needs. Keep in mind that applications are based on programs which consist of algorithms and data structures that define the data, how to use it, as well as how and when to store it. Those data structures define data that will get transformed into information by programs while also being stored in memory and on data stored in various formats.

Just as various applications have different algorithms, they also have different types of data. Even though everything is not the same in all environments, or even how the same applications get used across various organizations, there are some similarities. Even though there are different types of applications and data, there are also some similarities and general characteristics. Keep in mind that information is the result of programs (applications and their algorithms) that process data into something useful or of value.

Data typically has a basic life cycle of:

Creation and some activity, including being protected
Dormant, followed by either continued activity or going inactive
Disposition (delete or remove)

In general, data can be

Temporary, ephemeral or transient
Dynamic or changing (“hot data”)
Active static on-line, near-line, or off-line (“warm-data”)
In-active static on-line or off-line (“cold data”)

Data is organized

Structured
Semi-structured
Unstructured

General data characteristics include:

Value = From no value to unknown to some or high value
Volume = Amount of data, files, objects of a given size
Variety = Various types of data (small, big, fast, structured, unstructured)
Velocity = Data streams, flows, rates, load, process, access, active or static

The following figure shows how different data has various values over time. Data that has no value today or in the future can be deleted, while data with unknown value can be retained.

Different data with various values over time

Application Data Value across sddc
Data Value Known, Unknown and No Value

General characteristics include the value of the data which in turn determines its performance, availability, capacity, and economic considerations. Also, data can be ephemeral (temporary) or kept for longer periods of time on persistent, non-volatile storage (you do not lose the data when power is turned off). Examples of temporary scratch include work and scratch areas such as where data gets imported into, or exported out of, an application or database.

Data can also be little, big, or big and fast, terms which describe in part the size as well as volume along with the speed or velocity of being created, accessed, and processed. The importance of understanding characteristics of data and how their associated applications use them is to enable effective decision-making about performance, availability, capacity, and economics of data infrastructure resources.

Data Value

There is more to data storage than how much space capacity per cost.

All data has one of three basic values:

No value = ephemeral/temp/scratch = Why keep it?
Some value = current or emerging future value, which can be low or high = Keep
Unknown value = protect until value is unlocked, or no remaining value

In addition to the above basic three, data with some value can also be further subdivided into little value, some value, or high value. Of course, you can keep subdividing into as many more or different categories as needed, after all, everything is not always the same across environments.

Besides data having some value, that value can also change by increasing or decreasing in value over time or even going from unknown to a known value, known to unknown, or to no value. Data with no value can be discarded, if in doubt, make and keep a copy of that data somewhere safe until its value (or lack of value) is fully known and understood.

The importance of understanding the value of data is to enable effective decision-making on where and how to protect, preserve, and cost-effectively store the data. Note that cost-effective does not necessarily mean the cheapest or lowest-cost approach, rather it means the way that aligns with the value and importance of the data at a given point in time.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value at various times, and that value is also evolving. Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part IV Application Data Volume Velocity Variety Everything Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Volume Velocity Variety Everything Is Not The Same

Application Data Volume Velocity Variety Everything Not The Same

Application Data Volume Velocity Variety Everything Is Not The Same

Application Data Volume Velocity Variety Everything Not The Same

This is part four of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on data volume velocity and variety, after all, everything is not the same, not to mention many different aspects of big data as well as little data.

Volume of Data

More data is growing at a faster rate every day, and that data is being retained for longer periods. Some data being retained has known value, while a growing amount of data has an unknown value. Data is generated or created from many sources, including mobile devices, social networks, web-connected systems or machines, and sensors including IoT and IoD. Besides where data is created from, there are also many consumers of data (applications) that range from legacy to mobile, cloud, IoT among others.

Unknown-value data may eventually have value in the future when somebody realizes that he can do something with it, or a technology tool or application becomes available to transform the data with unknown value into valuable information.

Some data gets retained in its native or raw form, while other data get processed by application program algorithms into summary data, or is curated and aggregated with other data to be transformed into new useful data. The figure below shows, from left to right and front to back, more data being created, and that data also getting larger over time. For example, on the left are two data items, objects, files, or blocks representing some information.

In the center of the following figure are more columns and rows of data, with each of those data items also becoming larger. Moving farther to the right, there are yet more data items stacked up higher, as well as across and farther back, with those items also being larger. The following figure can represent blocks of storage, files in a file system, rows, and columns in a database or key-value repository, or objects in a cloud or object storage system.

Application Data Value sddc
Increasing data velocity and volume, more data and data getting larger

In addition to more data being created, some of that data is relatively small in terms of the records or data structure entities being stored. However, there can be a large quantity of those smaller data items. In addition to the amount of data, as well as the size of the data, protection or overhead copies of data are also kept.

Another dimension is that data is also getting larger where the data structures describing a piece of data for an application have increased in size. For example, a still photograph was taken with a digital camera, cell phone, or another mobile handheld device, drone, or other IoT device, increases in size with each new generation of cameras as there are more megapixels.

Variety of Data

In addition to having value and volume, there are also different varieties of data, including ephemeral (temporary), persistent, primary, metadata, structured, semi-structured, unstructured, little, and big data. Keep in mind that programs, applications, tools, and utilities get stored as data, while they also use, create, access, and manage data.

There is also primary data and metadata, or data about data, as well as system data that is also sometimes referred to as metadata. Here is where context comes into play as part of tradecraft, as there can be metadata describing data being used by programs, as well as metadata about systems, applications, file systems, databases, and storage systems, among other things, including little and big data.

Context also matters regarding big data, as there are applications such as statistical analysis software and Hadoop, among others, for processing (analyzing) large amounts of data. The data being processed may not be big regarding the records or data entity items, but there may be a large volume. In addition to big data analytics, data, and applications, there is also data that is very big (as well as large volumes or collections of data sets).

For example, video and audio, among others, may also be referred to as big fast data, or large data. A challenge with larger data items is the complexity of moving over the distance promptly, as well as processing requiring new approaches, algorithms, data structures, and storage management techniques.

Likewise, the challenges with large volumes of smaller data are similar in that data needs to be moved, protected, preserved, and served cost-effectively for long periods of time. Both large and small data are stored (in memory or storage) in various types of data repositories.

In general, data in repositories is accessed locally, remotely, or via a cloud using:

Object and blobs stream, queue, and Application Programming Interface (API)
File-based using local or networked file systems
Block-based access of disk partitions, LUNs (logical unit numbers), or volumes

The following figure shows varieties of application data value including (left) photos or images, audio, videos, and various log, event, and telemetry data, as well as (right) sparse and dense data.

Application Data Value bits bytes blocks blobs bitstreams sddc
Varieties of data (bits, bytes, blocks, blobs, and bitstreams)

Velocity of Data

Data, in addition to having value (known, unknown, or none), volume (size and quantity), and variety (structured, unstructured, semi structured, primary, metadata, small, big), also has velocity. Velocity refers to how fast (or slowly) data is accessed, including being stored, retrieved, updated, scanned, or if it is active (updated, or fixed static) or dormant and inactive. In addition to data access and life cycle, velocity also refers to how data is used, such as random or sequential or some combination. Think of data velocity as how data, or streams of data, flow in various ways.

Velocity also describes how data is used and accessed, including:

Active (hot), static (warm and WORM), or dormant (cold)
Random or sequential, read or write-accessed
Real-time (online, synchronous) or time-delayed

Why this matters is that by understanding and knowing how applications use data, or how data is accessed via applications, you can make informed decisions. Also, having insight enables how to design, configure, and manage servers, storage, and I/O resources (hardware, software, services) to meet various needs. Understanding Application Data Value including the velocity of the data both for when it is created as well as when used is important for aligning the applicable performance techniques and technologies.

Where to learn more

Learn more about Application Data Value, application characteristics, performance, availability, capacity, economic (PACE) along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value, size, as well as velocity as part of its characteristic including how used by various applications. Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part V Application Data Access life cycle Patterns Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

March 13, 2018November 26, 2023

Application Data Access Lifecycle Patterns Everything Is Not The Same

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Application Data Access Life cycle Patterns Everything Is Not The Same

This is part five of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we look at various application and data lifecycle patterns as well as wrap up this series.

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

When it comes to Application Data Value, a common question I hear is why not keep all data?

If the data has value, and you have a large enough budget, why not? On the other hand, most organizations have a budget and other constraints that determine how much and what data to retain.

Another common question I get asked (or told) it isn’t the objective to keep less data to cut costs?

If the data has no value, then get rid of it. On the other hand, if data has value or unknown value, then find ways to remove the cost of keeping more data for longer periods of time so its value can be realized.

In general, the data life cycle (called by some cradle to grave, birth or creation to disposition) is created, save and store, perhaps update and read with changing access patterns over time, along with value. During that time, the data (which includes applications and their settings) will be protected with copies or some other technique, and eventually disposed of.

Between the time when data is created and when it is disposed of, there are many variations of what gets done and needs to be done. Considering static data for a moment, some applications and their data, or data and their applications, create data which is for a short period, then goes dormant, then is active again briefly before going cold (see the left side of the following figure). This is a classic application, data, and information life-cycle model (ILM), and tiering or data movement and migration that still applies for some scenarios.

Application Data Value
Changing data access patterns for different applications

However, a newer scenario over the past several years that continues to increase is shown on the right side of the above figure. In this scenario, data is initially active for updates, then goes cold or WORM (Write Once/Read Many); however, it warms back up as a static reference, on the web, as big data, and for other uses where it is used to create new data and information.

Data, in addition to its other attributes already mentioned, can be active (hot), residing in a memory cache, buffers inside a server, or on a fast storage appliance or caching appliance. Hot data means that it is actively being used for reads or writes (this is what the term Heat map pertains to in the context of the server, storage data, and applications. The heat map shows where the hot or active data is along with its other characteristics.

Context is important here, as there are also IT facilities heat maps, which refer to physical facilities including what servers are consuming power and generating heat. Note that some current and emerging data center infrastructure management (DCIM) tools can correlate the physical facilities power, cooling, and heat to actual work being done from an applications perspective. This correlated or converged management view enables more granular analysis and effective decision-making on how to best utilize data infrastructure resources.

In addition to being hot or active, data can be warm (not as heavily accessed) or cold (rarely if ever accessed), as well as online, near-line, or off-line. As their names imply, warm data may occasionally be used, either updated and written, or static and just being read. Some data also gets protected as WORM data using hardware or software technologies. WORM (immutable) data, not to be confused with warm data, is fixed or immutable (cannot be changed).

When looking at data (or storage), it is important to see when the data was created as well as when it was modified. However, you should avoid the mistake of looking only at when it was created or modified: Instead, also look to see when it was the last read, as well as how often it is read. You might find that some data has not been updated for several years, but it is still accessed several times an hour or minute. Also, keep in mind that the metadata about the actual data may be being updated, even while the data itself is static.

Also, look at your applications characteristics as well as how data gets used, to see if it is conducive to caching or automated tiering based on activity, events, or time. For example, there is a large amount of data for an energy or oil exploration project that normally sits on slower lower-cost storage, but that now and then some analysis needs to run on.

Using data and storage management tools, given notice or based on activity, which large or big data could be promoted to faster storage, or applications migrated to be closer to the data to speed up processing. Another example is weekly, monthly, quarterly, or year-end processing of financial, accounting, payroll, inventory, or enterprise resource planning (ERP) schedules. Knowing how and when the applications use the data, which is also understanding the data, automated tools, and policies, can be used to tier or cache data to speed up processing and thereby boost productivity.

All applications have performance, availability, capacity, economic (PACE) attributes, however:

PACE attributes vary by Application Data Value and usage
Some applications and their data are more active than others
PACE characteristics may vary within different parts of an application
PACE application and data characteristics along with value change over time

Read more about Application Data Value, PACE and application characteristics in Software Defined Data Infrastructure Essentials (CRC Press 2017).

Where to learn more

Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Lifecycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Keep in mind that Application Data Value everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them.

Also keep in mind that there is more data being created, the size of those data items, files, objects, entities, records are also increasing, as well as the speed at which they get created and accessed. The challenge is not just that there is more data, or data is bigger, or accessed faster, it’s all of those along with changing value as well as diverse applications to keep in perspective. With new Global Data Protection Regulations (GDPR) going into effect May 25, 2018, now is a good time to assess and gain insight into what data you have, its value, retention as well as disposition policies.

Remember, there are different data types, value, life-cycle, volume and velocity that change over time, and with Application Data Value Everything Is Not The Same, so why treat and manage everything the same?

Ok, nuff said, for now.

February 22, 2017November 26, 2023

Data Infrastructure IT Industry Related Resource Links P to T

IT Data Center and Data Infrastructure Industry Resources

Updated 6/13/2018

Following are some useful Data Infrastructure IT Industry Resource Links P to T to cloud, virtual and traditional IT data infrastructure related web sites. The data infrastructure environment (servers, storage, IO and networking, hardware, software, services, virtual, container and cloud) is rapidly changing. You may encounter a missing URL, or a URL that has changed. This list is updated on a regular basis to reflect changes (additions, changes, and retirement).

Disclaimer and note: URL’s submitted for inclusion on this site will be reviewed for consideration and to be in generally accepted good taste in regards to the theme of this site.

Best effort has been made to validate and verify the data infrastructure URLs that appear on this page and web site however they are subject to change. The author and/or maintainer(s) of this page and web site make no endorsement to and assume no responsibility for the URLs and their content that are listed on this page.

Send an email note to info at storageio dot com that includes company name, URL, contact name, title and phone number along with a brief 40 character description to be considered for addition to the above data infrastructure list, or, to be removed. Note that Server StorageIO and UnlimitedIO LLC (e.g. StorageIO) does not sell, trade, barter, borrow or share your contact information per our Privacy and Disclosure policy. View related data infrastructure Server StorageIO content here, and signup for our free newsletter here.

Links A-E
Links F-J
Links K-O
Links P-T
Links U-Z
Other Links

Packeteer.com WAFS and networking solutions (Bought Tacit)
packetlight.com CWDM and DWDM networking solutions
Panasas.com Clustered storage solution
pancetera.com Virtual machine backup software (Bought by Quantum)
Panduit.com Networking and cable management
panzura.com Cloud storage access software
paraccel.com Business and data analytics
paragon-software.com Storage management and backup tools
parallels.com VDI and desktop virtualization and cloud tools
parascale.com Clustered and cloud storage software
pcisig.com PCI trade group (PCI, PCI-X, PCI-Express/PCIe)
penguincomputing.com HPC servers, storage and hosting
pergamumsystems.com Archive solutions (Stealth)
Permabit.com Data archiving solutions
Pernixdata Server and storage I/O cache optimization for virtual servers
perotsystems.com Hosting and managed service provider (Bought by Dell)
pgp.com Security tools (Bought by Symantec)
PHDvirtual Data protection tools
Pillardata.com Data storage solutions – (Bought by Oracle)
pineapp.com Email, archive solutions, web and data protection
Piviot3.com IP Storage
Pivotal Labs Big Data, PaaS development tools, EMC/VMware spinout
plasmon.com (Now called Alliance Storage Technologies) Optical Storage Solutions
plextoramericas.com SSD and other storage solutions
plianttechnology.com Solid state storage devices (SSD) – (Bought by SANdisk)
Pluribus Networks Converged and software defined network management
pmc-serria.com Storage networking component supplier
pny.com Memory componets and technology
Pogoplug Cloud storage
PolyServe.com Clustered storage solutions (Sold to HP)
Polargy Data Center facilaties, HVAC and DCIM solutions
power.org Power Processor trade group
Mushkin SSD Solutions
Peak Cloud Cloud and storage services
PowerFile.com Data archiving solutions
powerware.com UPS and power conditioning systems
procedo.com Archiving and migration solutions
proceedtechnologies.com SAP consulting
profusionbackups.com Cloud and managed backup service solution
progeny.net VAR and specialized IT systems
prolexic.com Distributed denial of service tools
promise.com RAID storage systems
Prostorsystems.com Removable disk storage (See RDX Alliance)
Proxim.com Wireless networking
proximaldata.com SSD caching and tiering software
pt.com Communications hardware and software
puresi.com aka Puresilicon SSD storage solutions
purestorage.com SSD based storage
Puppet Labs IT Automation and DCIM tools for physical, Cloud and Virtual

qlogic.com Host bus adapters and switches
qsantechnology.com iSCSI IP storage
Qstart Technologies Data protection storage including LTFS based systems
Quadric Software Data protection software
qualstar.com Tape backup and archive solutions (Aka Qstar)
quantum.com Tape drives and libraries
quest.com IT and data management solution tools (Bought by Dell)
Qumulo Stealth storage startup
qwest.com (Century Link) Telephone and data networking, managed services provider
racemi.com Repurposing management tools
Rackable.com Now SGI
Rackspace.com Managed services and hosting
www.rackwise.com Data center management tools
raidinc.com Storage systems
raidundant.com Storage systems
Rainfinity.com File virtualization (Bought by EMC)
rainstor.com Big data management tools
rapidio.org RapidIO Trade Group
Raritan Data center and DCIM tools
rasilient.com Storage subsystem vendor
Ravello VMware optimization and management tools
Raxco Data, storage and systems management tools
rebit.com Backup and data protection solutions
RecordNation Digital Data Storage and Records Management
redbend.com Mobile device and application management
redbooks.ibm.com IBM Red books and Red pieces technical articles
Redhat.com Linux provider (Bought Gluster)
Reduxio Hybrid storage with data services
reflexphotonics.com Optical connectivity solutions
Reldata.com Storage systems (Renamed Starboard)
remote-backup.com Remote backup software
renewdata.com Data management and compliance tools
repliweb.com Web and content distribution
Retrospect Data Protection Software Tools
revivio.com Data Protection Software (Assets Bought by Symantec)
rightscale.com Amazon cloud computing management tools
rimage.com CD/DVD production technologies
risingtidesystems.com VAR
Ritek.com Storage solutions
rittal.com Enclosures and cabinets
riverbed.com Wide area file access acceleration solution
rjssoftware.com Document capture and management
rmsource.com Cloud backup solutions
rnanetworks.com Virtual memory management solutions (Bought by Dell)
rocketdivision.com iSCSI technologies
rorke.com VAR
rpath.com Data center automation
rsa.com Security division of EMC
safemediacorp.com Internet security and intrusion detection tools
safenet-inc.com Data protection focused VAR
Sagecloud Cloud storage, deep cold archive
samsung.com Various technologies including SSD memory
sanblaze.com Embedded storage and emulation solutions
SANbolic.com Storage, server and cloud management tools
sand-chip.com Chip design
SANDforce.com SSD storage solutions – (Bought by LSI)
sandial.com Defunct SAN startup
SANdisk.com SSD memory components
sandpiperdata.com Data migration services
sanmina-sci.com Contract manufacturer (Virtual Factory) for various OEM/VARs
sanovi.com Disaster recovery management tools
sanpulse.com SRA and automation tools
sanrad.com Storage networking routers (Bought by OCZ)
sans.org Security related web site
sansdigital.com VAR
sap.com Information management tools and applications
sas.com Statistical analysis software
sata-io.org Serial ATA trade organization
SavageIO High performance storage solutions
savvis.com Cloud, managed service provider and hosting (Bought by Centurylink)
sbbwg.org Storage Bridge Bay Working Group
scalable-systems.com Data warehouse consulting and tools
scalecomputing.com Clustered storage management software
scalemp.com Virtualization technology for scale out computing
scalent.com Virtual IT data center management tools
scality.com Email and sharepoint cloud storage
schoonerinfotech.com SSD based database management solutions
scsita.org SCSI and SAS trade group
seagate.com Disk drives
Sealpath Data and information protection tools
seanodes.com Distributed storage
sec.gov Site about compliance items including CFR 17a-4
securedatainnovations.com Data protection and security tools
sentilla.com Data center performance management tools
sepaton.com Disk based backup solutions
serialata.org Serial ATA trade association
servicemesh.com Cloud, datacenter transformation and devops tools
servicenow.com ITIL data center management tools
1servosity.com Cloud data protection
servoy.com Cloud development tools
ServPath.com Hosting services
seven10storage.com Disaster recovery and archiving software
sgi.com Storage, server and data management hardware, software, tools
sherpasoftware.com Email archiving
shop.bellmicro.com Distributor (Bought by Avnet)
siber.com Data protection and security tools
sidusdata.com Managed service and cloud provider
siemon.com Storage networking infrastructure items
sigmasol.com Value added reseller (VAR)
Signiant.com Data management tools
silexamerica.com Mobile device and server connectivity
SiliconImage.com Digital Video components
SiliconStor.com Storage networking silicon
siliconvalleypr.com IT technologies press/media and analyst relations firm
silveradotech.com VAR
silver-peak.com Wide area data and file services (WAFS, WADM, WADS)
SilverSky Cloud security
simpletech.com Storage solutions including USB portable devices
simplivity.com Convergence and virtualization solutions
simplycontinuous.net Data protection and cloud backup
siriuscom.com VAR
site-vault.com On-line backup server provider (BSP) managed service provider (MSP)
skyera.com SSD storage solutions
skytap.com Public and private cloud application development tools
Smart421 Smart421 AWS connect parter, Hosting/cloud/access services
smartm.com PC card and other memory module components
smc.com Storage and networking components
smithmicro.com Mobile data management tools
smmdirect.com Memory devices
snapappliances.com NAS Storage solutions (Now Adaptec)
snia.org Storage Networking Industry Association
snseurope.com U.K. & European Storage Networking News
snwusa.com SNIA and Computerworld conference
softek.com Storage management solutions (formerly Fujitsu Softek, Sold to IBM)
softlayer.com Cloud infrastructure services (IaaS) (Bought by IBM)
softnas.com ZFS based opensource NAS solutions
softricity.com Virtualization management tools (Bought by Microsoft)
Sogeti.com Data management tools
solarflare.com 10Gb Ethernet networking
solarwinds.com IT management tools (Bought TekTools, Hyper9 and others)
solidaccess.com Solid state storage (SSD) solutions
soliddata.com Solid State Disk solutions
solidfire.com iSCSI SSD optimized for hosting and cloud providers
Solix.com Database archiving software
solutiontechnology.co.uk Storage networking training
sonasoft.com Email archiving, backup and data protection
sonnettech.com External storage solutions
sony.com Storage devices
sophos.com Data protection and security tools
sorrento.com Optical networking
sparebackup.com Backup data protection solutions
sparkweave.com Private cloud archive and file sharing
spec.org SPEC benchmarks
spectralogic.com Tape library and disk based backup solutions
spiceworks.com Online community and management software tools
spirent.com Storage networking test equipment
Spiron.com Data discovery, classification, lifecycle management (formerly Identity Finder)
Splice Communications Splice Communications AWS connect parter, Hosting/cloud/access services
splunk.com DCIM and log management tools
spotcloud.com Cloud services clearing house
spraycool.com IT Data center and component cooling
springsoft.com Bought by Synopsys
spsoftglobal.com Software development
spyrus.com Security tools
ssswg.org IEEE Storage Systems Standards Work Group
starboardstorage.com Unified storage solutions (Formerly Reldata, now ceased operations)
startech.com IT/AV technolgie equipment from enclosures to KVM and more
starwindsoftware.com iSCSI storage management solutions
stcroixsolutions.com VAR
stec-inc.com SSD storage (Bought by WD)
Steeleye.com HA software
Stellar Data Protection tools
storagetek.com Disk, tape, data management software (Bought by Sun)
stonebranch.com File transfer tools
stonefly.com Storage networking routers (Aka DNF)
storability.com Storage management software (Bought by STK)
storactive.com Data protection solutions
storagecraft.com Data protection tools
storagefusion.com Storage resource analysis (SRA) tools
storageio.net Alternate URL for the StorageIO Group
storageiogroup.com Alternate URL for the StorageIO Group
storagemadeeasy.com Hybrid and personal cloud management tools and dashboards
Storagemonkeys.com Storage community site
storagenetworking.org Storage Networking Users Groups also known as SNUGs
storageperformance.org Storage Performance Council information
www.storagesearch.com Venue for information about various storage and related topics
storcase.com Data Archive solutions (Bought by Crudata)
store-age.com Storage management software (Bought by LSI)
storediq.com eDiscovery, search, indexing, classification (Bought by IBM)
Storewize.com Real time data compression (Bought by IBM)
Storix.com Data backup solutions
storlife.com CAS object archive storage
stormagic.com Storage virtualization and data movement software
storserver.com Backup and data protection solutions
storsimple.com Cloud storage access solutions (Bought by Microsoft)
storspeed.com NAS/NFS optimization solutions (Missing in action)
stratascale.com Cloud, hosting and management solutions
stratus.com High availability storage and servers
sugarsync.com Backup and data protection solutions
sun.com Storage networking hardware and software (Bought by Oracle)
sunbeltsoftware.com End point data protection security tools
sungard.com Data protection and cloud services
superlumin.com Application caching tools
supermicro.com Server and storage solutions
surdoc.com Cloud storage and backup
surgient.com Cloud computing solutions
svlg.net Silicon Valley Leadership Group
Swiftstack Private cloud solutions
swifttest.com NFS and CIFS storage testing solutions
sybase.com Database solutions
sycamorenetworks.com Networking solutions
Symantec.com Data and storage management software
symbolicio.com stealth startup
symform.com Cloud storage and backup
syncsort.com Information Management tools
synnex.com Distributor
Synnex IT Solutions
synology.com SMB storage solutions
synopsys.com Computer technology development and manufacturing
SysAid Data center, DCIM and ITSM tools
t10.orgscsi-3.htm ANSI T10 (SCSI information) site
t11.org ANSI T11 page for Fibre Channel information
t3media.com Cloud storage and video platform tools
tableausoftware.com Data analytics software tools
tacit.com WAN file system accelerator (Bought by Packeteer)
tacitnetworks.com Wide area file access acceleration solution (Bought by Packeteer)
tandberg.com Data management solutions (Bought by Cisco)
tapeandmedia.com Information about magnetic tape media
tapepower.com Site for tape topics
tarmin.com Archiving solutions
teamdrive.com Cloud storage
teamquest.com IRM management and capacity management tools
TeamViewer.com Remote support and Online meeting software
techdata.com Distributor
tegile.com Storage system solutions
tehutinetworks.net High speed iSCSI adapters
tek-tools.com SRM storage management software (Bought by Solarwinds)
TelecityGroup AWS connect parter, Hosting/cloud/access services
tellabs.com Networking components
Telx AWS connect parter, Hosting/cloud/access services
teneros.com Email archiving and management solutions
teracloud.com Capacity planning and resource management software
teradata.com Large scale database and data warehouse systems
teradici.com PC over IP technologies
teranetics.com Ethernet chips
Terascala Data analytics and management solutions
ter.de Optical storage libraries
terracloudinc.com Cloud services
TerraScale.com Scalable storage and server solutions
Verizon/Terremark Cloud, hosting and managed services
Tevron Application Response Time Monitoring
texmemsys.com Solid State Disk storage
thebci.org Business Continuity Institute
thecus.com Multi-protocol storage
thegreengrid.org Industry Trade Group
The Padcaster Apple iPad tools
thepluggllc.com Data center energy efficient floor tiles
theq3.com Data storage security solutions
thinkaheadit.com aka Ahead Value added reseller
thinkaheadit.com Value added reseller (VAR)
thirdbrigade.com Intrusion detection security tools (Bought by Trend Micro)
thirdio.com SSD solutions
tiaonline.org Telecommunications Industry Association
tidalsoftware.com IT Management software tools (Bought by Cisco)
timespring.com Continuous data protection solutions
tintri.com NFS and NAS storage optimized for VMware
tivoli.com Data management software
Softbank Telecom Corp. AWS connect parter, Hosting/cloud/access services
Primary Data and Tonian Stealth data virtualization startup

topgun-tech.com Data Infrastructure Resource (Server, Storage, SANs)

top500.org Top 500 super compute sites
topio.com Data protection software (Bought by NetApp)
topspin.com InfiniBand Technology (Bought by Cisco)(
Toshiba.com Server and storage solutions
tpc.org Transaction processing performance council
translattice.com Distributed and elastic database and automation tools
Tredent.com WAN optimization solutions
TrendMicro.com Security and anti virus tools
trianz.com VAR
tributary.com Datra protection soultion tools including virtual, disk and tape-
trilogytechnologies.ie Managed services provider
tritondata.com IT services and VAR
trunkbow.com Cloud, mobile and networking services
trustedcomputinggroup.org Trusted computing industry trade group
trusteddatasolutions.com VAR
trustedid.com ID theft protection
trustware.com Internet and data protection security tools
turnkeylinux.org Turnkey Linux appliance –
tusc.com VAR
twinstrata.com BC/DR analysis and cloud access software
tw telecom tw telecom AWS connect parter, Hosting/cloud/access services
TSO logic DCIM and data center power energy management tools
tzolkin.com DNS and High Availability solutions

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Can we get a side of context with them IOPS and other storage metrics?
WHEN AND WHERE TO USE NAND FLASH SSD FOR VIRTUAL SERVERS
Revisiting RAID storage remains relevant and resources
NVMe overview and primer – Part I
Part 1 of HDD for content servers series Trends and Content Application Servers
Part 2 of HDD for content servers series Content application server decisions and testing plans
Part 3 of HDD for content servers series Test hardware and software configuration
Part 4 of HDD for content servers series Large file I/O processing
Part 5 of HDD for content servers series Small file I/O processing
Part 6 of HDD for content servers series General I/O processing
Part 7 of HDD for content servers series How HDD continue to evolve over different generations and wrap up
As the platters spin, HDD’s for cloud, virtual and traditional storage environments
How many IOPS can a HDD, HHDD or SSD do?
Hard Disk Drives (HDD) for Virtual Environments
Server and Storage I/O performance and benchmarking tools
Server storage I/O performance benchmark workload scripts Part I and Part II
How to test your HDD, SSD or all flash array (AFA) storage fundamentals
What is the best server storage I/O workload benchmark? It depends
I/O, I/O how well do you know about good or bad server and storage I/Os?
Big Files Lots of Little File Processing Benchmarking with Vdbench
Part II – NVMe overview and primer (Different Configurations)
Part III – NVMe overview and primer (Need for Performance Speed)
Part IV – NVMe overview and primer (Where and How to use NVMe)
Part V – NVMe overview and primer (Where to learn more, what this all means)
PCIe Server I/O Fundamentals
If NVMe is the answer, what are the questions?
NVMe Wont Replace Flash By Itself
Via Computerweekly – NVMe discussion: PCIe card vs U.2 and M.2
Intel and Micron unveil new 3D XPoint Non Volatie Memory (NVM) for servers and storage
Part II – Intel and Micron new 3D XPoint server and storage NVM
Part III – 3D XPoint new server storage memory from Intel and Micron
Server storage I/O benchmark tools, workload scripts and examples (Part I) and (Part II)
Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities
www.objectstoragecenter.com and Software Defined, Cloud, Bulk and Object Storage Fundamentals
Server Storage I/O Network PCIe Fundamentals

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Visit the following additional data infrastructure and IT data center related links.

Links A-E
Links F-J
Links K-O
Links P-T
Links U-Z
Other Links

Ok, nuff said, for now.

October 20, 2010November 26, 2023

Have VTLs or VxLs become Zombies, Declared dead yet still alive?

Have you heard or read the reports and speculation that VTLs (Virtual Tape Libraries) are dead?

It seems that in IT the all to popular trend is to declare something dead so that your new product or technology can have a chance of making it in to the market or perhaps seen in a better light.

Sometimes this approach works to temporary freeze the market until common sense and clarity returns to the market or until something else fun to talk about comes along and in other cases, the messages can fall on deft ears.

The approach of declaring something dead tends to play well for those who like shiny new toys (SNT) or new shiny toys (NST) and being on the popular, cool trendy bandwagon.

Not surprisingly, while some actual IT customers can fall into the SNT or NST syndrome, its often the broader industry including media, bloggers, analysts, consultants and other self proclaimed or anointed pundits as well as vendors who latch on to the declare it dead movement. After all, who wants to talk about something that is old, boring and already being sold to paying customers who are using it. Now this is not a bad thing as we need a balance of up and coming challengers to keep the status quo challenged, likewise we need a balance of the new to avoid death grips on the old and what is working.

Likewise, many IT customers particularly larger ones tend to be very risk averse and conservative with their budgets protecting their investments thus they may only go leading bleeding edge if there is a dual redundant blood bank with a backup on hot standby (thats some HA humor BTW).

Another reason that declaring items dead in support of SNT and NST is that while many of the commonly declared dead items are on the proverbial plateau of productivity for IT customers, that also can mean that they are on the plateau of profitability for the vendors.

However, not all good things last and at sometime, there is the need to transition from the old to the new and this is where things like virtualization including virtual tape libraries or virtual disk libraries or virtual storage library or what ever you want to call a VxL (more on what a VxL is in a moment) can come into play.

I realize that for some, particularly those who like to grasp on to SNT, NST and ride the dead pool bandwagons this will probably appear as snarky or cynical which is fine, after all, for some, you should be laughing to the bank and if not, you may in fact be missing out on an opportunity for playing in the dead pool marketing game.

Now back to VxL.

In the case of VTLs, for some it is the T word that bothers them, you know T as in Tape which is not a SNT or NST in an age where SSD has supposedly killed the disk drive which allegedly terminated tape (yeah right). Sure tape is not being used as much for backup as it has in the past with its role shifting to that of longer term retention, something that it is well suited for.

For tape fans (or cynics) you can read more here, here and here. However there is still a large amount of backup/restore along with other data protection or preservation (e.g. archiving) processing (software tools, processes, procedures, skill sets, management tools) that still expects to see tape.

Hence this is where VTLs or VxLs come into play leveraging virtualization in an Life Beyond Consolidation (and here) scenario providing abstraction, transparency, agility and emulation and IMHO are still very much alive and evolving.

Ok, for those who do not like or believe in or of its continued existence and evolving role, substitute the T (tape) with X and you get a VxL. That is, plug in what ever X word that makes you happy or marketable or a Shiny New TLA. For example Virtual Disk Library, Virtual Storage Library, Virtual Backup Library, Virtual Compression Library, Virtual Dedupe Library, Virtual ILM Library, Virtual Archive Library, Virtual Cloud Library and so forth. Granted some VxLs only emulate tape and hence are VTLs while others support NAS and other protocols (or personalities) not to mention functionality ranging from replication, DFR as well as automated policy management.

However, keep in mind that if your preference is VTL, VxL or what ever other buzzword bingo name that you want to use or come up with, look at how virtualization in the form of abstraction, transparency and emulation can bridge the gap between the new (disk based data protection) combined with DFR (Data Footprint Reduction) and the old (existing backup/restore, archive or other management tools and processes.

Here are some additional links pertaining to VTLs (excuse me, VxLs):

Virtual tape libraries: Old backup technology holdover or gateway to the future?
Not to mention here, here, here, here or here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

September 17, 2010March 7, 2022

What is DFR or Data Footprint Reduction?

Updated 10/9/2018

What is DFR or Data Footprint Reduction?

Data Footprint Reduction (DFR) is a collection of techniques, technologies, tools and best practices that are used to address data growth management challenges. Dedupe is currently the industry darling for DFR particularly in the scope or context of backup or other repetitive data.

However DFR expands the scope of expanding data footprints and their impact to cover primary, secondary along with offline data that ranges from high performance to inactive high capacity.

Consequently the focus of DFR is not just on reduction ratios, its also about meeting time or performance rates and data protection windows.

This means DFR is about using the right tool for the task at hand to effectively meet business needs, and cost objectives while meeting service requirements across all applications.

Examples of DFR technologies include Archiving, Compression, Dedupe, Data Management and Thin Provisioning among others.

Read more about DFR in Part I and Part II of a two part series found here and here.

Where to learn more

Learn more about data footprint reducton (DFR), data footprint overhead and related topics via the following links:

Next Generation Hybrid Software Defined Data Infrastructures Are In Your Future #blogtobertech
Data Footprint Reduction – Software Defined Data Infrastructure Essentials
PACE your Server Storage I/O decision making, its about application requirements
Announcing Software Defined Data Infrastructure Essentials Book by Greg Schulz
July 2018 Server StorageIO Data Infrastructure Update Newsletter
Pictures Over Stillwater Drone Pro Shop and Resource Links
2018 Hot Popular New Trending Data Infrastructure Vendors to Watch
Part 1 – Application Data Value Characteristics Everything Is Not The Same
Part 2 – 4 3 2 1 Data Protection Application Data Availability
Part 3 – Application Data Characteristics Types Everything Is Not The Same
Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
Part 5 – Application Data Access Life cycle Patterns Everything Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Server Storage I/O Tradecraft Trends
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)
If NVMe is the answer, what are the questions?
NVMe Primer (or refresh), The NVMe Place, The SSD Place, and the Object Storage Center

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means

That is all for now, hope you find these ongoing series of current or emerging Industry Trends and Perspectives posts of interest.

Ok, nuff said, for now.

Cheers Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

August 3, 2010May 17, 2021

Data footprint reduction (Part 2): Dell, IBM, Ocarina and Storwize

Over the past couple of weeks there has been a flurry of IT industry activity around data footprint impact reduction with Dell buying Ocarina and IBM acquiring Storwize. For those who want the quick (compacted, reduced) synopsis of what Dell buying Ocarina as well as IBM acquiring Storwize means read the first post in this two part series as well as some of my comments here and here.

This piece and it companion in part I of this two part series is about expanding the discussion to the much larger opportunity for vendors or vars of overall data footprint impact reduction beyond where they are currently focused. Likewise, this is about IT customers realizing that there are more opportunities to address data and storage optimization across your entire organization using various techniques instead of just focusing on backup or vmware virtual servers.

Who is Ocarina and Storwize?
Ocarina is a data and storage management software startup focused on data footprint reduction using a variety of approaches, techniques and algorithms. They differ from the traditional data dedupers (e.g. Asigra, Bakbone, Commvault, EMC Avamar, Datadomain and Networker, Exagrid, Falconstor, HP, IBM Protectier and TSM, Quantum, Sepaton and Symantec among others) by looking at data footprint reduction beyond just backup.

This means looking at how to reduce data footprint across different types of data including videos, image as well as text based documents among others. As a result, the market sweet spot for Ocarina is for general data footprint reduction including static along with active data including entertainment, video surveillance or gaming, reference data, web 2.0 and other bulk storage application data needs (this should compliment Dells recent Exanet acquisition).

What this means is that Ocarina is very well suited to address the rapidly growing amount of unstructured data that may not otherwise be handled as efficiently with by dedupe alone.

Storwize is a data and storage management startup focused on data footprint reduction using inline compression with an emphasis on maintaining performance for reads as well as writes of unstructured as well as structured database data. Consequently the market sweet spot for Storwize is around boosting the capacity of existing NAS storage systems from different vendors without negatively impacting performance. The trade off of the Storwize approach is that you do not get the spectacular data reduction ratios associated with backup centric or focused dedupe, however, you maintain performance associated with online storage that some dedupers dream of.

Both Dell and IBM have existing dedupe solutions for general purpose as well as backup along with other data footprint impact reduction tools (either owned or via partners). Now they are both expanding their focus and reach similar to what others such as EMC, HP, NetApp, Oracle and Symantec among others are doing. What this means is that someone at Dell and IBM see that there is much more to data footprint impact reduction than just a focus on dedupe for backup.

Wait, what does all of this discussion (or read here for background issues, challenges and opportunities) about unstructured data and changing access lifecycles have to do with dedupe, Ocarina and Storwize?

Continue reading on as this is about the expanding opportunity for data footprint reduction across entire organizations. That is, more data is being kept online and expanding data footprint impact needs to be addressed to meet business objectives using various techniques balancing performance, availability, capacity and energy or economics (PACE).

What does all of this have to do with IBM buying Storwize and Dell acquiring Ocarina?
If you have not pieced this together yet, let me net it out.

This is about the opportunity to address the organization wide expanding data footprint impact across all applications, types of data as well as tiers of storage to support business growth (more data to store) while maintaining QoS yet reduce per unit costs including management.

This is about expanding the story to the broader data footprint impact reduction from the more narrowly focused backup and dedupe discussion which are still in their infancy on a relative basis to their full market potential (read more here).

Now are you seeing where this is going and fits?

Does this mean IBM and Dell defocus on their existing Dedupe product lines or partners?
I do not believe so, at least as long as their respective revenue prevention departments are kept on the sidelines and off of the field of play. What I mean by this is that the challenge for IBM and Dell is similar to that of what others such as EMC are faced with having diverse portfolios or technology toolboxes. The challenge is messaging to the bigger issues, then aligning the right tool to the task at hand to address given issues and opportunities instead of singularly focused on a specific product causing revenue prevention elsewhere.

As an example, for backup, I would expect Dell to continue to work with its existing dedupe backup centric partners and technologies however find new opportunities to leverage their Ocarina solution. Likewise, IBM I would expect to continue to show customers where Tivoli software based dedupe or Protectier (aka the deduper formerly known as Diligent) or other target based dedupe fits and expand into other data footprint impact areas with Storewize.

Does this change the playing field?
IMHO these moves as well as some previous moves by the likes of EMC and NetApp among others are examples of expanding the scope and dimension of the playing field. That is, the focus is much more than just dedupe for backup or of virtual machines (e.g. VMware vSphere or Microsoft HyperV).

This signals a growing awareness around the much larger and broader opportunity around organization wide data footprint impact reduction. In the broader context some applications or data gets compressed either in application software such as databases, file systems, operating systems or even hypervisors as well as in networks using protocol or bandwidth optimizers as well as inline compression or post processing techniques as has been the case with streaming tape devices for some time.

This also means that where with dedupe the primary focus or marketing angle up until recently has been around reduction ratios, to meet the needs of time or performance sensitive applications data transfer rates also become important.

Hence the role of policy based data footprint reduction where the right tool or technique to meet specific service requirements is applied. For those vendors with a diverse data footprint impact reduction tool kit including archive, compression, dedupe, thin provision among other techniques, I would expect to hear expanded messaging around the theme of applying the right tool to the task at hand.

Does this mean Dell bought Ocarina to accessorize EqualLogic?
Perhaps, however that would then beg the question of why EqualLogic needs accessorizing. Granted there are many EqualLogic along with other Dell sold storage systems attached to Dell and other vendors servers operating as NFS or Windows CIFS file servers that are candidates for Ocarina. However there are also many environments that do not yet include Dell EqualLogic solutions where Ocarina is a means for Dell to extend their reach enabling those organizations to do more with what they have while supporting growth.

In other words, Ocarina can be used to accessorize, or, it can be used to generate and create pull through for various Dell products. I also see a very strong affinity and opportunity for Dell to combine their recent Exanet NAS storage clustering software with Dell servers, storage to create bulk or scale out solutions similar to what HP and other vendors have done. Of course what Dell does with the Ocarina software over time, where they integrate it into their own products as well as OEM to others should be interesting to watch or speculate upon.

Does this mean IBM bought Storwize to accessorize XIV?
Well, I guess if you put a gateway (or software on a server which is the same thing) in front of XIV to transform it into a NAS system, sure, then Storwize could be used to increase the net usable capacity of the XIV installed base. However that is a lot of work and cost for what is on a relative basis a small footprint, yet it is a viable option never the less.

IMHO IBM has much more of a play, perhaps a home run by walking before they run by placing Storwize in front of their existing large installed base of NetApp N series (not to mention targeting NetApps own install base) as well as complimenting their SONAS solutions. From there as IBM gets their legs and mojo, they could go on the attack by going after other vendors NAS solutions with an efficiency story similar to how IBM server groups target other vendors server business for takeout opportunities except in a complimenting manner.

Longer term I would not be surprised to see IBM continue development of the block based IP (as well as file) in the storwize product for deployment in solutions ranging from SVC to their own or OEM based products along with articulating their comprehensive data footprint reduction solution portfolio. What will be important for IBM to do is articulating what solution to use when, where, why and how without confusing their customers, partners and rest of the industry (something that Dell will also have to do).

Some links for additional reading on the above and related topics

Data footprint reduction (Part 1): Life beyond dedupe
Chapter 8 and 10: The Green and Virtual Data Center (CRC)
Business Benefits of Data Footprint Impact Reduction
Long-Term Data Protection and Retention – Finding the Correct Balance
Application Transparency and Co-existence with Real-Time Data Compression
Enabling a Green and Energy Efficient Storage with Real-time Compression
Real-time Data Compression Integrity and Reliability
Real-Time Data Compression Performance Considerations
Real-time Data Compression for On-line Active Data
Application Agnostic Real-time Data Compression
Saving Money with Green IT: Time To Invest In Information Factories
Storage Efficiency and Optimization – The Other Green
Shifting from energy avoidance to energy efficiency
Industry Trends and Perspectives: Tape, Disk and Dedupe Coexistence
The Many Flavors of Deduplication
Experts Share De-Dupe Insights
Business Benefits of Policy Based Data De-Duplication
Comments on IBM buying Storwize primary compression
Comments on Dell buying Ocarina and primary

Wrap up (for now)

Organizations of all shape and size are encountering some form of growing data footprint impact that currently, or soon will need to be addressed. Given that different applications and types of data along with associated storage mediums or tiers have various performance, availability, capacity, energy as well as economic characteristics multiple data footprint impact reduction tools or techniques are needed. What this all means is that the focus of data footprint reduction is expanding beyond that of just dedupe for backup or other early deployment scenarios.

Note what this means is that dedupe has an even brighter future than where it currently is focused which is still only scratching the surface of potential market adoption as was discussed in part 1 of this series.

However this also means that dedupe is not the only solution to all data footprint reduction scenarios. Other techniques including archiving, compression, data management, thin provisioning, data deletion, tiered storage and consolidation will start to gain respect, coverage discussions and debates.

Bottom line, use the most applicable technologies or combinations along with best practice for the task and activity at hand.

For some applications reduction ratios are an important focus on the tools or modes of operations that achieve those results.

Likewise for other applications where the focus is on performance with some data reduction benefit, tools are optimized for performance first and reduction secondary.

Thus I expect messaging from some vendors to adjust (expand) to those capabilities that they have in their toolboxes (product portfolios) offerings

Consequently, IMHO some of the backup centric dedupe solutions may find themselves in niche roles in the future unless they can diversity. Vendors with multiple data footprint reduction tools will also do better than those with only a single function or focused tool.

However for those who only have a single or perhaps a couple of tools, well, guess what the approach and messaging will be. After all, if all you have is a hammer everything looks like a nail, if all you have is a screw driver, well, you get the picture.

On the other hand, if you are still not clear on what all this means, send me a note, give a call, post a comment or a tweet and will be happy to discuss with you.

Oh, FWIW, if interested, disclosure: Storwize was a client a couple of years ago.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

May 26, 2010November 26, 2023

Industry Trends and Perspectives: Tiered Storage, Systems and Mediums

This is part of an ongoing series of short industry trends and perspectives blog posts briefs.

These short posts compliment other longer posts along with traditional industry trends and perspective white papers, research reports, solution brief content found at www.storageio.com/reports.

Two years ago we read about how the magnetic disk drive would be dead in a couple of years at the hand of flash SSD. Guess what, it is a couple of years later and the magnetic disk drive is far from being dead. Granted high performance Fibre Channel disks will continue to be replaced by high performance, small form factor 2.5" SAS drives along with continued adoption of high capacity SAS and SATA devices.

Likewise, SSD or flash drives continue to be deployed, however outside of iPhone, iPod and other consumer or low end devices, nowhere near the projected or perhaps hoped for level. Rest assured the trend Im seeing and hearing from IT customers is that some will continue to look for places to strategically deploy SSD where possible, practical and affordable, there will continue to be a roll for disk and even tape devices on a go forward basis.

Also watch for more coverage and discussion around the emergence of the Hybrid Hard Disk Drive (HHDD) that was discussed about four to five years ago. The HHDD made an appearance and then quietly went away for some time, perhaps more R and D time in the labs while flash SSD garnered the spotlight.

There could be a good opportunity for HHDD technology leveraging the best of both worlds that is continued pricing decreases for disk with larger capacity using smaller yet more affordable amounts of flash in a solution that is transparent to the server or storage controller making for easier integration.

That is all for now, hope you find this ongoing series of current and emerging Industry Trends and Perspectives interesting.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

January 14, 2010December 24, 2020

2010 and 2011 Trends, Perspectives and Predictions: More of the same?

2011 is not a typo, I figured that since Im getting caught up on some things, why not get a jump as well.

Since 2009 went by so fast, and that Im finally getting around to doing an obligatory 2010 predictions post, lets take a look at both 2010 and 2011.

Actually Im getting around to doing a post here having already done interviews and articles for others soon to be released.

Based on prior trends and looking at forecasts, a simple predictions is that some of the items for 2010 will apply for 2011 as well given some of this years items may have been predicted by some in 2008, 2007, 2006, 2005 or, well ok, you get the picture. :)

Predictions are fun and funny in that for some, they are taken very seriously, while for others, at best they are taken with a grain of salt depending on where you sit. This applies both for the reader as well as who is making the predictions along with various motives or incentives.

Some are serious, some not so much…

For some, predictions are a great way of touting or promoting favorite wares (hard, soft or services) or getting yet another plug (YAP is a TLA BTW) in to meet coverage or exposure quota.

Meanwhile for others, predictions are a chance to brush up on new terms for the upcoming season of buzzword bingo games (did you pick up on YAP).

In honor of the Vancouver winter games, Im expecting some cool Olympic sized buzzword bingo games with a new slippery fast one being federation. Some buzzwords will take a break in 2010 as well as 2011 having been worked pretty hard the past few years, while others that have been on break, will reappear well rested, rejuvenated, and ready for duty.

Lets also clarify something regarding predictions and this is that they can be from at least two different perspectives. One view is that from a trend of what will be talked about or discussed in the industry. The other is in terms of what will actually be bought, deployed and used.

What can be confusing is sometimes the two perspectives are intermixed or assumed to be one and the same and for 2010 I see that trend continuing. In other words, there is adoption in terms of customers asking and investigating technologies vs. deployment where they are buying, installing and using those technologies in primary situations.

It is safe to say that there is still no such thing as an information, data or processing recession. Ok, surprise surprise; my dogs could have probably made that prediction during a nap. However what this means is more data will need to be moved, processed and stored for longer periods of time and at a lower cost without degrading performance or availability.

This means, denser technologies that enable a lower per unit cost of service without negatively impacting performance, availability, capacity or energy efficiency will be needed. In other words, watch for an expanded virtualization discussion around life beyond consolidation for servers, storage, desktops and networks with a theme around productivity and virtualization for agility and management enablement.

Certainly there will be continued merger and acquisitions on both a small as well as large scale ranging from liquidation sales or bargain hunting, to large and a mega block buster or two. Im thinking in terms of outside of the box, the type that will have people wondering perhaps confused as to why such a deal would be done until the whole picture is reveled and thought out.

In other words, outside of perhaps IBM, HP, Oracle, Intel or Microsoft among a few others, no vendor is too large not to be acquired, merged with, or even involved in a reverse merger. Im also thinking in terms of vendors filling in niche areas as well as building out their larger portfolio and IT stacks for integrated solutions.

Ok, lets take a look at some easy ones, lay ups or slam dunks:

More cluster, cloud conversations and confusion (public vs. private, service vs. product vs. architecture)
More server, desktop, IO and storage consolidation (excuse me, server virtualization)
Data footprint impact reduction ranging from deletion to archive to compress to dedupe among others
SSD and in particular flash continues to evolve with more conversations around PCM
Growing awareness of social media as yet another tool for customer relations management (CRM)
Security, data loss/leap prevention, digital forensics, PCI (payment card industry) and compliance
Focus expands from gaming/digital surveillance /security and energy to healthcare
Fibre Channel over Ethernet (FCoE) mainstream in discussions with some initial deployments
Continued confusion of Green IT and carbon reduction vs. economic and productivity (Green Gap)
No such thing as an information, data or processing recession, granted budgets are strained
Server, Storage or Systems Resource Analysis (SRA) with event correlation
SRA tools that provide and enable automation along with situational awareness

The green gap of confusion will continue with carbon or environment centric stories and messages continue to second back stage while people realize the other dimension of green being productivity.

As previously mentioned, virtualization of servers and storage continues to be popular with an expanding focus from just consolidation to one around agility, flexibility and enabling production, high performance or for other systems that do not lend themselves to consolidation to be virtualized.

6GB SAS interfaces as well as more SAS disk drives continue to gain popularity. I have said in the past there was a long shot that 8GFC disk drives might appear. We might very well see those in higher end systems while SAS drives continue to pick up the high performance spinning disk role in mid range systems.

Granted some types of disk drives will give way over time to others, for example high performance 3.5” 15.5K Fibre Channel disks will give way to 2.5” 15.5K SAS boosting densities, energy efficiency while maintaining performance. SSD will help to offload hot spots as they have in the past enabling disks to be more effectively used in their applicable roles or tiers with a net result of enhanced optimization, productivity and economics all of which have environmental benefits (e.g. the other Green IT closing the Green Gap).

What I dont see occurring, or at least in 2010

An information or data recession requiring less server, storage, I/O networking or software resources
OSD (object based disk storage without a gateway) at least in the context of T10
Mainframes, magnetic tape, disk drives, PCs, or Windows going away (at least physically)
Cisco cracking top 3, no wait, top 5, no make that top 10 server vendor ranking
More respect for growing and diverse SOHO market space
iSCSI taking over for all I/O connectivity, however I do see iSCSI expand its footprint
FCoE and flash based SSD reaching tipping point in terms of actual customer deployments
Large increases in IT Budgets and subsequent wild spending rivaling the dot com era
Backup, security, data loss prevention (DLP), data availability or protection issues going away
Brett Favre and the Minnesota Vikings winning the super bowl

What will be predicted at end of 2010 for 2011 (some of these will be DejaVU)

Many items that were predicted this year, last year, the year before that and so on…
Dedupe moving into primary and online active storage, rekindling of dedupe debates
Demise of cloud in terms of hype and confusion being replaced by federation
Clustered, grid, bulk and other forms of scale out storage grow in adoption
Disk, Tape, RAID, Mainframe, Fibre Channel, PCs, Windows being declared dead (again)
2011 will be the year of Holographic storage and T10 OSD (an annual prediction by some)
FCoE kicks into broad and mainstream deployment adoption reaching tipping point
16Gb (16GFC) Fibre Channel gets more attention stirring FCoE vs. FC vs. iSCSI debates
100GbE gets more attention along with 4G adoption in order to move more data
Demise of iSCSI at the hands of SAS at low end, FCoE at high end and NAS from all angles

Gaining ground in 2010 however not yet in full stride (at least from customer deployment)

On the connectivity front, iSCSI, 6Gb SAS, 8Gb Fibre Channel, FCoE and 100GbE
SSD/flash based storage everywhere, however continued expansion
Dedupe everywhere including primary storage – its still far from its full potential
Public and private clouds along with pNFS as well as scale out or clustered storage
Policy based automated storage tiering and transparent data movement or migration
Microsoft HyperV and Oracle based server virtualization technologies
Open source based technologies along with heterogeneous encryption
Virtualization life beyond consolidation addressing agility, flexibility and ease of management
Desktop virtualization using Citrix, Microsoft and VMware along with Microsoft Windows 7

Buzzword bingo hot topics and themes (in no particular order) include:

2009 and previous year carry over items including cloud, iSCSI, HyperV, Dedupe, open source
Federation takes over some of the work of cloud, virtualization, clusters and grids
E2E, End to End management preferably across different technologies
SAS, Serial Attached SCSI for server to storage systems and as disk to storage interface
SRA, E23, Event correlation and other situational awareness related IRM tools
Virtualization, Life beyond consolidation enabling agility, flexibility for desktop, server and storage
Green IT, Transitions from carbon focus to economic with efficiency enabling productivity
FCoE, Continues to evolve and mature with more deployments however still not at tipping point
SSD, Flash based mediums continue to evolve however tipping point is still over the horizon
IOV, I/O Virtualization for both virtual and non virtual servers
Other new or recycled buzzword bingo candidates include PCoIP, 4G,

RAID will again be pronounced as being dead no longer relevant yet being found in more diverse deployments from consumer to the enterprise. In other words, RAID may be boring and thus no longer relevant to talk about, yet it is being used everywhere and enhanced in evolutionary ways, perhaps for some even revolutionary.

Tape remains being declared dead (e.g. on the Zombie technology list) yet being enhanced, purchased and utilized at higher rates with more data stored than in past history. Instead of being killed off by the disk drive, tape is being kept around for both traditional uses as well as taking on new roles where it is best suited such as long term or bulk off-line storage of data in ultra dense and energy efficient not to mention economical manners.

What I am seeing and hearing is that customers using tape are able to reduce the number of drives or transports, yet due to leveraging disk buffers or caches including from VTL and dedupe devices, they are able to operate their devices at higher utilization, thus requiring fewer devices with more data stored on media than in the past.

Likewise, even though I have been a fan of SSD for about 20 years and am bullish on its continued adoption, I do not see SSD killing off the spinning disk drive anytime soon. Disk drives are helping tape take on this new role by being a buffer or cache in the form of VTLs, disk based backup and bulk storage enhanced with compression, dedupe, thin provision and replication among other functionality.

There you have it, my predictions, observations and perspectives for 2010 and 2011. It is a broad and diverse list however I also get asked about and see a lot of different technologies, techniques and trends tied to IT resources (servers, storage, I/O and networks, hardware, software and services).

Lets see how they play out.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

November 23, 2009March 7, 2022

ILM = Has It Losts its Meaning

Disclaimer, warning, be advised, heads up, disclosure, this post is partially for fun so take it that way.

Remember ILM, that is, Information Lifecycle Management among other meanings.

It was a popular buzzword de jour a few years ago similar to how cloud is being tossed around lately, or in the recent past, virtualization, clusters, grids and SOA among others.

One of the challenges with ILM besides its overuse and thus confusion was what it meant, after all was or is it a product, process, paradigm or something else?

That depends of course on who you talk to and their view or definition.

For some, ILM was a new name for archiving, or storage and data tiering, or data management, or hierarchical storage management (HSM) or system managed storage (SMS) and software managed storage (SMS) among others.

So where is ILM today?

Better yet, what does ILM stand for?

Well here are a few thoughts; some are oldies but goodies, some new, some just for fun.

ILM = I Like Marketing or Its a Lot of Marketing or Its a Lot of Money
ILM = It Losts its Meaning or Its a Lot of Meetings
ILM = Information Loves Magnetic media or I Love Magnetic media
ILM = IBM Loves Mainframes or Intel Loves Memory
ILM = Infrastructure Lifecycle Management or iPods/iPhones Like Macintosh

Then there are many other variations of xLM where I is replaced with X (similar to XaaS) where X is any letter you want or need for a particular purpose or message theme. For example, how about replacing X with an A for Application Lifecycle Management (ALM), or a B for Buzzword or Backup Lifecycle Management (BLM), C for Content Lifecycle Management (CLM) and D for Document or Data Lifecycle Management (DLM). There are many others including Hardware Lifecycle Management (HLM), Product or Program Lifecycle Management (PLM) not to mention Server, Storage or Security Lifecycle Management (SLM).

While ILM or xLM specific product and marketing buzz for the most part has subsided, perhaps it is about time to reappear to give current buzzwords such as cloud a bread or rest. After all, ILM and xLM as buzzwords should be well rested after their break at the Buzzword Rest Spa (BRS) perhaps located on someday isle. You know about someday isle dont you? Its that place of dreams, a visionary place to be visited in the future.

There are already signs of the impending rested, rejuvenated and re branded appearance of ILM in the form of automated tiering, intelligent storage and data management, file virtualization, policy managed server and storage among others.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

Greg Schulz – StorageIO, Author “The Green and Virtual Data Center” (CRC)

Technorati tags: ILM

October 1, 2009January 23, 2019

The function of XaaS(X) – Pick a letter

Remember the xSP era where X was I for ISP (Internet Service Provider) or M for Managed Service Provider (MSP) or S for Storage Service Provider, part of buzzword bingo?

That was similar to the xLM craze where X could have been I for Information Lifecycle Management (ILM), D for Data Lifecycle Management (DLM) and so forth where even someone tried to register the term ILM and failed instead of grabbing something like XLM, lest I digress.

Fast forward to today, given the wide spread use of anything SaaS among other XaaS terms, lets have a quick and perhaps fun look at what some of the different usages of the new function XaaS(X) in the IT industry today.

By no means is this an exhaustive list, feel free to comment with others, the more the merrier. Using the Basic English alphabet without numbers or extended character sets, here are some possibilities among others (some are and continue to be used in the industry):

A	Analyst, Application, Archive, Audit or Authentication
B	Backup or Blogger
C	Cloud, Complier, Compute or Connectivity
D	Data management, Datawharehouse, DBA, Dedupe, Development, Disk or Docmanagement
E	Email, Encryption or Evangelist
F	Files or Freeware
G	Grid or Google
H	Help, Hotline or Hype
I	ILM, Information, Infrastructure, IO or IT
J	Jobs
K	Kbytes
L	Library or Linkedin
M	Mainframe, Marketing, Manufacturing, Media, Memory or Middleware
N	NAS, Networking or Notification
O	Office, Oracle, Optical or Optimization
P	Performance, Petabytes, Platform, Policy, Police, Print or PR
Q	Quality
R	RAID, Replication, Reporter, Research or Rightsmanagement
S	SAN, Search, Security, Server, Software, Storage, Support
T	Tape, Technology, Testing, Tradegroup, Trends or Twittering
U	Unfollow
V	VAR, Virtualization or Vendor
W	Web
X	Xray
Y	Youtube
Z	zSeries or zilla

Feel free to comment with others for the list, and likewise, feel free to share the list.

Cheers gs

Cheers gs
Greg Schulz – StorageIO, Author “The Green and Virtual Data Center” (CRC)

October 21, 2008March 18, 2019

From ILM to IIM, Is this a solution sell looking for a problem?

Storage I/O trends

Enterprise Storage Forum has a new piece about what could be the successor to ILM from a marketing rallying cry perspective in the form of Intelligent Information Management (IIM).

Information management is an important topic, however, given tough economic times, can IIM be joined into some other discussions about efficiency and boosting productivity to help justify its cost what ever that cost may be in terms of more hardware, software and people to carry out? With EMC and Gartner banging the drum, it will be interesting to see who else jumps on the IIM bandwagon.

On the other hand, lets see what over variations surface perhaps an VIIM (Virtualized IIM), or a IIMaaS (IIM as a Service), or how about Cloud IIM or GIIM (Green IIM) among others like xIIM where you plug what ever letter you want in front if IIM (something that someone missed out on a few years ago by not grabbing xLM).

While I see the importance of data management, the bottom line is going to be how to budget and build a business case when sustaining business growth in tough economic times is a common theme. Hopefully we can see some business case and justifications that can involve some self-funded, that is, the cost of adopting and deploying IIM is covered by the savings in associated hardware and software management and maintenance fees as well as a means of boosting overall IT and data management productivity.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

Application Data Value Characteristics Everything Is Not The Same

Common Applications Characteristics

Performance and Activity (How Resources Get Used)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Availability 4 3 2 1 Data Protection

Availability (Accessibility, Durability, Consistency)

Capacity and Space (What Gets Consumed and Occupied)

Economics (People, Budgets, Energy and other Constraints)

Where to learn more

What this all means and wrap-up

Share this:

Application Data Characteristics Types Everything Is Not The Same

Various Types of Data

Different data with various values over time

Data Value

Where to learn more

What this all means and wrap-up

Share this:

Application Data Volume Velocity Variety Everything Not The Same

Volume of Data

Variety of Data

Velocity of Data

Where to learn more

What this all means and wrap-up

Share this:

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

Where to learn more

What this all means and wrap-up

Share this:

Data Infrastructure IT Industry Related Resource Links P to T

IT Data Center and Data Infrastructure Industry Resources

Where To Learn More

What This All Means

Share this:

Share this:

What is DFR or Data Footprint Reduction?

Where to learn more

What this all means

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: