Application Data Value Characteristics Everything Is Not The Same (Part I)

Application Data Value Characteristics Everything Is Not The Same

Application Data Value Characteristics Everything Is Not The Same

Application Data Value Characteristics Everything Is Not The Same

This is part one of a five-part mini-series looking at Application Data Value Characteristics Everything Is Not The Same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we start things off by looking at general application server storage I/O characteristics that have an impact on data value as well as access.

Application Data Value Software Defined Data Infrastructure Essentials Book SDDC

Everything is not the same across different organizations including Information Technology (IT) data centers, data infrastructures along with the applications as well as data they support. For example, there is so-called big data that can be many small files, objects, blobs or data and bit streams representing telemetry, click stream analytics, logs among other information.

Keep in mind that applications impact how data is accessed, used, processed, moved and stored. What this means is that a focus on data value, access patterns, along with other related topics need to also consider application performance, availability, capacity, economic (PACE) attributes.

If everything is not the same, why is so much data along with many applications treated the same from a PACE perspective?

Data Infrastructure resources including servers, storage, networks might be cheap or inexpensive, however, there is a cost to managing them along with data.

Managing includes data protection (backup, restore, BC, DR, HA, security) along with other activities. Likewise, there is a cost to the software along with cloud services among others. By understanding how applications use and interact with data, smarter, more informed data management decisions can be made.

IT Applications and Data Infrastructure Layers
IT Applications and Data Infrastructure Layers

Keep in mind that everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them. Also keep in mind that programs (e.g. applications) = algorithms (code) + data structures (how data defined and organized, structured or unstructured).

There are traditional applications, along with those tied to Internet of Things (IoT), Artificial Intelligence (AI) and Machine Learning (ML), Big Data and other analytics including real-time click stream, media and entertainment, security and surveillance, log and telemetry processing among many others.

What this means is that there are many different application with various character attributes along with resource (server compute, I/O network and memory, storage requirements) along with service requirements.

Common Applications Characteristics

Different applications will have various attributes, in general, as well as how they are used, for example, database transaction activity vs. reporting or analytics, logs and journals vs. redo logs, indices, tables, indices, import/export, scratch and temp space. Performance, availability, capacity, and economics (PACE) describes the applications and data characters and needs shown in the following figure.

Application and data PACE attributes
Application PACE attributes (via Software Defined Data Infrastructure Essentials)

All applications have PACE attributes, however:

  • PACE attributes vary by application and usage
  • Some applications and their data are more active than others
  • PACE characteristics may vary within different parts of an application

Think of applications along with associated data PACE as its personality or how it behaves, what it does, how it does it, and when, along with value, benefit, or cost as well as quality-of-service (QoS) attributes.

Understanding applications in different environments, including data values and associated PACE attributes, is essential for making informed server, storage, I/O decisions and data infrastructure decisions. Data infrastructures decisions range from configuration to acquisitions or upgrades, when, where, why, and how to protect, and how to optimize performance including capacity planning, reporting, and troubleshooting, not to mention addressing budget concerns.

Primary PACE attributes for active and inactive applications and data are:

P – Performance and activity (how things get used)
A – Availability and durability (resiliency and data protection)
C – Capacity and space (what things use or occupy)
E – Economics and Energy (people, budgets, and other barriers)

Some applications need more performance (server computer, or storage and network I/O), while others need space capacity (storage, memory, network, or I/O connectivity). Likewise, some applications have different availability needs (data protection, durability, security, resiliency, backup, business continuity, disaster recovery) that determine the tools, technologies, and techniques to use.

Budgets are also nearly always a concern, which for some applications means enabling more performance per cost while others are focused on maximizing space capacity and protection level per cost. PACE attributes also define or influence policies for QoS (performance, availability, capacity), as well as thresholds, limits, quotas, retention, and disposition, among others.

Performance and Activity (How Resources Get Used)

Some applications or components that comprise a larger solution will have more performance demands than others. Likewise, the performance characteristics of applications along with their associated data will also vary. Performance applies to the server, storage, and I/O networking hardware along with associated software and applications.

For servers, performance is focused on how much CPU or processor time is used, along with memory and I/O operations. I/O operations to create, read, update, or delete (CRUD) data include activity rate (frequency or data velocity) of I/O operations (IOPS). Other considerations include the volume or amount of data being moved (bandwidth, throughput, transfer), response time or latency, along with queue depths.

Activity is the amount of work to do or being done in a given amount of time (seconds, minutes, hours, days, weeks), which can be transactions, rates, IOPs. Additional performance considerations include latency, bandwidth, throughput, response time, queues, reads or writes, gets or puts, updates, lists, directories, searches, pages views, files opened, videos viewed, or downloads.
 
Server, storage, and I/O network performance include:

  • Processor CPU usage time and queues (user and system overhead)
  • Memory usage effectiveness including page and swap
  • I/O activity including between servers and storage
  • Errors, retransmission, retries, and rebuilds

the following figure shows a generic performance example of data being accessed (mixed reads, writes, random, sequential, big, small, low and high-latency) on a local and a remote basis. The example shows how for a given time interval (see lower right), applications are accessing and working with data via different data streams in the larger image left center. Also shown are queues and I/O handling along with end-to-end (E2E) response time.

fundamental server storage I/O
Server I/O performance fundamentals (via Software Defined Data Infrastructure Essentials)

Click here to view a larger version of the above figure.

Also shown on the left in the above figure is an example of E2E response time from the application through the various data infrastructure layers, as well as, lower center, the response time from the server to the memory or storage devices.

Various queues are shown in the middle of the above figure which are indicators of how much work is occurring, if the processing is keeping up with the work or causing backlogs. Context is needed for queues, as they exist in the server, I/O networking devices, and software drivers, as well as in storage among other locations.

Some basic server, storage, I/O metrics that matter include:

  • Queue depth of I/Os waiting to be processed and concurrency
  • CPU and memory usage to process I/Os
  • I/O size, or how much data can be moved in a given operation
  • I/O activity rate or IOPs = amount of data moved/I/O size per unit of time
  • Bandwidth = data moved per unit of time = I/O size × I/O rate
  • Latency usually increases with larger I/O sizes, decreases with smaller requests
  • I/O rates usually increase with smaller I/O sizes and vice versa
  • Bandwidth increases with larger I/O sizes and vice versa
  • Sequential stream access data may have better performance than some random access data
  • Not all data is conducive to being sequential stream, or random
  • Lower response time is better, higher activity rates and bandwidth are better

Queues with high latency and small I/O size or small I/O rates could indicate a performance bottleneck. Queues with low latency and high I/O rates with good bandwidth or data being moved could be a good thing. An important note is to look at several metrics, not just IOPs or activity, or bandwidth, queues, or response time. Also, keep in mind that metrics that matter for your environment may be different from those for somebody else.

Something to keep in perspective is that there can be a large amount of data with low performance, or a small amount of data with high-performance, not to mention many other variations. The important concept is that as space capacity scales, that does not mean performance also improves or vice versa, after all, everything is not the same.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

SDDC Data Infrastructure

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. However all applications have some element (high or low) of performance, availability, capacity, economic (PACE) along with various similarities. Likewise data has different value at various times. Continue reading the next post (Part II Application Data Availability Everything Is Not The Same) in this five-part mini-series here.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Application Data Availability 4 3 2 1 Data Protection

Application Data Availability 4 3 2 1 Data Protection

4 3 2 1 data protection Application Data Availability Everything Is Not The Same

Application Data Availability 4 3 2 1 Data Protection

This is part two of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application performance, availability, capacity, economic (PACE) attributes that have an impact on data value as well as availability.

4 3 2 1 data protection  Book SDDC

Availability (Accessibility, Durability, Consistency)

Just as there are many different aspects and focus areas for performance, there are also several facets to availability. Note that applications performance requires availability and availability relies on some level of performance.

Availability is a broad and encompassing area that includes data protection to protect, preserve, and serve (backup/restore, archive, BC, BR, DR, HA) data and applications. There are logical and physical aspects of availability including data protection as well as security including key management (manage your keys or authentication and certificates) and permissions, among other things.

Availability = accessibility (can you get to your application and data) + durability (is the data intact and consistent). This includes basic Reliability, Availability, Serviceability (RAS), as well as high availability, accessibility, and durability. “Durable” has multiple meanings, so context is important. Durable means how data infrastructure resources hold up to, survive, and tolerate wear and tear from use (i.e., endurance), for example, Flash SSD or mechanical devices such as Hard Disk Drives (HDDs). Another context for durable refers to data, meaning how many copies in various places.

Server, storage, and I/O network availability topics include:

  • Resiliency and self-healing to tolerate failure or disruption
  • Hardware, software, and services configured for resiliency
  • Accessibility to reach or be reached for handling work
  • Durability and consistency of data to be available for access
  • Protection of data, applications, and assets including security

Additional server I/O and data infrastructure along with storage topics include:

  • Backup/restore, replication, snapshots, sync, and copies
  • Basic Reliability, Availability, Serviceability, HA, fail over, BC, BR, and DR
  • Alternative paths, redundant components, and associated software
  • Applications that are fault-tolerant, resilient, and self-healing
  • Non disruptive upgrades, code (application or software) loads, and activation
  • Immediate data consistency and integrity vs. eventual consistency
  • Virus, malware, and other data corruption or loss prevention

From a data protection standpoint, the fundamental rule or guideline is 4 3 2 1, which means having at least four copies consisting of at least three versions (different points in time), at least two of which are on different systems or storage devices and at least one of those is off-site (on-line, off-line, cloud, or other). There are many variations of the 4 3 2 1 rule shown in the following figure along with approaches on how to manage technology to use. We will go into deeper this subject in later chapters. For now, remember the following.

large version application server storage I/O
4 3 2 1 data protection (via Software Defined Data Infrastructure Essentials)

4    At least four copies of data (or more), Enables durability in case a copy goes bad, deleted, corrupted, failed device, or site.
3    The number (or more) versions of the data to retain, Enables various recovery points in time to restore, resume, restart from.
2    Data located on two or more systems (devices or media/mediums), Enables protection against device, system, server, file system, or other fault/failure.

1    With at least one of those copies being off-premise and not live (isolated from active primary copy), Enables resiliency across sites, as well as space, time, distance gap for protection.

Capacity and Space (What Gets Consumed and Occupied)

In addition to being available and accessible in a timely manner (performance), data (and applications) occupy space. That space is memory in servers, as well as using available consumable processor CPU time along with I/O (performance) including over networks.

Data and applications also consume storage space where they are stored. In addition to basic data space, there is also space consumed for metadata as well as protection copies (and overhead), application settings, logs, and other items. Another aspect of capacity includes network IP ports and addresses, software licenses, server, storage, and network bandwidth or service time.

Server, storage, and I/O network capacity topics include:

  • Consumable time-expiring resources (processor time, I/O, network bandwidth)
  • Network IP and other addresses
  • Physical resources of servers, storage, and I/O networking devices
  • Software licenses based on consumption or number of users
  • Primary and protection copies of data and applications
  • Active and standby data infrastructure resources and sites
  • Data footprint reduction (DFR) tools and techniques for space optimization
  • Policies, quotas, thresholds, limits, and capacity QoS
  • Application and database optimization

DFR includes various techniques, technologies, and tools to reduce the impact or overhead of protecting, preserving, and serving more data for longer periods of time. There are many different approaches to implementing a DFR strategy, since there are various applications and data.

Common DFR techniques and technologies include archiving, backup modernization, copy data management (CDM), clean up, compress, and consolidate, data management, deletion and dedupe, storage tiering, RAID (including parity-based, erasure codes , local reconstruction codes [LRC] , and Reed-Solomon , Ceph Shingled Erasure Code (SHEC ), among others), along with protection configurations along with thin-provisioning, among others.

DFR can be implemented in various complementary locations from row-level compression in database or email to normalized databases, to file systems, operating systems, appliances, and storage systems using various techniques.

Also, keep in mind that not all data is the same; some is sparse, some is dense, some can be compressed or deduped while others cannot. Likewise, some data may not be compressible or dedupable. However, identical copies can be identified with links created to a common copy.

Economics (People, Budgets, Energy and other Constraints)

If one thing in life and technology that is constant is change, then the other constant is concern about economics or costs. There is a cost to enable and maintain a data infrastructure on premise or in the cloud, which exists to protect, preserve, and serve data and information applications.

However, there should also be a benefit to having the data infrastructure to house data and support applications that provide information to users of the services. A common economic focus is what something costs, either as up-front capital expenditure (CapEx) or as an operating expenditure (OpEx) expense, along with recurring fees.

In general, economic considerations include:

  • Budgets (CapEx and OpEx), both up front and in recurring fees
  • Whether you buy, lease, rent, subscribe, or use free and open sources
  • People time needed to integrate and support even free open-source software
  • Costs including hardware, software, services, power, cooling, facilities, tools
  • People time includes base salary, benefits, training and education

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

SDDC Data Infrastructure

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. All applications have some element of performance, availability, capacity, economic (PACE) needs as well as resource demands. There is often a focus around data storage about storage efficiency and utilization which is where data footprint reduction (DFR) techniques, tools, trends and as well as technologies address capacity requirements. However with data storage there is also an expanding focus around storage effectiveness also known as productivity tied to performance, along with availability including 4 3 2 1 data protection. Continue reading the next post (Part III Application Data Characteristics Types Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Application Data Characteristics Types Everything Is Not The Same

Application Data Characteristics Types Everything Is Not The Same

Application Data Characteristics Types Everything Is Not The Same

Application Data Characteristics Types Everything Is Not The Same

This is part three of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on different types of data. There is more to data than simply being big data, fast data, big fast or unstructured, structured or semistructured, some of which has been touched on in this series, with more to follow. Note that there is also data in terms of the programs, applications, code, rules, policies as well as configuration settings, metadata along with other items stored.

Application Data Value Software Defined Data Infrastructure Essentials Book SDDC

Various Types of Data

Data types along with characteristics include big data, little data, fast data, and old as well as new data with a different value, life-cycle, volume and velocity. There are data in files and objects that are big representing images, figures, text, binary, structured or unstructured that are software defined by the applications that create, modify and use them.

There are many different types of data and applications to meet various business, organization, or functional needs. Keep in mind that applications are based on programs which consist of algorithms and data structures that define the data, how to use it, as well as how and when to store it. Those data structures define data that will get transformed into information by programs while also being stored in memory and on data stored in various formats.

Just as various applications have different algorithms, they also have different types of data. Even though everything is not the same in all environments, or even how the same applications get used across various organizations, there are some similarities. Even though there are different types of applications and data, there are also some similarities and general characteristics. Keep in mind that information is the result of programs (applications and their algorithms) that process data into something useful or of value.

Data typically has a basic life cycle of:

  • Creation and some activity, including being protected
  • Dormant, followed by either continued activity or going inactive
  • Disposition (delete or remove)

In general, data can be

  • Temporary, ephemeral or transient
  • Dynamic or changing (“hot data”)
  • Active static on-line, near-line, or off-line (“warm-data”)
  • In-active static on-line or off-line (“cold data”)

Data is organized

  • Structured
  • Semi-structured
  • Unstructured

General data characteristics include:

  • Value = From no value to unknown to some or high value
  • Volume = Amount of data, files, objects of a given size
  • Variety = Various types of data (small, big, fast, structured, unstructured)
  • Velocity = Data streams, flows, rates, load, process, access, active or static

The following figure shows how different data has various values over time. Data that has no value today or in the future can be deleted, while data with unknown value can be retained.

Different data with various values over time

Application Data Value across sddc
Data Value Known, Unknown and No Value

General characteristics include the value of the data which in turn determines its performance, availability, capacity, and economic considerations. Also, data can be ephemeral (temporary) or kept for longer periods of time on persistent, non-volatile storage (you do not lose the data when power is turned off). Examples of temporary scratch include work and scratch areas such as where data gets imported into, or exported out of, an application or database.

Data can also be little, big, or big and fast, terms which describe in part the size as well as volume along with the speed or velocity of being created, accessed, and processed. The importance of understanding characteristics of data and how their associated applications use them is to enable effective decision-making about performance, availability, capacity, and economics of data infrastructure resources.

Data Value

There is more to data storage than how much space capacity per cost.

All data has one of three basic values:

  • No value = ephemeral/temp/scratch = Why keep it?
  • Some value = current or emerging future value, which can be low or high = Keep
  • Unknown value = protect until value is unlocked, or no remaining value

In addition to the above basic three, data with some value can also be further subdivided into little value, some value, or high value. Of course, you can keep subdividing into as many more or different categories as needed, after all, everything is not always the same across environments.

Besides data having some value, that value can also change by increasing or decreasing in value over time or even going from unknown to a known value, known to unknown, or to no value. Data with no value can be discarded, if in doubt, make and keep a copy of that data somewhere safe until its value (or lack of value) is fully known and understood.

The importance of understanding the value of data is to enable effective decision-making on where and how to protect, preserve, and cost-effectively store the data. Note that cost-effective does not necessarily mean the cheapest or lowest-cost approach, rather it means the way that aligns with the value and importance of the data at a given point in time.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

SDDC Data Infrastructure

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Data has different value at various times, and that value is also evolving. Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part IV Application Data Volume Velocity Variety Everything Not The Same) in this series here.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Application Data Volume Velocity Variety Everything Is Not The Same

Application Data Volume Velocity Variety Everything Not The Same

Application Data Volume Velocity Variety Everything Not The Same

This is part four of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on data volume velocity and variety, after all, everything is not the same, not to mention many different aspects of big data as well as little data.

Application Data Value Software Defined Data Infrastructure Essentials Book SDDC

Volume of Data

More data is growing at a faster rate every day, and that data is being retained for longer periods. Some data being retained has known value, while a growing amount of data has an unknown value. Data is generated or created from many sources, including mobile devices, social networks, web-connected systems or machines, and sensors including IoT and IoD. Besides where data is created from, there are also many consumers of data (applications) that range from legacy to mobile, cloud, IoT among others.

Unknown-value data may eventually have value in the future when somebody realizes that he can do something with it, or a technology tool or application becomes available to transform the data with unknown value into valuable information.

Some data gets retained in its native or raw form, while other data get processed by application program algorithms into summary data, or is curated and aggregated with other data to be transformed into new useful data. The figure below shows, from left to right and front to back, more data being created, and that data also getting larger over time. For example, on the left are two data items, objects, files, or blocks representing some information.

In the center of the following figure are more columns and rows of data, with each of those data items also becoming larger. Moving farther to the right, there are yet more data items stacked up higher, as well as across and farther back, with those items also being larger. The following figure can represent blocks of storage, files in a file system, rows, and columns in a database or key-value repository, or objects in a cloud or object storage system.

Application Data Value sddc
Increasing data velocity and volume, more data and data getting larger

In addition to more data being created, some of that data is relatively small in terms of the records or data structure entities being stored. However, there can be a large quantity of those smaller data items. In addition to the amount of data, as well as the size of the data, protection or overhead copies of data are also kept.

Another dimension is that data is also getting larger where the data structures describing a piece of data for an application have increased in size. For example, a still photograph was taken with a digital camera, cell phone, or another mobile handheld device, drone, or other IoT device, increases in size with each new generation of cameras as there are more megapixels.

Variety of Data

In addition to having value and volume, there are also different varieties of data, including ephemeral (temporary), persistent, primary, metadata, structured, semi-structured, unstructured, little, and big data. Keep in mind that programs, applications, tools, and utilities get stored as data, while they also use, create, access, and manage data.

There is also primary data and metadata, or data about data, as well as system data that is also sometimes referred to as metadata. Here is where context comes into play as part of tradecraft, as there can be metadata describing data being used by programs, as well as metadata about systems, applications, file systems, databases, and storage systems, among other things, including little and big data.

Context also matters regarding big data, as there are applications such as statistical analysis software and Hadoop, among others, for processing (analyzing) large amounts of data. The data being processed may not be big regarding the records or data entity items, but there may be a large volume. In addition to big data analytics, data, and applications, there is also data that is very big (as well as large volumes or collections of data sets).

For example, video and audio, among others, may also be referred to as big fast data, or large data. A challenge with larger data items is the complexity of moving over the distance promptly, as well as processing requiring new approaches, algorithms, data structures, and storage management techniques.

Likewise, the challenges with large volumes of smaller data are similar in that data needs to be moved, protected, preserved, and served cost-effectively for long periods of time. Both large and small data are stored (in memory or storage) in various types of data repositories.

In general, data in repositories is accessed locally, remotely, or via a cloud using:

  • Object and blobs stream, queue, and Application Programming Interface (API)
  • File-based using local or networked file systems
  • Block-based access of disk partitions, LUNs (logical unit numbers), or volumes

The following figure shows varieties of application data value including (left) photos or images, audio, videos, and various log, event, and telemetry data, as well as (right) sparse and dense data.

Application Data Value bits bytes blocks blobs bitstreams sddc
Varieties of data (bits, bytes, blocks, blobs, and bitstreams)

Velocity of Data

Data, in addition to having value (known, unknown, or none), volume (size and quantity), and variety (structured, unstructured, semi structured, primary, metadata, small, big), also has velocity. Velocity refers to how fast (or slowly) data is accessed, including being stored, retrieved, updated, scanned, or if it is active (updated, or fixed static) or dormant and inactive. In addition to data access and life cycle, velocity also refers to how data is used, such as random or sequential or some combination. Think of data velocity as how data, or streams of data, flow in various ways.

Velocity also describes how data is used and accessed, including:

  • Active (hot), static (warm and WORM), or dormant (cold)
  • Random or sequential, read or write-accessed
  • Real-time (online, synchronous) or time-delayed

Why this matters is that by understanding and knowing how applications use data, or how data is accessed via applications, you can make informed decisions. Also, having insight enables how to design, configure, and manage servers, storage, and I/O resources (hardware, software, services) to meet various needs. Understanding Application Data Value including the velocity of the data both for when it is created as well as when used is important for aligning the applicable performance techniques and technologies.

Where to learn more

Learn more about Application Data Value, application characteristics, performance, availability, capacity, economic (PACE) along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

SDDC Data Infrastructure

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Data has different value, size, as well as velocity as part of its characteristic including how used by various applications. Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part V Application Data Access life cycle Patterns Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Application Data Access Lifecycle Patterns Everything Is Not The Same

Application Data Access Life cycle Patterns Everything Is Not The Same(Part V)

Application Data Access Life cycle Patterns Everything Is Not The Same

Application Data Access Life cycle Patterns Everything Is Not The Same

This is part five of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we look at various application and data lifecycle patterns as well as wrap up this series.

Application Data Value Software Defined Data Infrastructure Essentials Book SDDC

Active (Hot), Static (Warm and WORM), or Dormant (Cold) Data and Lifecycles

When it comes to Application Data Value, a common question I hear is why not keep all data?

If the data has value, and you have a large enough budget, why not? On the other hand, most organizations have a budget and other constraints that determine how much and what data to retain.

Another common question I get asked (or told) it isn’t the objective to keep less data to cut costs?

If the data has no value, then get rid of it. On the other hand, if data has value or unknown value, then find ways to remove the cost of keeping more data for longer periods of time so its value can be realized.

In general, the data life cycle (called by some cradle to grave, birth or creation to disposition) is created, save and store, perhaps update and read with changing access patterns over time, along with value. During that time, the data (which includes applications and their settings) will be protected with copies or some other technique, and eventually disposed of.

Between the time when data is created and when it is disposed of, there are many variations of what gets done and needs to be done. Considering static data for a moment, some applications and their data, or data and their applications, create data which is for a short period, then goes dormant, then is active again briefly before going cold (see the left side of the following figure). This is a classic application, data, and information life-cycle model (ILM), and tiering or data movement and migration that still applies for some scenarios.

Application Data Value
Changing data access patterns for different applications

However, a newer scenario over the past several years that continues to increase is shown on the right side of the above figure. In this scenario, data is initially active for updates, then goes cold or WORM (Write Once/Read Many); however, it warms back up as a static reference, on the web, as big data, and for other uses where it is used to create new data and information.

Data, in addition to its other attributes already mentioned, can be active (hot), residing in a memory cache, buffers inside a server, or on a fast storage appliance or caching appliance. Hot data means that it is actively being used for reads or writes (this is what the term Heat map pertains to in the context of the server, storage data, and applications. The heat map shows where the hot or active data is along with its other characteristics.

Context is important here, as there are also IT facilities heat maps, which refer to physical facilities including what servers are consuming power and generating heat. Note that some current and emerging data center infrastructure management (DCIM) tools can correlate the physical facilities power, cooling, and heat to actual work being done from an applications perspective. This correlated or converged management view enables more granular analysis and effective decision-making on how to best utilize data infrastructure resources.

In addition to being hot or active, data can be warm (not as heavily accessed) or cold (rarely if ever accessed), as well as online, near-line, or off-line. As their names imply, warm data may occasionally be used, either updated and written, or static and just being read. Some data also gets protected as WORM data using hardware or software technologies. WORM (immutable) data, not to be confused with warm data, is fixed or immutable (cannot be changed).

When looking at data (or storage), it is important to see when the data was created as well as when it was modified. However, you should avoid the mistake of looking only at when it was created or modified: Instead, also look to see when it was the last read, as well as how often it is read. You might find that some data has not been updated for several years, but it is still accessed several times an hour or minute. Also, keep in mind that the metadata about the actual data may be being updated, even while the data itself is static.

Also, look at your applications characteristics as well as how data gets used, to see if it is conducive to caching or automated tiering based on activity, events, or time. For example, there is a large amount of data for an energy or oil exploration project that normally sits on slower lower-cost storage, but that now and then some analysis needs to run on.

Using data and storage management tools, given notice or based on activity, which large or big data could be promoted to faster storage, or applications migrated to be closer to the data to speed up processing. Another example is weekly, monthly, quarterly, or year-end processing of financial, accounting, payroll, inventory, or enterprise resource planning (ERP) schedules. Knowing how and when the applications use the data, which is also understanding the data, automated tools, and policies, can be used to tier or cache data to speed up processing and thereby boost productivity.

All applications have performance, availability, capacity, economic (PACE) attributes, however:

  • PACE attributes vary by Application Data Value and usage
  • Some applications and their data are more active than others
  • PACE characteristics may vary within different parts of an application
  • PACE application and data characteristics along with value change over time

Read more about Application Data Value, PACE and application characteristics in Software Defined Data Infrastructure Essentials (CRC Press 2017).

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI) and related topics via the following links:

SDDC Data Infrastructure

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Keep in mind that Application Data Value everything is not the same across various organizations, data centers, data infrastructures, data and the applications that use them.

Also keep in mind that there is more data being created, the size of those data items, files, objects, entities, records are also increasing, as well as the speed at which they get created and accessed. The challenge is not just that there is more data, or data is bigger, or accessed faster, it’s all of those along with changing value as well as diverse applications to keep in perspective. With new Global Data Protection Regulations (GDPR) going into effect May 25, 2018, now is a good time to assess and gain insight into what data you have, its value, retention as well as disposition policies.

Remember, there are different data types, value, life-cycle, volume and velocity that change over time, and with Application Data Value Everything Is Not The Same, so why treat and manage everything the same?

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Data Infrastructure IT Industry Related Resource Links P to T

Data Infrastructure IT Industry Related Resource Links P to T

IT Data Center and Data Infrastructure Industry Resources

Updated 6/13/2018

Following are some useful Data Infrastructure IT Industry Resource Links P to T to cloud, virtual and traditional IT data infrastructure related web sites. The data infrastructure environment (servers, storage, IO and networking, hardware, software, services, virtual, container and cloud) is rapidly changing. You may encounter a missing URL, or a URL that has changed. This list is updated on a regular basis to reflect changes (additions, changes, and retirement).

Disclaimer and note: URL’s submitted for inclusion on this site will be reviewed for consideration and to be in generally accepted good taste in regards to the theme of this site.

Best effort has been made to validate and verify the data infrastructure URLs that appear on this page and web site however they are subject to change. The author and/or maintainer(s) of this page and web site make no endorsement to and assume no responsibility for the URLs and their content that are listed on this page.

Software Defined Data Infrastructure Essentials Book SDDC

Send an email note to info at storageio dot com that includes company name, URL, contact name, title and phone number along with a brief 40 character description to be considered for addition to the above data infrastructure list, or, to be removed. Note that Server StorageIO and UnlimitedIO LLC (e.g. StorageIO) does not sell, trade, barter, borrow or share your contact information per our Privacy and Disclosure policy. View related data infrastructure Server StorageIO content here, and signup for our free newsletter here.

Links A-E
Links F-J
Links K-O
Links P-T
Links U-Z
Other Links

  • Packeteer.com    WAFS and networking solutions (Bought Tacit)
  • packetlight.com    CWDM and DWDM networking solutions
  • Panasas.com    Clustered storage solution
  • pancetera.com    Virtual machine backup software (Bought by Quantum)
  • Panduit.com    Networking and cable management
  • panzura.com    Cloud storage access software
  • paraccel.com     Business and data analytics
  • paragon-software.com    Storage management and backup tools
  • parallels.com    VDI and desktop virtualization and cloud tools
  • parascale.com     Clustered and cloud storage software
  • pcisig.com    PCI trade group (PCI, PCI-X, PCI-Express/PCIe)
  • penguincomputing.com    HPC servers, storage and hosting
  • pergamumsystems.com    Archive solutions (Stealth)
  • Permabit.com    Data archiving solutions
  • Pernixdata    Server and storage I/O cache optimization for virtual servers
  • perotsystems.com    Hosting and managed service provider (Bought by Dell)
  • pgp.com    Security tools (Bought by Symantec)
  • PHDvirtual    Data protection tools
  • Pillardata.com    Data storage solutions – (Bought by Oracle)
  • pineapp.com    Email, archive solutions, web and data protection
  • Piviot3.com    IP Storage
  • Pivotal Labs    Big Data, PaaS development tools, EMC/VMware spinout
  • plasmon.com    (Now called Alliance Storage Technologies) Optical Storage Solutions
  • plextoramericas.com    SSD and other storage solutions
  • plianttechnology.com    Solid state storage devices (SSD) – (Bought by SANdisk)
  • Pluribus Networks    Converged and software defined network management
  • pmc-serria.com    Storage networking component supplier
  • pny.com    Memory componets and technology
  • Pogoplug    Cloud storage
  • PolyServe.com    Clustered storage solutions (Sold to HP)
  • Polargy    Data Center facilaties, HVAC and DCIM solutions
  • power.org    Power Processor trade group
  • Mushkin   SSD Solutions
  • Peak Cloud    Cloud and storage services
  • PowerFile.com    Data archiving solutions
  • powerware.com    UPS and power conditioning systems
  • procedo.com    Archiving and migration solutions
  • proceedtechnologies.com    SAP consulting
  • profusionbackups.com    Cloud and managed backup service solution
  • progeny.net    VAR and specialized IT systems
  • prolexic.com    Distributed denial of service tools
  • promise.com    RAID storage systems
  • Prostorsystems.com    Removable disk storage (See RDX Alliance)
  • Proxim.com    Wireless networking
  • proximaldata.com    SSD caching and tiering software
  • pt.com    Communications hardware and software
  • puresi.com aka Puresilicon    SSD storage solutions
  • purestorage.com    SSD based storage
  • Puppet Labs    IT Automation and DCIM tools for physical, Cloud and Virtual

  • qlogic.com    Host bus adapters and switches
  • qsantechnology.com    iSCSI IP storage
  • Qstart Technologies    Data protection storage including LTFS based systems
  • Quadric Software    Data protection software
  • qualstar.com    Tape backup and archive solutions (Aka Qstar)
  • quantum.com    Tape drives and libraries
  • quest.com    IT and data management solution tools (Bought by Dell)
  • Qumulo    Stealth storage startup
  • qwest.com    (Century Link) Telephone and data networking, managed services provider
  • racemi.com    Repurposing management tools
  • Rackable.com    Now SGI
  • Rackspace.com    Managed services and hosting
  • www.rackwise.com    Data center management tools
  • raidinc.com    Storage systems
  • raidundant.com    Storage systems
  • Rainfinity.com    File virtualization (Bought by EMC)
  • rainstor.com    Big data management tools
  • rapidio.org    RapidIO Trade Group
  • Raritan    Data center and DCIM tools
  • rasilient.com    Storage subsystem vendor
  • Ravello    VMware optimization and management tools
  • Raxco    Data, storage and systems management tools
  • rebit.com    Backup and data protection solutions
  • RecordNation    Digital Data Storage and Records Management
  • redbend.com    Mobile device and application management
  • redbooks.ibm.com    IBM Red books and Red pieces technical articles
  • Redhat.com    Linux provider (Bought Gluster)
  • Reduxio    Hybrid storage with data services
  • reflexphotonics.com    Optical connectivity solutions
  • Reldata.com    Storage systems (Renamed Starboard)
  • remote-backup.com    Remote backup software
  • renewdata.com    Data management and compliance tools
  • repliweb.com    Web and content distribution
  • Retrospect    Data Protection Software Tools
  • revivio.com    Data Protection Software (Assets Bought by Symantec)
  • rightscale.com    Amazon cloud computing management tools
  • rimage.com    CD/DVD production technologies
  • risingtidesystems.com    VAR
  • Ritek.com    Storage solutions
  • rittal.com    Enclosures and cabinets
  • riverbed.com    Wide area file access acceleration solution
  • rjssoftware.com    Document capture and management
  • rmsource.com    Cloud backup solutions
  • rnanetworks.com    Virtual memory management solutions (Bought by Dell)
  • rocketdivision.com    iSCSI technologies
  • rorke.com    VAR
  • rpath.com    Data center automation
  • rsa.com    Security division of EMC
  • safemediacorp.com    Internet security and intrusion detection tools
  • safenet-inc.com    Data protection focused VAR
  • Sagecloud   Cloud storage, deep cold archive
  • samsung.com    Various technologies including SSD memory
  • sanblaze.com    Embedded storage and emulation solutions
  • SANbolic.com    Storage, server and cloud management tools
  • sand-chip.com    Chip design
  • SANDforce.com    SSD storage solutions – (Bought by LSI)
  • sandial.com    Defunct SAN startup
  • SANdisk.com    SSD memory components
  • sandpiperdata.com    Data migration services
  • sanmina-sci.com    Contract manufacturer (Virtual Factory) for various OEM/VARs
  • sanovi.com    Disaster recovery management tools
  • sanpulse.com    SRA and automation tools
  • sanrad.com    Storage networking routers (Bought by OCZ)
  • sans.org    Security related web site
  • sansdigital.com    VAR
  • sap.com    Information management tools and applications
  • sas.com    Statistical analysis software
  • sata-io.org    Serial ATA trade organization
  • SavageIO   High performance storage solutions
  • savvis.com    Cloud, managed service provider and hosting (Bought by Centurylink)
  • sbbwg.org    Storage Bridge Bay Working Group
  • scalable-systems.com    Data warehouse consulting and tools
  • scalecomputing.com    Clustered storage management software
  • scalemp.com    Virtualization technology for scale out computing
  • scalent.com    Virtual IT data center management tools
  • scality.com    Email and sharepoint cloud storage
  • schoonerinfotech.com    SSD based database management solutions
  • scsita.org    SCSI and SAS trade group
  • seagate.com    Disk drives
  • Sealpath   Data and information protection tools
  • seanodes.com    Distributed storage
  • sec.gov    Site about compliance items including CFR 17a-4
  • securedatainnovations.com    Data protection and security tools
  • sentilla.com    Data center performance management tools
  • sepaton.com    Disk based backup solutions
  • serialata.org    Serial ATA trade association
  • servicemesh.com    Cloud, datacenter transformation and devops tools
  • servicenow.com    ITIL data center management tools
  • 1servosity.com    Cloud data protection
  • servoy.com    Cloud development tools
  • ServPath.com    Hosting services
  • seven10storage.com    Disaster recovery and archiving software
  • sgi.com    Storage, server and data management hardware, software, tools
  • sherpasoftware.com    Email archiving
  • shop.bellmicro.com    Distributor (Bought by Avnet)
  • siber.com    Data protection and security tools
  • sidusdata.com    Managed service and cloud provider
  • siemon.com    Storage networking infrastructure items
  • sigmasol.com    Value added reseller (VAR)
  • Signiant.com    Data management tools
  • silexamerica.com    Mobile device and server connectivity
  • SiliconImage.com    Digital Video components
  • SiliconStor.com    Storage networking silicon
  • siliconvalleypr.com    IT technologies press/media and analyst relations firm
  • silveradotech.com    VAR
  • silver-peak.com    Wide area data and file services (WAFS, WADM, WADS)
  • SilverSky    Cloud security
  • simpletech.com    Storage solutions including USB portable devices
  • simplivity.com    Convergence and virtualization solutions
  • simplycontinuous.net    Data protection and cloud backup
  • siriuscom.com    VAR
  • site-vault.com    On-line backup server provider (BSP) managed service provider (MSP)
  • skyera.com    SSD storage solutions
  • skytap.com    Public and private cloud application development tools
  • Smart421   Smart421    AWS connect parter, Hosting/cloud/access services
  • smartm.com    PC card and other memory module components
  • smc.com    Storage and networking components
  • smithmicro.com    Mobile data management tools
  • smmdirect.com    Memory devices
  • snapappliances.com    NAS Storage solutions (Now Adaptec)
  • snia.org    Storage Networking Industry Association
  • snseurope.com    U.K. & European Storage Networking News
  • snwusa.com    SNIA and Computerworld conference
  • softek.com    Storage management solutions (formerly Fujitsu Softek, Sold to IBM)
  • softlayer.com    Cloud infrastructure services (IaaS) (Bought by IBM)
  • softnas.com    ZFS based opensource NAS solutions
  • softricity.com    Virtualization management tools (Bought by Microsoft)
  • Sogeti.com    Data management tools
  • solarflare.com    10Gb Ethernet networking
  • solarwinds.com    IT management tools (Bought TekTools, Hyper9 and others)
  • solidaccess.com    Solid state storage (SSD) solutions
  • soliddata.com    Solid State Disk solutions
  • solidfire.com    iSCSI SSD optimized for hosting and cloud providers
  • Solix.com    Database archiving software
  • solutiontechnology.co.uk    Storage networking training
  • sonasoft.com    Email archiving, backup and data protection
  • sonnettech.com    External storage solutions
  • sony.com    Storage devices
  • sophos.com    Data protection and security tools
  • sorrento.com    Optical networking
  • sparebackup.com    Backup data protection solutions
  • sparkweave.com    Private cloud archive and file sharing
  • spec.org    SPEC benchmarks
  • spectralogic.com    Tape library and disk based backup solutions
  • spiceworks.com    Online community and management software tools
  • spirent.com    Storage networking test equipment
  • Spiron.com  Data discovery, classification, lifecycle management (formerly Identity Finder)
  • Splice Communications   Splice Communications    AWS connect parter, Hosting/cloud/access services
  • splunk.com    DCIM and log management tools
  • spotcloud.com    Cloud services clearing house
  • spraycool.com    IT Data center and component cooling
  • springsoft.com    Bought by Synopsys
  • spsoftglobal.com    Software development
  • spyrus.com    Security tools
  • ssswg.org    IEEE Storage Systems Standards Work Group
  • starboardstorage.com    Unified storage solutions (Formerly Reldata, now ceased operations)
  • startech.com    IT/AV technolgie equipment from enclosures to KVM and more
  • starwindsoftware.com    iSCSI storage management solutions
  • stcroixsolutions.com    VAR
  • stec-inc.com    SSD storage (Bought by WD)
  • Steeleye.com    HA software
  • Stellar    Data Protection tools
  • storagetek.com    Disk, tape, data management software (Bought by Sun)
  • stonebranch.com    File transfer tools
  • stonefly.com    Storage networking routers (Aka DNF)
  • storability.com    Storage management software (Bought by STK)
  • storactive.com    Data protection solutions
  • storagecraft.com    Data protection tools
  • storagefusion.com    Storage resource analysis (SRA) tools
  • storageio.net    Alternate URL for the StorageIO Group
  • storageiogroup.com    Alternate URL for the StorageIO Group
  • storagemadeeasy.com    Hybrid and personal cloud management tools and dashboards
  • Storagemonkeys.com    Storage community site
  • storagenetworking.org    Storage Networking Users Groups also known as SNUGs
  • storageperformance.org    Storage Performance Council information
  • www.storagesearch.com    Venue for information about various storage and related topics
  • storcase.com    Data Archive solutions (Bought by Crudata)
  • store-age.com    Storage management software (Bought by LSI)
  • storediq.com    eDiscovery, search, indexing, classification (Bought by IBM)
  • Storewize.com    Real time data compression (Bought by IBM)
  • Storix.com    Data backup solutions
  • storlife.com    CAS object archive storage
  • stormagic.com    Storage virtualization and data movement software
  • storserver.com    Backup and data protection solutions
  • storsimple.com    Cloud storage access solutions (Bought by Microsoft)
  • storspeed.com    NAS/NFS optimization solutions (Missing in action)
  • stratascale.com    Cloud, hosting and management solutions
  • stratus.com    High availability storage and servers
  • sugarsync.com    Backup and data protection solutions
  • sun.com    Storage networking hardware and software (Bought by Oracle)
  • sunbeltsoftware.com    End point data protection security tools
  • sungard.com    Data protection and cloud services
  • superlumin.com    Application caching tools
  • supermicro.com    Server and storage solutions
  • surdoc.com    Cloud storage and backup
  • surgient.com    Cloud computing solutions
  • svlg.net    Silicon Valley Leadership Group
  • Swiftstack    Private cloud solutions
  • swifttest.com    NFS and CIFS storage testing solutions
  • sybase.com    Database solutions
  • sycamorenetworks.com    Networking solutions
  • Symantec.com    Data and storage management software
  • symbolicio.com    stealth startup
  • symform.com    Cloud storage and backup
  • syncsort.com    Information Management tools
  • synnex.com    Distributor
  • Synnex   IT Solutions
  • synology.com    SMB storage solutions
  • synopsys.com    Computer technology development and manufacturing
  • SysAid    Data center, DCIM and ITSM tools
  • t10.orgscsi-3.htm    ANSI T10 (SCSI information) site
  • t11.org    ANSI T11 page for Fibre Channel information
  • t3media.com    Cloud storage and video platform tools
  • tableausoftware.com    Data analytics software tools
  • tacit.com    WAN file system accelerator (Bought by Packeteer)
  • tacitnetworks.com    Wide area file access acceleration solution (Bought by Packeteer)
  • tandberg.com    Data management solutions (Bought by Cisco)
  • tapeandmedia.com    Information about magnetic tape media
  • tapepower.com    Site for tape topics
  • tarmin.com        Archiving solutions
  • teamdrive.com    Cloud storage
  • teamquest.com    IRM management and capacity management tools
  • TeamViewer.com    Remote support and Online meeting software
  • techdata.com    Distributor
  • tegile.com    Storage system solutions
  • tehutinetworks.net    High speed iSCSI adapters
  • tek-tools.com    SRM storage management software (Bought by Solarwinds)
  • TelecityGroup    AWS connect parter, Hosting/cloud/access services
  • tellabs.com    Networking components
  • Telx    AWS connect parter, Hosting/cloud/access services
  • teneros.com    Email archiving and management solutions
  • teracloud.com    Capacity planning and resource management software
  • teradata.com    Large scale database and data warehouse systems
  • teradici.com    PC over IP technologies
  • teranetics.com    Ethernet chips
  • Terascala    Data analytics and management solutions
  • ter.de    Optical storage libraries
  • terracloudinc.com    Cloud services
  • TerraScale.com    Scalable storage and server solutions
  • Verizon/Terremark   Cloud, hosting and managed services
  • Tevron   Application Response Time Monitoring
  • texmemsys.com    Solid State Disk storage
  • thebci.org    Business Continuity Institute
  • thecus.com    Multi-protocol storage
  • thegreengrid.org    Industry Trade Group
  • The Padcaster    Apple iPad tools
  • thepluggllc.com    Data center energy efficient floor tiles
  • theq3.com    Data storage security solutions
  • thinkaheadit.com aka Ahead    Value added reseller
  • thinkaheadit.com    Value added reseller (VAR)
  • thirdbrigade.com    Intrusion detection security tools (Bought by Trend Micro)
  • thirdio.com    SSD solutions
  • tiaonline.org    Telecommunications Industry Association
  • tidalsoftware.com    IT Management software tools (Bought by Cisco)
  • timespring.com    Continuous data protection solutions
  • tintri.com    NFS and NAS storage optimized for VMware
  • tivoli.com    Data management software
  • Softbank Telecom Corp.    AWS connect parter, Hosting/cloud/access services
  • Primary Data and Tonian    Stealth data virtualization startup
  • topgun-tech.com    Data Infrastructure Resource (Server, Storage, SANs)
  • top500.org    Top 500 super compute sites
  • topio.com    Data protection software (Bought by NetApp)
  • topspin.com    InfiniBand Technology (Bought by Cisco)(
  • Toshiba.com    Server and storage solutions
  • tpc.org    Transaction processing performance council
  • translattice.com    Distributed and elastic database and automation tools
  • Tredent.com    WAN optimization solutions
  • TrendMicro.com    Security and anti virus tools
  • trianz.com    VAR
  • tributary.com   Datra protection soultion tools including virtual, disk and tape-
  • trilogytechnologies.ie    Managed services provider
  • tritondata.com    IT services and VAR
  • trunkbow.com    Cloud, mobile and networking services
  • trustedcomputinggroup.org    Trusted computing industry trade group
  • trusteddatasolutions.com    VAR
  • trustedid.com    ID theft protection
  • trustware.com    Internet and data protection security tools
  • turnkeylinux.org   Turnkey Linux appliance –
  • tusc.com    VAR
  • twinstrata.com    BC/DR analysis and cloud access software
  • tw telecom   tw telecom    AWS connect parter, Hosting/cloud/access services
  • TSO logic    DCIM and data center power energy management tools
  • tzolkin.com    DNS and High Availability solutions

Where To Learn More

View additional NAS, NVMe, SSD, NVM, SCM, Data Infrastructure and HDD related topics via the following links.

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What This All Means

Visit the following additional data infrastructure and IT data center related links.

Links A-E
Links F-J
Links K-O
Links P-T
Links U-Z
Other Links

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Have VTLs or VxLs become Zombies, Declared dead yet still alive?

Have you heard or read the reports and speculation that VTLs (Virtual Tape Libraries) are dead?

It seems that in IT the all to popular trend is to declare something dead so that your new product or technology can have a chance of making it in to the market or perhaps seen in a better light.

Sometimes this approach works to temporary freeze the market until common sense and clarity returns to the market or until something else fun to talk about comes along and in other cases, the messages can fall on deft ears.

The approach of declaring something dead tends to play well for those who like shiny new toys (SNT) or new shiny toys (NST) and being on the popular, cool trendy bandwagon.

Not surprisingly, while some actual IT customers can fall into the SNT or NST syndrome, its often the broader industry including media, bloggers, analysts, consultants and other self proclaimed or anointed pundits as well as vendors who latch on to the declare it dead movement. After all, who wants to talk about something that is old, boring and already being sold to paying customers who are using it. Now this is not a bad thing as we need a balance of up and coming challengers to keep the status quo challenged, likewise we need a balance of the new to avoid death grips on the old and what is working.

Likewise, many IT customers particularly larger ones tend to be very risk averse and conservative with their budgets protecting their investments thus they may only go leading bleeding edge if there is a dual redundant blood bank with a backup on hot standby (thats some HA humor BTW).

Another reason that declaring items dead in support of SNT and NST is that while many of the commonly declared dead items are on the proverbial plateau of productivity for IT customers, that also can mean that they are on the plateau of profitability for the vendors.

However, not all good things last and at sometime, there is the need to transition from the old to the new and this is where things like virtualization including virtual tape libraries or virtual disk libraries or virtual storage library or what ever you want to call a VxL (more on what a VxL is in a moment) can come into play.

I realize that for some, particularly those who like to grasp on to SNT, NST and ride the dead pool bandwagons this will probably appear as snarky or cynical which is fine, after all, for some, you should be laughing to the bank and if not, you may in fact be missing out on an opportunity for playing in the dead pool marketing game.

Now back to VxL.

In the case of VTLs, for some it is the T word that bothers them, you know T as in Tape which is not a SNT or NST in an age where SSD has supposedly killed the disk drive which allegedly terminated tape (yeah right). Sure tape is not being used as much for backup as it has in the past with its role shifting to that of longer term retention, something that it is well suited for.

For tape fans (or cynics) you can read more here, here and here. However there is still a large amount of backup/restore along with other data protection or preservation (e.g. archiving) processing (software tools, processes, procedures, skill sets, management tools) that still expects to see tape.

Hence this is where VTLs or VxLs come into play leveraging virtualization in an Life Beyond Consolidation (and here) scenario providing abstraction, transparency, agility and emulation and IMHO are still very much alive and evolving.

Ok, for those who do not like or believe in or of its continued existence and evolving role, substitute the T (tape) with X and you get a VxL. That is, plug in what ever X word that makes you happy or marketable or a Shiny New TLA. For example Virtual Disk Library, Virtual Storage Library, Virtual Backup Library, Virtual Compression Library, Virtual Dedupe Library, Virtual ILM Library, Virtual Archive Library, Virtual Cloud Library and so forth. Granted some VxLs only emulate tape and hence are VTLs while others support NAS and other protocols (or personalities) not to mention functionality ranging from replication, DFR as well as automated policy management.

However, keep in mind that if your preference is VTL, VxL or what ever other buzzword bingo name that you want to use or come up with, look at how virtualization in the form of abstraction, transparency and emulation can bridge the gap between the new (disk based data protection) combined with DFR (Data Footprint Reduction) and the old (existing backup/restore, archive or other management tools and processes.

Here are some additional links pertaining to VTLs (excuse me, VxLs):

  • Virtual tape libraries: Old backup technology holdover or gateway to the future?
  • Not to mention here, here, here, here or here.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

What is DFR or Data Footprint Reduction?

What is DFR or Data Footprint Reduction?

What is DFR or Data Footprint Reduction?

Updated 10/9/2018

What is DFR or Data Footprint Reduction?

Data Footprint Reduction (DFR) is a collection of techniques, technologies, tools and best practices that are used to address data growth management challenges. Dedupe is currently the industry darling for DFR particularly in the scope or context of backup or other repetitive data.

However DFR expands the scope of expanding data footprints and their impact to cover primary, secondary along with offline data that ranges from high performance to inactive high capacity.

Consequently the focus of DFR is not just on reduction ratios, its also about meeting time or performance rates and data protection windows.

This means DFR is about using the right tool for the task at hand to effectively meet business needs, and cost objectives while meeting service requirements across all applications.

Examples of DFR technologies include Archiving, Compression, Dedupe, Data Management and Thin Provisioning among others.

Read more about DFR in Part I and Part II of a two part series found here and here.

Where to learn more

Learn more about data footprint reducton (DFR), data footprint overhead and related topics via the following links:

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means

That is all for now, hope you find these ongoing series of current or emerging Industry Trends and Perspectives posts of interest.

Ok, nuff said, for now.

Cheers Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Data footprint reduction (Part 2): Dell, IBM, Ocarina and Storwize

Dell

IBM

Over the past couple of weeks there has been a flurry of IT industry activity around data footprint impact reduction with Dell buying Ocarina and IBM acquiring Storwize. For those who want the quick (compacted, reduced) synopsis of what Dell buying Ocarina as well as IBM acquiring Storwize means read the first post in this two part series as well as some of my comments here and here.

This piece and it companion in part I of this two part series is about expanding the discussion to the much larger opportunity for vendors or vars of overall data footprint impact reduction beyond where they are currently focused. Likewise, this is about IT customers realizing that there are more opportunities to address data and storage optimization across your entire organization using various techniques instead of just focusing on backup or vmware virtual servers.

Who is Ocarina and Storwize?
Ocarina is a data and storage management software startup focused on data footprint reduction using a variety of approaches, techniques and algorithms. They differ from the traditional data dedupers (e.g. Asigra, Bakbone, Commvault, EMC Avamar, Datadomain and Networker, Exagrid, Falconstor, HP, IBM Protectier and TSM, Quantum, Sepaton and Symantec among others) by looking at data footprint reduction beyond just backup.

This means looking at how to reduce data footprint across different types of data including videos, image as well as text based documents among others. As a result, the market sweet spot for Ocarina is for general data footprint reduction including static along with active data including entertainment, video surveillance or gaming, reference data, web 2.0 and other bulk storage application data needs (this should compliment Dells recent Exanet acquisition).

What this means is that Ocarina is very well suited to address the rapidly growing amount of unstructured data that may not otherwise be handled as efficiently with by dedupe alone.

Storwize is a data and storage management startup focused on data footprint reduction using inline compression with an emphasis on maintaining performance for reads as well as writes of unstructured as well as structured database data. Consequently the market sweet spot for Storwize is around boosting the capacity of existing NAS storage systems from different vendors without negatively impacting performance. The trade off of the Storwize approach is that you do not get the spectacular data reduction ratios associated with backup centric or focused dedupe, however, you maintain performance associated with online storage that some dedupers dream of.

Both Dell and IBM have existing dedupe solutions for general purpose as well as backup along with other data footprint impact reduction tools (either owned or via partners). Now they are both expanding their focus and reach similar to what others such as EMC, HP, NetApp, Oracle and Symantec among others are doing. What this means is that someone at Dell and IBM see that there is much more to data footprint impact reduction than just a focus on dedupe for backup.

Wait, what does all of this discussion (or read here for background issues, challenges and opportunities) about unstructured data and changing access lifecycles have to do with dedupe, Ocarina and Storwize?

Continue reading on as this is about the expanding opportunity for data footprint reduction across entire organizations. That is, more data is being kept online and expanding data footprint impact needs to be addressed to meet business objectives using various techniques balancing performance, availability, capacity and energy or economics (PACE).

Dell

IBM

What does all of this have to do with IBM buying Storwize and Dell acquiring Ocarina?
If you have not pieced this together yet, let me net it out.

This is about the opportunity to address the organization wide expanding data footprint impact across all applications, types of data as well as tiers of storage to support business growth (more data to store) while maintaining QoS yet reduce per unit costs including management.

This is about expanding the story to the broader data footprint impact reduction from the more narrowly focused backup and dedupe discussion which are still in their infancy on a relative basis to their full market potential (read more here).

Now are you seeing where this is going and fits?

Does this mean IBM and Dell defocus on their existing Dedupe product lines or partners?
I do not believe so, at least as long as their respective revenue prevention departments are kept on the sidelines and off of the field of play. What I mean by this is that the challenge for IBM and Dell is similar to that of what others such as EMC are faced with having diverse portfolios or technology toolboxes. The challenge is messaging to the bigger issues, then aligning the right tool to the task at hand to address given issues and opportunities instead of singularly focused on a specific product causing revenue prevention elsewhere.

As an example, for backup, I would expect Dell to continue to work with its existing dedupe backup centric partners and technologies however find new opportunities to leverage their Ocarina solution. Likewise, IBM I would expect to continue to show customers where Tivoli software based dedupe or Protectier (aka the deduper formerly known as Diligent) or other target based dedupe fits and expand into other data footprint impact areas with Storewize.

Does this change the playing field?
IMHO these moves as well as some previous moves by the likes of EMC and NetApp among others are examples of expanding the scope and dimension of the playing field. That is, the focus is much more than just dedupe for backup or of virtual machines (e.g. VMware vSphere or Microsoft HyperV).

This signals a growing awareness around the much larger and broader opportunity around organization wide data footprint impact reduction. In the broader context some applications or data gets compressed either in application software such as databases, file systems, operating systems or even hypervisors as well as in networks using protocol or bandwidth optimizers as well as inline compression or post processing techniques as has been the case with streaming tape devices for some time.

This also means that where with dedupe the primary focus or marketing angle up until recently has been around reduction ratios, to meet the needs of time or performance sensitive applications data transfer rates also become important.

Hence the role of policy based data footprint reduction where the right tool or technique to meet specific service requirements is applied. For those vendors with a diverse data footprint impact reduction tool kit including archive, compression, dedupe, thin provision among other techniques, I would expect to hear expanded messaging around the theme of applying the right tool to the task at hand.

Does this mean Dell bought Ocarina to accessorize EqualLogic?
Perhaps, however that would then beg the question of why EqualLogic needs accessorizing. Granted there are many EqualLogic along with other Dell sold storage systems attached to Dell and other vendors servers operating as NFS or Windows CIFS file servers that are candidates for Ocarina. However there are also many environments that do not yet include Dell EqualLogic solutions where Ocarina is a means for Dell to extend their reach enabling those organizations to do more with what they have while supporting growth.

In other words, Ocarina can be used to accessorize, or, it can be used to generate and create pull through for various Dell products. I also see a very strong affinity and opportunity for Dell to combine their recent Exanet NAS storage clustering software with Dell servers, storage to create bulk or scale out solutions similar to what HP and other vendors have done. Of course what Dell does with the Ocarina software over time, where they integrate it into their own products as well as OEM to others should be interesting to watch or speculate upon.

Does this mean IBM bought Storwize to accessorize XIV?
Well, I guess if you put a gateway (or software on a server which is the same thing) in front of XIV to transform it into a NAS system, sure, then Storwize could be used to increase the net usable capacity of the XIV installed base. However that is a lot of work and cost for what is on a relative basis a small footprint, yet it is a viable option never the less.

IMHO IBM has much more of a play, perhaps a home run by walking before they run by placing Storwize in front of their existing large installed base of NetApp N series (not to mention targeting NetApps own install base) as well as complimenting their SONAS solutions. From there as IBM gets their legs and mojo, they could go on the attack by going after other vendors NAS solutions with an efficiency story similar to how IBM server groups target other vendors server business for takeout opportunities except in a complimenting manner.

Longer term I would not be surprised to see IBM continue development of the block based IP (as well as file) in the storwize product for deployment in solutions ranging from SVC to their own or OEM based products along with articulating their comprehensive data footprint reduction solution portfolio. What will be important for IBM to do is articulating what solution to use when, where, why and how without confusing their customers, partners and rest of the industry (something that Dell will also have to do).

Some links for additional reading on the above and related topics

Wrap up (for now)

Organizations of all shape and size are encountering some form of growing data footprint impact that currently, or soon will need to be addressed. Given that different applications and types of data along with associated storage mediums or tiers have various performance, availability, capacity, energy as well as economic characteristics multiple data footprint impact reduction tools or techniques are needed. What this all means is that the focus of data footprint reduction is expanding beyond that of just dedupe for backup or other early deployment scenarios.

Note what this means is that dedupe has an even brighter future than where it currently is focused which is still only scratching the surface of potential market adoption as was discussed in part 1 of this series.

However this also means that dedupe is not the only solution to all data footprint reduction scenarios. Other techniques including archiving, compression, data management, thin provisioning, data deletion, tiered storage and consolidation will start to gain respect, coverage discussions and debates.

Bottom line, use the most applicable technologies or combinations along with best practice for the task and activity at hand.

For some applications reduction ratios are an important focus on the tools or modes of operations that achieve those results.

Likewise for other applications where the focus is on performance with some data reduction benefit, tools are optimized for performance first and reduction secondary.

Thus I expect messaging from some vendors to adjust (expand) to those capabilities that they have in their toolboxes (product portfolios) offerings

Consequently, IMHO some of the backup centric dedupe solutions may find themselves in niche roles in the future unless they can diversity. Vendors with multiple data footprint reduction tools will also do better than those with only a single function or focused tool.

However for those who only have a single or perhaps a couple of tools, well, guess what the approach and messaging will be. After all, if all you have is a hammer everything looks like a nail, if all you have is a screw driver, well, you get the picture.

On the other hand, if you are still not clear on what all this means, send me a note, give a call, post a comment or a tweet and will be happy to discuss with you.

Oh, FWIW, if interested, disclosure: Storwize was a client a couple of years ago.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Industry Trends and Perspectives: Tiered Storage, Systems and Mediums

This is part of an ongoing series of short industry trends and perspectives blog posts briefs.

These short posts compliment other longer posts along with traditional industry trends and perspective white papers, research reports, solution brief content found at www.storageioblog.com/reports.

Two years ago we read about how the magnetic disk drive would be dead in a couple of years at the hand of flash SSD. Guess what, it is a couple of years later and the magnetic disk drive is far from being dead. Granted high performance Fibre Channel disks will continue to be replaced by high performance, small form factor 2.5" SAS drives along with continued adoption of high capacity SAS and SATA devices.

Likewise, SSD or flash drives continue to be deployed, however outside of iPhone, iPod and other consumer or low end devices, nowhere near the projected or perhaps hoped for level. Rest assured the trend Im seeing and hearing from IT customers is that some will continue to look for places to strategically deploy SSD where possible, practical and affordable, there will continue to be a roll for disk and even tape devices on a go forward basis.

Also watch for more coverage and discussion around the emergence of the Hybrid Hard Disk Drive (HHDD) that was discussed about four to five years ago. The HHDD made an appearance and then quietly went away for some time, perhaps more R and D time in the labs while flash SSD garnered the spotlight.

There could be a good opportunity for HHDD technology leveraging the best of both worlds that is continued pricing decreases for disk with larger capacity using smaller yet more affordable amounts of flash in a solution that is transparent to the server or storage controller making for easier integration.

Related and companion material:
Blog: ILM = Has It Losts its Meaning
Blog: SSD and Storage System Performance
Blog: Has SSD put Hard Disk Drives (HDDs) On Endangered Species List
Blog: Optimize Data Storage for Performance and Capacity Efficiency

That is all for now, hope you find this ongoing series of current and emerging Industry Trends and Perspectives interesting.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

2010 and 2011 Trends, Perspectives and Predictions: More of the same?

2011 is not a typo, I figured that since Im getting caught up on some things, why not get a jump as well.

Since 2009 went by so fast, and that Im finally getting around to doing an obligatory 2010 predictions post, lets take a look at both 2010 and 2011.

Actually Im getting around to doing a post here having already done interviews and articles for others soon to be released.

Based on prior trends and looking at forecasts, a simple predictions is that some of the items for 2010 will apply for 2011 as well given some of this years items may have been predicted by some in 2008, 2007, 2006, 2005 or, well ok, you get the picture. :)

Predictions are fun and funny in that for some, they are taken very seriously, while for others, at best they are taken with a grain of salt depending on where you sit. This applies both for the reader as well as who is making the predictions along with various motives or incentives.

Some are serious, some not so much…

For some, predictions are a great way of touting or promoting favorite wares (hard, soft or services) or getting yet another plug (YAP is a TLA BTW) in to meet coverage or exposure quota.

Meanwhile for others, predictions are a chance to brush up on new terms for the upcoming season of buzzword bingo games (did you pick up on YAP).

In honor of the Vancouver winter games, Im expecting some cool Olympic sized buzzword bingo games with a new slippery fast one being federation. Some buzzwords will take a break in 2010 as well as 2011 having been worked pretty hard the past few years, while others that have been on break, will reappear well rested, rejuvenated, and ready for duty.

Lets also clarify something regarding predictions and this is that they can be from at least two different perspectives. One view is that from a trend of what will be talked about or discussed in the industry. The other is in terms of what will actually be bought, deployed and used.

What can be confusing is sometimes the two perspectives are intermixed or assumed to be one and the same and for 2010 I see that trend continuing. In other words, there is adoption in terms of customers asking and investigating technologies vs. deployment where they are buying, installing and using those technologies in primary situations.

It is safe to say that there is still no such thing as an information, data or processing recession. Ok, surprise surprise; my dogs could have probably made that prediction during a nap. However what this means is more data will need to be moved, processed and stored for longer periods of time and at a lower cost without degrading performance or availability.

This means, denser technologies that enable a lower per unit cost of service without negatively impacting performance, availability, capacity or energy efficiency will be needed. In other words, watch for an expanded virtualization discussion around life beyond consolidation for servers, storage, desktops and networks with a theme around productivity and virtualization for agility and management enablement.

Certainly there will be continued merger and acquisitions on both a small as well as large scale ranging from liquidation sales or bargain hunting, to large and a mega block buster or two. Im thinking in terms of outside of the box, the type that will have people wondering perhaps confused as to why such a deal would be done until the whole picture is reveled and thought out.

In other words, outside of perhaps IBM, HP, Oracle, Intel or Microsoft among a few others, no vendor is too large not to be acquired, merged with, or even involved in a reverse merger. Im also thinking in terms of vendors filling in niche areas as well as building out their larger portfolio and IT stacks for integrated solutions.

Ok, lets take a look at some easy ones, lay ups or slam dunks:

  • More cluster, cloud conversations and confusion (public vs. private, service vs. product vs. architecture)
  • More server, desktop, IO and storage consolidation (excuse me, server virtualization)
  • Data footprint impact reduction ranging from deletion to archive to compress to dedupe among others
  • SSD and in particular flash continues to evolve with more conversations around PCM
  • Growing awareness of social media as yet another tool for customer relations management (CRM)
  • Security, data loss/leap prevention, digital forensics, PCI (payment card industry) and compliance
  • Focus expands from gaming/digital surveillance /security and energy to healthcare
  • Fibre Channel over Ethernet (FCoE) mainstream in discussions with some initial deployments
  • Continued confusion of Green IT and carbon reduction vs. economic and productivity (Green Gap)
  • No such thing as an information, data or processing recession, granted budgets are strained
  • Server, Storage or Systems Resource Analysis (SRA) with event correlation
  • SRA tools that provide and enable automation along with situational awareness

The green gap of confusion will continue with carbon or environment centric stories and messages continue to second back stage while people realize the other dimension of green being productivity.

As previously mentioned, virtualization of servers and storage continues to be popular with an expanding focus from just consolidation to one around agility, flexibility and enabling production, high performance or for other systems that do not lend themselves to consolidation to be virtualized.

6GB SAS interfaces as well as more SAS disk drives continue to gain popularity. I have said in the past there was a long shot that 8GFC disk drives might appear. We might very well see those in higher end systems while SAS drives continue to pick up the high performance spinning disk role in mid range systems.

Granted some types of disk drives will give way over time to others, for example high performance 3.5” 15.5K Fibre Channel disks will give way to 2.5” 15.5K SAS boosting densities, energy efficiency while maintaining performance. SSD will help to offload hot spots as they have in the past enabling disks to be more effectively used in their applicable roles or tiers with a net result of enhanced optimization, productivity and economics all of which have environmental benefits (e.g. the other Green IT closing the Green Gap).

What I dont see occurring, or at least in 2010

  • An information or data recession requiring less server, storage, I/O networking or software resources
  • OSD (object based disk storage without a gateway) at least in the context of T10
  • Mainframes, magnetic tape, disk drives, PCs, or Windows going away (at least physically)
  • Cisco cracking top 3, no wait, top 5, no make that top 10 server vendor ranking
  • More respect for growing and diverse SOHO market space
  • iSCSI taking over for all I/O connectivity, however I do see iSCSI expand its footprint
  • FCoE and flash based SSD reaching tipping point in terms of actual customer deployments
  • Large increases in IT Budgets and subsequent wild spending rivaling the dot com era
  • Backup, security, data loss prevention (DLP), data availability or protection issues going away
  • Brett Favre and the Minnesota Vikings winning the super bowl

What will be predicted at end of 2010 for 2011 (some of these will be DejaVU)

  • Many items that were predicted this year, last year, the year before that and so on…
  • Dedupe moving into primary and online active storage, rekindling of dedupe debates
  • Demise of cloud in terms of hype and confusion being replaced by federation
  • Clustered, grid, bulk and other forms of scale out storage grow in adoption
  • Disk, Tape, RAID, Mainframe, Fibre Channel, PCs, Windows being declared dead (again)
  • 2011 will be the year of Holographic storage and T10 OSD (an annual prediction by some)
  • FCoE kicks into broad and mainstream deployment adoption reaching tipping point
  • 16Gb (16GFC) Fibre Channel gets more attention stirring FCoE vs. FC vs. iSCSI debates
  • 100GbE gets more attention along with 4G adoption in order to move more data
  • Demise of iSCSI at the hands of SAS at low end, FCoE at high end and NAS from all angles

Gaining ground in 2010 however not yet in full stride (at least from customer deployment)

  • On the connectivity front, iSCSI, 6Gb SAS, 8Gb Fibre Channel, FCoE and 100GbE
  • SSD/flash based storage everywhere, however continued expansion
  • Dedupe  everywhere including primary storage – its still far from its full potential
  • Public and private clouds along with pNFS as well as scale out or clustered storage
  • Policy based automated storage tiering and transparent data movement or migration
  • Microsoft HyperV and Oracle based server virtualization technologies
  • Open source based technologies along with heterogeneous encryption
  • Virtualization life beyond consolidation addressing agility, flexibility and ease of management
  • Desktop virtualization using Citrix, Microsoft and VMware along with Microsoft Windows 7

Buzzword bingo hot topics and themes (in no particular order) include:

  • 2009 and previous year carry over items including cloud, iSCSI, HyperV, Dedupe, open source
  • Federation takes over some of the work of cloud, virtualization, clusters and grids
  • E2E, End to End management preferably across different technologies
  • SAS, Serial Attached SCSI for server to storage systems and as disk to storage interface
  • SRA, E23, Event correlation and other situational awareness related IRM tools
  • Virtualization, Life beyond consolidation enabling agility, flexibility for desktop, server and storage
  • Green IT, Transitions from carbon focus to economic with efficiency enabling productivity
  • FCoE, Continues to evolve and mature with more deployments however still not at tipping point
  • SSD, Flash based mediums continue to evolve however tipping point is still over the horizon
  • IOV, I/O Virtualization for both virtual and non virtual servers
  • Other new or recycled buzzword bingo candidates include PCoIP, 4G,

RAID will again be pronounced as being dead no longer relevant yet being found in more diverse deployments from consumer to the enterprise. In other words, RAID may be boring and thus no longer relevant to talk about, yet it is being used everywhere and enhanced in evolutionary ways, perhaps for some even revolutionary.

Tape remains being declared dead (e.g. on the Zombie technology list) yet being enhanced, purchased and utilized at higher rates with more data stored than in past history. Instead of being killed off by the disk drive, tape is being kept around for both traditional uses as well as taking on new roles where it is best suited such as long term or bulk off-line storage of data in ultra dense and energy efficient not to mention economical manners.

What I am seeing and hearing is that customers using tape are able to reduce the number of drives or transports, yet due to leveraging disk buffers or caches including from VTL and dedupe devices, they are able to operate their devices at higher utilization, thus requiring fewer devices with more data stored on media than in the past.

Likewise, even though I have been a fan of SSD for about 20 years and am bullish on its continued adoption, I do not see SSD killing off the spinning disk drive anytime soon. Disk drives are helping tape take on this new role by being a buffer or cache in the form of VTLs, disk based backup and bulk storage enhanced with compression, dedupe, thin provision and replication among other functionality.

There you have it, my predictions, observations and perspectives for 2010 and 2011. It is a broad and diverse list however I also get asked about and see a lot of different technologies, techniques and trends tied to IT resources (servers, storage, I/O and networks, hardware, software and services).

Lets see how they play out.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

ILM = Has It Losts its Meaning

Disclaimer, warning, be advised, heads up, disclosure, this post is partially for fun so take it that way.

Remember ILM, that is, Information Lifecycle Management among other meanings.

It was a popular buzzword de jour a few years ago similar to how cloud is being tossed around lately, or in the recent past, virtualization, clusters, grids and SOA among others.

One of the challenges with ILM besides its overuse and thus confusion was what it meant, after all was or is it a product, process, paradigm or something else?

That depends of course on who you talk to and their view or definition.

For some, ILM was a new name for archiving, or storage and data tiering, or data management, or hierarchical storage management (HSM) or system managed storage (SMS) and software managed storage (SMS) among others.

So where is ILM today?

Better yet, what does ILM stand for?

Well here are a few thoughts; some are oldies but goodies, some new, some just for fun.

ILM = I Like Marketing or Its a Lot of Marketing or Its a Lot of Money
ILM = It Losts its Meaning or Its a Lot of Meetings
ILM = Information Loves Magnetic media or I Love Magnetic media
ILM = IBM Loves Mainframes or Intel Loves Memory
ILM = Infrastructure Lifecycle Management or iPods/iPhones Like Macintosh

Then there are many other variations of xLM where I is replaced with X (similar to XaaS) where X is any letter you want or need for a particular purpose or message theme. For example, how about replacing X with an A for Application Lifecycle Management (ALM), or a B for Buzzword or Backup Lifecycle Management (BLM), C for Content Lifecycle Management (CLM) and D for Document or Data Lifecycle Management (DLM). There are many others including Hardware Lifecycle Management (HLM), Product or Program Lifecycle Management (PLM) not to mention Server, Storage or Security Lifecycle Management (SLM).

While ILM or xLM specific product and marketing buzz for the most part has subsided, perhaps it is about time to reappear to give current buzzwords such as cloud a bread or rest. After all, ILM and xLM as buzzwords should be well rested after their break at the Buzzword Rest Spa (BRS) perhaps located on someday isle. You know about someday isle dont you? Its that place of dreams, a visionary place to be visited in the future.

There are already signs of the impending rested, rejuvenated and re branded appearance of ILM in the form of automated tiering, intelligent storage and data management, file virtualization, policy managed server and storage among others.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Greg Schulz – StorageIO, Author “The Green and Virtual Data Center” (CRC)

Technorati tags: ILM

The function of XaaS(X) – Pick a letter

Remember the xSP era where X was I for ISP (Internet Service Provider) or M for Managed Service Provider (MSP) or S for Storage Service Provider, part of buzzword bingo?

That was similar to the xLM craze where X could have been I for Information Lifecycle Management (ILM), D for Data Lifecycle Management (DLM) and so forth where even someone tried to register the term ILM and failed instead of grabbing something like XLM, lest I digress.

Fast forward to today, given the wide spread use of anything SaaS among other XaaS terms, lets have a quick and perhaps fun look at what some of the different usages of the new function XaaS(X) in the IT industry today.

By no means is this an exhaustive list, feel free to comment with others, the more the merrier. Using the Basic English alphabet without numbers or extended character sets, here are some possibilities among others (some are and continue to be used in the industry):

AAnalyst, Application, Archive, Audit or Authentication
BBackup or Blogger
CCloud, Complier, Compute or Connectivity
DData management, Datawharehouse, DBA, Dedupe, Development, Disk or Docmanagement
EEmail, Encryption or Evangelist
FFiles or Freeware
GGrid or Google
HHelp, Hotline or Hype
IILM, Information, Infrastructure, IO or IT
JJobs
KKbytes
LLibrary or Linkedin
MMainframe, Marketing, Manufacturing, Media, Memory or Middleware
NNAS, Networking or Notification
OOffice, Oracle, Optical or Optimization
PPerformance, Petabytes, Platform, Policy, Police, Print or PR
QQuality
RRAID, Replication, Reporter, Research or Rightsmanagement
SSAN, Search, Security, Server, Software, Storage, Support
TTape, Technology, Testing, Tradegroup, Trends or Twittering
UUnfollow
VVAR, Virtualization or Vendor
WWeb
XXray
YYoutube
ZzSeries or zilla

Feel free to comment with others for the list, and likewise, feel free to share the list.

Cheers gs

Cheers gs
Greg Schulz – StorageIO, Author “The Green and Virtual Data Center” (CRC)

From ILM to IIM, Is this a solution sell looking for a problem?

Storage I/O trends

Enterprise Storage Forum has a new piece about what could be the successor to ILM from a marketing rallying cry perspective in the form of Intelligent Information Management (IIM).

Information management is an important topic, however, given tough economic times, can IIM be joined into some other discussions about efficiency and boosting productivity to help justify its cost what ever that cost may be in terms of more hardware, software and people to carry out? With EMC and Gartner banging the drum, it will be interesting to see who else jumps on the IIM bandwagon.

On the other hand, lets see what over variations surface perhaps an VIIM (Virtualized IIM), or a IIMaaS (IIM as a Service), or how about Cloud IIM or GIIM (Green IIM) among others like xIIM where you plug what ever letter you want in front if IIM (something that someone missed out on a few years ago by not grabbing xLM).

While I see the importance of data management, the bottom line is going to be how to budget and build a business case when sustaining business growth in tough economic times is a common theme. Hopefully we can see some business case and justifications that can involve some self-funded, that is, the cost of adopting and deploying IIM is covered by the savings in associated hardware and software management and maintenance fees as well as a means of boosting overall IT and data management productivity.

Ok, nuff said.

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved