contributed content Archives

October 9, 2018March 7, 2022

How I saved money storing more data on aws s3 simple storage service #blogtobertech

How I saved money storing more data on aws s3 simple storage service is an example of reducing cloud costs as opposed to merely cutting cloud costs. What this means is that instead of just cutting my cloud storage costs with a focus on how much I could save, I wanted to remove some costs while also storing more data without compromise. For example, since making the changes, storage capacity usage has almost doubled, yet prices are remaining 37% lower from two years ago before the changes were made.

How I saved money storing more data on aws s3?

Without adding any context, the typical reaction might be that I saved money storing more data on (or in) AWS S3 as opposed to locally on-site (on-prem). Another typical response would be that I moved all of my data from a different more expensive cloud service to AWS S3. Yet another common reaction would that I moved my AWS S3 data into AWS Glacier cold storage, or, deleted a large amount of data.

Some might even comment that I must have used some type of dedupe, compression or other data footprint reduction (DFR) technology. On the other hand, some might determine that I probably did all or some of the above, or, leveraged AWS tiered storage, aligning different storage classes to the type of data activity.

How I saved money storing more data in AWS S3 actually involved spending some money, to eventually save money by leveraging different S3 storage classes. As part of rebalancing or moving different data to its new storage class, some one-time charges were incurred which recouped after several months of savings. The costs pertained to EC2 compute instances and associated storage used for some of the data tiering, other fees were for access charges along with excessive API calls. For example, some of the data was in storage classes that had fees for early retrieval or deletions, or fees for access among others.

How I use different AWS S3 storage classes (tiers)

Standard – Frequently changing data, or data with frequent access
Infrequent Access (IA) – Data that does not change frequently or that is not routinely accessed. In the past before OZA, I had placed data that did not need to be in standard, yet to warm for Glacier in this storage class. After the migrations, I have fewer data stored in IA, with more in OZA as well as some in Standard.
One Zone Availability (OZA) – Data that is frequently accessed for reading, however, is static, not yet cold enough to move to Glacier or deep archive. A mix of backups, online and active archives. Note that I use OZA as an additional copy or location and not as a single, lowest cost place to store data. In other words, anything that I put into OZA has at least one additional copy somewhere else which may not be in the cloud.
Glacier – Very cold, seldom accessed, archives

Where to learn more

Learn more about Clouds and Data Infrastructure related trends, tools, technologies and topics via the following links:

Ten tips to reduce your cloud compute storage costs
Application Data Value Characteristics Everything Is Not the Same (Part I)
PACE your Server Storage I/O decision making, it’s about application requirements
What is DFR or Data Footprint Reduction?
Cloud conversations: confidence, certainty, and confidentiality
Cloud conversations: AWS EBS, Glacier and S3 overview (Part I)
Cloud Conversations AWS Azure Service Maps via Microsoft
AWS S3 Storage Gateway Revisited (Part I)
Cloud Conversations: AWS S3 Cross Region Replication storage enhancements
Cloud conversations: AWS EBS, Glacier and S3 overview (Part II S3)
Pictures Over Stillwater Drone Pro Shop and Resource Links
2018 Hot Popular New Trending Data Infrastructure Vendors to Watch
Part 1 – Application Data Value Characteristics Everything Is Not The Same
Data Infrastructure server storage I/O network Recommended Reading
World Backup Day 2018 Data Protection Readiness Reminder
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Server Storage I/O Tradecraft Trends
Data Infrastructure Overview, Its What’s Inside of Data Centers
4 3 2 1 and 3 2 1 data protection best practices
Garbage data in, garbage information out, big data or big garbage?
GDPR (General Data Protection Regulation) Resources Are You Ready?
Which Enterprise HDD to use for a Content Server Platform
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)
If NVMe is the answer, what are the questions?
NVMe Primer (or refresh), The NVMe Place, The SSD Place, and the Object Storage Center

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means

I decreased my AWS monthly bill by balancing things around, there was a one-month period where my costs increased during the changes, then a subsequent reduction. However, while I saw my monthly AWS storage invoices decrease, I’m also storing more data per month. How I saved money storing more data on aws s3 simple storage service involved using different storage classes.

Ok, nuff said, for now.

Cheers Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

March 13, 2018October 18, 2024

Application Data Volume Velocity Variety Everything Is Not The Same

Application Data Volume Velocity Variety Everything Not The Same

This is part four of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from chapter 2 of my new book Software Defined Data Infrastructure Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In this post, we continue looking at application and data characteristics with a focus on data volume velocity and variety, after all, everything is not the same, not to mention many different aspects of big data as well as little data.

Volume of Data

More data is growing at a faster rate every day, and that data is being retained for longer periods. Some data being retained has known value, while a growing amount of data has an unknown value. Data is generated or created from many sources, including mobile devices, social networks, web-connected systems or machines, and sensors including IoT and IoD. Besides where data is created from, there are also many consumers of data (applications) that range from legacy to mobile, cloud, IoT among others.

Unknown-value data may eventually have value in the future when somebody realizes that he can do something with it, or a technology tool or application becomes available to transform the data with unknown value into valuable information.

Some data gets retained in its native or raw form, while other data get processed by application program algorithms into summary data, or is curated and aggregated with other data to be transformed into new useful data. The figure below shows, from left to right and front to back, more data being created, and that data also getting larger over time. For example, on the left are two data items, objects, files, or blocks representing some information.

In the center of the following figure are more columns and rows of data, with each of those data items also becoming larger. Moving farther to the right, there are yet more data items stacked up higher, as well as across and farther back, with those items also being larger. The following figure can represent blocks of storage, files in a file system, rows, and columns in a database or key-value repository, or objects in a cloud or object storage system.

Application Data Value sddc
Increasing data velocity and volume, more data and data getting larger

In addition to more data being created, some of that data is relatively small in terms of the records or data structure entities being stored. However, there can be a large quantity of those smaller data items. In addition to the amount of data, as well as the size of the data, protection or overhead copies of data are also kept.

Another dimension is that data is also getting larger where the data structures describing a piece of data for an application have increased in size. For example, a still photograph was taken with a digital camera, cell phone, or another mobile handheld device, drone, or other IoT device, increases in size with each new generation of cameras as there are more megapixels.

Variety of Data

In addition to having value and volume, there are also different varieties of data, including ephemeral (temporary), persistent, primary, metadata, structured, semi-structured, unstructured, little, and big data. Keep in mind that programs, applications, tools, and utilities get stored as data, while they also use, create, access, and manage data.

There is also primary data and metadata, or data about data, as well as system data that is also sometimes referred to as metadata. Here is where context comes into play as part of tradecraft, as there can be metadata describing data being used by programs, as well as metadata about systems, applications, file systems, databases, and storage systems, among other things, including little and big data.

Context also matters regarding big data, as there are applications such as statistical analysis software and Hadoop, among others, for processing (analyzing) large amounts of data. The data being processed may not be big regarding the records or data entity items, but there may be a large volume. In addition to big data analytics, data, and applications, there is also data that is very big (as well as large volumes or collections of data sets).

For example, video and audio, among others, may also be referred to as big fast data, or large data. A challenge with larger data items is the complexity of moving over the distance promptly, as well as processing requiring new approaches, algorithms, data structures, and storage management techniques.

Likewise, the challenges with large volumes of smaller data are similar in that data needs to be moved, protected, preserved, and served cost-effectively for long periods of time. Both large and small data are stored (in memory or storage) in various types of data repositories.

In general, data in repositories is accessed locally, remotely, or via a cloud using:

Object and blobs stream, queue, and Application Programming Interface (API)
File-based using local or networked file systems
Block-based access of disk partitions, LUNs (logical unit numbers), or volumes

The following figure shows varieties of application data value including (left) photos or images, audio, videos, and various log, event, and telemetry data, as well as (right) sparse and dense data.

Application Data Value bits bytes blocks blobs bitstreams sddc
Varieties of data (bits, bytes, blocks, blobs, and bitstreams)

Velocity of Data

Data, in addition to having value (known, unknown, or none), volume (size and quantity), and variety (structured, unstructured, semi structured, primary, metadata, small, big), also has velocity. Velocity refers to how fast (or slowly) data is accessed, including being stored, retrieved, updated, scanned, or if it is active (updated, or fixed static) or dormant and inactive. In addition to data access and life cycle, velocity also refers to how data is used, such as random or sequential or some combination. Think of data velocity as how data, or streams of data, flow in various ways.

Velocity also describes how data is used and accessed, including:

Active (hot), static (warm and WORM), or dormant (cold)
Random or sequential, read or write-accessed
Real-time (online, synchronous) or time-delayed

Why this matters is that by understanding and knowing how applications use data, or how data is accessed via applications, you can make informed decisions. Also, having insight enables how to design, configure, and manage servers, storage, and I/O resources (hardware, software, services) to meet various needs. Understanding Application Data Value including the velocity of the data both for when it is created as well as when used is important for aligning the applicable performance techniques and technologies.

Where to learn more

Learn more about Application Data Value, application characteristics, performance, availability, capacity, economic (PACE) along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI) and related topics via the following links:

- Part 1 – Application Data Value Characteristics Everything Is Not The Same
- Part 2 – 4 3 2 1 Data Protection Application Data Availability
- Part 3 – Application Data Characteristics Types Everything Is Not The Same
- Part 4 – Application Data Volume Velocity Variety Everything Is Not The Same
- Part 5 – Application Data Access life cycle Patterns Everything Not The Same
- Software Defined, Cloud, Object and Blob Storage
- Data Infrastructure server storage I/O network Recommended Reading
- World Backup Day 2018 Data Protection Readiness Reminder
- Data Infrastructure Server Storage I/O related Tradecraft Overview
- Data Infrastructure Overview, Its What’s Inside of Data Centers
- 4 3 2 1 and 3 2 1 data protection best practices
- Garbage data in, garbage information out, big data or big garbage?
- GDPR (General Data Protection Regulation) Resources Are You Ready?
- Which Enterprise HDD to use for a Content Server Platform
- The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
- The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
- Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA,Replication, Security)

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What this all means and wrap-up

Data has different value, size, as well as velocity as part of its characteristic including how used by various applications. Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part V Application Data Access life cycle Patterns Everything Is Not The Same) in this series here.

Ok, nuff said, for now.

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

April 12, 2017September 26, 2022

A Story About Stadiums Along With Seismic Activity

server storage I/O trends

A Story About Stadiums Along With Seismic Activity

I find in my inbox several pitches a day for briefings, products, vendors, services, authors, books, blogs and other things to cover, write about or simply take their pitch and post it as is, or with some editing. Of course, there are also the pitches or should I say, offers for somebody to so kindly craft content to appear on my sites along with offers of $50 to $75 USD (or more) along with a do follow a link that I decline. Note, if any of you are looking for or interested in those types of offers, let me know and will gladly forward them to you.

Most of the pitch story ideas are already written as, well, stories vs. what they are looking for, presenting, providing not surprisingly missing the mark. However every now and then I come across something that is worth a read like this one here below. It really does not have much to do with IT or data infrastructures, although you might find a remote connection to data center vibration dampening, sports or other things.

I Will leave it up to you to determine if this is worth reading, informative, perhaps even an entertaining distraction from the United Airlines PR debacle among other things (disclosure: I have no affiliation with those mentioned in the following or their agencies).

PR Contact: Miguel Casellas-Gil: 727-443-7115 ext 214
MiguelCG@news-experts.com

Seismic Celebrations Present Concern
Over Safety In Stadiums

As Sports Stadiums Age, Questions Surround Safety Of Structures

With his team leading 34-30 in the final minutes of a Wild Card Playoff game against the New Orleans Saints in 2011, Seattle Seahawks running back Marshawn Lynch took a handoff and exploded through the hole, beginning what would turn out to be a 67-yard touchdown run.

Odds are as Lynch was sealing the victory for the Seahawks against the defending Super Bowl Champions, nobody in the stands was worried about the structural soundness of CenturyLink Field.

Sitting a few thousand miles away on a tiny island at North Tonawanda, NY, just outside of Niagara Falls, Douglas P. Taylor, CEO of Taylor Devices (www.taylordevices.com/), no doubt looked at Lynch’s run through a different lens than most Seahawks fans that day.

Taylor’s daily job involves controlling and stopping the movement of masses. No, he’s not a linebacker, he’s an engineer, and his company manufactures seismic dampers that protect structures during such events as earthquakes and high winds.

As Lynch rumbled down the sideline for the game-winning touchdown, something else was rumbling in Seattle that day. Lynch’s run led to such a frenzy in the stands that jumping fans caused a 1.0 earthquake to register at the Pacific Northwest Seismic Network.

Lynch helped set off seismic alarms again in 2014 on touchdown run, and football fans of another sort on the other side of the pond got into the act earlier this year, causing what amounted to a 1.0 Earthquake in Spain in celebration of an FC Barcelona win.

Taylor’s firm wasn’t involved in the construction of either facility in Seattle or Barcelona, but it was heavily involved with BC Place, a new stadium in Vancouver, and Safeco Field, the retractable roof stadium that serves as the home of the Seattle Mariners.

“Those who are going to sporting events should be made aware that technology already exists to protect a structure and its occupants during wind and seismic events,” Taylor says. “My hope is that a fan’s biggest worry is the score of the game and not whether the stands around him are going to collapse.”

Of Major League Baseball’s 30 stadiums, 18 were built in 1995 or later, with five of those opening in the past decade. When play opens on the 2017 NFL season, 10 of its stadiums will have opened their doors in 1994 or earlier, with the remaining 21 opening in 1995 or later.

“Any stadium in a high seismic zone that was designed before 1995 probably does not meet the updated seismic codes,” Taylor says. “For stadiums subject only to high winds, older designs may well meet the current codes. However, these codes usually only provide a structure that won’t totally collapse.

While certain weather and nature-related phenomena such as hurricanes and snow storms can be identified by meteorologists well in advance to postpone games, there is no early-warning system for an earthquake.

Several professional stadiums – not to mention a large number of college football stadiums – are near fault lines in California and in the Midwest.

According to the U.S. Geological Survey, there is a seven to 10 percent chance an earthquake magnitude 6.0 or higher will strike in the next 50 years along the New Madrid Fault Line in the Midwest. California on the other hand – the current home of five MLB teams and four NFL teams – is staring down the barrel of a gun ready to fire off a 7.0 magnitude earthquake or higher at any time.

“The northern part of the state is long overdue for a powerful earthquake (7.0 or higher) along the San Andreas fault,” Taylor says. “San Diego and Los Angeles aren’t safe either. A new fault line was discovered in the Southern part of the state earlier this year that could cause an earthquake as powerful as 7.4 on the Richter Scale.”

About Douglas P. Taylor

Douglas P. Taylor is the CEO of Taylor Devices (www.taylordevices.com), which manufactures seismic dampers that protect structures during such events as earthquakes and high winds. He is inventor or co-inventor of 34 patents in the fields of energy management, hydraulics and shock isolation. In 2015, he was inducted into the Space Technology Hall of Fame by NASA and the Space Foundation.

If you would like to run the above article, please feel free to do so. I can also provide images to accompany it. If you’re interested in interviewing Douglas Taylor, having him provide comments, or having him write an exclusive article for you let me know and I’ll gladly work out the details.

Miguel Casellas-Gil
Print Campaign Manager
News and Experts
3748 Turman Loop #101
Wesley Chapel, FL 33544
Tel: 727-443-7115, Extension 214
www.newsandexperts.com

Where to Learn More

Want to learn more see the contact information above.

What this all means

Infrastructure items from roads, bridges, airports, sewer, water, electrical power and data centers (along with the data infrastructure contents inside of them) of all age. Likewise, they are at risk from acts of man, as well as acts of nature needing to be resilient. Ask yourself how resilient is your data infrastructure, including if it is legacy, cloud or hybrid.

Ok, nuff said (for now…).

Cheers
Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert (and vSAN). Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Watch for the Spring 2017 release of his new book “Software-Defined Data Infrastructure Essentials” (CRC Press).

Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

March 27, 2017October 18, 2024

Preparing For World Backup Day 2017 Are You Prepared

In case you have forgotten, or were not aware, this coming Friday March 31 is World Backup Day 2017 (and recovery day). The annual day is a to remember to make sure you are protecting your applications, data, information, configuration settings as well as data infrastructures. While the emphasis is on Backup, that also means recovery as well as testing to make sure everything is working properly as part of on-prem and cloud data protection.

What the Vendors Have To Say

Today I received the following from Kylle over at TOUCHDOWNPR on behalf of their clients providing their perspectives on what World Backup Day means, or how to be prepared. Keep in mind these are not Server StorageIO clients (granted some have been in the past, or I know them, that is a disclosure btw), and this is in no way an endorsement of what they are saying, or advocating. Instead, this is simply passing along to you what was given to me.

Not included in this list? No worries, add your perspectives (politely) to the comments, or, drop me a note, and perhaps I will do a follow-up or addition to this.

Kylle O’Sullivan
TOUCHDOWNPR
Email: Kosullivan@touchdownpr.com
Mobile: 508-826-4482
Skype: Kylle.OSullivan

“Data loss and disruption happens far too often in the enterprise. Research by Ponemon in 2016 estimates the average cost of an unplanned outage has spiralled to nearly $9,000 a minute, causing crippling downtime as well as financial and reputational damage. Legacy backups simply aren’t equipped to provide seamless operations, with zero Recovery Point Objectives (RPO) should a disaster strike. In order to guarantee the availability of applications, synchronous replication with real-time analytics needs to be simple to setup, monitor and manage for application owners and economical to the organization. That way, making zero data loss attainable suddenly becomes a reality.” – Chuck Dubuque, VP Product Marketing, Tintri

“With today’s “always-on” business environment, data loss can destroy a company’s brand and customer trust. A multiple software-based strategy with software-defined and hyperconverged storage infrastructure is the most effective route for a flexible backup plan. With this tactic, snapshots, replication and stretched clusters can help protect data, whether in a local data center cluster, across data centers or across the cloud. IT teams rely on these software-based policies as the backbone of their disaster recovery implementations as the human element is removed. This is possible as the software-based strategy dictates that all virtual machines are accurately, automatically and consistently replicated to the DR sites. Through this automatic and transparent approach, no administrator action is required, saving employees time, money and providing peace of mind that business can carry on despite any outage.” – Patrick Brennan, Senior Product Marketing Manager, Atlantis Computing

“It’s only a matter of time before your datacenter experiences a significant outage, if it hasn’t already, due to a wide range of causes, from something as simple as human error or power failure to criminal activity like ransomware and cyberattacks, or even more catastrophic events like hurricanes. Shifting thinking to ‘when’ as opposed to ‘if’ something like this happens is crucial; crucial to building a more flexible and resilient IT infrastructure that can withstand any kind of disruption resulting in negative impact on business performance. World Backup Day reminds us of the importance of both having a backup plan in place and as well as conducting regular reviews of current and new technology to do everything possible to keep business running without interruption. Organizations today are highly aware that they are heavily dependent on data and critical applications, and that losing even just an hour of data can greatly harm revenues and brand reputation, sometimes beyond repair. Savvy businesses are taking an all-inclusive approach to this problem that incorporates cloud-based technologies into their disaster recovery plans. And with consistent testing and automation, they are ensuring that those plans are extremely simple to execute against in even the most challenging of situations, a key element of successfully avoiding damaging downtime.” Rob Strechay, VP Product, Zerto

“Data is one of the most valuable business assets and when it comes to data protection chief among its IT challenges is the ever-growing rate of data and the associated vulnerability. Backup needs to be reliable, fast and cost efficient. Organizations are on the defensive after a disaster and being able to recover critical data within minutes is crucial. Breakthroughs in disk technologies and pricing have led to very dense arrays that are power, cost and performance efficient. Backup has been revolutionized and organizations need to ensure they are safeguarding their most valuable commodity – not just now but for the long term. Secure archive platforms are complementary and create a complete recovery strategy.” – Geoff Barrall, COO, Nexsan

Consider the DR Options that Object Storage Adds
“Data backup and disaster recovery used to be treated as separate processes, which added complexity. But with object storage as a backup target you now have multiple options to bring backup and DR together in a single flow. You can configure a hybrid cloud and tier a portion of your data to the public cloud, or you can locate object storage nodes at different locations and use replication to provide geographic separation. So, this World Backup Day, consider how object storage has increased your options for meeting this critical need.” – Jon Toor, Cloudian CMO

Whats In Your Data Protection Toolbox

What tools, technologies do you have in your data protection toolbox? Do you only have a hammer and thus answer to every situation is that it looks like a nail? Or, do you have multiple tools, technologies combined with your various tradecraft experiences to applice different techniques?

storageio data protection toolbox

Where To Learn More

Following these links to additional related material about backup, restore, availability, data protection, BC, BR, DR along with associated topics, trends, tools, technologies as well as techniques.

Time to restore from backup: Do you know where your data is?
February 2017 Server StorageIO Update Newsletter
Data Infrastructure Server Storage I/O Tradecraft Trends
Data Infrastructure Server Storage I/O related Tradecraft Overview
Data Infrastructure Primer and Overview (Its Whats Inside The Data Center)
What’s a data infrastructure?
Ensure your data infrastructure remains available and resilient
Part III Until the focus expands to data protection – Taking action
Welcome to the Data Protection Diaries
Backup, Big data, Big Data Protection, CMG & More with Tom Becchetti Podcast
Six plus data center software defined management dashboards
Cloud Storage Concerns, Considerations and Trends
Software Defined, Cloud, Bulk and Object Storage Fundamentals (www.objectstoragecenter.com)

Data Infrastructure Overview, Its Whats Inside of Data Centers
All You Need To Know about Remote Office/Branch Office Data Protection Backup (free webinar with registration)
Software Defined, Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI) resources
The SSD Place (SSD, NVM, PM, SCM, Flash, NVMe, 3D XPoint, MRAM and related topics)
The NVMe Place (NVMe related topics, trends, tools, technologies, tip resources)
Data Protection Diaries (Archive, Backup/Restore, BC, BR, DR, HA, RAID/EC/LRC, Replication, Security)
Software Defined Data Infrastructure Essentials (CRC Press 2017) including SDDC, Cloud, Container and more
Various Data Infrastructure related events, webinars and other activities

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

What This All Means

Backup of data is important, so to is recovery which also means testing. Testing means more than just if you can read the tape, disk, SSD, USB, cloud or other medium (or location). Go a step further and verify that not only you can read the data from the medium, also if your applications or software are able to use it. Have you protected your applications (e.g. not just the data), security keys, encryption, access, dedupe and other certificates along with metadata as well as other settings? Do you have a backup or protection copy of your protection including recovery tools? What granularity of protection and recovery do you have in place, when did you test or try it recently? In other words, what this all means is be prepared, find and fix issues, as well as in the course of testing, don’t cause a disaster.

Ok, nuff said, for now.

December 10, 2014November 26, 2023

Data Storage Tape Update V2014, Its Still Alive

Data Storage Tape Update V2014, It’s Still Alive

server storage I/O trends

A year or so ago I did a piece tape summit resources. Despite being declared dead for decades, and will probably stay being declared dead for years to come, magnetic tape is in fact still alive being used by some organizations, granted its role is changing while the technology still evolves.

Here is the memo I received today from the PR folks of the Tape Storage Council (e.g. tape vendors marketing consortium) and for simplicity (mine), I’m posting it here for you to read in its entirety vs. possibly in pieces elsewhere. Note that this is basically a tape status and collection of marketing and press release talking points, however you can get an idea of the current messaging, who is using tape and technology updates.

Tape Data Storage in 2014 and looking towards 2015

True to the nature of magnetic tape as a data storage medium, this is not a low latency small post, rather a large high-capacity bulk post or perhaps all you need to know about tape for now, or until next year. Otoh, if you are a tape fan, you can certainly take the memo from the tape folks, as well as visit their site for more info.

From the tape storage council industry trade group:

Today the Tape Storage Council issued its annual memo to highlight the current trends, usages and technology innovations occurring within the tape storage industry. The Tape Storage Council includes representatives of BDT, Crossroads Systems, FUJIFILM, HP, IBM, Imation, Iron Mountain, Oracle, Overland Storage, Qualstar, Quantum, REB Storage Systems, Recall, Spectra Logic, Tandberg Data and XpresspaX.

Data Growth and Technology Innovations Fuel Tape’s Future
Tape Addresses New Markets as Capacity, Performance, and Functionality Reach New Levels

Abstract
For the past decade, the tape industry has been re-architecting itself and the renaissance is well underway. Several new and important technologies for both LTO (Linear Tape Open) and enterprise tape products have yielded unprecedented cartridge capacity increases, much longer media life, improved bit error rates, and vastly superior economics compared to any previous tape or disk technology. This progress has enabled tape to effectively address many new data intensive market opportunities in addition to its traditional role as a backup device such as archive, Big Data, compliance, entertainment and surveillance. Clearly disk technology has been advancing, but the progress in tape has been even greater over the past 10 years. Today’s modern tape technology is nothing like the tape of the past.

The Growth in Tape
Demand for tape is being fueled by unrelenting data growth, significant technological advancements, tape’s highly favorable economics, the growing requirements to maintain access to data “forever” emanating from regulatory, compliance or governance requirements, and the big data demand for large amounts of data to be analyzed and monetized in the future. The Digital Universe study suggests that the world’s information is doubling every two years and much of this data is most cost-effectively stored on tape.

Enterprise tape has reached an unprecedented 10 TB native capacity with data rates reaching 360 MB/sec. Enterprise tape libraries can scale beyond one exabyte. Enterprise tape manufacturers IBM and Oracle StorageTek have signaled future cartridge capacities far beyond 10 TBs with no limitations in sight. Open systems users can now store more than 300 Blu-ray quality movies with the LTO-6 2.5 TB cartridge. In the future, an LTO-10 cartridge will hold over 14,400 Blu-ray movies. Nearly 250 million LTO tape cartridges have been shipped since the format’s inception. This equals over 100,000 PB of data protected and retained using LTO Technology. The innovative active archive solution combining tape with low-cost NAS storage and LTFS is gaining momentum for open systems users.

Recent Announcements and Milestones
Tape storage is addressing many new applications in today’s modern data centers while offering welcome relief from constant IT budget pressures. Tape is also extending its reach to the cloud as a cost-effective deep archive service. In addition, numerous analyst studies confirm the TCO for tape is much lower than disk when it comes to backup and data archiving applications. See TCO Studies section below.

On Sept. 16, 2013 Oracle Corp announced the StorageTek T10000D enterprise tape drive. Features of the T10000D include an 8.5 TB native capacity and data rate of 252 MB/s native. The T10000D is backward read compatible with all three previous generations of T10000 tape drives.
On Jan. 16, 2014 Fujifilm Recording Media USA, Inc. reported it has manufactured over 100 million LTO Ultrium data cartridges since its release of the first generation of LTO in 2000. This equates to over 53 thousand petabytes (53 exabytes) of storage and more than 41 million miles of tape, enough to wrap around the globe 1,653 times.
April 30, 2014, Sony Corporation independently developed a soft magnetic under layer with a smooth interface using sputter deposition, created a nano-grained magnetic layer with fine magnetic particles and uniform crystalline orientation. This layer enabled Sony to successfully demonstrate the world’s highest areal recording density for tape storage media of 148 GB/in2. This areal density would make it possible to record more than 185 TB of data per data cartridge.
On May 19, 2014 Fujifilm in conjunction with IBM successfully demonstrated a record areal data density of 85.9 Gb/in2 on linear magnetic particulate tape using Fujifilm’s proprietary NANOCUBIC™ and Barium Ferrite (BaFe) particle technologies. This breakthrough in recording density equates to a standard LTO cartridge capable of storing up to 154 terabytes of uncompressed data, making it 62 times greater than today’s current LTO-6 cartridge capacity and projects a long and promising future for tape growth.
On Sept. 9, 2014 IBM announced LTFS LE version 2.1.4 4 extending LTFS (Linear Tape File System) tape library support.
On Sept. 10, 2014 the LTO Program Technology Provider Companies (TPCs), HP, IBM and Quantum, announced an extended roadmap which now includes LTO generations 9 and 10. The new generation guidelines call for compressed capacities of 62.5 TB for LTO-9 and 120 TB for generation LTO-10 and include compressed transfer rates of up to 1,770 MB/second for LTO-9 and a 2,750 MB/second for LTO-10. Each new generation will include read-and-write backwards compatibility with the prior generation as well as read compatibility with cartridges from two generations prior to protect investments and ease tape conversion and implementation.
On Oct. 6, 2014 IBM announced the TS1150 enterprise drive. Features of the TS1150 include a native data rate of up to 360 MB/sec versus the 250 MB/sec native data rate of the predecessor TS1140 and a native cartridge capacity of 10 TB compared to 4 TB on the TS1140. LTFS support was included.
On Nov. 6, 2014, HP announced a new release of StoreOpen Automation that delivers a solution for using LTFS in automation environments with Windows OS, available as a free download. This version complements their already existing support for Mac and Linux versions to help simplify integration of tape libraries to archiving solutions.

Significant Technology Innovations Fuel Tape’s Future
Development and manufacturing investment in tape library, drive, media and management software has effectively addressed the constant demand for improved reliability, higher capacity, power efficiency, ease of use and the lowest cost per GB of any storage solution. Below is a summary of tape’s value proposition followed by key metrics for each:

Tape drive reliability has surpassed disk drive reliability
Tape cartridge capacity (native) growth is on an unprecedented trajectory
Tape has a faster device data rate than disk
Tape has a much longer media life than any other digital storage medium
Tape’s functionality and ease of use is now greatly enhanced with LTFS
Tape requires significantly less energy consumption than any other digital storage technology
Tape storage has a much lower acquisition cost and TCO than disk

Reliability. Tape reliability levels have surpassed HDDs. Reliability levels for tape exceeds that of the most reliable disk drives by one to three orders of magnitude. The BER (Bit Error Rate – bits read per hard error) for enterprise tape is rated at 1×1019 and 1×1017 for LTO tape. This compares to 1×1016 for the most reliable enterprise Fibre Channel disk drive.

Capacity and Data Rate. LTO-6 cartridges provide 2.5 TB capacity and more than double the compressed capacity of the preceding LTO-5 drive with a 14% data rate performance boost to 160 MB/sec. Enterprise tape has reached 8.5 TB native capacity and 252 MB/sec on the Oracle StorageTek T10000D and 10 TB native capacity and 360 MB/sec on the IBM TS1150. Tape cartridge capacities are expected to grow at unprecedented rates for the foreseeable future.

Media Life. Manufacturers specifications indicate that enterprise and LTO tape media has a life span of 30 years or more while the average tape drive will be deployed 7 to 10 years before replacement. By comparison, the average disk drive is operational 3 to 5 years before replacement.

LTFS Changes Rules for Tape Access. Compared to previous proprietary solutions, LTFS is an open tape format that stores files in application-independent, self-describing fashion, enabling the simple interchange of content across multiple platforms and workflows. LTFS is also being deployed in several innovative “Tape as NAS” active archive solutions that combine the cost benefits of tape with the ease of use and fast access times of NAS. The SNIA LTFS Technical Working Group has been formed to broaden cross–industry collaboration and continued technical development of the LTFS specification.

TCOStudies. Tape’s widening cost advantage compared to other storage mediums makes it the most cost-effective technology for long-term data retention. The favorable economics (TCO, low energy consumption, reduced raised floor) and massive scalability have made tape the preferred medium for managing vast volumes of data. Several tape TCO studies are publicly available and the results consistently confirm a significant TCO advantage for tape compared to disk solutions.

According to the Brad Johns Consulting Group, a TCO study for an LTFS-based ‘Tape as NAS’ solution totaled $1.1M compared with $7.0M for a disk-based unified storage solution. This equates to a savings of over $5.9M over a 10-year period, which is more than 84 percent less than the equivalent amount for a storage system built on a 4 TB hard disk drive unified storage system. From a slightly different perspective, this is a TCO savings of over $2,900/TB of data. Source: Johns, B. “A New Approach to Lowering the Cost of Storing File Archive Information,”.

Another comprehensive TCO study by ESG (Enterprise Strategies Group) comparing an LTO-5 tape library system with a low-cost SATA disk system for backup using de-duplication (best case for disk) shows that disk deduplication has a 2-4x higher TCO than the tape system for backup over a 5 year period. The study revealed that disk has a TCO of 15x higher than tape for long-term data archiving.

Select Case Studies Highlight Tape and Active Archive Solutions
CyArk Is a non-profit foundation focused on the digital preservation of cultural heritage sites including places such as Mt. Rushmore, and Pompeii. CyArk predicted that their data archive would grow by 30 percent each year for the foreseeable future reaching one to two petabytes in five years. They needed a storage solution that was secure, scalable, and more cost-effective to provide the longevity required for these important historical assets. To meet this challenge CyArk implemented an active archive solution featuring LTO and LTFS technologies.

Dream Works Animation a global Computer Graphic (CG) animation studio has implemented a reliable, cost-effective and scalable active archive solution to safeguard a 2 PB portfolio of finished movies and graphics, supporting a long-term asset preservation strategy. The studio’s comprehensive, tiered and converged active archive architecture, which spans software, disk and tape, saves the company time, money and reduces risk.

LA Kings of the NHL rely extensively on digital video assets for marketing activities with team partners and for its broadcast affiliation with Fox Sports. Today, the Kings save about 200 GB of video per game for an 82 game regular season and are on pace to generate about 32-35 TB of new data per season. The King’s chose to implement Fujifilm’s Dternity NAS active archive appliance, an open LTFS based architecture. The Kings wanted an open source archiving solution which could outlast its original hardware while maintaining data integrity. Today with Dternity and LTFS, the Kings don’t have to decide what data to keep because they are able to cost-effectively save everything they might need in the future.

McDonald’s primary challenge was to create a digital video workflow that streamlines the management and distribution of their global video assets for their video production and post-production environment. McDonald’s implemented the Spectra T200 tape library with LTO-6 providing 250 TB of McDonald’s video production storage. Nightly, incremental backup jobs store their media assets into separate disk and LTO- 6 storage pools for easy backup, tracking and fast retrieval. This system design allows McDonald’s to effectively separate and manage their assets through the use of customized automation and data service policies.

NCSA employs an Active Archive solution providing 100 percent of the nearline storage for the NCSA Blue Waters supercomputer, which is one of the world’s largest active file repositories stored on high capacity, highly reliable enterprise tape media. Using an active archive system along with enterprise tape and RAIT (Redundant Arrays of Inexpensive Tape) eliminates the need to duplicate tape data, which has led to dramatic cost savings.

Queensland Brain Institute (QBI) is a leading center for neuroscience research. QBI’s research focuses on the cellular and molecular mechanisms that regulate brain function to help develop new treatments for neurological and mental disorders. QBI’s storage system has to scale extensively to store, protect, and access tens of terabytes of data daily to support cutting-edge research. QBI choose an Oracle solution consisting of Oracle’s StorageTek SL3000 modular tape libraries with StorageTek T10000 enterprise tape drives. The Oracle solution improved QBI’s ability to grow, attract world-leading scientists and meet stringent funding conditions.

Looking Ahead to 2015 and Beyond
The role tape serves in today’s modern data centers is expanding as IT executives and cloud service providers address new applications for tape that leverage its significant operational and cost advantages. This recognition is driving investment in new tape technologies and innovations with extended roadmaps, and it is expanding tape’s profile from its historical role in data backup to one that includes long-term archiving requiring cost-effective access to enormous quantities of stored data. Given the current and future trajectory of tape technology, data intensive markets such as big data, broadcast and entertainment, archive, scientific research, oil and gas exploration, surveillance, cloud, and HPC are expected to become significant beneficiaries of tape’s continued progress. Clearly the tremendous innovation, compelling value proposition and development activities demonstrate tape technology is not sitting still; expect this promising trend to continue in 2015 and beyond.

Visit the Tape Storage Council at tapestorage.org

What this means and summary

Like it not tape is still alive being used along with the technology evolving with new enhancements as outlined above.

Good to see the tape folks doing some marketing to get their story told and heard for those who are still interested.

Does that mean I still use tape?

Nope, I stopped using tape for local backups and archives well over a decade ago using disk to disk and disk to cloud.

Does that mean I believe that tape is dead?

Nope, I still believe that for some organizations and some usage scenarios it makes good sense, however like with most data storage related technologies, it’s not a one size or type of technology fits everything scenario value proposition.

On a related note for cloud and object storage, visit www.objectstoragecenter.com

Ok, nuff said, for now…

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

How I saved money storing more data on aws s3 simple storage service #blogtobertech

How I saved money storing more data on aws s3?

Where to learn more

What this all means

Share this:

Application Data Volume Velocity Variety Everything Not The Same

Volume of Data

Variety of Data

Velocity of Data

Where to learn more

What this all means and wrap-up

Share this:

A Story About Stadiums Along With Seismic Activity

Where to Learn More

What this all means

Share this:

Preparing For World Backup Day 2017 Are You Prepared

What the Vendors Have To Say

Whats In Your Data Protection Toolbox

Where To Learn More

What This All Means

Share this:

Data Storage Tape Update V2014, It’s Still Alive

Tape Data Storage in 2014 and looking towards 2015

What this means and summary

Share this: