Welcome to the April 2013 edition of the StorageIO Update. This edition includes more on nand flash SSD, after all its not if, rather when, where, why, with what along with how much SSD is in your future. Also more on object storage, clouds, big data and little data, HDDs, SNW, backup/restore, HA, BC, DR and data protection along with data center topics and trends.
You can get access to this news letter via various social media venues (some are shown below) in addition to StorageIO web sites and subscriptions.
Click on the following links to view the April 2013 edition as (HTML sent via Email) version, or PDF versions.
Visit the news letter page to view previous editions of the StorageIO Update.
You can subscribe to the news letter by clicking here.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
A couple of weeks ago I attended the spring 2013 Storage Networking World (SNW) in Orlando Florida. Talking with SNIA Chairman Wayne Adams and SNIA Director Leo Legar this was the 28th edition of the US SNW (two shows a year), plus the international ones. While I have not been to all 28 of the US SNWs, I have been to a couple of dozen SNWs in the US, Europe and Brazil going back to around 2001 as an attendee, main stage as well as breakout, and tutorial presenter (see here and here).
For the spring 2013 SNW I was there for a mix of meetings, analyst briefings, attending the expo, doing some podcasts (see below), meeting with IT professionals (e.g. customers), VARs, vendors along with presenting three sessions (you can download them and others backup, restore, BC, DR and archiving).
Some of the buzz and themes heard included big data was a little topic at the event, while cloud was in the conversations, dedupe and data footprint reduction (DFR) do matter for some people and applications. However also a common theme with customers including Media and Entertainment (M&E) is that not everything can be duped thus other DFR approaches are needed.
There was some hype in and around hybrid storage along with storage hypervisors, which was also an entertaining panel discussion with HDS (Claus Mikkelsen aka @YoClaus), Datacore, IBM and Virstro.
The theme of that discussion seemed for the most part to gravitate towards realities of storage virtualization and less about the hypervisor hype. Some software defined marketing hype I heard is that it is impossible to spend more than a million dollars on a server today. I guess with the applicable caveats, qualifiers and context that could be true, however I also know some vendors and customers that would say otherwise.
Lunchtime at SNW Spring 2013
Not surprisingly, there was an increase in vendors wanting to jump on the software defined and object storage bandwagons; however, customers tended to be curious at best, confused or concerned otherwise. Speaking of object storage, check out this podcast discussion with Cleversafe customer Justin Stottlemyer of Shutterfly and his 80PB environment.
In addition to Cleversafe, heard from Astute (if you need fast iSCSI storage check them out), Avere has a new NAS for dummies book out, Exablox a storage system startup with emphasis on scalability, ease of use and NAS access and hybrid storage Tegile. Also, check out SwifTest for generating application workloads and measurement that had their customer Go Daddy presenting at the event. A couple of others to keep an eye on include Raxco with their thin provision storage reclamation tool, and Infinio with their NAS acceleration for VMware software tools among others.
Here are the three presentations that I did while at the event:
Analyst Perspective: Increase Your Return on Innovation (The New ROI) With Data Management and Dedupe There is no such thing as an information recession with more data to move, process and store, however there are economic challenges. Likewise, people and data are living longer and getting larger which requires leveraging data footprint reduction (DFR) techniques on a broader focus. It is time to move upstream finding and fixing things at the source to reduce the downstream impact of expanding data footprints, enabling more to be done with what you have.
Analyst Perspective: Metrics that Matter – Meritage of Data Management and Data Protection Not everything in the data center or information factory is the same. This session recaps and builds off the morning increase your ROI with data footprint and data management session while setting the stage for the rethinking data protection (backup, BC and DR). Are you maximizing the return on innovation in how using new tools and technology in new ways, vs. using new tools in old ways? Also discussed performance capacity planning, forecasting analysis in cloud, virtual and physical environments. Without metrics that matter, you are flying blind, or perhaps missing opportunities to further drive your return on innovation and return on investment.
Analyst Perspective: Time to Rethink Data Protection Including BC and DR When it comes to today’s data centers and information factories including physical, virtual and cloud, everything is not the same, so why treat business continuance (BC), disaster recovery (DR) and data protection in general the same? Simply using new tools, technologies and techniques in the same old ways is no longer a viable option. Since there is no such thing as a data or information recession, yet there are economic and budget challenges, along with new or changing threat risks, now is the time to review data protection including BC and DR including using new technologies in new ways.
You can view the complete SNW USA spring 2013 agenda here.
While busy, I liked this edition of SNW USA in that it had a great agenda with diversity and balance of speaker sessions (some tutorials, some vendors, some IT customers, and some analysts) vs. too many of one specific area.
In addition to the agenda and session length, the venue was good, big enough, however not spread out so much to cause loss of the buzz and energy of the event.
This SNW had some similar buzz or energy as early versions granted without the hype and fanfare of a startup industry or focus area (that would be some of the other events today)
Should SNW go to a once a year event?
While it would be nice to have a twice a year venue for convenience, practicality and budgets say once would be enough given all the other conferences and venues on the agenda (or that could be).
The next SNW USA will be October 15 to 17 2013 in Long Beach California, and Europe in Frankfurt Germany October 29-30 2013.
Thanks again to all the attendees, participants, vendor exhibitors, event organizers and SNIA, SNW/Computerworld staffs for another great event.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
Riding the current software defined data center (SDC) wave being led by the likes of VMware and software defined networking (SDN) also championed by VMware via their acquisition of Nicira last year, Software Defined Marketing (SDM) is in full force. HP being a player in providing the core building blocks for traditional little data and big data, along with physical, virtual, converged, cloud and software defined has announced a new compute, processor or server platform called the Moonshot 1500.
Software defined marketing aside, there are some real and interesting things from a technology standpoint that HP is doing with the Moonshot 1500 along with other vendors who are offering micro server based solutions.
First, for those who see server (processor and compute) improvements as being more and faster cores (and threads) per socket, along with extra memory, not to mention 10GbE or 40GbE networking and PCIe expansion or IO connectivity, hang on to your hats.
Moonshot is in the model of the micro servers or micro blades such as what HP has offered in the past along with the likes of Dell and Sea Micro (now part of AMD). The micro servers are almost the opposite of the configuration found on regular servers or blades where the focus is putting more ability on a motherboard or blade.
With micro servers the approach support those applications and environments that do not need lots of CPU processing capability, large amount of storage or IO or memory. These include some web hosting or cloud application environments that can leverage more smaller, lower power, less performance or resource intensive platforms. For example big data (or little data) applications whose software or tools benefit from many low-cost, low power, and lower performance with distributed, clustered, grid, RAIN or ring based architectures can benefit from this type of solution.
What is the Moonshot 1500 system?
4.3U high rack mount chassis that holds up to 45 micro servers
Each hot-swap micro server is its own self-contained module similar to blade server
Server modules install vertically from the top into the chassis similar to some high-density storage enclosures
Compute or processors are Intel Atom S1260 2.0GHz based processors with 1 MB of cache memory
Single S0-DIMM slot (unbuffered ECC at 1333 MHz) supports 8GB (1 x 8GB DIMM) DRAM
Each server module has a single 2.5″ SATA 200GB SSD, 500GB or 1TB HDD onboard
A dual port Broadcom 5720 1 Gb Ethernet LAn per server module that connects to chassis switches
Marvel 9125 storage controller integrated onboard each server module
Chassis and enclosure management along with ACPI 2.0b, SMBIOS 2.6.1 and PXE support
A pair of Ethernet switches each give up to six x 10GbE uplinks for the Moonshot chassis
Dual RJ-45 connectors for iLO chassis management are also included
Status LEDs on the front of each chassis providers status of the servers and network switches
Support for Canonical Ubuntu 12.04, RHEL 6.4, SUSE Linux LES 11 SP2
Notice a common theme with moonshot along with other micro server-based systems and architectures?
If not, it is simple, I mean literally simple and flexible is the value proposition.
Simple is the theme (with software defined for marketing) along with low-cost, lower energy power demand, lower performance, less of what is not needed to remove cost.
Granted not all applications will be a good fit for micro servers (excuse me, software defined servers) as some will need the more robust resources of traditional servers. With solutions such as HP Moonshot, system architects and designers have more options available to them as to what resources or solution options to use. For example, a cloud or object storage system based solutions that does not need a lot of processing performance per node or memory, and a low amount of storage per node might find this as an interesting option for mid to entry-level needs.
Will HP release a version of their Lefthand or IBRIX (both since renamed) based storage management software on these systems for some market or application needs?
How about deploying NoSQL type tools including Cassandra or Mongo, how about CloudStack, OpenStack Swift, Basho Riak (or Riak CS) or other software including object storage, on these types of solutions, or web servers and other applications that do not need the fastest processors or most memory per node?
Thus micro server-based solutions such as Moonshot enable return on innovation (the new ROI) by enabling customers to leverage the right tool (e.g. hard product) to create their soft product allowing their users or customers to in turn innovate in a cost-effective way.
Will the Moonshot servers be the software defined turnaround for HP, click here to see what Bloomberg has to say, or Forbes here.
Learn more about Moonshot servers at HP here, here or data sheets found here.
Btw, HP claims that this is the industries first software defined server, hmm.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
In this episode from SNW Spring 2013 in Orlando Florida, Bruce Ravid (@BruceRave) and me visit with Justin Stottlemyer (@JHStott) who is a Fellow and Storage Architect at Shutterfly.
Our conversation centers on how Justin and Shutterfly maximize their return on innovation (the new ROI) by using object storage along with other technology and techniques to create a resilient, scalable flexible data infrastructure.
Justin was at SNW presenting on overcoming object integration at Shutterfly where their data infrastructure consists of 80PB of storage to house over 30PB of user content data that continues to grow.
For those not familiar, Shutterfly providers customers with free unlimited storage of their photos which can then be printed in coffee table type books such as the one shown in the above figure. My wife has used Shutterfly a few times to create photo books such as the one shown above in the image.
As you will hear Justin explain in the pod cast, photos get uploaded and ingested into their environment and then available for printing.
In addition to talking about object storage, private clouds, business continuance (BC) and disaster recovery, other topics include performance and capacity planning, maximizing return on innovation in addition to return on investment among other items. Varies and managed by user interface
Listen in to hear how Justin and Shutterfly are currently managing 80PB of storage with over 30PB of user data that continues to grow.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
In 2012 AWS released their Storage Gateway that you can use and try for free here using either an EC2 Amazon Machine Instance (AMI), or deployed locally on a hypervisor such as VMware vSphere/ESXi. About a year ago I did a storage gateway post (First, second and third impressions) when it was first released. I will do a new post soon following up with my later impressions and experiences of having used it recently. For now, my quick (fourth impressions can be found here in this AWS Marketplace review). In general, the gateway is an AWS alternative to using third product gateway, appliances of software tools for accessing AWS storage.
When deployed locally on a VM, the storage gateway communicates using the AWS API’s back to the S3 and EBS (depending on how configured) storage services. Locally, the storage gateway presents an iSCSI block access method for Windows or other servers to use.
There are two modes with one being Gateway-Stored and the other Gateway-Cached. Gateway-Stored uses your primary storage mapped to the storage gateway as primary storage and asynchronous (time delayed) snapshots (user defined) to S3 via EBS volumes. This is a handy way to have local storage for low latency access, yet use AWS for HA, BC and DR, along with a means for doing migration into or out of AWS. Gateway-cache mode places primary storage in AWS S3 with a local cached copy to reduce network overhead.
When I tried the gateway a month or so ago, using both modes, I was not able to view any of my data using standard S3 tools. For example if I looked in my S3 buckets the objects do not appear, something that AWS said had to do with where and how those buckets and objects are managed. Otoh, I was able to see EBS snapshots for the gateway-stored mode including using that as a means of moving data between local and AWS EC2 instances. Note that regardless of the AWS storage gateway mode, some local cache storage is needed, and likewise some EBS volumes will be needed depending on what mode is used.
When I used the gateway, a Windows Server mounted the iSCSI volume presented by the storage gateway and in turn served that to other systems as a shared folder. Thus while having block such as iSCSI is nice, a NAS (NFS or CIFS) presentation and access mode would also be useful. However more on the storage gateway in a future post. Also note that beyond the free trial period (you may have to pay for storage being used) for using the gateway, there are also fees for S3 and EBS storage volumes use.
What about Glacier?
Shortly after its release last year, I did this piece about Glacier and have since been doing some testing proof of concepts with it.
I like Glacier and its prospects for doing some various things, particular for inactive data including deep archives that will seldom if every be accessed, yet need to be retained. The business value proposition of Glacier is that it has a very high durability and low-cost assuming that you do not need to frequently access your data, and when you do, that you can wait 3 to 5 hours before retrieving it from your S3 buckets.
Access to Glacier is via API or AWS console so getting things into and out of it can be a challenge. For example I wanted to see if I could use AWS storage gateway to more easily bulk move things into Glacier via S3, however no luck, or at least today. Speaking of S3, by setting your policies you determine when objects get moved into Glacier as well as how long they will stay there, you can read more about Glacier here and via AWS here.
Note that there is a myth that cloud vendors have hidden fees which may be the case for some, however so far I have not seen that to be the case with AWS. However, as a consumer, designer or architect, doing your homework and looking at the above links among others you can be ready and understand the various fees and options. Hence like procuring traditional hardware, software or services, do your due diligence and be an informed shopper.
Some more service cost notes include:
Note that with S3 Standard and RRS objects there is not a charge for deletion of objects, however there is a pro-rated charge per GByte of Glacier objects removed prior to 90 days. Glacier also allows up to 5% of your average monthly storage usage (pro-rated daily) to be restored with no charge, other fees apply for restoring larger amounts in a given period. Thus if you are planning on accessing and using data, analyze what your activity and usage will be as part of calculating your costs with Glacier. Read more about Glacier here.
Standard EBS volumes are changed by the amount of storage space capacity you provision in GB until released. For EBS snapshot copies there are fees for transferring data across regions, once moved, the rates of the new region apply for the snapshot.
As with Standard volumes, volume storage for Provisioned IOPS volumes is charged by the amount you provision in GB per month. With Provisioned IOPS volumes, you are also charged by the amount you provision in IOPS pro-rated as a percentage of days you have it in use for the month.
Thus important for cloud storage planning to know not only your space requirements, also IOP’s, bandwidth, and level of availability as well as durability. so for Standard volumes, you will likely see a lower number of I/O requests on your bill than is seen by your application unless you sync all of your I/Os to disk. Thus pay attention to what your needs are in terms of availability (accessibility), durability (resiliency or survivability), space capacity, and performance.
Leverage AWS CloudWatch tools and API’s to monitoring that matter for timely insight and situational awareness into how EBS, EC2, S3, Glacier, Storage Gateway and other services are being used (or costing you). Also visit the AWS service health status dashboard to gain insight into how things are running to help gain confidence with cloud services and solutions.
Hopefully this helps to fill in some gaps giving more information addressing questions, along with generating new ones to prepare for your journey with clouds. After all, don’t be scared of clouds. Be prepared, do your homework, identify your concerns and then address those to gain cloud confidence.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
For those not familiar, Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services. With S3, you specify a region where a bucket is created that will contain objects that can be written, read, listed and deleted. You can create multiple buckets in a region with unlimited number of objects ranging from 1 byte to 5 Tbytes in size per bucket. Each object has a unique, user or developer assigned access key. In addition to indicating which AWS region, S3 buckets and objects are provisioned using different levels of availability, durability, SLA’s and costs (view S3 SLA’s here).
Cost will vary depending on the AWS region being used, along if Standard or Reduced Redundancy Storage (RSS) selected. Standard S3 storage is designed with 99.999999999% durability (how many copies exists) and 99.99% availability (how often can it be accessed) on an annual basis capable of two data centers becoming un-available.
As its name implies, for a lower fee and level of durability, S3 RRS has an annual durability of 99.999% and availability of 99.99% capable of a single data center loss. In the following figure durability is how many copies of data exist spread across different servers and storage systems in various data centers and availability zones.
What would you put in RRS vs. Standard S3 storage?
Items that need some level of persistence that can be refreshed, recreated or restored from some other place or pool of storage such as thumbnails or static content or read caches. Other items would be those that you could tolerant some downtime while waiting for data to be restored, recovered or rebuilt from elsewhere in exchange for a lower cost.
Different AWS regions can be chosen for regulatory compliance requirements, performance, SLA’s, cost and redundancy with authentication mechanisms including encryption (SSL and HTTPS) to make sure data is kept secure. Various rights and access can be assigned to objects including making them public or private. In addition to logical data protection (security, identity and access management (IAM), encryption, access control) policies also apply to determine level of durability and availability or accessibility of buckets and objects. Other attributes of buckets and objects include life-cycle management polices and logging of activity to the items. Also part of the objects are meta data containing information about the data being stored shown in a generic example below.
Access to objects is via standard REST and SOAP interfaces with an Application Programming Interface (API). For example default access is via HTTP along with a Bit Torrent interface with optional support via various gateways, appliances and software tools.
Example cloud and object storage access
The above figure via Cloud and Virtual Data Storage Networking (CRC Press) shows a generic example applicable to AWS services including S3 being accessed in different ways. For example I access my S3 buckets and objects via Jungle Disk (one of the tools I use for data protection) that can also access my Rackspace Cloudfiles data. In the following figure there are examples of some of my S3 buckets and objects used by different applications and tools that I have in various AWS regions.
AWS S3 buckets and objects in different regions
Note that I sometimes use other AWS regions outside the US for testing purposes, for compliance purpose my production, business or personal data is only in the US regions.
The following figure is a generic example of how cloud and object storage are accessed using different tools, hardware, software and API’s along with gateways. AWS is an example of what is shown in the following figure as a Cloud Service and S3, EBS or Glacier as cloud storage. Common example API commands are also shown which will vary by different vendors, products or solution definitions or implementations. While Amazon S3 API which is REST HTTP based has become an industry de facto standard, there are other API’s including CDMI (Cloud Data Management Interface) developed by SNIA which has gained ISO accreditation.
In addition to using Jungle Disk which manages my AWS keys and objects that it creates, I can also access my S3 objects via the AWS management console and web tools, also via third-party tools including Cyberduck.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
For those not familiar, Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services. There are several other storage and data related service for little data database (SQL and NoSql based) other offerings include compute, data management, application and networking for different needs shown in the following image.
S3 is well suited for both big and little data repositories of objects ranging from backup to archive to active video images and much more. In fact if you are using some of the different AaaS or SaaS services including backup or file and video sharing, those may be using S3 as its back-end storage repository. For example NetFlix leverages various AWS capabilities as part of its data and applications infrastructure (read more here).
AWS basics
AWS consists of multiple regions that contain multiple availability zones where data and applications are supported from.
Note that objects stored in a region never leave that region, such as data stored in the EU west never leave Ireland, or data in the US East never leaves Virginia.
AWS does support the ability for user controlled movement of data between regions for business continuance (BC), high availability (HA) and disaster recovery (DR). Read more here at the AWS Security and Compliance site and in this AWS white paper.
What about EBS?
That brings us to Elastic Block Storage (EBS) that is used by EC2 (read more about EC2 and instances here) as storage for cloud and virtual machines or compute instances. In addition to using S3 as a persistent backing store or target for holding snapshots EBS can be thought of as primary storage. You can provision and allocate EBS volumes in the different data centers of the various AWS availability zones. As part of allocating your EBS volume you indicate the type (standard) or provisioned IOP’s or the new EBS Optimized volumes. EBS Optimized volumes enables instances that support the feature to have better IO performance to storage.
The following image shows an EC2 instance with EBS volumes (standard and provisioned IOPS’s) along with S3 volumes and snapshots. In the following example the instance and volumes are being served via the AWS US East region (Northern Virginia) using availability zone US East 1a. In addition, EBS optimized volumes are shown being used in the example to increase bandwidth or throughput performance between storage and the compute instance.
Using the above as a basis, you can build on that to leverage multiple availability zones or regions for HA, BC and DR combined with application, network load balancing and other capabilities. Note that EBS volumes are protected for durability by being spread across different servers and storage in an availability zone. Additional protection is provided by using snapshots combined with S3. Additional BC and DR or HA protection can be accomplished by replicating data across availability zones.
The above is an example of tying various components and services together. For example using different AWS availability zones, instances, EBS, S3 and other tools including those from third parties. Here is a link to a free chapter download from Cloud and Virtual Data Storage Networking (CRC Press) pertaining to data protection, BC and DR (available at Amazon here and Kindle here). In addition here is an AWS white paper on using their services for BC, HA and DR.
EBS volumes are created ranging in size from 1GByte to 1Tbyte in space capacity with multiple volumes being mapped or attached to an EC2 instances. EBS volumes appear as a virtual disk drive for block storage. From the EC2 instance and guest operating system you can mount, format and use the EBS volumes as any other block disk drive with your favorite tools and file systems. In addition to space capacity, EBS volumes are also provisioned with standard IO (e.g. disk based) performance or high performance Provisioned IOPS (e.g. SSD) for thousands of IOPS per instance. AWS states that a standard EBS volume should support about 100 IOP’s on average, with about 2,000 IOPS for a provisioned IOP volume. Need more than 2,000 IOPS, then the AWS recommendation is to use multiple IOP provisioned volumes with data spread across those. Following is an example of AWS EBS volumes seen via the EC2 management interface.
AWS EC2 and EBS configuration status
Note that there is a 10 to 1 ratio of space capacity to IOP’s being provisioned. If you try to play a game of 1,000 IOPS provisioned on a 10GByte EBS volume to keep your costs down you are out of luck. Thus to get 1,000 IOPS’s you would need to allocate at least a 100GByte EBS volume of which you will be billed for the actual space used on a monthly pro-rated basis. The following is an example of provisioning an AWS EBS volume using provisioned IOPS in the US East region in the 1a availability zone.
Provisioning IOPS with EBS volume
Standard and Provisioned IOPS EBS volumes
Standard EBS volumes are good for boot images or other application usage that are not IO performance intensive. For database or other active applications where more performance is needed, then EBS Provisioned IOPS volumes are your option. Note that the provisioned IOP rate is persistent for the specific volume during its life. Thus if you set it and forget it including not using it without turning it off, you will be billed for provisioning it.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
The four EBS optimized instance types are m3.xlarge, m3.2xlarge, m2.2xlarge and c1.xlarge for dedicated bandwidth or throughput between the EC2 instances and EBS volumes. The performance or bandwidth ranges from 500 Mbits (500 / 8 = 62.5 MBytes) per second, to 1,000 Mbits (1,000 / 8 = 125MBytes) per second depending on the type of instance. As a refresher, EC2 instances (why by time you read this could change) vary in size and functionality with different amounts of EC2 Unit of Compute (ECU), number of virtual cores, amount of storage space included, 32 or 64 bit, storage and networking IO performance, and EBS Optimized or not. In addition to instances, different operating system images can be installed using those licensed from AWS such as various Windows and Unix or supply your own.
There are also different generations of instances such as M1 (first generation where one ECU = 1.0 to 1.2 Ghz of a 2007 era Opteron or Xeon processor), M3 (second generation with faster processors) along with Micro low-cost options. There are also other optimized instances including high or large amounts of memory, high CPU or compute processing, clustered compute, high memory clustered, clustered GPU (e.g. using Nivida Tesla GPUs), high IO and high storage space capacity needs.
Here is the announcement from AWS:
Dear Amazon Web Services Customer,
We are delighted to announce the global availability of EBS-optimized support for four additional instance types: m3.xlarge, m3.2xlarge, m2.2xlarge, and c1.xlarge. EBS-optimized instances deliver dedicated throughput between Amazon EC2 and Amazon EBS, with options between 500 Megabits per second and 1,000 Megabits per second depending on the instance type used. The dedicated throughput minimizes contention between EBS I/O and other traffic from your Amazon EC2 instance, providing the best performance for your EBS volumes.
EBS-optimized instances are designed for use with both Standard and Provisioned IOPS EBS volumes. Standard volumes deliver 100 IOPS on average with a best effort ability to burst to hundreds of IOPS, making them well-suited for workloads with moderate and bursty I/O needs. When attached to an EBS-optimized instance, Provisioned IOPS volumes are designed to consistently deliver up to 2000 IOPS from a single volume, making them ideal for I/O intensive workloads such as databases. You can attach multiple Amazon EBS volumes to a single instance and stripe your data across them for increased I/O and throughput performance.
Amazon EBS-optimized support is now available for m3.xlarge, m3.2xlarge, m2.2xlarge, m2.4xlarge, m1.large, m1.xlarge, and c1.xlarge instance types, and is currently supported in the US-East (N. Virginia), US-West (N. California), US-West (Oregon), EU-West (Ireland), Asia Pacific (Singapore), Asia Pacific (Japan), Asia Pacific (Sydney), and South America (São Paulo) Regions.
What this means is that AWS is enabling customers to size their compute instances and storage volumes with more flexibility to meet different needs. For example, EC2 instances with various compute processing capabilities, amount of memory, network and storage I/O performance to volumes. In addition, storage volumes based on different space capacity size, standard or provisioned IOP’s, bandwidth or throughput performance between the instance and volume, along with data protection such as snapshots.
This means that the cost per space capacity of an EBS volume varies based on which AWS availability zone it is in, standard (lower IOP performance) or provisioned IOP’s (faster), along with instance type. In other words, cloud storage is not just about the cost per GByte, it’s also about the cost for IOPS, bandwidth to use it, where it is located (e.g. with AWS which Availability Zone), type of service, level of availability and durability among other attributes.
Additional reading and related items:
Cloud conversations: AWS EBS, Glacier and S3 overview (Part I)
Cloud conversations: AWS EBS, Glacier and S3 overview (Part II)
Cloud conversations: AWS EBS, Glacier and S3 overview (Part III)
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
Cloud Bulk Big Data Software Defined Object Storage Resources
Welcome to the Cloud, Big Data, Software Defined, Bulk and Object Storage Resources Center Page objectstoragecenter.com.
This object storage resources, along with software defined, cloud, bulk, and scale-out storage page is part of the server StorageIOblog microsite collection of resources. Software-defined, Bulk, Cloud and Object Storage exist to support expanding and diverse application data demands.
Bulk, Cloud, Object Storage Solutions and Services
There are various types of cloud, bulk, and object storage including public services such as Amazon Web Services (AWS) Simple Storage Service (S3), Backblaze, Google, Microsoft Azure, IBM Softlayer, Rackspace among many others. There are also solutions for hybrid and private deployment from Cisco, Cloudian, CTERA, Cray, DDN, Dell EMC, Elastifile, Fujitsu, Vantera/HDS, HPE, Hedvig, Huawei, IBM, NetApp, Noobaa, OpenIO, OpenStack, Quantum, Rackspace, Rozo, Scality, Spectra, Storpool, StorageCraft, Suse, Swift, Virtuozzo, WekaIO, WD, among many others.
Cloud products and services among others, along with associated data infrastructures including object storage, file systems, repositories and access methods are at the center of bulk, big data, big bandwidth and little data initiatives on a public, private, hybrid and community basis. After all, not everything is the same in cloud, virtual and traditional data centers or information factories from active data to in-active deep digital archiving.
Object Context Matters
Before discussing Object Storage lets take a step back and look at some context that can clarify some confusion around the term object. The word object has many different meanings and context, both inside of the IT world as well as outside. Context matters with the term object such as a verb being a thing that can be seen or touched as well as a person or thing of action or feeling directed towards.
Besides a person, place or physical thing, an object can be a software-defined data structure that describes something. For example, a database record describing somebody’s contact or banking information, or a file descriptor with name, index ID, date and time stamps, permissions and access control lists along with other attributes or metadata. Another example is an object or blob stored in a cloud or object storage system repository, as well as an item in a hypervisor, operating system, container image or other application.
Besides being a verb, an object can also be a noun such as disapproval or disagreement with something or someone. From an IT context perspective, an object can also refer to a programming method (e.g. object-oriented programming [oop], or Java [among other environments] objects and classes) and systems development in addition to describing entities with data structures.
In other words, a data structure describes an object that can be a simple variable, constant, complex descriptor of something being processed by a program, as well as a function or unit of work. There are also objects unique or with context to specific environments besides Java or databases, operating systems, hypervisors, file systems, cloud and other things.
The Need For Bulk, Cloud and Object Storage
There is no such thing as an information recession with more data being generated, moved, processed, stored, preserved and served, granted there are economic realities. Likewise as a society our dependence on information being available for work or entertainment, from medical healthcare to social media and all points in between continues to increase (check out the Human Face of Big Data).
Object and cloud storage are in your future, the questions are when, where, with what and how among others.
Watch for more content and links to be added here soon to this object storage center page including posts, presentations, pod casts, polls, perspectives along with services and product solutions profiles.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.
NetApp announced the other day a new all nand flash solid-state devices (SSD) storage system called the EF540 that is available now. The EF540 has something’s new and cool, along with some things familiar, tried, true and proven.
What is new is that the EF540 is an all nand flash multi-level cell (MLC) SSD storage system. What is old is that the EF540 is based on the NetApp E-Series (read more here and here) and SANtricity software with hundreds of thousands installed systems. As a refresher, the E-Series are the storage system technologies and solutions obtained via the Engenio acquisition from LSI in 2011.
Image via www.ntapgeek.com
The EF540 expands the NetApp SSD flash portfolio which includes products such as FlashCache (read cache aka PAM) for controllers in ONTAP based storage systems. Other NetApp items in the NetApp flash portfolio include FlashPool SSD drives for persistent read and write storage in ONTAP based systems. Complimenting FlashCache and FlashPool is the server-side PCIe caching card and software FlashAccel. NetApp is claiming to have revenue shipped 36PB of flash complimenting over 3 Exabytes (EB) of storage while continuing to ship a large amount of SAS and SATA HDD’s.
NetApp also previewed its future FlashRay storage system that should appear in beta later in 2013 and general availability in 2014.
EMC and NetApp (along with other vendors) continue to sell large numbers of HDD’s as well as large amounts of SSD. Both EMC and NetApp are taking similar approaches of leveraging PCIe flash cards as cache adding software functionality to compliment underlying storage systems. The benefit is that the cache approach is less disruptive for many environments while allowing improved return on investment (ROI) of existing assets.
What does this all mean? The NetApp EF540 based on the E-Series storage system architecture is like one of its primary competitors (e.g. EMC VNX also available as an all-flash model). The similarity is that both have been competitors, as well as have been around for over a decade with hundreds of thousands of installed systems. The similarities are also that both continue to evolve their code base leveraging new hardware and software functionality. These improvements have resulted in improved performance, availability, capacity, energy effectiveness and cost reduction.
Whats your take on RAID still being relevant?
From a performance perspective, there are plenty of public workloads and benchmarks including Microsoft ESRP and SPC among others to confirm its performance. Watch for NetApp to release EF540 SPC results given their history of doing so with other E-Series based systems. With those or other results, compare and contrast to other solutions looking not just at IOPS or MB/sec (bandwidth), also latency, functionality and cost.
What does the EF540 compete with? The EF540 competes with all flash-based SSD solutions (Violin, Solidfire, Purestorage, Whiptail, Kaminario, IBM/TMS, up-coming EMC Project “X” (aka XtremeIO)) among others. Some of those systems use general-purpose servers combined SSD drives, PCIe cards along with management software where others leverage customized platforms with software. To a lesser extent, competition will also be mixed mode SSD and HDD solutions along with some PCIe target SSD cards for some situations.
What to watch and look for: It will be interesting to view and contrast public price performance results using SPC or Microsoft ESRP among others to see how the EF540 compares. In addition, it will be interesting to compare other storage based, as well as SSD systems beyond the number of IOPS. What will be interesting is to keep an eye on latency, as well as bandwidth, feature functionality and associated costs.
Given that the NetApp E-Series are OEM or sold by third parties, let’s see if something looking similar or identical to the EF540 appear at any of those or new partners. This includes traditional general purpose and little-data environments, along with cloud, managed service provider, high performance compute and high productivity compute (HPC), super computer (SC), big data and big bandwidth among others.
Poll, Have SSD been successful in traditional storage systems and arrays
The EF540 could also appear as a storage or IO accelerator for large-scale out, clustered, grid and object storage systems for meta data, indices, key value stores among other uses either direct attached to servers, or via shared iSCSI, SAS, FC and InfiniBand (IBA) SCSI Remote Protocol (SRP).
Keep an eye on how the startups that have been primarily Just a Bunch Of SSD (JBOS) in a box start talking about adding new features and functionality such as snapshots, replication or price reductions. Also, keep an eye and ear open to what EMC does with project “X” along with NetApp FlashRay among other improvements.
For NetApp customers, prospects, partners, E-Series OEMs and their customers with the need for IO consolidation, or performance optimization for big-data, little-data and related applications the EF540 opens up new opportunities and should be good news. For EMC competitors, they now have new competition which also signals an expanding market with new opportunities in adjacent areas for growth. This also further signals the need for diverse ssd portfolios and product options to meet different customer application needs, along with increased functionality vs. lowest cost for high capacity fast nand SSD storage.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
Cloud and object storage will continue to gain in awareness, functionality, and options from various providers in terms of products, solutions, and services. There will be a mix of large-scale solutions and smaller ones, with a mix of open-source and proprietary pieces. Some of these will be for archiving, some for backup or data protection. Others will be for big-data, high-performance computing, or cloud on a local or wide area basis, while others for general file sharing.
Along with cloud and object storage, watch for more options about how those products or services can be accessed using traditional NAS (NFS, CIFS, HDFS and others) along with block, such as iSCSI object API’s, including Amazon S3, REST, HTTP, JSON, XML, iOS and CDMI along with programmatic bindings.
Data protection modernization, including backup/restore, high-availability, business continuity, disaster recovery, archiving, and related technologies for cloud, virtual, and traditional environments will remain popular themes.
Expect more Fibre Channel over Ethernet for networking with your servers and storage, PCIe Gen 3 to move data in and out of servers, and Serial-attached SCSI (SAS) as a means of attaching storage to servers or as the back-end storage for larger storage systems and appliances. For those who like to look out over the horizon, keep an eye and ear open for more discussion around PCI gen 3 deployment and gen 4 definitions, not to mention DDR4 and nand flash moving close to the processors.
With VMware buying Virsto, that should keep software defined marketing (SDM) and Storage hypervisors, storage virtualization, virtual storage, virtual storage arrays (VSA’s) active topic themes. Lets also keep in mind for storage space capacity optimization Data footprint reduction (DFR) including archiving, backup and data protection modernization, compression, consolidation, dedupe and data management.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved
Depending on whom you talk to or ask, you will get different views and opinions, some of them stronger than others on if magnetic tape is dead or alive as a data storage medium. However an aspect of tape that is alive are the discussions by those for, against or that simply see it as one of many data storage mediums and technologies whose role is changing.
Here is a link to an a ongoing discussion over in one of the Linked In group forums (Backup & Recovery Professionals) titled About Tape and disk drives. Rest assured, there is plenty of fud and hype on both sides of the tape is dead (or alive) arguments, not very different from the disk is dead vs. SSD or cloud arguments. After all, not everything is the same in data centers, clouds and information factories.
Fwiw, I removed tape from my environment about 8 years ago, or I should say directly as some of my cloud providers may in fact be using tape in various ways that I do not see, nor do I care one way or the other as long as my data is safe, secure, protected and SLA’s are meet. Likewise, I consult and advice for organizations where tape still exists yet its role is changing, same with those using disk and cloud.
I am not ready to adopt the singular view that tape is dead yet as I know too many environments that are still using it, however agree that its role is changing, thus I am not part of the tape cheerleading camp.
On the other hand, I am a fan of using disk based data protection along with cloud in new and creative (including for my use) as part of modernizing data protection. Although I see disk as having a very bright and important future beyond what it is being used for now, at least today, I am not ready to join the chants of tape is dead either.
Does that mean I can’t decide or don’t want to pick a side? NO
It means that I do not have to nor should anyone have to choose a side, instead look at your options, what are you trying to do, how can you leverage different things, techniques and tools to maximize your return on innovation. If that means that tape is, being phased out of your organization good for you. If that means there is a new or different role for tape in your organization co-existing with disk, then good for you.
If somebody tells you that tape sucks and that you are dumb and stupid for using it without giving any informed basis for those comments then call them dumb and stupid requesting they come back when then can learn more about your environment, needs, and requirements ready to have an informed discussion on how to move forward.
Likewise, if you can make an informed value proposition on why and how to migrate to new ways of modernizing data protection without having to stoop to the tape is dead argument, or cite some research or whatever, good for you and start telling others about it.
Otoh, if you need to use fud and hype on why tape is dead, why it sucks or is bad, at least come up with some new and relevant facts, third-party research, arguments or value propositions.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved