Amazon Web Services AWS July 2018 Updates

Amazon Web Services AWS July 2018 Updates

Amazon Web Services (AWS) July 2018 Updates

Amazon Web Services AWS July 2018 Updates

Amazon Web Services AWS July 2018 Updates continue to expand feature, functionality, service capabilities of the public cloud providers capabilities across various geographies.

Recent AWS updates include Snowball Edge (SBE) that adds local, on-site, on-premises aka on-prem EC2 compute capabilities as part of the Snowball appliance. Previously Snowball was a data and storage migration only appliance, now with the new capabilities, compute is also enabled as part of a turnkey converged platform. Read more about SBE here.

In other updates, AWS has extended its Elastic Cloud Compute (EC2) capabilities (besides Snowball Edge) with new instance types, along with leveraging their next generation hypervisor as part of Nitro enabled systems. New EC2 instances span from on-prem Snowball Edge (SBE) to AWS Dedicated aka bare metal instances, along with traditional cloud instances (e.g., virtual machines).

These new instances including R5, R5D, and Z1 among others leverage faster Intel Xeon Platinum 8000 series processors, along with more memory. For example, Z1D is a compute-intensive instance with 4.0 GHz all turbo enabled core, while R5 is memory optimized with 3.1 GHz cores (up to 96 vCPU) and up to 768GB of RAM. The R5D is a memory-optimized instance that also supports up to 3.6TB of on-instance NVMe based storage. View additional AWS instance types here.

AWS has enhanced SageMaker (Machine Learning) service supporting higher throughput enabling faster data transformation batch jobs of non-real-time inference. To enable higher data and API call rates, AWS has also enhanced Simple Storage Service (S3) request rate. Another enhancement by AWS is enabling bring your own IP address preview for virtual private cloud (VPC) as part of allowing hybrid clouds.

View additional new, recent and past AWS updates here, and here.

Where to learn more

Learn more about AWS, Cloud and data infrastructures related topics via the following links:

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means

Amazon Web Services AWS July 2018 Updates continue to expand the number, type and extensiveness of public cloud services, as well as enabling hybrid capabilities. The Amazon Web Services AWS July 2018 Updates also address different data infrastructure layers from lower level Infrastructure as a Service (IaaS) including EC2 compute, as well as higher level artificial inelegance (AI), machine learning (ML), deep learning (DL) among other cognitive as well as analytic offerings.

Ok, nuff said, for now.

Cheers Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

AWS Snowball Edge SBE Converged Cloud Storage Appliance

AWS Snowball Edge SBE Converged Cloud Storage Appliance

AWS Snowball Edge SBE Converged Cloud Storage Appliance

As part of extending their cloud platform reach, recent Amazon Web Services (AWS) announcements include AWS Snowball Edge SBE Converged Cloud Storage Appliance. Snowball Edge (SBE) has evolved from its previous focus as a data transfer, migration platform appliance to now include support for on-prem compute. SBE has previously been available as an appliance that ships from AWS to your location as a service to enable bulk data movement to the public cloud (e.g. AWS Simple Storage Service (S3) bucket). With this new capability, AWS is enabling SBE to support on-prem compute similar to Elastic Cloud Compute (EC2) cloud instances.

AWS Snowball Data Migration at PB scale
AWS Snowball Appliance Image via AWS.com

What is AWS Snowball

Snowball is a bulk physical data migration appliance that AWS ships to your location. You use Snowball by setting up a copy job with AWS, when the device arrives at your site, set it up, and enable the copy jobs to occur moving data from source to Snowball destination. Once data is copied, you ship the Snowball back to an AWS region and availability zone (AZ) where its contents are copied into a Simple Storage Service (S3) bucket of your choice. Once the copy job into an AWS S3 is complete, AWS performs a secure erase of the Snowball.

Basic Snowball includes 10 GbE network connections (RJ45 and SFP+ [fiber or copper]). Security and Encryption includes 256-bit keys that can be managed via AWS Key Management Service (KMS). Note that keys are not sent to or stored on the device for security during transit. For additional protection, tamper-resistant seals are included along with the Trusted Platform Module (TPM) to detect unauthorized hardware, firmware or software changes.

End to End tracking is enabled using E ink shipping labels and allow monitoring via AWS Simple Notification Service (SNS). Once your data transfer job completes along with verified, a software erasure of the SBE is performed by AWS following NIST media handling guidelines.

For management, SBE has an API for customer integration, as well as the ability to create and manage transfer jobs via the AWS management console. SBE Adapter also gives customers direct access to Snowball where it appears as an S3 endpoint (how you access the storage and data).

Backside view of AWS Snowball
Backside view of Snowball Image via Amazon.com

Additional Snowball Speeds and Specification Feature Feeds include:

  • Storage space capacity of 50TB (42TB usable) or 80TB (72TB usable)
  • Network connectivity 10 GbE RJ45 (Cat6), SFP+ (Copper and Optical). Cables include RJ45 and Copper SFP+. For Fiber attached Ethernet, the customer supplies their own SFP+ optical cables.
  • SBE is designed for office environments, as well as data centers (e.g., about 68db) and weigh about 47 pounds.
  • Power requirements include NEMA 5-15p (standard wall outlet) 100-200 volts with power cable included.

Note for traditional Snowball deployments an on-prem workstation or server is needed to copy data from source locations to the Snowball device.

How AWS Snowball and Snowball Edge work

How AWS Snowball Works

Referring to the image above, first step to using AWS Snowball (or Snowball Edge) is to place an order via AWS management console (A). Part of the ordering process involves setting up the data transfer job, and in the case of AWS Snowball Edge, defining the EC2 instance and image (read more about that here via AWS). After placing order and setup, the AWS Snowball arrives at your location (B), on-site setup is done and data transfer performed (C). Once data is transferred, the AWS Snowball is returned to designated AWS location via two day shipping (D) and data copied into your specified S3 or Glacier bucket (E). After your data is transferred into the S3 or Glacier bucket you specify as part of the transfer job, you are able to do what you want with your files, folders, images, videos, VHDX’s, VMDK’s, ISO, little data, big data.

What is AWS Snowball Edge

AWS has enhanced its Snowball Edge (SBE) data mobility, migration, and transport appliance to now also include compute. For those not familiar, Snowball is an appliance that comes in various sizes that you order from AWS, it shows up at your site, and then you copy your data to it for migration into AWS. Once data is copied, you return to AWS where the data then appears in your designated S3 bucket. From your S3 bucket, you can then move the data, files, volumes, images to other locations, use for standing up EC2 compute, populating databases or other items.

With the new compute feature, AWS is enabling compute on the snowball edge appliance functioning similar to EC2 instance, except that they are on your site. This means you can use the compute to run your own custom AMI’s (Amazon Machine Image) on site or on-prem in support of data migration, conversion or another process. You can also keep the appliance on-site for as long as you want, granted your credit card gets charged to support development, test, extended migration, or to have a converged, or, hyper-converged platform.

Note that with SBE having compute capability, you can now run an EC2 image that functions as your copy server eliminating the need to have a workstation or server on-prem for the copy operation.

Additional AWS Snowball Edge Speeds and feature function feeds include:

  • 100TB (82TB usable) storage space capacity
  • 10 GbE network, along with 10/25 GbE SFP28 and 40 GbE QSFP+ with device-based encryption (customer provided network cables)
  • Local computing with EC2 and Lambda functions for remote deployment along with scale-out clustering of multiple SBE’s
  • S3 compatible endpoint along with NFS endpoint (mount point) using both NFS v3 and v4.1.
  • Weighs about 50 pounds, tamper evident seals along with TPM similar to traditional Snowball along with detection of hardware, firmware or software changes.
  • Can exist in an office environment, or data center.
  • Power cables are included, NEMA 5-15p, 100-220 volts, 400 watts.

What is AWS Snowmobile

Need something with more capacity than an SBE? AWS has a more extensive version called Snowmobile that supports up to 100PB that is brought to your site via a 45-foot-long tractor-trailer truck. Both SBE and Snowmobile physically move data from your location to an AWS region availability zone (AZ) aka data center where it is placed into the Simple Storage Service (S3) or Glacier bucket of your choice. Once in the S3 or Glacier bucket, you can move the data to where ever you need it.

Why Snowball Edge and Snowmobile vs. Fast Networks

Some people ask why the need for services such as SBE and Snowmobile, or, physically shipping your SSDs, HDD’s, tape or other storage media to a cloud provider in the Internet era of fast networks. The reason can be quite simple; most environments do not have internet connection speeds of 10 GbE or higher that can be dedicated outside of regular use for data movement at scale.

Likewise, some public cloud service providers have limitations on the network speed of their front-end general-purpose Internet access.

Note that some such as AWS have high-speed, low latency direct connect services from partner staging locations. However, those too may be limited in speed for large bulk transfers. AWS also has other performance-enhanced services for general Internet access including S3 Transfer Acceleration. Note that Microsoft Azure has special connectivity options such as ExpressRoute, while Google Compute Platform (GCP) has Cloud Interconnect.

Is AWS SBE and CI, HCI, CiB or Appliance?

The answer to the question of if SBE is a Converged Infrastructure (CI), Hyper-Converged Infrastructure (HCI), Cloud in a Box (CiB) or Cloud Appliance depends on your view and definition of those deployment models. Some will argue that SBE is a CI or HCI as well as CiB based on what Cisco, Dell Technologies, HPE, Microsoft (Azure Stack and Windows S2D), NetApp, Nutanix, Pivot3 and VMware vSAN among others offerings.

On the other hand, some will argue that SBE is not the same as the above and others give it does not meet the definition of their CI, HCI, CiB or cloud appliance. What is important is not if CI, HCI, CiB or appliance, rather, what it can do, how it can adapt to your environment and work for you vs. you work for it. In other words, what is important is the enablement a solution provides vs. if it is CI, HCI, CIB or something else. Meanwhile watch to see who ignores SBE, who welcomes it to their market space, and who throws mud balls and fud balls at snowball.

When to use Snowball vs. Snowball Edge

If all you need is bulk data migration appliance using one of your servers or workstations for smaller amounts of data, traditional Snowball is a good fit. On the other hand, if you need to move more data, leverage SBE enabled on-prem compute with EC2 and Lambda functionality for short, or long-term duration, as well as scale-out to create a cluster, then SBE is for you. SBE is also a good fit for environments that need short-term, as well as the longer-term deployment of compute, storage and network (e.g., converged). For example, factory environments, rugged implementations on ships, energy exploration and processing, traveling venues and sporting events, distributed environments being consolidated among others.

AWS Regions, AZ locations
AWS Regions and AZ’s image Via AWS.com

What About AWS Snowball Edge Pricing

Pricing varies based on AWS region you are using for your transfer and management from. Another variable is if you are selecting data transfer only, or, enabling EC2 compute instance on-prem. Yet another pricing variable is how long you will keep the Snowball Edge on-prem. You are given ten (10) free days as part of your data transfer job along with days for shipping and return.

Beyond the ten free days, you will pay a daily rate that varies. The longer you keep the SBE on-prem, and for example commit to a one or three-year pre-pay, you will receive larger discounts. Also note that there are no data transfer fees for moving data into AWS. However, standard pricing applies once stored into AWS, or moved. Also note that standard AWS storage charges (e.g. S3, Glacier, along with API calls apply once data is stored).

As an example, data transfer only, the service fee for a data transfer job is USD 300 for the US and another non-Asia-Pacific (Singapore). Additional days are $30 each.

Another example is selecting data transfer plus EC2 compute instance which varies by region example is $500 for transfer job (US East Northern Virginia or Ohio), $50 a day extra fee. However, if you are will to pay up front for one year, the day fee drops to $42 (varies by region), and to $35 a day for a three commitment.

For some environments, it may cost less to buy a server with storage, set it up and manage, while for others, the simplicity of a turnkey converged platform may be more cost-effective along with better value. Learn more about AWS Snowball Edge pricing here.

Where to learn more

Learn more about AWS, Snowball Edge, Cloud and data infrastructures related topics via the following links:

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means

Has AWS embraced hybrid public cloud and on-prem computing? IMHO while AWS is making it easier for environments to use, access as well as move to public cloud, they are still focused on the public cloud as the destination. In other words, AWS is making it easy to move your data and applications to their services as well as access them with AWS Snowball Edge SBE Converged Cloud Storage Appliance.

Ok, nuff said, for now.

Cheers Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2018. Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

AWS Cloud Application Data Protection Webinar

AWS Cloud Application Data Protection Webinar

AWS Cloud Application Data Protection Webinar trends

AWS Cloud Application Data Protection Webinar
Date: Tuesday, April 24, 2018 at 11:00am PT / 2:00pm ET

Only YOU can prevent data loss for on-premises, Amazon Web Service (AWS) based cloud, and hybrid applications.

Join me in this free AWS Cloud Application Data Protection Webinar (registration required) sponsored by Veeam produced by Redmond Magazine as we explore issues, trends, tools, best practices and techniques for enabling data protection with AWS technologies.

Hyper-V Disaster Recovery SDDC Data Infrastructure Data Protection

Attend and learn about:

  • Application-aware point in time snapshot data protection
  • Protecting AWS EC2 and on-premises applications (and data)
  • Leveraging AWS for data protection and recovery
  • And much more

Register for the live event or catch the replay here.

Where to learn more

Learn more about data protection, software defined data center (SDDC), software defined data infrastructures (SDDI), AWS, cloud and related topics via the following links:

SDDC Data Infrastructure

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

You can not go forward if you can not go back to a particular point in time (e.g. recovery point objective or RPO). Likewise, if you can not go back to a given RPO, how can you go forward with your business as well as meet your recovery time objective (RTO)? Join us for the live conversation or replay by registering (free) here to learn how to enable AWS Cloud Application Data Protection Webinar, as well as using AWS S3 for on-site, on-premises data protection.

Ok, nuff said, for now.

Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (vSAN and vCloud). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO. All Rights Reserved. StorageIO is a registered Trade Mark (TM) of Server StorageIO.

Amazon Web Service AWS September 2017 Software Defined Data Infrastructure Updates

Amazon Web Service AWS September 2017 Software Defined Data Infrastructure Updates

server storage I/O data infrastructure trends

Amazon Web Service AWS September 2017 Software Defined Data Infrasture Updates

September was a busy month pertaining to software defined data infrastructure including cloud and related AWS announcements. One of the announcements included VMware partnering to deliver vSphere, vSAN and NSX data infrastructure components for creating software defined data centers (SDDC) also known as multi cloud, and hybrid cloud leveraging AWS elastic bare metal servers (read more here in a companion post). Unlike traditional partner software defined solutions that relied on AWS Elastic Cloud Compute (EC2) instances, VMware is being deployed using private bare metal AWS elastic servers.

What this means is that VMware vSphere (e.g. ESXi) hypervisor, vCenter, software defined storage (vSAN), storage defined network (NSX) and associated vRealize tools are deployed on AWS data infrastructure that can be used for deploying hybrid software defined data centers (e.g. connecting to your existing VMware environment). Learn more about VMware on AWS here or click on the following image.

VMware on AWS via Amazon.com

Additional AWS Updates

Amazon Web Services (AWS) updates include, coinciding with VMworld, the initial availability of VMware on AWS (using virtual private servers e.g. think along the lines of Lightsail, not EC2 instances) was announced. Amazon Web Services (AWS) continues its expansion into database and table services with Relational Data Services (RDS) including various engines (Amazon Auora,MariaDB, MySQL, Oracle, PostgreSQL,and SQL Server along with Database Migration Service (DMS). Note that these RDS are in addition to what you can install and run your self on Elastic Cloud Compute (EC2) virtual machine instances, Lambda serverless containers, or Lightsail Virtual Private Servers (VPS).

AWS has published a guide to database testing on Amazon RDS for Oracle plotting latency and IOPs for OLTP workloads here using SLOB. If you are not familiar with SLOB (Silly Little Oracle Benchmark) here is a podcast with its creator Kevin Closson discussing database performance and related topics. Learn more about SLOB and step by step installation for AWS RDS Oracle here, and for those who are concerned or think that you can not run workloads to evaluate Oracle platforms, have a look at this here.

EC2 enhancements include charging by the second (previous by the hour) for some EC2 instances (see details here including what is or is not currently available) which is a growing trend by private cloud vendors aligning with how serverless containers have been billed. New large memory EC2 instances that for example support up to 3,904GB of DDR4 RAM have been added by AWS. Other EC2 enhancements include updated network performance for some instances, OpenCL development environment to leverage AWS F1 FPGA enabled instances, along with new Elastic GPU enabled instances. Other server and network enhancements include Network Load Balancer for Elastic Load Balancer announced, as well as application load balancer now supports load balancing to IP address as targets for AWS and on premises (e.g. hybrid) resources.

Other updates and announces include data protection backups to AWS via Commvault and AWS Storage Gateway VTL announced. IBM has announced their Spectrum Scale (e.g. formerly known as SONAS aka GPFS) Scale Out Storage solution for high performance compute (HPC) quick start on AWS. Additional AWS enhancements include new edge location in Boston and a third Seattle site, while Direct Connect sites have been added in Boston and Houston along with Canberra Australia. View more AWS announcements and enhancements here.

Where To Learn More

Learn more about related technology, trends, tools, techniques, and tips with the following links.

What This All Means

AWS continues to grow and expand, both in terms of number of services, also the extensiveness of them. Likewise AWS continues to add more regions and data center availability zones, enhanced connectivity, along with earlier mentioned service features. The partnership with VMware should enable enterprise organizations to move towards hybrid cloud data infrastructures, while giving AWS an additional reach into those data centers. Overall a good set of enhancements by AWS who continues to evolve their cloud and software defined data infrastructure portfolio of solution offerings.

By the way, if you have not heard, its Blogtober, check out some of the other blogs and posts occurring during October here.

Ok, nuff said, for now.
Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (and vSAN). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio.

Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2023 Server StorageIO(R) and UnlimitedIO. All Rights Reserved.

Dell EMC Announce Azure Stack Hybrid Cloud Solution

server storage I/O trends

Dell EMC Azure Stack Hybrid Cloud Solution

Dell EMC have announced their Microsoft Azure Stack hybrid cloud platform solutions. This announcement builds upon earlier statements of support and intention by Dell EMC to be part of the Microsoft Azure Stack community. For those of you who are not familiar, Azure Stack is an on premise extension of Microsoft Azure public cloud.

What this means is that essentially you can have the Microsoft Azure experience (or a subset of it) in your own data center or data infrastructure, enabling cloud experiences and abilities at your own pace, your own way with control. Learn more about Microsoft Azure Stack including my experiences with and installing Technique Preview 3 (TP3) here.

software defined data infrastructures SDDI and SDDC

What Is Azure Stack

Microsoft Azure Stack is an on-premises (e.g. in your own data center) private (or hybrid when connected to Azure) cloud platform. Currently Azure Stack is in Technical Preview 3 (e.g. TP3) and available as a proof of concept (POC) download from Microsoft. You can use Azure Stack TP3 as a POC for learning, demonstrating and trying features among other activities. Here is link to a Microsoft Video providing an overview of Azure Stack, and here is a good summary of roadmap, licensing and related items.

In summary, Microsoft Azure Stack and this announcement is about:

  • A onsite, on-premises, in your data center extension of Microsoft Azure public cloud
  • Enabling private and hybrid cloud with good integration along with shared experiences with Azure
  • Adopt, deploy, leverage cloud on your terms and timeline choosing what works best for you
  • Common processes, tools, interfaces, management and user experiences
  • Leverage speed of deployment and configuration with a purpose-built integrated solution
  • Support existing and cloud-native Windows, Linux, Container and other services
  • Available as a public preview via software download, as well as vendors offering solutions

What Did Dell EMC Announce

Dell EMC announced their initial product, platform solutions, and services for Azure Stack. This includes a Proof of Concept (PoC) starter kit (PE R630) for doing evaluations, prototype, training, development test, DevOp and other initial activities with Azure Stack. Dell EMC also announced a larger for production deployment, or large-scale development, test DevOp activity turnkey solution. The initial production solution scales from 4 to 12 nodes, or from 80 to 336 cores that include hardware (server compute, memory, I/O and networking, top of rack (TOR) switches, management, Azure Stack software along with services. Other aspects of the announcement include initial services in support of Microsoft Azure Stack and Azure cloud offerings.
server storage I/O trends
Image via Dell EMC

The announcement builds on joint Dell EMC Microsoft experience, partnerships, technologies and services spanning hardware, software, on site data center and public cloud.
server storage I/O trends
Image via Dell EMC

Dell EMC along with Microsoft have engineered a hybrid cloud platform for organizations to modernize their data infrastructures enabling faster innovate, accelerate deployment of resources. Includes hardware (server compute, memory, I/O networking, storage devices), software, services, and support.
server storage I/O trends
Image via Dell EMC

The value proposition of Dell EMC hybrid cloud for Microsoft Azure Stack includes consistent experience for developers and IT data infrastructure professionals. Common experience across Azure public cloud and Azure Stack on-premises in your data center for private or hybrid. This includes common portal, Powershell, DevOps tools, Azure Resource Manager (ARM), Azure Infrastructure as a Service (IaaS) and Platform as a Service (PaaS), Cloud Infrastructure and associated experiences (management, provisioning, services).
server storage I/O trends
Image via Dell EMC

Secure, protect, preserve and serve applications VMs hosted on Azure Stack with Dell EMC services along with Microsoft technologies. Dell EMC data protection including backup and restore, Encryption as a Service, host guard and protected VMs, AD integration among other features.
server storage I/O trends
Image via Dell EMC

Dell EMC services for Microsoft Azure Stack include single contact support for prepare, assessment, planning; deploy with rack integration, delivery, configuration; extend the platform with applicable migration, integration with Office 365 and other applications, build new services.
server storage I/O trends
Image via Dell EMC

Dell EMC Hyper-converged scale out solutions range from minimum of 4 x PowerEdge R730XD (total raw specs include 80 cores (4 x 20), 1TB RAM (4 x 256GB), 12.8TB SSD Cache, 192TB Storage, plus two top of row network switches (Dell EMC) and 1U management server node. Initial maximum configuration raw specification includes 12 x R730XD (total 336 cores), 6TB memory, 86TB SSD cache, 900TB storage along with TOR network switch and management server.

The above configurations initially enable HCI nodes of small (low) 20 cores, 256GB memory, 5.7TB SSD cache, 40TB storage; mid size 24 cores, 384GB memory, 11.5TB cache and 60TB storage; high-capacity with 28 cores, 512GB memory, 11.5TB cache and 80TB storage per node.
server storage I/O trends
Image via Dell EMC

Dell EMC Evaluator program for Microsoft Azure Stack including the PE R630 for PoCs, development, test and training environments. The solution combines Microsoft Azure Stack software, Dell EMC server with Intel E5-2630 (10 cores, 20 threads / logical processors or LPs), or Intel E5-2650 (12 cores, 24 threads / LPs). Memory is 128GB or 256GB, storage includes flash SSD (2 x 480GB SAS) and HDD (6 x 1TB SAS).
and networking.
server storage I/O trends
Image via Dell EMC

Collaborative support single contact between Microsoft and Dell EMC

Who Is This For

This announcement is for any organization that is looking for an on-premises, in your data center private or hybrid cloud turnkey solution stack. This initial set of announcements can be for those looking to do a proof of concept (PoC), advanced prototype, support development test, DevOp or gain cloud-like elasticity, ease of use, rapid procurement and other experiences of public cloud, on your terms and timeline. Naturally, there is a strong affinity and seamless experience for those already using, or planning to use Azure Public Cloud for Windows, Linux, Containers and other workloads, applications, and services.

What Does This Cost

Check with your Dell EMC representative or partner for exact pricing which varies for the size and configurations. There are also various licensing models to take into consideration if you have Microsoft Enterprise License Agreements (ELAs) that your Dell EMC representative or business partner can address for you. Likewise being cloud based, there is also time usage-based options to explore.

Where to learn more

What this all means

The dust is starting to settle on last falls Dell EMC integration, both of whom have long histories working with, and partnering along with Microsoft on legacy, as well as virtual software-defined data centers (SDDC), software-defined data infrastructures (SDDI), native, and hybrid clouds. Some may view the Dell EMC VMware relationship as a primary focus, however, keep in mind that both Dell and EMC had worked with Microsoft long before VMware came into being. Likewise, Microsoft remains one of the most commonly deployed operating systems on VMware-based environments. Granted Dell EMC have a significant focus on VMware, they both also sell, service and support many services for Microsoft-based solutions.

What about Cisco, HPE, Lenovo among others who have to announce or discussed their Microsoft Azure Stack intentions? Good question, until we hear more about what those and others are doing or planning, there is not much more to do or discuss beyond speculating for now. Another common question is if there is demand for private and hybrid cloud, in fact, some industry expert pundits have even said private, or hybrid are dead which is interesting, how can something be dead if it is just getting started. Likewise, it is early to tell if Azure Stack will gain traction with various organizations, some of whom may have tried or struggled with OpenStack among others.

Given a large number of Microsoft Windows-based servers on VMware, OpenStack, Public cloud services as well as other platforms, along with continued growing popularity of Azure, having a solution such as Azure Stack provides an attractive option for many environments. That leads to the question of if Azure Stack is essentially a replacement for Windows Servers or Hyper-V and if only for Windows guest operating systems. At this point indeed, Windows would be an attractive and comfortable option, however, given a large number of Linux-based guests running on Hyper-V as well as Azure Public, those are also primary candidates as are containers and other services.

Overall, this is an excellent and exciting move for both Microsoft extending their public cloud software stack to be deployed within data centers in a hybrid way, something that those customers are familiar with doing. This is a good example of hybrid being spanning public and private clouds, remote and on-premises, as well as familiarity and control of traditional procurement with the flexibility, elasticity experience of clouds.

software defined data infrastructures SDDI and SDDC

Some will say that if OpenStack is struggling in many organizations and being free open source, how Microsoft can have success with Azure Stack. The answer could be that some organizations have struggled with OpenStack while others have not due to lack of commercial services and turnkey support. Having installed both OpenStack and Azure Stack (as well as VMware among others), Azure Stack is at least the TP3 PoC is easy to install, granted it is limited to one node, unlike the production versions. Likewise, there are easy to use appliance versions of OpenStack that are limited in scale, as well as more involved installs that unlock full functionality.

OpenStack, Azure Stack, VMware and others have their places, along, or supporting containers along with other tools. In some cases, those technologies may exist in the same environment supporting different workloads, as well as accessing various public clouds, after all, Hybrid is the home run for many if not most legality IT environments.

Overall this is a good announcement from Dell EMC for those who are interested in, or should become more aware about Microsoft Azure Stack, Cloud along with hybrid clouds. Likewise look forward to hearing more about the solutions from others who will be supporting Azure Stack as well as other hybrid (and Virtual Private Clouds).

Ok, nuff said (for now…).

Cheers
Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert (and vSAN). Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Watch for the spring 2017 release of his new book "Software-Defined Data Infrastructure Essentials" (CRC Press).

Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2023 Server StorageIO(R) and UnlimitedIO. All Rights Reserved.

March 2017 Server StorageIO Data Infrastructure Update Newsletter

Volume 17, Issue III

Hello and welcome to the March 2017 issue of the Server StorageIO update newsletter.

First a reminder world backup (and recovery) day is on March 31. Following up from the February Server StorageIO update newsletter that had a focus on data protection this edition includes some additional posts, articles, tips and commentary below.

Other data infrastructure (and tradecraft) topics in this edition include cloud, virtual, server, storage and I/O including NVMe as well as networks. Industry trends include new technology and services announcements, cloud services, HPE buying Nimble among other activity. Check out the Converged Infrastructure (CI), Hyper-Converged (HCI) and Cluster in Box (or Cloud in Box) coverage including a recent SNIA webinar I was invited to be the guest presenter for, along with companion post below.

In This Issue

Enjoy this edition of the Server StorageIO update newsletter.

Cheers GS

Data Infrastructure and IT Industry Activity Trends

Some recent Industry Activities, Trends, News and Announcements include:

Dell EMC has discontinued the NVMe direct attached shared DSSD D5 all flash array has been discontinued. At about the same time Dell EMC is shutting down the DSSD D5 product, it has also signaled they will leverage the various technologies including NVMe across their broad server storage portfolio in different ways moving forward. While Dell EMC is shutting down DSSD D5, they are also bringing additional NVMe solutions to the market including those they have been shipping for years (e.g. on the server-side). Learn more about DSSD D5 here and here including perspectives of how it could have been used (plays for playbooks).

Meanwhile NVMe industry activity continues to expand with different solutions from startups such as E8, Excelero, Everspin, Intel, Mellanox, Micron, Samsung and WD SANdisk among others. Also keep in mind, if the answer is NVMe, then what were and are the questions to ask, as well as what are some easy to use benchmark scripts (using fio, diskspd, vdbench, iometer).

Speaking of NVMe, flash and SSDs, Amazon Web Services (AWS) have added new Elastic Cloud Compute (EC2) storage and I/O optimized i3 instances. These new instances are available in various configurations with different amounts of vCPU (cores or logical processors), memory and NVMe SSD capacities (and quantity) along with price.

Note that the price per i3 instance varies not only by its configuration, also for image and region deployed in. The flash SSD capacities range from an entry-level (i3.large) with 2 vCPU (logical processors), 15.25GB of RAM and a single 475GB NVMe SSD that for example in the US East Region was recently priced at $0.156 per hour. At the high-end there is the i3.16xlarge with 64 vCPU (logical processors), 488GB RAM and 8 x 1900GB NVMe SSDs with a recent US East Region price of $4.992 per hour. Note that the vCPU refers to the available number of logical processors available and not necessarily cores or sockets.

Also note that your performance will vary, and while NVMe protocol tends to use less CPU per I/O, if generating a large number of I/Os you will need some CPU. What this means is that if you find your performance limited compared to expectations with the lower end i3 instances, move up to a larger instance and see what happens. If you have a Windows-based environment, you can use a tool such as Diskspd to see what happens with I/O performance as you decrease the number of CPUs used.

Chelsio has announced they are now Microsoft Azure Stack Certified with their iWARP RDMA host adapter solutions, as well as for converged infrastructure (CI), hyper-converged (HCI) and legacy server storage deployments. As part of the announcement, Chelsio is also offering a 30 day no cost trial of their adapters for Microsoft Azure Stack, Windows Server 2016 and Windows 10 client environments. Learn more about the Chelsio trial offer here.

Everspin (the MRAM Spintorque, persistent RAM folks) have announced a new Storage Class Memory (SCM) NVMe accessible family (nvNITRO) of storage accelerator devices (PCIe AiC, U.2). Whats interesting about Everspin is that they are using NVMe for accessing their persistent RAM (e.g. MRAM) making it easily plug compatible with existing operating systems or hypervisors. This means using standard out of the box NVMe drivers where the Everspin SCM appears as a block device (for compatibility) functioning as a low latency, high performance persistent write cache.

Something else interesting besides making the new memory compatible with existing servers CPU complex via PCIe, is how Everspin is demonstrating that NVMe as a general access protocol is not just exclusive to nand flash-based SSDs. What this means is that instead of using non-persistent DRAM, or slower NAND flash (or 3D XPoint SCM), Everspin nvNITRO enables high endurance write cache with persistent to compliment existing NAND flash as well as emerging 3D XPoint based storage. Keep an eye on Everspin as they are doing some interesting things for future discussions.

Google Cloud Services has added additional regions (cloud locations) and other enhancements.

HPE continued buying into server storage I/O data infrastructure technologies announcing an all cash (e.g. no stock) acquisition of Nimble Storage (NMBL). The cash acquisition for a little over $1B USD amounts to $12.50 USD per Nimble share, double what it had traded at. As a refresh, or overview, Nimble is an all flash shared storage system leverage NAND flash solid storage device (SSD) performance. Note that Nimble also partners with Cisco and Lenovo platforms that compete with HPE servers for converged systems.View additional perspectives here.

Riverbed has announced the release of Steelfusion 5 which while its name implies physical hardware metal, the solution is available as tin wrapped (e.g. hardware appliance) software. However the solution is also available for deployment as a VMware virtual appliance for remote office branch office (ROBO) among others. Enhancements include converged functionality such as NAS support along with network latency as well as bandwidth among other features.

Check out other industry news, comments, trends perspectives here.

Server StorageIOblog Posts

Recent and popular Server StorageIOblog posts include:

View other recent as well as past StorageIOblog posts here

Server StorageIO Commentary in the news

Recent Server StorageIO industry trends perspectives commentary in the news.

Via InfoStor: 8 Big Enterprise SSD Trends to Expect in 2017
Watch for increased capacities at lower cost, differentiation awareness of high-capacity, low-cost and lower performing SSDs versus improved durability and performance along with cost capacity enhancements for active SSD (read and write optimized). You can also expect increased support for NVMe both as a back-end storage device with different form factors (e.g., M.2 gum sticks, U.2 8639 drives, PCIe cards) as well as front-end (e.g., storage systems that are NVMe-attached) including local direct-attached and fiber-attached. This means more awareness around NVMe both as front-end and back-end deployment options.

Via SearchITOperations: Storage performance bottlenecks
Sometimes it takes more than an aspirin to cure a headache. There may be a bottleneck somewhere else, in hardware, software, storage system architecture or something else.

Via SearchDNS: Parsing through the software-defined storage hype
Beyond scalability, SDS technology aims for freedom from the limits of proprietary hardware.

Via InfoStor: Data Storage Industry Braces for AI and Machine Learning
AI could also lead to untapped hidden or unknown value in existing data that has no or little perceived value

Via SearchDataCenter: New options to evolve data backup recovery

View more Server, Storage and I/O trends and perspectives comments here

Various Tips, Tools, Technology and Tradecraft Topics

Recent Data Infrastructure Tradecraft Articles, Tips, Tools, Tricks and related topics.

Via ComputerWeekly: Time to restore from backup: Do you know where your data is?
Via IDG/NetworkWorld: Ensure your data infrastructure remains available and resilient
Via IDG/NetworkWorld: Whats a data infrastructure?

Check out Scott Lowe @Scott_Lowe of VMware fame who while having a virtual networking focus has a nice roundup of related data infrastructure topics cloud, open source among others.

Want to take a break from reading or listening to tech talk, check out some of the fun videos including aerial drone (and some technology topics) at www.storageio.tv.

View more tips and articles here

Events and Activities

Recent and upcoming event activities.

May 8-10, 2017 – Dell EMCworld – Las Vegas

April 3-7, 2017 – Seminars – Dutch workshop seminar series – Nijkerk Netherlands

March 15, 2017 – Webinar – SNIA/BrightTalkHyperConverged and Storage – 10AM PT

January 26 2017 – Seminar – Presenting at Wipro SDx Summit London UK

See more webinars and activities on the Server StorageIO Events page here.


Cheers
Gs

Greg Schulz – Microsoft MVP Cloud and Data Center Management, VMware vExpert (and vSAN). Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier) and twitter @storageio. Watch for the spring 2017 release of his new book Software-Defined Data Infrastructure Essentials(CRC Press).

Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2023 Server StorageIO(R) and UnlimitedIO. All Rights Reserved.

Microsoft Diskspd (Part II): Server Storage I/O Benchmark Tools

Microsoft Diskspd (Part II): Server Storage I/O Benchmark Tools

server storage I/O trends

This is part-two of a two-part post pertaining Microsoft Diskspd.that is also part of a broader series focused on server storage I/O benchmarking, performance, capacity planning, tools and related technologies. You can view part-one of this post here, along with companion links here.

Microsoft Diskspd StorageIO lab test drive

Server and StorageIO lab

Talking about tools and technologies is one thing, installing as well as trying them is the next step for gaining experience so how about some quick hands-on time with Microsoft Diskspd (download your copy here).

The following commands all specify an I/O size of 8Kbytes doing I/O to a 45GByte file called diskspd.dat located on the F: drive. Note that a 45GByte file is on the small size for general performance testing, however it was used for simplicity in this example. Ideally a larger target storage area (file, partition, device) would be used, otoh, if your application uses a small storage device or volume, then tune accordingly.

In this test, the F: drive is an iSCSI RAID protected volume, however you could use other storage interfaces supported by Windows including other block DAS or SAN (e.g. SATA, SAS, USB, iSCSI, FC, FCoE, etc) as well as NAS. Also common to the following commands is using 16 threads and 32 outstanding I/Os to simulate concurrent activity of many users, or application processing threads.
server storage I/O performance
Another common parameter used in the following was -r for random, 7200 seconds (e.g. two hour) test duration time, display latency ( -L ) disable hardware and software cache ( -h), forcing cpu affinity (-a0,1,2,3). Since the test ran on a server with four cores I wanted to see if I could use those for helping to keep the threads and storage busy. What varies in the commands below is the percentage of reads vs. writes, as well as the results output file. Some of the workload below also had the -S option specified to disable OS I/O buffering (to view how buffering helps when enabled or disabled). Depending on the goal, or type of test, validation, or workload being run, I would choose to set some of these parameters differently.

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -w0 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noh_write000.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -w50 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noh_write050.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -w100 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noh_write100.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -S -w0 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noSh_test_write000.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -S -w50 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noSh_write050.txt

diskspd -c45g -b8K -t16 -o32 -r -d7200 -h -S -w100 -L -a0,1,2,3 F:\diskspd.dat >> SIOWS2012R203_Eiscsi_145_noSh_write100.txt

The following is the output from the above workload command.
Microsoft Diskspd sample output
Microsoft Diskspd sample output part 2
Microsoft Diskspd sample output part 3

Note that as with any benchmark, workload test or simulation your results will vary. In the above the server, storage and I/O system were not tuned as the focus was on working with the tool, determining its capabilities. Thus do not focus on the performance results per say, rather what you can do with Diskspd as a tool to try different things. Btw, fwiw, in the above example in addition to using an iSCSI target, the Windows 2012 R2 server was a guest on a VMware ESXi 5.5 system.

Where to learn more

The following are related links to read more about server (cloud, virtual and physical) storage I/O benchmarking tools, technologies and techniques.

Drew Robb’s benchmarking quick reference guide
Server storage I/O benchmarking tools, technologies and techniques resource page
Server and Storage I/O Benchmarking 101 for Smarties.
Microsoft Diskspd download and Microsoft Diskspd overview (via Technet)
I/O, I/O how well do you know about good or bad server and storage I/Os?
Server and Storage I/O Benchmark Tools: Microsoft Diskspd (Part I and Part II)

Comments and wrap-up

What I like about Diskspd (Pros)

Reporting including CPU usage (you can’t do server and storage I/O without CPU) along with IOP’s (activity), bandwidth (throughout or amount of data being moved), per thread and total results along with optional reporting. While a GUI would be nice particular for beginners, I’m used to setting up scripts for different workloads so having an extensive options for setting up different workloads is welcome. Being associated with a specific OS (e.g. Windows) the CPU affinity and buffer management controls will be handy for some projects.

Diskspd has the flexibility to use different storage interfaces and types of storage including files or partitions should be taken for granted, however with some tools don’t take things for granted. I like the flexibility to easily specify various IO sizes including large 1MByte, 10MByte, 20MByte, 100MByte and 500MByte to simulate application workloads that do large sequential (or random) activity. I tried some IO sizes (e.g. specified by -b parameter larger than 500MB however, I received various errors including "Could not allocate a buffer bytes for target" which means that Diskspd can do IO sizes smaller than that. While not able to do IO sizes larger than 500MB, this is actually impressive. Several other tools I have used or with have IO size limits down around 10MByte which makes it difficult for creating workloads that do large IOP’s (note this is the IOP size, not the number of IOP’s).

Oh, something else that should be obvious however will state it, Diskspd is free unlike some industry de-facto standard tools or workload generators that need a fee to get and use.

Where Diskspd could be improved (Cons)

For some users a GUI or configuration wizard would make the tool easier to get started with, on the other hand (oth), I tend to use the command capabilities of tools. Would also be nice to specify ranges as part of a single command such as stepping through an IO size range (e.g. 4K, 8K, 16K, 1MB, 10MB) as well as read write percentages along with varying random sequential mixes. Granted this can easily be done by having a series of commands, however I have become spoiled by using other tools such as vdbench.

Summary

Server and storage I/O performance toolbox

Overall I like Diskspd and have added it to my Server Storage I/O workload and benchmark tool-box

Keep in mind that the best benchmark or workload generation technology tool will be your own application(s) configured to run as close as possible to production activity levels.

However when that is not possible, the an alternative is to use tools that have the flexibility to be configured as close as possible to your application(s) workload characteristics. This means that the focus should not be as much on the tool, as opposed to how flexible is a tool to work for you, granted the tool needs to be robust.

Having said that, Microsoft Diskspd is a good and extensible tool for benchmarking, simulation, validation and comparisons, however it will only be as good as the parameters and configuration you set it up to use.

Check out Microsoft Diskspd and add it to your benchmark and server storage I/O tool-box like I have done.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Part II: Revisiting re:Invent 2014, Lambda and other AWS updates

server storage I/O trends

Part II: Revisiting re:Invent 2014 and other AWS updates

This is part two of a two-part series about Amazon Web Services (AWS) re:Invent 2014 and other recent cloud updates, read part one here.

AWS re:Invent 2014

AWS re:Invent announcements

Announcements and enhancements made by AWS during re:Invent include:

  • Key Management Service (KMS)
  • Amazon RDS for Aurora
  • Amazon EC2 Container Service
  • AWS Lambda
  • Amazon EBS Enhancements
  • Application development, deployed and life-cycle management tools
  • AWS Service Catalog
  • AWS CodeDeploy
  • AWS CodeCommit
  • AWS CodePipeline

AWS Lambda

In addition to announcing new higher performance Elastic Cloud Compute (EC2) compute instances along with container service, another new service is AWS Lambda. Lambda is a service that automatically and quickly runs your applications code in response to events, activities, or other triggers. In addition to running your code, Lambda service is billed in 100 millisecond increments along with corresponding memory use vs. standard EC2 per hour billing. What this means is that instead of paying for an hour of time for your code to run, you can choose to use the Lambda service with more fine-grained consumption billing.

Lambda service can be used to have your code functions staged ready to execute. AWS Lambda can run your code in response to S3 bucket content (e.g. objects) changes, messages arriving via Kinesis streams or table updates in databases. Some examples include responding to event such as a web-site click, response to data upload (photo, image, audio, file or other object), index, stream or analyze data, receive output from a connected device (think Internet of Things IoT or Internet of Device IoD), trigger from an in-app event among others. The basic idea with Lambda is to be able to pay for only the amount of time needed to do a particular function without having to have an AWS EC2 instance dedicated to your application. Initially Lambda supports Node.js (JavaScript) based code that runs in its own isolated environment.

AWS cloud example
Various application code deployment models

Lambda service is a pay for what you consume, charges are based on the number of requests for your code function (e.g. application), amount of memory and execution time. There is a free tier for Lambda that includes 1 million requests and 400,000 GByte seconds of time per month. A GByte second is the amount of memory (e.g. DRAM vs. storage) consumed during a second. An example is your application is run 100,000 times and runs for 1 second consuming 128MB of memory = 128,000,000MB = 128,000GB seconds. View various pricing models here on the AWS Lambda site that show examples for different memory sizes, times a function runs and run time.

How much memory you select for your application code determines how it can run in the AWS free tier, which is available to both existing and new customers. Lambda fees are based on the total across all of your functions starting with the code when it runs. Note that you could have from one to thousands or more different functions running in Lambda service. As of this time, AWS is showing Lambda pricing as free for the first 1 million requests, and beyond that, $0.20 per 1 million request ($0.0000002 per request) per duration. Duration is from when you code runs until it ends or otherwise terminates rounded up to the nearest 100ms. The Lambda price also depends on the amount of memory you allocated for your code. Once past the 400,000 GByte second per month free tier the fee is $0.00001667 for every GB second used.

Why use AWS Lambda vs. an EC2 instance

Why would you use AWS Lambda vs. provisioning an Container, EC2 instance or running your application code function on a traditional or virtual machine?

If you need control and can leverage an entire physical server with its operating system (O.S.), application and support tools for your piece of code (e.g. JavaScript), that could be an option. If you simply need to have an isolated image instance (O.S., applications and tools) for your code on a shared virtual on-premises environment then that can be an option. Likewise if you have the need to move your application to an isolated cloud machine (CM) that hosts an O.S. along with your application paying for those resources such as on an hourly basis, that could be your option. Simply need a lighter-weight container to drop your application into that’s where Docker and containers comes into play to off-load some of the traditional application dependencies overhead.

However, if all you want to do is to add some code logic to support processing activity for example when an object, file or image is uploaded to AWS S3 without having to standup an EC2 instance along with associated server, O.S. and complete application activity, that’s where AWS Lambda comes into play. Simply create your code (initially JavaScript) and specify how much memory it needs, define what events or activities will trigger or invoke the event, and you have a solution.

View AWS Lambda pricing along with free tier information here.

Amazon EBS Enhancements

AWS is increasing the performance and size of General Purpose SSD and Provisioned IOP’s SSD volumes. This means that you can create volumes up to 16TB and 10,000 IOP’s for AWS EBS general-purpose SSD volumes. For EBS Provisioned IOP’s SSD volumes you can create up to 16TB for 20,000 IOP’s. General-purpose SSD volumes deliver a maximum throughput (bandwidth) of 160 MBps and Provisioned IOP SSD volumes have been specified by AWS at 320MBps when attached to EBS optimized instances. Learn more about EBS capabilities here. Verify your IO size and verify AWS sizing information to avoid surprises as all IO sizes are not considered to be the same. Learn more about Provisioned IOP’s, optimized instances, EBS and EC2 fundamentals in this StorageIO AWS primer here.

Application development, deployed and life-cycle management tools

In addition to compute and storage resource enhancements, AWS has also announced several tools to support application development, configuration along with deployment (life-cycle management). These include tools that AWS uses themselves as part of building and maintaining the AWS platform services.

AWS Config (Preview e.g. early access prior to full release)

Management, reporting and monitoring capabilities including Data center infrastructure management (DCIM) for monitoring your AWS resources, configuration (including history), governance, change management and notifications. AWS Config enables similar capabilities to support DCIM, Change Management Database (CMDB), trouble shooting and diagnostics, auditing, resource and configuration analysis among other activities. Learn more about AWS Config here.

AWS Service Catalog

AWS announced a new service catalog that will be available in early 2015. This new service capability will enable administrators to create and manage catalogs of approved resources for users to use via their personalized portal. Learn more about AWS service catalog here.

AWS CodeDeploy

To support code rapid deployment automation for EC2 instances, AWS has released CodeDeploy. CodeDeploy masks complexity associated with deployment when adding new features to your applications while reducing human error-prone operations. As part of the announcement, AWS mentioned that they are using CodeDeploy as part of their own applications development, maintenance, and change-management and deployment operations. While suited for at scale deployments across many instances, CodeDeploy works with as small as a single EC2 instance. Learn more about AWS CodeDeploy here.

AWS CodeCommit

For application code management, AWS will be making available in early 2015 a new service called CodeCommit. CodeCommit is a highly scalable secure source control service that host private Git repositories. Supporting standard functionalities of Git, including collaboration, you can store things from source code to binaries while working with your existing tools. Learn more about AWS CodeCommit here.

AWS CodePipeline

To support application delivery and release automation along with associated management tools, AWS is making available CodePipeline. CodePipeline is a tool (service) that supports build, checking workflow’s, code staging, testing and release to production including support for 3rd party tool integration. CodePipeline will be available in early 2015, learn more here.

Additional reading and related items

Learn more about the above and other AWS services by actually truing hands on using their free tier (AWS Free Tier). View AWS re:Invent produced breakout session videos here, audio podcasts here, and session slides here (all sessions may not yet be uploaded by AWS re:Invent)

What this all means

AWS amazon web services

AWS continues to invest as well as re-invest into its environment both adding new feature functionality, as well as expanding the extensibility of those features. This means that AWS like other vendors or service providers adds new check-box features, however they also like some increase the depth extensibility of those capabilities. Besides adding new features and increasing the extensibility of existing capabilities, AWS is addressing both the data and information infrastructure including compute (server), storage and database, networking along with associated management tools while also adding extra developer tools. Developer tools include life-cycle management supporting code creation, testing, tracking, testing, change management among other management activities.

Another observation is that while AWS continues to promote the public cloud such as those services they offer as the present and future, they are also talking hybrid cloud. Granted you have to listen carefully as you may not simply hear hybrid cloud used like some toss it around, however listen for and look into AWS Virtual Private Cloud (VPC), along with what you can do using various technologies via the AWS marketplace. AWS is also speaking the language of enterprise and traditional IT from an applications and development to data and information infrastructure perspective while also walking the cloud talk. What this means is that AWS realizes that they need to help existing environments evolve and make the transition to the cloud which means speaking their language vs. converting them to cloud conversations to then be able to migrate them to the cloud. These steps should make AWS practical for many enterprise environments looking to make the transition to public and hybrid cloud at their pace, some faster than others. More on these and some related themes in future posts.

The AWS re:Invent event continues to grow year over year, I heard a figure of over 12,000 people however it was not clear if that included exhibiting vendors, AWS people, attendees, analyst, bloggers and media among others. However a simple validation is that the keynotes were in the larger rooms used by events such as EMCworld and VMworld when they hosted in Las Vegas as was the expo space vs. what I saw last year while at re:Invent. Unlike some large events such as VMworld where at best there is a waiting queue or line to get into sessions or hands on lab (HOL), while becoming more crowded, AWS re:Invent is still easy to get in and spend some time using the HOL which is of course powered by AWS meaning you can resume what you started while at re:Invent later. Overall a good event and nice series of enhancements by AWS, looking forward to next years AWS re:Invent.

Ok, nuff said (for now)

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Some fall 2013 AWS cloud storage and compute enhancements

Storage I/O trends

Some fall 2013 AWS cloud storage and compute enhancements

I just received via Email the October Amazon Web Services (AWS) Newsletter in advance of the re:Invent event next week in Las Vegas (yes I will be attending).

AWS October newsletter and enhancement updates

What this means

AWS is arguably the largest of the public cloud services with a diverse set of services and options across multiple geographic regions to meet different customer needs. As such it is not surprising to see AWS continue to expand their service offerings expanding their portfolio both in terms of features, functionalities along with extending their presences in different geographies.

Lets see what else AWS announces next week in Las Vegas at their 2013 re:Invent event.

Click here to view the current October 2013 AWS newsletter. You can view (and signup for) earlier AWS newsletters here, and while you are at it, view the current and recent StorageIO Update newsletters here.

Ok, nuff said (for now)

Cheers
Gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

Cloud conversations: AWS EBS, Glacier and S3 overview (Part I)

Storage I/O industry trends image

Amazon Web Services (AWS) recently added EBS Optimized support for enhanced bandwidth EC2 instances (read more here). This industry trends and perspective cloud conversation is the first (looking at EBS) in a three-part series companion to the AWS EBS optimized post found here. Part II is here (closer look at S3) and part III is here (tying it all together).

AWS image via Amazon.com

For those not familiar, Simple Storage Services (S3), Glacier and Elastic Block Storage (EBS) are part of the AWS cloud storage portfolio of services. There are several other storage and data related service for little data database (SQL and NoSql based) other offerings include compute, data management, application and networking for different needs shown in the following image.

AWS services console image
AWS Services Console via www.amazon.com

Simple Storage Service (S3) is commonly used in the context of cloud storage and object storage accessed via its S3 API. S3 can be used externally from outside AWS as well as within or via other AWS services. For example with Elastic Cloud Compute (EC2) including via the Amazon Storage Gateway (read more here and about EC2 here). Glacier is the AWS cold or deep storage service for inactive data and is a companion to S3 that you can read more about here.

S3 is well suited for both big and little data repositories of objects ranging from backup to archive to active video images and much more. In fact if you are using some of the different AaaS or SaaS services including backup or file and video sharing, those may be using S3 as its back-end storage repository. For example NetFlix leverages various AWS capabilities as part of its data and applications infrastructure (read more here).

AWS basics

AWS consists of multiple regions that contain multiple availability zones where data and applications are supported from.

yyyy

Note that objects stored in a region never leave that region, such as data stored in the EU west never leave Ireland, or data in the US East never leaves Virginia.

AWS does support the ability for user controlled movement of data between regions for business continuance (BC), high availability (HA) and disaster recovery (DR). Read more here at the AWS Security and Compliance site and in this AWS white paper.

What about EBS?

That brings us to Elastic Block Storage (EBS) that is used by EC2 (read more about EC2 and instances here) as storage for cloud and virtual machines or compute instances. In addition to using S3 as a persistent backing store or target for holding snapshots EBS can be thought of as primary storage. You can provision and allocate EBS volumes in the different data centers of the various AWS availability zones. As part of allocating your EBS volume you indicate the type (standard) or provisioned IOP’s or the new EBS Optimized volumes. EBS Optimized volumes enables instances that support the feature to have better IO performance to storage.

The following image shows an EC2 instance with EBS volumes (standard and provisioned IOPS’s) along with S3 volumes and snapshots. In the following example the instance and volumes are being served via the AWS US East region (Northern Virginia) using availability zone US East 1a. In addition, EBS optimized volumes are shown being used in the example to increase bandwidth or throughput performance between storage and the compute instance.

xxxxxxx

Using the above as a basis, you can build on that to leverage multiple availability zones or regions for HA, BC and DR combined with application, network load balancing and other capabilities. Note that EBS volumes are protected for durability by being spread across different servers and storage in an availability zone. Additional protection is provided by using snapshots combined with S3. Additional BC and DR or HA protection can be accomplished by replicating data across availability zones.

SQL applications using cloud and object storage services

The above is an example of tying various components and services together. For example using different AWS availability zones, instances, EBS, S3 and other tools including those from third parties. Here is a link to a free chapter download from Cloud and Virtual Data Storage Networking (CRC Press) pertaining to data protection, BC and DR (available at Amazon here and Kindle here). In addition here is an AWS white paper on using their services for BC, HA and DR.

EBS volumes are created ranging in size from 1GByte to 1Tbyte in space capacity with multiple volumes being mapped or attached to an EC2 instances. EBS volumes appear as a virtual disk drive for block storage. From the EC2 instance and guest operating system you can mount, format and use the EBS volumes as any other block disk drive with your favorite tools and file systems. In addition to space capacity, EBS volumes are also provisioned with standard IO (e.g. disk based) performance or high performance Provisioned IOPS (e.g. SSD) for thousands of IOPS per instance. AWS states that a standard EBS volume should support about 100 IOP’s on average, with about 2,000 IOPS for a provisioned IOP volume. Need more than 2,000 IOPS, then the AWS recommendation is to use multiple IOP provisioned volumes with data spread across those. Following is an example of AWS EBS volumes seen via the EC2 management interface.

Image of mapping AWS EBS to ECS instance
AWS EC2 and EBS configuration status

Note that there is a 10 to 1 ratio of space capacity to IOP’s being provisioned. If you try to play a game of 1,000 IOPS provisioned on a 10GByte EBS volume to keep your costs down you are out of luck. Thus to get 1,000 IOPS’s you would need to allocate at least a 100GByte EBS volume of which you will be billed for the actual space used on a monthly pro-rated basis. The following is an example of provisioning an AWS EBS volume using provisioned IOPS in the US East region in the 1a availability zone.

Image of AWS EBS provisioned IOPs
Provisioning IOPS with EBS volume

Standard and Provisioned IOPS EBS volumes

Standard EBS volumes are good for boot images or other application usage that are not IO performance intensive. For database or other active applications where more performance is needed, then EBS Provisioned IOPS volumes are your option. Note that the provisioned IOP rate is persistent for the specific volume during its life. Thus if you set it and forget it including not using it without turning it off, you will be billed for provisioning it.

Additional reading and related items:

  • Cloud conversations: AWS EBS optimized instances
  • Cloud conversations: AWS EBS, Glacier and S3 overview (Part II S3)
  • Cloud conversations: AWS EBS, Glacier and S3 overview (Part III)
  • Cloud conversations: AWS Government Cloud (GovCloud)
  • Cloud conversations: Gaining cloud confidence from insights into AWS outages
  • AWS (Amazon) storage gateway, first, second and third impressions
  • Cloud conversations: Public, Private, Hybrid what about Community Clouds?
  • Amazon cloud storage options enhanced with Glacier
  • Amazon Web Services (AWS) and the NetFlix Fix?
  • Cloud conversation, Thanks Gartner for saying what has been said
  • Cloud and Virtual Data Storage Networking via Amazon.com
  • Seven Databases in Seven Weeks
  • www.objectstoragecenter.com
  • Continue reading part II (closer look at S3) here and part III (tying it all together) here.

    Ok, nuff said (for now)

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Cloud conversations: AWS EBS Optimized Instances

    Storage I/O industry trends image

    Amazon Web Services (AWS) recently announced global availability of Elastic Block Storage (EBS) optimized support for four extra Elastic Cloud Computing (EC2) instance types. The support enables optimized performance between standard and provisioned IOP EBS volumes and EC2 instances to meet different bandwidth or throughput needs (learn more about AWS EBS, EC2, S3 and Glacier here).

    AWS image via Amazon.com

    The four EBS optimized instance types are m3.xlarge, m3.2xlarge, m2.2xlarge and c1.xlarge for dedicated bandwidth or throughput between the EC2 instances and EBS volumes. The performance or bandwidth ranges from 500 Mbits (500 / 8 = 62.5 MBytes) per second, to 1,000 Mbits (1,000 / 8 = 125MBytes) per second depending on the type of instance. As a refresher, EC2 instances (why by time you read this could change) vary in size and functionality with different amounts of EC2 Unit of Compute (ECU), number of virtual cores, amount of storage space included, 32 or 64 bit, storage and networking IO performance, and EBS Optimized or not. In addition to instances, different operating system images can be installed using those licensed from AWS such as various Windows and Unix or supply your own.

    Image of EC2 instance

    There are also different generations of instances such as M1 (first generation where one ECU = 1.0 to 1.2 Ghz of a 2007 era Opteron or Xeon processor), M3 (second generation with faster processors) along with Micro low-cost options. There are also other optimized instances including high or large amounts of memory, high CPU or compute processing, clustered compute, high memory clustered, clustered GPU (e.g. using Nivida Tesla GPUs), high IO and high storage space capacity needs.

    Here is the announcement from AWS:

    Dear Amazon Web Services Customer,

    We are delighted to announce the global availability of EBS-optimized support for four additional instance types: m3.xlarge, m3.2xlarge, m2.2xlarge, and c1.xlarge. EBS-optimized instances deliver dedicated throughput between Amazon EC2 and Amazon EBS, with options between 500 Megabits per second and 1,000 Megabits per second depending on the instance type used. The dedicated throughput minimizes contention between EBS I/O and other traffic from your Amazon EC2 instance, providing the best performance for your EBS volumes.

    EBS-optimized instances are designed for use with both Standard and Provisioned IOPS EBS volumes. Standard volumes deliver 100 IOPS on average with a best effort ability to burst to hundreds of IOPS, making them well-suited for workloads with moderate and bursty I/O needs. When attached to an EBS-optimized instance, Provisioned IOPS volumes are designed to consistently deliver up to 2000 IOPS from a single volume, making them ideal for I/O intensive workloads such as databases. You can attach multiple Amazon EBS volumes to a single instance and stripe your data across them for increased I/O and throughput performance.

    Amazon EBS-optimized support is now available for m3.xlarge, m3.2xlarge, m2.2xlarge, m2.4xlarge, m1.large, m1.xlarge, and c1.xlarge instance types, and is currently supported in the US-East (N. Virginia), US-West (N. California), US-West (Oregon), EU-West (Ireland), Asia Pacific (Singapore), Asia Pacific (Japan), Asia Pacific (Sydney), and South America (São Paulo) Regions.

    You can learn more by visiting the Amazon EC2 detail page.

    Sincerely,

    The Amazon EC2 Team

    What this means is that AWS is enabling customers to size their compute instances and storage volumes with more flexibility to meet different needs. For example, EC2 instances with various compute processing capabilities, amount of memory, network and storage I/O performance to volumes. In addition, storage volumes based on different space capacity size, standard or provisioned IOP’s, bandwidth or throughput performance between the instance and volume, along with data protection such as snapshots.

    This means that the cost per space capacity of an EBS volume varies based on which AWS availability zone it is in, standard (lower IOP performance) or provisioned IOP’s (faster), along with instance type. In other words, cloud storage is not just about the cost per GByte, it’s also about the cost for IOPS, bandwidth to use it, where it is located (e.g. with AWS which Availability Zone), type of service, level of availability and durability among other attributes.

    Additional reading and related items:

    Continue reading part I (closer look at EBS) here, part II (closer look at S3) here and part III (tying it all together) here.

    Ok, nuff said (for now)

    Cheers
    Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved

    Amazon cloud storage options enhanced with Glacier

    StorageIO industry trend for storage IO

    In case you missed it, Amazon Web Services (AWS) has enhanced their cloud services (Elastic Cloud Compute or EC2) along with storage offerings. These include Relational Database Service (RDS), DynamoDB, Elastic Block Store (EBS), and Simple Storage Service (S3). Enhancements include new functionality along with availability or reliability in the wake of recent events (outages or service disruptions). Earlier this year AWS announced their Cloud Storage Gateway solution that you can read an analysis here. More recently AWS announced provisioned IOPS among other enhancements (see AWS whats new page here).

    Amazon Web Services logo

    Before announcing Glacier, options for Amazon storage services relied on general purpose S3, or EBS with other Amazon services. S3 has provided users the ability to select different availability zones (e.g. geographical regions where data is stored) along with level of reliability for different price points for their applications or services being offered.

    Note that AWS S3 flexibility lends itself to individuals or organizations using it for various purposes. This ranges from storing backup or file sharing data to being used as a target for other cloud services. S3 pricing options vary depending on which availability zones you select as well as if standard or reduced redundancy. As its name implies, reduced redundancy trades lower availability recovery time objective (RTO) in exchange for lower cost per given amount of space capacity.

    AWS has now announced a new class or tier of storage service called Glacier, which as its name implies moves very slow and capable of supporting large amounts of data. In other words, targeting inactive or seldom accessed data where emphasis is on ultra-low cost in exchange for a longer RTO. In exchange for an RTO that AWS is stating that it can be measured in hours, your monthly storage cost can be as low as 1 cent per GByte or about 12 cents per year per GByte plus any extra fees (See here).

    Here is a note that I received from the Amazon Web Services (AWS) team:

    Dear Amazon Web Services Customer,
    We are excited to announce the immediate availability of Amazon Glacier – a secure, reliable and extremely low cost storage service designed for data archiving and backup. Amazon Glacier is designed for data that is infrequently accessed, yet still important to keep for future reference. Examples include digital media archives, financial and healthcare records, raw genomic sequence data, long-term database backups, and data that must be retained for regulatory compliance. With Amazon Glacier, customers can reliably and durably store large or small amounts of data for as little as $0.01/GB/month. As with all Amazon Web Services, you pay only for what you use, and there are no up-front expenses or long-term commitments.

    Amazon Glacier is:

    • Low cost– Amazon Glacier is an extremely low-cost, pay-as-you-go storage service that can cost as little as $0.01 per gigabyte per month, irrespective of how much data you store.
    • Secure – Amazon Glacier supports secure transfer of your data over Secure Sockets Layer (SSL) and automatically stores data encrypted at rest using Advanced Encryption Standard (AES) 256, a secure symmetrix-key encryption standard using 256-bit encryption keys.
    • Durable– Amazon Glacier is designed to give average annual durability of 99.999999999% for each item stored.
    • Flexible -Amazon Glacier scales to meet your growing and often unpredictable storage requirements. There is no limit to the amount of data you can store in the service.
    • Simple– Amazon Glacier allows you to offload the administrative burdens of operating and scaling archival storage to AWS, and makes long term data archiving especially simple. You no longer need to worry about capacity planning, hardware provisioning, data replication, hardware failure detection and repair, or time-consuming hardware migrations.
    • Designed for use with other Amazon Web Services – You can use AWS Import/Export to accelerate moving large amounts of data into Amazon Glacier using portable storage devices for transport. In the coming months, Amazon Simple Storage Service (Amazon S3) plans to introduce an option that will allow you to seamlessly move data between Amazon S3 and Amazon Glacier using data lifecycle policies.

    Amazon Glacier is currently available in the US-East (N. Virginia), US-West (N. California), US-West (Oregon), EU-West (Ireland), and Asia Pacific (Japan) Regions.

    A few clicks in the AWS Management Console are all it takes to setup Amazon Glacier. You can learn more by visiting the Amazon Glacier detail page, reading Jeff Barrs blog post, or joining our September 19th webinar.
    Sincerely,
    The Amazon Web Services Team

    StorageIO industry trend for storage IO

    What is AWS Glacier?

    Glacier is low-cost for lower performance (e.g. access time) storage suited to data applications including archiving, inactive or idle data that you are not in a hurry to retrieve. Pay as you go pricing that can be as low as $0.01 USD per GByte per month (and other optional fees may apply, see here) depending on availability zone. Availability zone or regions include US West coast (Oregon or Northern California), US East Coast (Northern Virginia), Europe (Ireland) and Asia (Tokyo).

    Amazon Web Services logo

    Now what is understood should have to be discussed, however just to be safe, pity the fool who complains about signing up for AWS Glacier due to its penny per month per GByte cost and it being too slow for their iTunes or videos as you know its going to happen. Likewise, you know that some creative vendor or their surrogate is going to try to show a miss-match of AWS Glacier vs. their faster service that caters to a different usage model; it is just a matter of time.

    StorageIO industry trend for storage IO

    Lets be clear, Glacier is designed for low-cost, high-capacity, slow access of infrequently accessed data such as an archive or other items. This means that you will be more than disappointed if you try to stream a video, or access a document or photo from Glacier as you would from S3 or EBS or any other cloud service. The reason being is that Glacier is designed with the premise of low-cost, high-capacity, high availability at the cost of slow access time or performance. How slow? AWS states that you may have to wait several hours to reach your data when needed, however that is the tradeoff. If you need faster access, pay more or find a different class and tier of storage service to meet that need, perhaps for those with the real need for speed, AWS SSD capabilities ;).

    Here is a link to a good post over at Planforcloud.com comparing Glacier vs. S3, which is like comparing apples and oranges; however, it helps to put things into context.

    Amazon Web Services logo

    In terms of functionality, Glacier security includes secure socket layer (SSL), advanced encryption standard (AES) 256 (256-bit encryption keys) data at rest encryption along with AWS identify and access management (IAM) policies.

    Persistent storage designed for 99.999999999% durability with data automatically placed in different facilities on multiple devices for redundancy when data is ingested or uploaded. Self-healing is accomplished with automatic background data integrity checks and repair.

    Scale and flexibility are bound by the size of your budget or credit card spending limit along with what availability zones and other options you choose. Integration with other AWS services including Import/Export where you can ship large amounts of data to Amazon using different media and mediums. Note that AWS has also made a statement of direction (SOD) that S3 will be enhanced to seamless move data in and out of Glacier using data policies.

    Part of stretching budgets for organizations of all size is to avoid treating all data and applications the same (key theme of data protection modernization). This means classifying and addressing how and where different applications and data are placed on various types of servers, storage along with revisiting modernizing data protection.

    While the low-cost of Amazon Glacier is an attention getter, I am looking for more than just the lowest cost, which means I am also looking for reliability, security among other things to gain and keep confidence in my cloud storage services providers. As an example, a few years ago I switched from one cloud backup provider to another not based on cost, rather functionality and ability to leverage the service more extensively. In fact, I could switch back to the other provider and save money on the monthly bills; however I would end up paying more in lost time, productivity and other costs.

    StorageIO industry trend for storage IO

    What do I see as the barrier to AWS Glacier adoption?

    Simple, getting vendors and other service providers to enhance their products or services to leverage the new AWS Glacier storage category. This means backup/restore, BC and DR vendors ranging from Amazon (e.g. releasing S3 to Glacier automated policy based migration), Commvault, Dell (via their acquisitions of Appassure and Quest), EMC (Avamar, Networker and other tools), HP, IBM/Tivoli, Jungledisk/Rackspace, NetApp, Symantec and others, not to mention cloud gateway providers will need to add support for this new capabilities, along with those from other providers.

    As an Amazon EC2 and S3 customer, it is great to see Amazon continue to expand their cloud compute, storage, networking and application service offerings. I look forward to actually trying out Amazon Glacier for storing encrypted archive or inactive data to compliment what I am doing. Since I am not using the Amazon Cloud Storage Gateway, I am looking into how I can use Rackspace Jungledisk to manage an Amazon Glacier repository similar to how it manages my S3 stores.

    Some more related reading:
    Only you can prevent cloud data loss
    Data protection modernization, more than swapping out media
    Amazon Web Services (AWS) and the NetFlix Fix?
    AWS (Amazon) storage gateway, first, second and third impressions

    As of now, it looks like I will have to wait for either Jungledisk adds native support as they do today for managing my S3 storage pool today, or, the automated policy based movement between S3 and Glacier is transparently enabled.

    Ok, nuff said for now

    Cheers Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    Amazon Web Services (AWS) and the NetFlix Fix?

    Amazon Web Services (AWS)
    Amazon Web Services (AWS)

    I received the following note from Amazon Web Services (AWS) about an enhancement to their Elastic Compute Cloud (EC2) service that can be seen by some as an enhancement to service or perhaps by others after last weeks outages, a fix or addressing a gap in their services. Note for those not aware, you can view current AWS service status portal here.

    The following is the note I received from AWS.

     

    Announcing Multiple IP Addresses for Amazon EC2 Instances in Amazon VPC

    Amazon Web Services (AWS)
    Amazon Web Services (AWS)
    Dear Amazon EC2 Customer,

    We are excited to introduce multiple IP addresses for Amazon EC2 instances in Amazon VPC. Instances in a VPC can be assigned one or more private IP addresses, each of which can be associated with its own Elastic IP address. With this feature you can host multiple websites, including SSL websites and certificates, on a single instance where each site has its own IP address. Private IP addresses and their associated Elastic IP addresses can be moved to other network interfaces or instances, assisting with application portability across instances.

    The number of IP addresses that you can assign varies by instance type. Small instances can accommodate up to 8 IP addresses (across 2 elastic network interfaces) whereas High-Memory Quadruple Extra Large and Cluster Computer Eight Extra Large instances can be assigned up to 240 IP addresses (across 8 elastic network interfaces). For more information about IP address and elastic network interface limits, go to Instance Families and Types in the Amazon EC2 User Guide.

    You can have one Elastic IP (EIP) address associated with a running instance at no charge. If you associate additional EIPs with that instance, you will be charged $0.005/hour for each additional EIP associated with that instance per hour on a pro rata basis.

    With this release we are also lowering the charge for EIP addresses not associated with running instances, from $0.01 per hour to $0.005 per hour on a pro rata basis. This price reduction is applicable to EIP addresses in both Amazon EC2 and Amazon VPC and will be applied to EIP charges incurred since July 1, 2012.
    To learn more about multiple IP addresses, visit the Amazon VPC User Guide. For more information about pricing for additional Elastic IP addresses on an instance, please see Amazon EC2 Pricing.
    Sincerely,

    The Amazon EC2 Team

    We hope you enjoyed receiving this message. If you wish to remove yourself from receiving future product announcements and the monthly AWS Newsletter, please update your communication preferences.

    Amazon Web Services LLC is a subsidiary of Amazon.com, Inc. Amazon.com is a registered trademark of Amazon.com, Inc. This message produced and distributed by Amazon Web Services, LLC, 410 Terry Ave. North, Seattle, WA 98109-5210.

    End of AWS message

     

    Server and StorageIO industry trends and perspective DAS

    Either way you look at it, AWS (disclosure I’m a paying EC2 and S3 customer) is taking responsibility on their part to do what is needed to enable a resilient, flexible, scalable data infrastructure. What I mean by that is that protecting data and access to it in cloud environments is a shared responsibility including discussing what went wrong, how to fix and prevent it, as well as communicating best practices. That is both the provider or service along with those who are using those capabilities have to take some ownership and responsibility on how they get used.

    For example, last week a major thunderstorms rolled across the U.S. causing large-scale power outages along the eastern seaboard of the U.S. and in particular in the Virginia area where one of Amazons availability zones (US East-1) has data centers located. Keep in mind that Amazon availability zones are made up of a collection of different physical data centers to cut or decrease chances of a single point of failure. However on June 30, 2012 during the major storms on the East coast of the U.S. something did go wrong, and as is usually the case, a chain of events resulted in or near a disaster (you can read the AWS post-mortem here).

    The result is that AWS based out of the Virginia availability zone were knocked off line for a period which impacted EC2, Elastic Block Storage (EBS), Relational Database Service (RDS) and Elastic Load Balancer (ELB) capabilities for that zone. This is not the first time that the Virginia availability zone has been affected having met a disruption about a year ago. What was different about this most recent outage is that a year ago one of the marquee AWS customers NetFlix was not affected during that outage due to how they use multiple availability zones for HA. In last weeks AWS outage NetFlix customers or services were affected however not due to loss of data or systems, rather, loss of access (which to a user or consumer is the same thing). The loss of access was due to failure of elastic load balancing not being able to allow users access to other availability zones.

    Server and StorageIO industry trends and perspective DAS

    Consequently, if you choose to read between the lines on the above email note I received from AWS, you can either look at the new service capabilities as an enhancement, or AWS learning and improving their capabilities. Also reading between the lines you can see how some environments such as NetFlix take responsibility in how they use cloud services designing for availability, resiliency and scale with stability as opposed to simply using as a cost cutting tool.

    Thus when both the provider and consumer take some responsibility for ensuring data protection and accessibility to services, there is less of a chance of service disruptions. Likewise when both parties learn from incidents or mistakes or leverage experiences, it makes for a more robust solution on a go forward basis. For those who have been around the block (or file) a few times thinking that clouds are not reliable or still immature you may have a point however think back to when your favorite or preferred platform (e.g. Mainframe, Mini, PC, client-server, iProduct, Web or other) initially appeared and teething problems or associated headaches.

    IMHO AWS along with other vendors or service providers who take responsibility to publish post-mortem’s of incidents, find and fix issues, address and enhance capabilities is part of the solution for laying the groundwork for the future vs. simply playing to a near term trend theme. Likewise vendors and service providers who are reaching out and helping to educate and get their customers to take some responsibility in how they can use services for removing complexity (and cost) to enhance services as opposed to simply cutting cost and introducing risk will do better over the long run.

    As I discuss in my book Cloud and Virtual Data Storage Networking (CRC Press), do not be scared of clouds, however be ready, do your homework, learn and understand what needs to be done or done differently. This means taking a shared responsibility one that the service provider should also be taking with you not to mention identifying new best practices, tools to be used along with conducting proof of concepts (POCs) to learn what to do and what not to do.

    Some related information:
    Only you can prevent cloud data loss
    The blame game: Does cloud storage result in data loss?
    Cloud conversations: Loss of data access vs. data loss
    Clouds are like Electricity: Dont be Scared
    AWS (Amazon) storage gateway, first, second and third impressions
    Poll: What Do You Think of IT Clouds? (Cast your vote and see results)

    Ok, nuff said for now.

    Cheers Gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved

    AWS (Amazon) storage gateway, first, second and third impressions

    Amazon Web Services (AWS) today announced the beta of their new storage gateway functionality that enables access of Amazon S3 (Simple Storage Services) from your different applications using an appliance installed in your data center site. With this beta launch, Amazon joins other startup vendors who are providing standalone gateway appliance products (e.g. Nasuni etc) along with those who have disappeared from the market (e.g. Cirtas). In addition to gateway vendors, there are also those with cloud access added to their software tools such as (e.g. Jungle Disk that access both Rack space and Amazon S3 along with Commvault Simpana Cloud connector among others). There are also vendors that have joined cloud access gateways as part of their storage systems such as TwinStrata among others. Even EMC (and here) has gotten into the game adding qualified cloud access support to some of their products.

    What is a cloud storage gateway?

    Before going further, lets take a step back and address what for some may be a fundemental quesiton of what is a cloud storage gateway?

    Cloud services such as storage are accessed via some type of network, either the public Internet or a private connection. The type of cloud service being accessed (figure 1) will decide what is needed. For example, some services can be accessed using a standard Web browser, while others must plug-in or add-on modules. Some cloud services may need downloading an application, agent, or other tool for accessing the cloud service or resources, while others give an on-site or on-premisess appliance or gateway.

    Generic cloud access example via Cloud and Virtual Data Storage Networking (CRC Press)
    Figure 1: Accessing and using clouds (From Cloud and Virtual Data Storage Networking (CRC Press))

    Cloud access software and gateways or appliances are used for making cloud storage accessible to local applications. The gateways, as well as enabling cloud access, provide replication, snapshots, and other storage services functionality. Cloud access gateways or server-based software include tools from BAE, Citrix, Gladinet, Mezeo, Nasuni, Openstack, Twinstrata among others. In addition to cloud gateway appliances or cloud points of presence (cpops), access to public services is also supported via various software tools. Many data protection tools including backup/restore, archiving, replication, and other applications have added (or are planning to add) support for access to various public services such as Amazon, Goggle, Iron Mountain, Microsoft, Nirvanix, or Rack space among several others.

    Some of the tools have added native support for one or more of the cloud services leveraging various applicaiotn programming interfaces (APIs), while other tools or applications rely on third-party access gateway appliances or a combination of native and appliances. Another option for accessing cloud resources is to use tools (Figure 2) supplied by the service provider, which may be their own, from a third-party partner, or open source, as well as using their APIs to customize your own tools.

    Generic cloud access example via Cloud and Virtual Data Storage Networking (CRC Press)
    Figure 2: Cloud access tools (From Cloud and Virtual Data Storage Networking (CRC Press))

    For example, I can use my Amazon S3 or Rackspace storage accounts using their web and other provided tools for basic functionality. However, for doing backups and restores, I use the tools provided by the service provider, which then deal with two different cloud storage services. The tool presents an interface for defining what to back up, protect, and restore, as well as enabling shared (public or private) storage devices and network drives. In addition to providing an interface (Figure 2), the tool also speaks specific API and protocols of the different services, including PUT (create or update a container), POST (update header or Meta data), LIST (retrieve information), HEAD (metadata information access), GET (retrieve data from a container), and DELETE (remove container) functions. Note that the real behavior and API functionality will vary by service provider. The importance of mentioning the above example is that when you look at some cloud storage services providers, you will see mention of PUT, POST, LIST, HEAD, GET, and DELETE operations as well as services such as capacity and availability. Some services will include an unlimited number of operations, while others will have fees for doing updates, listing, or retrieving your data in addition to  basic storage fees. By being aware of cloud primitive functions such as PUT or POST and GET or LIST, you can have a better idea of what they are used for as well as how they play into evaluating different services, pricing, and services plans.

    Depending on the type of cloud service, various protocols or interfaces may be used, including iSCSI, NAS NFS, HTTP or HTTPs, FTP, REST, SOAP, and Bit Torrent, and APIs and PaaS mechanisms including .NET or SQL database commands, in addition to XM, JSON, or other formatted data. VMs can be moved to a cloud service using file transfer tools or upload capabilities of the provider. For example, a VM such as a VMDK or VHD  is prepared locally in your environment and then uploaded to a cloud provider for execution. Cloud services may give an access program or utility that allows you to configure when, where, and how data will be protected, similar to other backup or archive tools.

    Some traditional backup or archive tools have added direct or via third party support for accessing IaaS cloud storage services such as Amazon, Rack space, and others. Third-party access appliance or gateways enable existing tools to read and write data to a cloud environment by presenting a standard interface such as NAS (NFS and/or CIFS) or iSCSI (Block) that gets mapped to the back-end cloud service format. For example, if you subscribe to Amazon S3, storage is allocated as objects and various tools are used to use or utilize. The cloud access software or appliance understands how to communicate with the IaaS  storage APIs and abstracts those from how they are used. Access software tools or gateways, in addition to translating or mapping between cloud APIs, formats your applications including security with encryption, bandwidth optimization, and data footprint reduction such as compression and de-duplication. Other functionality include reporting, management tools that support various interfaces, protocols and standards including SNMP or SNIA, Storage Management Initiative Specification (SMIS), and Cloud Data Management Initiative (CDMI).

    First impression: Interesting, good move Amazon, I was ready to install and start testing it today

    The good news here is that Amazon is taking steps to make it easier for your existing applications and IT environments to use and leverage clouds for private and hybrid adoption models with both an Amazon branded and managed services, technology and associated tools.

    This means leveraging your existing Amazon accounts to simplify procurement, management, ongoing billing as well as leveraging their infrastructure. As a standalone gateway appliance (e.g. it does not have to be bundled as part of a specific backup, archive, replication or other data management tool), the idea is that you can insert the technology into your existing data center between your servers and storage to begin sending a copy of data off to Amazon S3. In addition to sending data to S3, the integrated functionality with other AWS services should make it easier to integrated with Elastic Cloud Compute (EC2) and Elastic Block storage (EBS) capabilities including snapshots for data protection.

    Thus my first impression of AWS storage gateway at a high level view is good and interesting resulting in looking a bit deeper resulting in a second impression.

    Second impression: Hmm, what does it really do and require, time to slow down and do more home work

    Digging deeper and going through the various publicly available material (note can only comment or discuss on what is announced or publicly available) results in a second impression of wanting and needing to dig deeper based on some of caveats. Now granted and in fairness to Amazon, this is of course a beta release and hence while on first impression it can be easy to miss the notice that it is in fact a beta so keep in mind things can and hopefully will change.

    Pricing aside, which means as with any cloud or managed storage service, you will want to do a cost analysis model just as you would for procuring physical storage, look into the cost of monthly gateway fee along with its associated physical service running VMware ESXi configuration that you will need to supply. Chances are that if you are an average sized SMB, you have a physical machine (PM) laying around that you can throw a copy of ESXi on to if you dont already have room for some more VMs on an existing one.

    You will also need to assess the costs for using the S3 storage including space capacity charges, access and other fees as well as charges for doing snapshots or using other functionality. Again these are not unique to Amazon or their cloud gateway and should be best practices for any service or solution that you are considering. Amazon makes it easy by the way to see their base pricing for different tiers of availability, geographic locations and optional fees.

    Speaking of accessing the cloud, and cloud conversations, you will also want to keep in mind what your networking bandwidth service requirements will be to move data to Amazon that might not already be doing so.

    Another thing to consider with the AWS storage gateway is that it does not replace your local storage (that is unless you move your applications to Amazon EC2 and EBS), rather makes a copy of what every you save locally to a remote Amazon S3 storage pool. This can be good for high availability (HA), business continuance (BC), disaster recovery (DR) and compliance among other data management needs. However in your cost model you also need to keep in mind that you are not replacing your local storage, you are adding to it via the cloud which should be seen as complimenting and enhancing your private now to be hybrid environment.

     

    Walking the cloud data protection talk

    FWIW, I leverage a similar model where I use a service (Jungle Disk) where critical copies of my data get sent to that service which in turn places copies at Rack space (Jungledisks parent) and Amazon S3. What data goes to where depends on different policies that I have established. I also have local backup copies as well as master gold disaster copy stored in a secure offsite location. The idea is that when needed, I can get a good copy restored from my cloud providers quickly regardless of where I am if the local copy is not good. On the other hand, experience has already demonstrated that without sufficient network bandwidth services, if I need to bring back 100s of GBytes or TBytes of data quickly, Im going to be better off bring back onsite my master gold copy, then applying fewer, smaller updates from the cloud service. In other words, the technologies compliment each other.

    By the way, a lesson learned here is that once my first copy is made which have data footprint reduction (DFR) techniques applied (e.g. compress, de dupe, optimized, etc), later copies occur very fast. However subsequent restores of those large files or volumes also takes longer to retrieve from the cloud vs. sending up changed versions. Thus be aware of backup vs. restore times, something of which will apply to any cloud provider and can be mitigated by appliances that do local caching. However also keep in mind that if a disaster occurs, will your local appliance be affected and its cache rendered useless.

    Getting back to AWS storage gateway and my second impression is that at first it sounded great.

    However then I realized it only supports iSCSI and FWIW, nothing wrong with iSCSI, I like it and recommend using it where applicable, even though Im not using it. I would like to have seen a NAS (either NFS and/or CIFS) support for a gateway making it easier for in my scenario different applications, servers and systems to use and leverage the AWS services, something that I can do with my other gateways provided via different software tools. Granted for those environments that already are using iSCSI for your servers that will be using AWS storage gateway, then this is a non issue while for others it is a consideration including cost (time) to factor in to prepare your environment for using the ability.

    Depending on the amount of storage you have in your environment, the next item that caught my eye may or may not be an issue that the iSCSI gateway supports up to 1TB volumes and up to 12 of them hence a largest capacity of 12TB under management. This can be gotten around by using multiple gateways however the increased complexity balanced to the benefit the functionality is something to consider.

    Third impression: Dig deeper, learn more, address various questions

    This leads up to my third impression the need to dig deeper into what AWS storage gateway can and cannot do for various environments. I can see where it can be a fit for some environments while for others at least in its beta version will be a non starter. In the meantime, do your homework, look around at other options which ironically by having Amazon launching a gateway service may reinvigorate the market place of some of the standalone or embedded cloud gateway solution providers.

    What is needed for using AWS storage gateway

    In addition to having an S3 account, you will need to acquire for a monthly fee the storage gateway appliance which is software installed into a VMware ESXi hypervisor virtual machine (VM). The requirements are VMware ESXi hypervisor (v4.1) on a physical machine (PM) with at least 7.5GB of RAM and four (4) virtual processors assigned to the appliance VM along with 75GB of disk space for the Open Virtual Alliance (OVA) image installation and data. You will also need to have an proper sized network connection to Amazon. You will also need iSCSI initiators on either Windows server 2008, Windows 7 or Red Hat Enterprise Linux.

    Note that the AWS storage gateway beta is optimized for block write sizes greater than 4Kbytes and warns that smaller IO sizes can cause overhead resulting in lost storage space. This is a consideration for systems that have not yet changed your file systems and volumes to use the larger allocation sizes.

    Some closing thoughts, tips and comments:

    • Congratulations to Amazon for introducing and launching an AWS branded storage gateway.
    • Amazon brings trust the value of trust to a cloud relationship.
    • Initially I was excited about the idea of using a gateway that any of may systems could use my S3 storage pools with vs. using gateway access functions that are part of different tools such as my backup software or via Amazon web tools. Likewise I was excited by the idea of having an easy to install and use gateway that would allow me to grow in a cost effective way.
    • Keep in mind that this solution or at least in its beta version DOES NOT replace your existing iSCSI based storage needs, instead it compliments what you already have.
    • I hope Amazon listens carefully to what they customers and prospects want vs. need to evolve the functionality.
    • This announcement should reinvigorate some of the cloud appliance vendors as well as those who have embedded functionality to Amazon and other providers.
    • Keep bandwidth services and optimization in mind both for sending data as well as for when retrieving during a disaster or small file restore.
    • In concept, the AWS storage gateway is not all that different than appliances that do snapshots and other local and remote data protection such as those from Actifio, EMC (Recoverpoint), Falconstor or dedicated gateways such as those from Nasuni among others.
    • Here is a link to added AWS storage gateways frequently asked questions (FAQs).
    • If the AWS were available with a NAS interface, I would probably be activating it this afternoon even with some of their other requirements and cost aside.
    • Im still formulating my fourth impression which is going to take some time, perhaps if I can get Amazon to help sell more of my books so that I can get some money to afford to test the entire solution leveraging my existing S3, EC2 and EBS accounts I might do so in the future, otherwise for now, will continue to research.
    • Learn more about the AWS storage gateway beta, check out this free Amazon web cast on February 23, 2012.

    Learn more abut cloud based data protection, data footprint reduction, cloud gateways, access and management, check out my book Cloud and Virtual Data Storage Networking (CRC Press) which is of course available on Amazon Kindle as well as via hard cover print copy also available at Amazon.com.

    Ok, nuff said for now, I need to get back to some other things while thinking about this all some more.

    Cheers gs

    Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)

    twitter @storageio

    All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO and UnlimitedIO All Rights Reserved