AWS S3 Storage Gateway Revisited (Part I)

server storage I/O trends

AWS S3 Storage Gateway Revisited (Part I)

This Amazon Web Service (AWS) Storage Gateway Revisited posts is a follow-up to the AWS Storage Gateway test drive and review I did a few years ago (thus why it’s called revisited). As part of a two-part series, the first post looks at what AWS Storage Gateway is, how it has improved since my last review of AWS Storage Gateway along with deployment options. The second post in the series looks at a sample test drive deployment and use.

If you need an AWS primer and overview of various services such as Elastic Cloud Compute (EC2), Elastic Block Storage (EBS), Elastic File Service (EFS), Simple Storage Service (S3), Availability Zones (AZ), Regions and other items check this multi-part series (Cloud conversations: AWS EBS, Glacier and S3 overview (Part I) ).

AWS

As a quick refresher, S3 is the AWS bulk, high-capacity unstructured and object storage service along with its companion deep cold (e.g. inactive) Glacier. There are various S3 storage service classes including standard, reduced redundancy storage (RRS) along with infrequent access (IA) that have different availability durability, performance, service level and cost attributes.

Note that S3 IA is not Glacier as your data always remains on-line accessible while Glacier data can be off-line. AWS S3 can be accessed via its API, as well as via HTTP rest calls, AWS tools along with those from third-party’s. Third party tools include NAS file access such as S3FS for Linux that I use for my Ubuntu systems to mount S3 buckets and use similar to other mount points. Other tools include Cloudberry, S3 Motion, S3 Browser as well as plug-ins available in most data protection (backup, snapshot, archive) software tools and storage systems today.

AWS S3 Storage Gateway and What’s New

The Storage Gateway is the AWS tool that you can use for accessing S3 buckets and objects via your block volume, NAS file or tape based applications. The Storage Gateway is intended to give S3 bucket and object access to on-premises applications and data infrastructures functions including data protection (backup/restore, business continuance (BC), business resiliency (BR), disaster recovery (DR) and archiving), along with storage tiering to cloud.

Some of the things that have evolved with the S3 Storage Gateway include:

  • Easier, streamlined download, installation, deployment
  • Enhanced Virtual Tape Library (VTL) and Virtual Tape support
  • File serving and sharing (not to be confused with Elastic File Services (EFS))
  • Ability to define your own bucket and associated parameters
  • Bucket options including Infrequent Access (IA) or standard
  • Options for AWS EC2 hosted, or on-premises VMware as well as Hyper-V gateways (file only supports VMware and EC2)

AWS Storage Gateway Three Functions

AWS Storage Gateway can be deployed for three basic functions:

    AWS Storage Gateway File Architecture via AWS.com

  • File Gateway (NFS NAS) – Files, folders, objects and other items are stored in AWS S3 with a local cache for low latency access to most recently used data. With this option, you can create folders and subdirectory similar to a regular file system or NAS device as well as configure various security, permissions, access control policies. Data is stored in S3 buckets that you specify policies such as standard or Infrequent Access (IA) among other options. AWS hosted via EC2 as well as VMware Virtual Machine (VM) for on-premises file gateway.

    Also, note that AWS cautions on multiple concurrent writers to S3 buckets with Storage Gateway so check the AWS FAQs which may have changed by the time you read this. Current file share limits (subject to change) include 1 file gateway share per S3 bucket (e.g. a one to one mapping between file share and a bucket). There can be 10 file shares per gateway (e.g. multiple shares each with its own bucket per gateway) and a maximum file size of 5TB (same as maximum S3 object size). Note that you might hear about object storage systems supporting unlimited size objects which some may do, however generally there are some constraints either on their API front-end, or what is currently tested. View current AWS Storage Gateway resource and specification limits here.

  • AWS Storage Gateway Non-Cached Volume Architecture via AWS.com

    AWS Storage Gateway Cached Volume Architecture via AWS.com

  • Volume Gateway (Block iSCSI) – Leverages S3 with a point in time backup as an AWS EBS snapshot. Two options exist including Cached volumes with low-latency access to most recently used data (e.g. data is stored in AWS, with a local cache copy on disk or SSD). The other option is Stored Volumes (e.g. non-cached) where primary copy is local and periodic snapshot backups are sent to AWS. AWS provides EC2 hosted, as well as VMs for VMware and various Hyper-V Windows Server based VMs.

    Current Storage Gateway volume limits (subject to change) include maximum size of a cached volume 32TB, maximum size of a stored volume 16TB. Note that snapshots of cached volumes larger than 16TB can only be restored to a storage gateway volume, they can not be restored as an EBS volume (via EC2). There are a maximum of 32 volumes for a gateway with total size of all volumes for a gateway (cached) of 1,024TB (e.g. 1PB). The total size of all volumes for a gateway (stored volume) is 512TB. View current AWS Storage Gateway resource and specification limits here.

  • AWS Storage Gateway VTL Architecture via AWS.com

  • Virtual Tape Library Gateway (VTL) – Supports saving your data for backup/BC/DR/archiving into S3 and Glacier storage tiers. Being a Virtual Tape Library (e.g. VTL) you can specify emulation of tapes for compatibility with your existing backup, archiving and data protection software, management tools and processes.

    Storage Gateway limits for tape include minimum size of a virtual tape 100GB, maximum size of a virtual tape 2.5TB, maximum number of virtual tapes for a VTL is 1,500 and total size of all tapes in a VTL is 1PB. Note that the maximum number of virtual tapes in an archive is unlimited and total size of all tapes in an archive is also unlimited. View current AWS Storage Gateway resource and specification limits here.

    AWS

Where To Learn More

What This All Means

As to which gateway function and mode (cached or non-cached for Volumes) depends on what it is that you are trying to do. Likewise choosing between EC2 (cloud hosted) or on-premises Hyper-V and VMware VMs depends on what your data infrastructure support requirements are. Overall I like the progress that AWS has put into evolving the Storage Gateway, granted it might not be applicable for all usage cases. Continue reading more and view images from the AWS Storage Gateway Revisited test drive in part two located here.

Ok, nuff said (for now…).

Cheers
Gs

Greg Schulz – Multi-year Microsoft MVP Cloud and Data Center Management, VMware vExpert (and vSAN). Author of Software Defined Data Infrastructure Essentials (CRC Press), as well as Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio.

Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2023 Server StorageIO(R) and UnlimitedIO. All Rights Reserved.

Part II Revisting AWS S3 Storage Gateway (Test Drive Deployment)

server storage I/O trends

Part II Revisiting AWS S3 Storage Gateway (Test Drive Deployment)

This Amazon Web Service (AWS) Storage Gateway Revisited posts is a follow-up to the AWS Storage Gateway test drive and review I did a few years ago (thus why it’s called revisited). As part of a two-part series, the first post looks at what AWS Storage Gateway is, how it has improved since my last review of AWS Storage Gateway along with deployment options. The second post in the series looks at a sample test drive deployment and use.

What About Storage Gateway Costs?

Costs vary by region, type of storage being used (files stored in S3, Volume Storage, EBS Snapshots, Virtual Tape storage, Virtual Tape storage archive), as well as type of gateway host, along with how access and used. Request pricing varies including data written to AWS storage by gateway (up to maximum of $125.00 per month), snapshot/volume delete, virtual tape delete, (prorate fee for deletes within 90 days of being archived), virtual tape archival, virtual tape retrieval. Note that there are also various data transfer fees that also vary by region and gateway host. Learn more about pricing here.

What Are Some Storage Gateway Alternatives

AWS and S3 storage gateway access alternatives include those from various third-party (including that are in the AWS marketplace), as well as via data protection tools (e.g. backup/restore, archive, snapshot, replication) and more commonly storage systems. Some tools include Cloudberry, S3FS, S3 motion, S3 Browser among many others.

Tip is when a vendor says they support S3, ask them if that is for their back-end (e.g. they can access and store data in S3), or front-end (e.g. they can be accessed by applications that speak S3 API). Also explore what format the application, tool or storage system stores data in AWS storage, for example, are files mapped one to one to S3 objects along with corresponding directory hierarchy, or are they stored in a save set or other entity.

AWS Storage Gateway Deployment and Management Tips

Once you have created your AWS account (if you did not already have one) and logging into the AWS console (note the link defaults to US East 1 Region), go to the AWS Services Dashboard and select Storage Gateway (or click here which goes to US East 1). You will be presented with three options (File, Volume or VTL) modes.

What Does Storage Gateway and Install Look Like

The following is what installing a AWS Storage Gateway for file and then volume looks like. First, access the AWS Storage Gateway main landing page (it might change by time you read this) to get started. Scroll down and click on the Get Started with AWS Storage Gateway button or click here.

AWS Storage Gateway Landing Page

Select type of gateway to create, in the following example File is chosen.

Select type of AWS storage gateway

Next select the type of file gateway host (EC2 cloud hosted, or on-premises VMware). If you choose VMware, an OVA will be downloaded (follow the onscreen instructions) that you deploy on your ESXi system or with vCenter. Note that there is a different VMware VM gateway OAV for File Gateway and another for Volume Gateway. In the following example VMware ESXi OVA is selected and downloaded, then accessed via VMware tools such as vSphere Web Client for deployment.

AWS Storage Gateway select download

Once your VMware OVA file is downloaded from AWS, install using your preferred VMware tool, in this case I used the vSphere Web Client.

AWS Storage Gateway VM deploy

Once you have deployed the VMware VM for File Storage Gateway, it is time to connect to the gateway using the IP address assigned (static or DHCP) for the VM. Note that you may need to allocate some extra VMware storage to the VM if prompted (this mainly applies to Volume Gateway). Also follow directions about setting NTP time, using paravirtual adapters, thick vs. thin provisioning along with IP settings. Also double-check to make sure your VM and host are set for high-performance power setting. Note that the default username is sguser and password is sgpassword for the gateway.

AWS Storage Gateway Connect

Once you successfully connect to the gateway, next step will be to configure file share settings.

AWS Storage Gateway Configure File Share

Configure file share by selecting which gateway to use (in case you have more than one), name of an S3 bucket name to create, type of storage (S3 Standard or IA), along with Access Management security controls.

AWS Storage Gateway Create Share

Next step is to complete file share creation, not the commands provided for Linux and Windows for accessing the file share.

AWS Storage Gateway Review Share Settings

Review file share settings

AWS Storage Gateway access from Windows

Now lets use the file share by accessing and mounting to a Windows system, then copy some files to the file share.

AWS Storage Gateway verify Bucket Items

Now let’s go to the AWS console (or in our example use S3 Browser or your favorite tool) and look at the S3 bucket for the file share and see what is there. Note that each file is an object, and the objects simply appear as a file. If there were sub-directory those would also exist. Note that there are other buckets that I have masked out as we are only interested in the one named awsgwydemo that is configured using S3 Standard storage.

AWS Storage Gateway Volume

Now lets look at using the S3 Storage Gateway for Volumes. Similar to deploying for File Gateway, start out at the AWS Storage Gateway page and select Volume Gateway, then select what type of host (EC2 cloud, VMware or Hyper-V (2008 R2 or 2012) for on-premises deployment). Lets use the VMware Gateway, however as mentioned above, this is a different OVA/OVF than the File Gateway.

AWS Storage Gateway Configure Volume

Download the VMware OVA/OVF from AWS, and then install using your preferred VMware tools making sure to configure the gateway per instructions. Note that the Volume Gateway needs a couple of storage devices allocated to it. This means you will need to make sure that a SCSI adapter exists (or add one) to the VM, along with the disks (HDD or SSD) for local storage. Refer to AWS documentation about how to size, for my deployment I added a couple of small 80GB drives (you can choose to put on HDD or SSD including NVMe). Note that when connecting to the gateway if you get an error similar to below, make sure that you are in fact using the Volume Gateway and not mistakenly using the File Gateway OVA (VM). Note that the default username is sguser and password is sgpassword for the gateway.

AWS Storage Gateway Connect To Volume

Now connect to the local Volume Storage Gateway and notice the two local disks allocated to it.

AWS Storage Gateway Cached Volume Deploy

Next its time to create the Gateway which are deploying a Volume Cached below.

AWS Storage Gateway Volume Create

Next up is creating a volume, along with its security and access information.

AWS Storage Gateway Volume Settings

Volume configuration continued.

AWS Storage Gateway Volume CHAP

And now some additional configuration of the volume including iSCSI CHAP security.

AWS Storage Gateway Windows Access

Which leads us up to some Windows related volume access and configuration.

AWS Storage Gateway Using iSCSI Volume

Now lets use the new iSCSI based AWS Storage Gateway Volume. On the left you can see various WIndows command line activity, along with corresponding configuration information on the right.

AWS Storage Gateway Being Used by Windows

And there you have it, a quick tour of AWS Storage Gateway, granted there are more options that you can try yourself.

AWS

Where To Learn More

What This All Means

Overall I like the improvements that AWS has made to the Storage Gateway along with the different options it provides. Something to keep in mind is that if you are planning to use the AWS Storage Gateway File serving sharing mode that there are caveats to multiple concurrent writers to the same bucket. I would not be surprised if some other gateway or software based tool vendors tried to throw some fud towards the Storage Gateway, however ask them then how they coordinate multiple concurrent updates to a bucket while preserving data integrity.

Which Storage Gateway variant from AWS to use (e.g. File, Volume, VTL) depends on what your needs are, same with where the gateway is placed (Cloud hosted or on-premises with VMware or Hyper-V). Keep an eye on your costs, and more than just the storage space capacity. This means pay attention to your access and requests fees, as well as different service levels, along with data transfer fees.

You might wonder what about EFS and why you would want to use AWS Storage Gateway? Good question, at the time of this post EFS has evolved from being internal (e.g. within AWS and across regions) to having an external facing end-point however there is a catch. That catch which might have changed by time you read this is that the end-point can only be accessed from AWS Direct Connect locations.

This means that if your servers are not in a AWS Direct Connect location, without some creative configuration, EFS is not an option. Thus Storage Gateway File mode might be an option in place of EFS as well as using AWS storage access tools from others. For example I have some of my S3 buckets mounted on Linux systems using S3FS for doing rsync or other operations from local to cloud. In addition to S3FS, I also have various backup tools that place data into S3 buckets for backup, BC and DR as well as archiving.

Check out AWS Storage Gateway yourself and see what it can do or if it is a fit for your environment.

Ok, nuff said (for now…).

Cheers
Gs

Greg Schulz – Multi-year Microsoft MVP Cloud and Data Center Management, VMware vExpert (and vSAN). Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Watch for the spring 2017 release of his new book "Software-Defined Data Infrastructure Essentials" (CRC Press).

Courteous comments are welcome for consideration. First published on https://storageioblog.com any reproduction in whole, in part, with changes to content, without source attribution under title or without permission is forbidden.

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2023 Server StorageIO(R) and UnlimitedIO. All Rights Reserved.

Cloud Conversations: AWS EFS Elastic File System (Cloud NAS) First Preview Look

Storage I/O trends

Cloud Conversations: AWS EFS Elastic File System (Cloud NAS) First Preview Look

Amazon Web Services (AWS) recently announced (preview) new Elastic File System (EFS) providing Network File System (NFS) NAS (Network Attached Storage) capabilities for AWS Elastic Cloud Compute (EC2) instances. EFS AWS compliments other AWS storage offerings including Simple Storage Service (S3) along with Elastic Block Storage (EBS), Glacier and Relational Data Services (RDS) among others.

Ok, that’s a lot of buzzwords and acronyms so lets break this down a bit.

Amazon Web Services AWS

AWS EFS and Cloud Storage, Beyond Buzzword Bingo

  • EC2 – Instances exist in various Availability Zones (AZ’s) in different AWS Regions. Compute instance with various operating systems including Windows and Ubuntu among others that also can be pre-configured with applications such as SQL Server or web services among others. EC2 instances vary from low-cost to high-performance compute, memory, GPU, storage or general purposed optimized. For example, some EC2 instances rely solely on EBS, S3, RDS or other AWS storage offerings while others include on-board Solid State Disk (SSD) like DAS SSD found on traditional servers. EC2 instances on EBS volumes can be snapshot to S3 storage which in turn can be replicated to another region.
  • EBS – Scalable block accessible storage for EC2 instances that can be configured for performance or bulk storage, as well as for persistent images for EC2 instances (if you choose to configure your instance to be persistent)
  • EFS – New file (aka NAS) accessible storage service accessible from EC2 instances in various AZ’s in a given AWS region
  • Glacier – Cloud based near-line (or by some comparisons off-line) cold-storage archives.
  • RDS – Relational Database Services for SQL and other data repositories
  • S3 – Provides durable, scalable low-cost bulk (aka object) storage accessible from inside AWS as well as via externally. S3 can be used by EC2 instances for bulk durable storage as well as being used as a target for EBS snapshots.
  • Learn more about EC2, EBS, S3, Glacier, Regions, AZ’s and other AWS topics in this primer here

aws regions architecture

What is EFS

Implements NFS V4 (SNIA NFS V4 primer) providing network attached storage (NAS) meaning data sharing. AWS is indicating initial pricing for EFS at $0.30 per GByte per month. EFS is designed for storage and data sharing from multiple EC2 instances in different AZ’s in the same AWS region with scalability into the PBs.

What EFS is not

Currently it seems that EFS has an end-point inside AWS accessible via an EC2 instance like EBS. This appears to be like EBS where the storage service is accessible only to AWS EC2 instances unlike S3 which can be accessible from the out-side world as well as via EC2 instances.

Note however, that depending on how you configure your EC2 instance with different software, as well as configure a Virtual Private Cloud (VPC) and other settings, it is possible to have an application, software tool or operating system running on EC2 accessible from the outside world. For example, NAS software such as those from SoftNAS and NetApp among many others can be installed on an EC2 instance and with proper configuration, as well as being accessible to other EC2 instances, they can also be accessed from outside of AWS (with proper settings and security).

AWS EFS at this time is NFS version 4 based however does not support Windows SMB/CIFS, HDFS or other NAS access protocols. In addition AWS EFS is accessible from multiple AZ’s within a region. To share NAS data across regions some other software would be required.

EFS is not yet as of this writing released and AWS is currently accepting requests to join the EFS preview here.

Amazon Web Services AWS

Where to learn more

Here are some links to learn more about AWS S3 and related topics

What this all means and wrap-up

AWS continues to extend its cloud platform include both compute and storage offerings. EFS compliments EBS along with S3, Glacier and RDS. For many environments NFS support will be welcome while for others CIFS/SMB would be appreciated and others are starting to find that value in HDFS accessible NAS.

Overall I like this announcement and look forward to moving beyond the preview.

Ok, nuff said, for now..

Cheers gs

Greg Schulz – Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier)
twitter @storageio

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2024 Server StorageIO and UnlimitedIO LLC All Rights Reserved