Aurora Serverless MySQL Generally Available

You may have heard of Amazon Aurora, a custom built MySQL and PostgreSQL compatible database born and built in the cloud. You may have also heard of serverless, which allows you to build and run applications and services without thinking about instances. These are two pieces of the growing AWS technology story that we’re really excited to be working on. Last year, at AWS re:Invent we announced a preview of a new capability for Aurora called Aurora Serverless. Today, I’m pleased to announce that Aurora Serverless for Aurora MySQL is generally available. Aurora Serverless is on-demand, auto-scaling, serverless Aurora. You don’t have to think about instances or scaling and you pay only for what you use.

This paradigm is great for applications with unpredictable load or infrequent demand. I’m excited to show you how this all works. Let me show you how to launch a serverless cluster.

Creating an Aurora Serverless Cluster

First, I’ll navigate to the Amazon Relational Database Service (RDS) console and select the Clusters sub-console. From there, I’ll click the Create database button in the top right corner to get to this screen.

From the screen above I select my engine type and click next, for now only Aurora MySQL 5.6 is supported.

Now comes the fun part. I specify my capacity type as Serverless and all of the instance selection and configuration options go away. I only have to give my cluster a name and a master username/password combo and click next.

From here I can select a number of options. I can specify the minimum and maximum number of Aurora Compute Units (ACU) to be consumed. These are billed per-second, with a 5-minute minimum, and my cluster will autoscale between the specified minimum and maximum number of ACUs. The rules and metrics for autoscaling will be automatically created by Aurora Serverless and will include CPU utilization and number of connections. When Aurora Serverless detects that my cluster needs additional capacity it will grab capacity from a warm pool of resources to meet the need. This new capacity can start serving traffic in seconds because of the separation of the compute layer and storage layer intrinsic to the design of Aurora.

The cluster can even automatically scale down to zero if my cluster isn’t seeing any activity. This is perfect for development databases that might go long periods of time with little or no use. When the cluster is paused I’m only charged for the underlying storage. If I want to manually scale up or down, pre-empting a large spike in traffic, I can easily do that with a single API call.

Finally, I click Create database in the bottom right and wait for my cluster to become available – which happens quite quickly. For now we only support a limited number of cluster parameters with plans to enable more customized options as we iterate on customer feedback.

Now, the console provides a wealth of data, similar to any other RDS database.

From here, I can connect to my cluster like any other MySQL database. I could run a tool like sysbench or mysqlslap to generate some load and trigger a scaling event or I could just wait for the service to scale down and pause.

If I scroll down or select the events subconsole I can see a few different autoscaling events happening including pausing the instance at one point.

The best part about this? When I’m done writing the blog post I don’t need to remember to shut this server down! When I’m ready to use it again I just make a connection request and my cluster starts responding in seconds.

How Aurora Serverless Works

I want to dive a bit deeper into what exactly is happening behind the scenes to enable this functionality. When you provision an Aurora Serverless database the service does a few things:

  • It creates an Aurora storage volume replicated across multiple AZs.
  • It creates an endpoint in your VPC for the application to connect to.
  • It configures a network load balancer (invisible to the customer) behind that endpoint.
  • It configures multi-tenant request routers to route database traffic to the underlying instances.
  • It provisions the initial minimum instance capacity.

 

When the cluster needs to autoscale up or down or resume after a pause, Aurora grabs capacity from a pool of already available nodes and adds them to the request routers. This process takes almost no time and since the storage is shared between nodes Aurora can scale up or down in seconds for most workloads. The service currently has autoscaling cooldown periods of 1.5 minutes for scaling up and 5 minutes for scaling down. Scaling operations are transparent to the connected clients and applications since existing connections and session state are transferred to the new nodes. The only difference with pausing and resuming is a higher latency for the first connection, typically around 25 seconds.

Available Now

Aurora Serverless for Aurora MySQL is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland). If you’re interested in learning more about the Aurora engine there’s a great design paper available. If you’re interested in diving a bit deeper on exactly how Aurora Serverless works then look forward to more detail in future posts!

I personally believe this is one of the really exciting points in the evolution of the database story and I can’t wait to see what customers build with it!

Randall

AWS Online Tech Talks – August 2018

AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. Join us this month to learn about AWS services and solutions. We’ll have experts online to help answer any questions you may have. We’ve also launched our first-ever office hours style tech talk, where you have the opportunity to ask questions to our experts! This month we’ll be covering Amazon Aurora and Backup to AWS. Register today and join us! Please note – all sessions are free and in Pacific Time.

Tech talks featured this month:

Compute

August 28, 2018 | 11:00 AM – 11:45 AM PT – High Performance Computing on AWS – Learn how AWS scale and performance can deliver faster time to insights for your HPC environments.

Containers

August 22, 2018 | 11:00 AM – 11:45 AM PT – Distributed Tracing for Kubernetes Applications on AWS – Learn how to use AWS X-Ray to debug and monitor Kubernetes applications.

Data Lakes & Analytics

August 22, 2018 | 01:00 PM – 02:00 PM PT – Deep Dive on Amazon Redshift – Learn how to analyze all your data – across your data warehouse and data lake – with Amazon Redshift.

August 23, 2018 | 09:00 AM – 09:45 AM PT – Deep Dive on Amazon Athena – Dive deep on Amazon Athena and learn how to query S3 without servers to manage.

Databases

August 21, 2018 | 11:00 AM – 11:45 AM PT – Accelerate Database Development and Testing on AWS – Learn how to build database applications faster with Amazon Aurora.

Office Hours: August 30, 2018 | 11:00 AM – 12:00 PM PT – AWS Office Hours: Amazon Aurora – Opening up the Hood on AWS’ Fastest Growing Service – Ask AWS experts about anything on Amazon Aurora – From what makes Amazon Aurora different from other cloud databases to the unique ways our customers are leveraging it.

DevOps

August 22, 2018 | 09:00 AM – 10:00 AM PT – Amazon CI/CD Practices for Software Development Teams – Learn about Amazon’s CI/CD practices and how to leverage the AWS Developer Tools for similar workflows.

Enterprise & Hybrid

August 28, 2018 | 09:00 AM – 09:45 AM PT – Empower Your Organization with Alexa for Business – Discover how Amazon Alexa can act as an intelligent assistant and help you be more productive at work.

August 29, 2018 | 11:00 AM – 11:45 AM PT – Migrating Microsoft Workloads Like an Expert – Learn best practices on how to migrate Microsoft workloads to AWS like an expert.

IoT

August 27, 2018 | 01:00 PM – 02:00 PM PT – Using Predictive Analytics in Industrial IoT Applications – Learn how AWS IoT is used in industrial applications to predict equipment performance.

Machine Learning

August 20, 2018 | 09:00 AM – 10:00 AM PT – Machine Learning Models with TensorFlow Using Amazon SageMaker – Accelerate your ML solutions to production using TensorFlow on Amazon SageMaker.

August 21, 2018 | 09:00 AM – 10:00 AM PT – Automate for Efficiency with AI Language Services – Learn how organizations can benefit from intelligent automation through AI Language Services.

Mobile

August 29, 2018 | 01:00 PM – 01:45 PM PT – Building Serverless Web Applications with AWS Amplify – Learn how to build full stack serverless web applications with JavaScript & AWS.

re:Invent

August 23, 2018 | 11:00 AM – 11:30 AM PT – Episode 4: Inclusion & Diversity at re:Invent – Join Jill and Annie to learn about this year’s inclusion and diversity activities at re:Invent.

Security, Identity, & Compliance

August 27, 2018 | 11:00 AM – 12:00 PM PT – Automate Threat Migitation Using AWS WAF and Amazon GuardDuty – Learn best practices for using AWS WAF to automatically mitigate threats found by Amazon GuardDuty.

Serverless

August 21, 2018 | 01:00 PM – 02:00 PM PT – Serverless Streams, Topics, Queues, & APIs! How to Pick the Right Serverless Application Pattern – Learn how to pick the right design pattern for your serverless application with AWS Lambda.

Storage

Office Hours: August 23, 2018 | 01:00 PM – 02:00 PM PT – AWS Office Hours: Backing Up to AWS – Increasing Storage Scalability to Meet the Challenges of Today’s Data Landscape – Ask AWS experts anything from how to choose and deploy backup solutions in the cloud, to how to work with the AWS partner ecosystem, to best practices to maximize your resources.

August 27, 2018 | 09:00 AM – 09:45 AM PT – Data Protection Best Practices with EBS Snapshots – Learn best practices on how to easily make a simple point-in-time backup for your Amazon EC2 instances using Amazon EBS snapshots.

August 29, 2018 | 09:00 AM – 09:45 AM PT – Hybrid Cloud Storage with AWS Storage Gateway & Amazon S3 – Learn how to use Amazon S3 for your on-prem. applications with AWS Storage Gateway.

August 30, 2018 | 01:00 PM – 01:45 PM PT – A Briefing on AWS Data Transfer Services – Learn about your options for moving data into AWS, processing data at the edge, and building hybrid cloud architectures with AWS.

AWS IoT Device Defender Now Available – Keep Your Connected Devices Safe

I was cleaning up my home office over the weekend and happened upon a network map that I created in 1997. Back then my fully wired network connected 5 PCs and two printers. Today, with all of my children grown up and out of the house, we are down to 2 PCs. However, our home mesh network is also host to 2 Raspberry Pis, some phones, a pair of tablets, another pair of TVs, a Nintendo 3DS (thanks, Eric and Ana), 4 or 5 Echo devices, several brands of security cameras, and random gadgets that I buy. I also have a guest network, temporary home to random phones and tablets, and to some of the devices that I don’t fully trust.

This is, of course, a fairly meager collection compared to the typical office or factory, but I want to use it to point out some of the challenges that we all face as IoT devices become increasingly commonplace. I’m not a full-time system administrator. I set strong passwords and apply updates as I become aware of them, but security is always a concern.

New AWS IoT Device Defender
Today I would like to tell you about AWS IoT Device Defender. This new, fully-managed service (first announced at re:Invent) will help to keep your connected devices safe. It audits your device fleet, detects anomalous behavior, and recommends mitigations for any issues that it finds. It allows you to work at scale and in an environment that contains multiple types of devices.

Device Defender audits the configuration of your IoT devices against recommended security best practices. The audits can be run on a schedule on or demand, and perform the following checks:

Imperfect Configurations – The audit looks for expiring and revoked certificates, certificates that are shared by multiple devices, and duplicate client identifiers.

AWS Issues – The audit looks for overly permissive IoT policies, Cognito Ids with overly permissive access, and ensures that logging is enabled.

When issues are detected in the course of an audit, notifications can be delivered to the AWS IoT Console, as CloudWatch metrics, or as SNS notifications.

On the detection side, Device Defender looks at network connections, outbound packet and byte counts, destination IP addresses, inbound and outbound message rates, authentication failures, and more. You can set up security profiles, define acceptable behavior, and configure whitelists and blacklists of IP addresses and ports. An agent on each device is responsible for collecting device metrics and sending them to Device Defender. Devices can send metrics at 5 minute to 48 hour intervals.

Using AWS IoT Device Defender
You can access Device Defender’s features from the AWS IoT Console, CLI, or via a full set of APIs. I’ll use the Console, as I usually do, starting at the Defend menu:

The full set of available audit checks is available in Settings (any check that is enabled can be used as part of an audit):

I can see my scheduled audits by clicking Audit and Schedules. Then I can click Create to schedule a new one, or to run one immediately:

I create an audit by selecting the desired set of checks, and then save it for repeated use by clicking Create, or run it immediately:

I can choose the desired recurrence:

I can set desired day for a weekly audit, with similar options for the other recurrence frequencies. I also enter a name for my audit, and click Create (not shown in the screen shot):

I can click Results to see the outcome of past audits:

And I can click any audit to learn more:

Device Defender allows me to create security profiles to describe the expected behavior for devices within a thing group (or for all devices). I click Detect and Security profiles to get started, and can see my profiles. Then I can click Create to make a new one:

I enter a name and a description, and then model the expected behavior. In this case, I expect each device to send and receive less than 100K of network traffic per hour:

I can choose to deliver alerts to an SNS topic (I’ll need to set up an IAM role if I do this):

I can specify a behavior for all of my devices, or for those in specific thing groups:

After setting it all up, I click Save to create my security profile:

Next, I can click Violations to identify things that are in conflict with the behavior that is expected of them. The History tab lets me look back in time and examine past violations:

I can also view a device’s history of violations:

As you can see, Device Defender lets me know what is going on with my IoT devices, raises alarms when something suspicious occurs, and helps me to track down past issues, all from within the AWS Management Console.

Available Now
AWS IoT Device Defender is available today in the US East (N. Virginia), US West (Oregon), US East (Ohio), EU (Ireland), EU (Frankfurt), EU (London), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Seoul) Regions and you can start using it today. Pricing for audits is per-device, per-month; pricing for monitored datapoints is per datapoint, both with generous allocations in the AWS Free Tier (see the AWS IoT Device Defender page for more info).

Jeff;

New – Provisioned Throughput for Amazon Elastic File System (EFS)

Amazon Elastic File System lets you create petabyte-scale file systems that can be accessed in massively parallel fashion from hundreds or thousands of Amazon Elastic Compute Cloud (EC2) servers and on-premises resources, scaling on demand without disrupting applications. Behind the scenes, storage is distributed across multiple Availability Zones and redundant storage servers in order to provide you with file systems that are scalable, durable, and highly available. Space is allocated and billed on as as-needed basis, allowing you to consume as much as you need while keeping costs proportional to actual usage. Applications can achieve high levels of aggregate throughput and IOPS, with consistent low latencies. Our customers are using EFS for a broad spectrum of use cases including media processing workflows, big data & analytics jobs, code repositories, build trees, and content management repositories, taking advantage of the ability to simply lift-and-shift their existing file-based applications and workflows to the cloud.

A Quick Review
As you may already know, EFS lets you choose one of two performance modes each time you create a file system:

General Purpose – This is the default mode, and the one that you should start with. It is perfect for use cases that are sensitive to latency, and supports up to 7,000 operations per second per file system.

Max I/O – This mode scales to higher levels of aggregate throughput and performance, with slightly higher latency. It does not have an intrinsic limit on operations per second.

With either mode, throughput scales with the size of the file system, and can also burst to higher levels as needed. The size of the file system determines the upper bound on how high and how long you can burst. For example:

1 TiB – A 1 TiB file system can deliver 50 MiB/second continuously, and burst to 100 MiB/second for up to 12 hours each day.

10 TiB – A 10 TiB file system can deliver 500 MiB/second continuously, and burst up to 1 GiB/second for up to 12 hours each day.

EFS uses a credit system that allows you to “save up” throughput during quiet times and then use it during peak times. You can read about Amazon EFS Performance to learn more about the credit system and the two performance modes.

New Provisioned Throughput
Web server content farms, build trees, and EDA simulations (to name a few) can benefit from high throughput to a set of files that do not occupy a whole lot of space. In order to address this usage pattern, you now have the option to provision any desired level of throughput (up to 1 GiB/second) for each of your EFS file systems. You can set an initial value when you create the file system, and then increase it as often as you’d like. You can dial it back down every 24 hours, and you can also switch between provisioned throughput and bursting throughput on the same cycle. You can for example, configure an EFS file system to provide 50 MiB/second of throughput for your web server, even if the volume contains a relatively small amount of content.

If your application has throughput requirements that exceed what is possible using the default (bursting) model, the provisioned model is for you! You can achieve the desired level of throughput as soon as you create the file system, regardless of how much or how little storage you consume.

Here’s how I set this up:

Using provisioned throughput means that I will be billed separately for storage (in GiB/month units) and for provisioned throughput (in MiB/second-month units).

I can monitor average throughput by using a CloudWatch metric math expression. The Amazon EFS Monitoring Tutorial contains all of the formulas, along with CloudFormation templates that I can use to set up a complete CloudWatch Dashboard in a matter of minutes:

I select the desired template, enter the Id of my EFS file system, and click through the remaining screens to create my dashboard:

The template creates an IAM role, a Lambda function, a CloudWatch Events rule, and the dashboard:

The dashboard is available within the CloudWatch Console:

Here’s the dashboard for my test file system:

To learn more about how to use the EFS performance mode that is the best fit for your application, read Amazon Elastic File System – Choosing Between the Different Throughput & Performance Modes.

Available Now
This feature is available now and you can start using in today in all AWS regions where EFS is available.

Jeff;

 

Thoughts On Machine Learning Accuracy

This blog shares some brief thoughts on machine learning accuracy and bias.

Let’s start with some comments about a recent ACLU blog in which they run a facial recognition trial. Using Rekognition, the ACLU built a face database using 25,000 publicly available arrest photos and then performed facial similarity searches of that database using public photos of all current members of Congress. They found 28 incorrect matches out of 535, using an 80% confidence level; this is a 5% misidentification (sometimes called ‘false positive’) rate and a 95% accuracy rate. The ACLU has not published its data set, methodology, or results in detail, so we can only go on what they’ve publicly said. But, here are some thoughts on their claims:

  1. The default confidence threshold for facial recognition APIs in Rekognition is 80%, which is good for a broad set of general use cases (such as identifying celebrities on social media or family members who look alike in a photos app), but it’s not the right one for public safety use cases. The 80% confidence threshold used by the ACLU is far too low to ensure the accurate identification of individuals; we would expect to see false positives at this level of confidence. We recommend 99% for use cases where highly accurate face similarity matches are important (as indicated in our public documentation).

    To illustrate the impact of confidence threshold on false positives, we ran a test where we created a face collection using a dataset of over 850,000 faces commonly used in academia. We then used public photos of all members of US Congress (the Senate and House) to search against this collection in a similar way to the ACLU blog.

    When we set the confidence threshold at 99% (as we recommend in our documentation), our misidentification rate dropped to 0% despite the fact that we are comparing against a larger corpus of faces (30x larger than ACLU’s tests). This illustrates how important it is for those using ‎technology to help with public safety issues to pick appropriate confidence levels, so they have few (if any) false positives.

  2. In real-world public safety and law enforcement scenarios, Amazon Rekognition is almost exclusively used to help narrow the field and allow humans to expeditiously review and consider options using their judgment (and not to make fully autonomous decisions), where it can help find lost children, fight against human trafficking, or prevent crimes. Rekognition is generally only the first step in identifying an individual. In other use cases (such as social media), there isn’t the same need to double check so that confidence thresholds can be lower.

  3. In addition to setting the confidence threshold far too low, the Rekognition results can be significantly skewed by using a facial database that is not appropriately representative that is itself skewed. In this case, ACLU used a facial database of mugshots that may have had a material impact on the accuracy of Rekognition findings.

  4. The advantage of a cloud-based machine learning application like Rekognition is that it is constantly improving as we continue to improve the algorithm with more data.  Our customers immediately get the benefit of those improvements. We continue to focus on our mission of making Rekognition the most accurate and powerful tool for identifying people, objects, and scenes – and that certainly includes ensuring that the results are free of any bias that impacts accuracy.  We’ve been able to add a lot of value for customers and the world at large already with Rekognition in the fight against human trafficking, reuniting lost children with their families, reducing fraud for mobile payments, and improving security, and we’re excited about continuing to help our customers and society at large with Rekognition in the future.

  5. There is a general misconception that people can match faces to photos better than machines. In fact, the National Institute for Standards and Technology (“NIST”) recently shared a study of facial recognition technologies that are at least two years behind the state of the art used in Rekognition and concluded that even those older technologies can outperform human facial recognition abilities.

A final word about the misinterpreted ACLU results. When there are new technological advances, we all have to clearly understand what’s real and what’s not. There’s a difference between using machine learning to identify a food object and using machine learning to determine whether a face match should warrant considering any law enforcement action. The latter is serious business and requires much higher confidence levels. We continue to recommend that customers do not use less than 99% confidence levels for law enforcement matches, and then to only use the matches as one input across others that make sense for each agency. But, machine learning is a very valuable tool to help law enforcement agencies, and while being concerned it’s applied correctly, we should not throw away the oven because the temperature could be set wrong and burn the pizza. It is a very reasonable idea, however, for the government to weigh in and specify what temperature (or confidence levels) it wants law enforcement agencies to meet to assist in their public safety work.

Now Available: R5, R5d, and z1d Instances

Just last week I told you about our plans to launch EC2 Instances with Faster Processors and More Memory. Today I am happy to report that the R5, R5d, and z1d instances are available now and you can start using them today. Let’s take a look at each one!

R5 Instances
The memory-optimized R5 instances use custom Intel® Xeon® Platinum 8000 Series (Skylake-SP) processors running at up to 3.1 GHz, powered by sustained all-core Turbo Boost. They are perfect for distributed in-memory caches, in-memory analytics, and big data analytics, and are available in six sizes:

Instance Name vCPUs Memory EBS-Optimized Bandwidth Network Bandwidth
r5.large 2 16 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.xlarge 4 32 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.2xlarge 8 64 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.4xlarge 16 128 GiB 3.5 Gbps Up to 10 Gbps
r5.12xlarge 48 384 GiB 7.0 Gbps 10 Gbps
r5.24xlarge 96 768 GiB 14.0 Gbps 25 Gbps

You can launch R5 instances today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) Regions. To learn more, read about R5 and R5d Instances.

R5d Instances
The R5d instances share their specs with the R5 instances, and also include up to to 3.6 TB of local NVMe storage. Here are the sizes:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
r5d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.4xlarge 16 128 GiB 2 x 300 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
r5d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
r5d.24xlarge 96 768 GiB 4 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The R5d instances are available in the US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions. To learn more, read about R5 and R5d Instances.

z1d Instances
The high frequency z1d instances use custom Intel® Xeon® Scalable Processors running at up to 4.0 GHz, powered by sustained all-core Turbo Boost, perfect for Electronic Design Automation (EDA), financial simulation, relational database, and gaming workloads that can benefit from extremely high per-core performance. They are available in six sizes:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
z1d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD 2.333 Gbps Up to 10 Gbps
z1d.3xlarge 12 96 GiB 1 x 450 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
z1d.6xlarge 24 192 GiB 1 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
z1d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The fast cores allow you to run your existing jobs to completion more quickly than ever before, giving you the ability to fine-tune your use of databases and EDA tools that are licensed on a per-core basis.

You can launch z1d instances today in the US East (N. Virginia), US West (Oregon), US West (N. California), EU (Ireland), Asia Pacific (Tokyo), and Asia Pacific (Singapore) Regions. To learn more, read about z1d Instances.

In the Works
You will be able to create Amazon Relational Database Service (RDS), Amazon ElastiCache, and Amazon Aurora instances that make use of the R5 in the next two or three months.

The r5.metal, r5d.metal, and z1d.metal instances are on the near-term roadmap and I’ll let you know when they are available.

Jeff;

 

Amazon SageMaker Adds Batch Transform Feature and Pipe Input Mode for TensorFlow Containers

At the New York Summit a few days ago we launched two new Amazon SageMaker features: a new batch inference feature called Batch Transform that allows customers to make predictions in non-real time scenarios across petabytes of data and Pipe Input Mode support for TensorFlow containers. SageMaker remains one of my favorite services and we’ve covered it extensively on this blog and the machine learning blog. In fact, the rapid pace of innovation from the SageMaker team is a bit hard to keep up with. Since our last post on SageMaker’s Automatic Model Tuning with Hyper Parameter Optimization, the team launched 4 new built-in algorithms and tons of new features. Let’s take a look at the new Batch Transform feature.

Batch Transform

The Batch Transform feature is a high-performance and high-throughput method for transforming data and generating inferences. It’s ideal for scenarios where you’re dealing with large batches of data, don’t need sub-second latency, or need to both preprocess and transform the training data. The best part? You don’t have to write a single additional line of code to make use of this feature. You can take all of your existing models and start batch transform jobs based on them. This feature is available at no additional charge and you pay only for the underlying resources.

Let’s take a look at how we would do this for the built-in Object Detection algorithm. I followed the example notebook to train my object detection model. Now I’ll go to the SageMaker console and open the Batch Transform sub-console.

From there I can start a new batch transform job.

Here I can name my transform job, select which of my models I want to use, and the number and type of instances to use. Additionally, I can configure the specifics around how many records to send to my inference concurrently and the size of the payload. If I don’t manually specify these then SageMaker will select some sensible defaults.

Next I need to specify my input location. I can either use a manifest file or just load all the files in an S3 location. Since I’m dealing with images here I’ve manually specified my input content-type.

Finally, I’ll configure my output location and start the job!

Once the job is running, I can open the job detail page and follow the links to the metrics and the logs in Amazon CloudWatch.

I can see the job is running and if I look at my results in S3 I can see the predicted labels for each image.

The transform generated one output JSON file per input file containing the detected objects.

From here it would be easy to create a table for the bucket in AWS Glue and either query the results with Amazon Athena or visualize them with Amazon QuickSight.

Of course it’s also possible to start these jobs programmatically from the SageMaker API.

You can find a lot more detail on how to use batch transforms in your own containers in the documentation.

Pipe Input Mode for Tensorflow

Pipe input mode allows customers to stream their training dataset directly from Amazon Simple Storage Service (S3) into Amazon SageMaker using a highly optimized multi-threaded background process. This mode offers significantly better read throughput than the File input mode that must first download the data to the local Amazon Elastic Block Store (EBS) volume. This means your training jobs start sooner, finish faster, and use less disk space, lowering the costs associated with training your models. It has the added benefit of letting you train on datasets beyond the 16 TB EBS volume size limit.

Earlier this year, we ran some experiments with Pipe Input Mode and found that startup times were reduced up to 87% on a 78 GB dataset, with throughput twice as fast in some benchmarks, ultimately resulting in up to a 35% reduction in total training time.

By adding support for Pipe Input Mode to TensorFlow we’re making it easier for customers to take advantage of the same increased speed available to the built-in algorithms. Let’s look at how this works in practice.

First, I need to make sure I have the sagemaker-tensorflow-extensions available for my training job. This gives us the new PipeModeDataset class which takes a channel and a record format as inputs and returns a TensorFlow dataset. We can use this in our input_fn for the TensorFlow estimator and read from the channel. The code sample below shows a simple example.

from sagemaker_tensorflow import PipeModeDataset

def input_fn(channel):
    # Simple example data - a labeled vector.
    features = {
        'data': tf.FixedLenFeature([], tf.string),
        'labels': tf.FixedLenFeature([], tf.int64),
    }
    
    # A function to parse record bytes to a labeled vector record
    def parse(record):
        parsed = tf.parse_single_example(record, features)
        return ({
            'data': tf.decode_raw(parsed['data'], tf.float64)
        }, parsed['labels'])

    # Construct a PipeModeDataset reading from a 'training' channel, using
    # the TF Record encoding.
    ds = PipeModeDataset(channel=channel, record_format='TFRecord')

    # The PipeModeDataset is a TensorFlow Dataset and provides standard Dataset methods
    ds = ds.repeat(20)
    ds = ds.prefetch(10)
    ds = ds.map(parse, num_parallel_calls=10)
    ds = ds.batch(64)
    
    return ds

Then you can define your model and the same way you would for a normal TensorFlow estimator. When it comes to estimator creation time you just need to pass in input_mode='Pipe' as one of the parameters.

Available Now

Both of these new features are available now at no additional charge, and I’m looking forward to seeing what customers can build with the batch transform feature. I can already tell you that it will help us with some of our internal ML workloads here in AWS Marketing.

As always, let us know what you think of this feature in the comments or on Twitter!

Randall

Amazon Polly Update – Time-Driven Prosody and Asynchronous Synthesis

I hope that you are enjoying the Polly-powered audio that is available for the newest posts on this blog, including the DeepLens Challenge and the Storage Gateway Recap. As part of my blogging process, I now listen to the synthesized speech for my draft blog posts in order to get a better sense for how they flow.

Today we are launching two new features for Amazon Polly:

Time-Driven Prosody – You can now specify the desired duration for the synthesized speech that corresponds to part or all of the input text.

Asynchronous Synthesis – You can now process large blocks of text and store the synthesized speech in Amazon S3 with a single call.

Both of these features are available now and you can start using them today. Let’s take a closer look!

Time-Driven Prosody
Imagine that you are creating a multi-lingual version of a video or a self-running presentation. You write the script, record the video in one language, and then use Amazon Translate and Amazon Polly to create audio tracks in other languages. In order to keep each language in sync with the visual content, you need to exercise fine-grained control over the duration of each segment. That’s where this new feature comes in. You can now specify the maximum desired duration for any desired segments, counting on Polly to adjust the speech rate in order to limit the length of each segment.

The preceding paragraph generates 19 seconds of audio if I use Amazon Polly’s Joanna voice with no other options:

<speak>
  In order to keep each language in sync with the visual content, 
  you need to exercise fine-grained control over the duration of
  each segment. That's where this new feature comes in. You can 
  now specify the maximum desired duration for any desired segments, 
  counting on Polly to adjust the speech rate in order to limit 
  the length of each segment.
</speak>

I can use a <prosody> tag to limit the length to 15 seconds:

<speak>
  <prosody amazon:max-duration="15s">
    In order to keep each language in sync with the visual content, 
    you need to exercise fine-grained control over the duration of
    each segment. That's where this new feature comes in. You can 
    now specify the maximum desired duration for any desired segments, 
    counting on Polly to adjust the speech rate in order to limit 
    the length of each segment.
 </prosody>
</speak>

I can control the duration at a more fine-grained level by using multiple <prosody> tags:

  <prosody amazon:max-duration="10s">
    In order to keep each language in sync with the visual content, 
    you need to exercise fine-grained control over the duration of
    each segment. 
  </prosody>
  <prosody amazon:max-duration="7s">
    That's where this new feature comes in. You can now specify 
    the maximum desired duration for any desired segments, 
    counting on Polly to adjust the speech rate in order to limit 
    the length of each segment.
 </prosody>

The Spanish equivalent (courtesy of Amazon Translate) of my English text is somewhat longer and the speed-up is apparent:

<speak>
  <prosody amazon:max-duration="15s">
    Para mantener cada idioma sincronizado con el contenido
    visual, es necesario ejercer un control detallado sobre
    la duración de cada segmento. Ahí es donde entra esta 
    nueva característica. Ahora puede especificar la 
    duración máxima deseada para los segmentos deseados, 
    contando con que Polly ajuste la velocidad de voz para 
    limitar la longitud de cada segmento.
 </prosody>
</speak>

The text inside of each time-limited <prosody> tag is limited to 1500 characters and nesting is not allowed (the inner tag will be ignored). In order to ensure that the audio remains comprehensible, Polly will speed up the audio by a maximum of 5x.

Asynchronous Synthesis
This feature makes it easier for you to use Polly to generate speech for long-form content such as articles or book chapters by allowing you to process up to 100,000 characters of text at a time using asynchronous requests. The synthesized speech is delivered to the S3 bucket of your choice, with failure notifications optionally routed to the Amazon Simple Notification Service (SNS) topic of your choice. The generated audio can be up to 6 hours long, and is typically ready within minutes. In addition to 100,000 characters of text, each request can include an additional 100,000 characters of Speech Synthesis Markup Language (SSML) markup.

Each asynchronous request creates a new speech synthesis task. You can initiate and manage tasks from the Polly Console, CLI (start-speech-synthesis-task), or API (StartSpeechSynthesisTask).

To test this feature I created a plain-text version of my thoroughly obsolete AWS book and inserted some SSML tags, turning it in to valid XML along the way. Then I open the Polly Console, click Text-to-Speech, paste the XML, and click Synthesize to S3:

I enter the name of my S3 bucket (which must be in region where I plan to create the task), and click Synthesize to proceed:

My task is created:

And I can see it in the list of tasks:

I receive an email when the synthesis is complete:

And the file is in my bucket as expected:

I did not spend a lot of time on the markup, but the results are impressive:

Interestingly enough, most of that chapter is still relevant. The rest of the book has been overtaken by history, and is best left there! Perhaps I’ll write another one sometime.

Anyway, as you can see (and hear) the asynchronous speech synthesis is powerful and easy to use. Give it a shot, build something cool, and tell me about it.

Jeff;

 

 

Amazon EC2 Instance Update – Faster Processors and More Memory

Last month I told you about the Nitro system and explained how it will allow us to broaden the selection of EC2 instances and to pick up the pace as we do so, with an ever-broadening selection of compute, storage, memory, and networking options. This will allow us to give you access to the latest technology very quickly, giving you the ability to choose the instance type that is the best match for your applications.

Today, I would like to tell you about three new instance types that are in the works and that will be available soon:

Z1d – Compute-intensive instances running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They are ideal for Electronic Design Automation (EDA) and relational database workloads, and are also a great fit for several kinds of HPC workloads.

R5 – Memory-optimized instances running at up to 3.1 GHz powered by sustained all-core Turbo Boost, with up to 50% more vCPUs and 60% more memory than R4 instances.

R5d – Memory-optimized instances equipped with local NVMe storage (up to 3.6 TB for the largest R5d instance), and will be available in the same sizes and with the same specs as the R5 instances.

We are also planning to launch R5 Bare Metal, R5d Bare Metal, and Z1d Bare Metal instances. As is the case with the existing i3.metal instances, you will be able to access low-level hardware features and to run applications that are not licensed or supported in virtualized environments.

Z1d Instances
The Z1d instances are designed for applications that can benefit from extremely high per-core performance. These include:

Electronic Design Automation – As chips become smaller and denser, the amount of compute power needed to design and verify the chips increases non-linearly. Semiconductor customers deploy jobs that span thousands of cores; having access to faster cores reduces turnaround time for each job and can also lead to a measurable reduction in software licensing costs.

HPC – In the financial services world, jobs that run analyses or compute risks also benefit from faster cores. Manufacturing organizations can run their Finite Element Analysis (FEA) and simulation jobs to completion more quickly.

Relational Database – CPU-bound workloads that run on a database that “features” high per-core license fees will enjoy both cost and performance benefits.

Z1d instances use custom Intel® Xeon® Scalable Processors running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They will be available in 6 sizes, with up to 48 vCPUs, 384 GiB of memory, and 1.8 TB of local NVMe storage. On the network side, they feature ENA networking that will deliver up to 25 Gbps of bandwidth, and are EBS-Optimized by default for up to 14 Gbps of bandwidth. As usual, you can launch them in a Cluster Placement Group to increase throughput and reduce latency. Here are the sizes and specs:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
z1d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD 2.333 Gbps Up to 10 Gbps
z1d.3xlarge 12 96 GiB 1 x 450 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
z1d.6xlarge 24 192 GiB 1 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
z1d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers. Any AMI that runs on C5 or M5 instances will also run on Z1d instances.

R5 Instances
Building on the earlier generations of memory-intensive instance types (M2, CR1, R3, and R4), the R5 instances are designed to support high-performance databases, distributed in-memory caches, in-memory analytics, and big data analytics. They use custom Intel® Xeon® Platinum 8000 Series (Skylake-SP) processors running at up to 3.1 GHz, again powered by sustained all-core Turbo Boost. The instances will be available in 6 sizes, with up to 96 vCPUs and 768 GiB of memory. Like the Z1d instances, they feature ENA networking and are EBS-Optimized by default, and can be launched in Placement Groups. Here are the sizes and specs:

Instance Name vCPUs Memory EBS-Optimized Bandwidth Network Bandwidth
r5.large 2 16 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.xlarge 4 32 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.2xlarge 8 64 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.4xlarge 16 128 GiB 3.5 Gbps Up to 10 Gbps
r5.12xlarge 48 384 GiB 7.0 Gbps 10 Gbps
r5.24xlarge 96 768 GiB 14.0 Gbps 25 Gbps

Once again, the instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers.

Learn More
The new EC2 instances announced today highlight our plan to continue innovating in order to better meet your needs! I’ll share additional information as soon as it is available.

Jeff;

 

 

New – EC2 Compute Instances for AWS Snowball Edge

I love factories and never miss an opportunity to take a tour. Over the years, I have been lucky enough to watch as raw materials and sub-assemblies are turned into cars, locomotives, memory chips, articulated buses, and more. I’m always impressed by the speed, precision, repeat-ability, and the desire to automate every possible step. On one recent tour, the IT manager told me that he wanted to be able to set up and centrally manage the global collection of on-premises industrialized PCs that monitor their machinery as easily and as efficiently as he does their EC2 instances and other cloud resources.

Today we are making that manager’s dream a reality, with the introduction of EC2 instances that run on AWS Snowball Edge devices! These ruggedized devices, with 100 TB of local storage, can be used to collect and process data in hostile environments with limited or non-existent Internet connections before shipping the processed data back to AWS for storage, aggregation, and detailed analysis. Here are the instance specs:

Instance Name vCPUs Memory
sbe1.small 1 1 GiB
sbe1.medium 1 2 GiB
sbe1.large 2 4 GiB
sbe1.xlarge 4 8 GiB
sbe1.2xlarge 8 16 GiB
sbe1.4xlarge 16 32 GiB

Each Snowball Edge device is powered by an Intel® Xeon® D processor running at 1.8 GHz, and supports any combination of instances that consume up to 24 vCPUs and 32 GiB of memory. You can build and test AMIs (Amazon Machine Images) in the cloud and then preload them onto the device as part of the ordering process (I’ll show you how in just a minute). You can use the EC2-compatible endpoint exposed by each device to programmatically start, stop, resume, and terminate instances. This allows you to use the existing CLI commands and to build tools and scripts to manage fleets of devices. It also allows you to take advantage of your existing EC2 skills and knowledge, and to put them to good use in a new environment.

There are three main setup steps:

  1. Creating a suitable AMI.
  2. Ordering a Snowball Edge Device.
  3. Connecting and Configuring the Device.

Let’s take an in-depth look at the first two steps. Time was tight and I didn’t have time to get hands-on experience with an actual device, so the third step will have to wait for another time.

Creating a Suitable AMI
I have the ability to choose up to 10 AMIs that will be preloaded onto the device. The AMIs must be owned by my AWS account, and must be based on one of the following Marketplace AMIs:

These AMIs have been tested for use on Snowball Edge devices and can be used as a starting point for customization. We will be adding additional options over time, so let us know what you need.

I decided to start with the newest Ubuntu AMI, and launch it on an M5 instance, taking care to specify the SSH keypair that I will eventually use to connect to the instance from my terminal client:

After my instance launches, I connect to it, customize it as desired for use on my device, and then return to the EC2 Console to create an AMI. I select the running instance, choose Create Image from the Actions menu, specify the details, and click Create Image:

The size of the root volume will determine how much of the device’s SSD storage is allocated to the instance when it launches. A total of one TB of space is available for all running instances, so keep your local file storage needs in mind as your analyze your use case and set up your AMIs. Also, Snowball Edge devices cannot make use of additional EBS volumes, so don’t bother including them in your AMI. My AMI is ready within minutes (To learn more about how to create AMIs, read Creating an Amazon EBS-Backed Linux AMI):

Now I am ready to order my first device!

Ordering a Snowball Edge Device
The ordering procedure lets me designate a shipping address and specify how I would like my Snowball device to be configured. I open the AWS Snowball Console and click Create job:

I specify the job type (they all support EC2 compute instances):

Then I select my shipping address, entering a new one if necessary (come and visit me):

Next, I define my job. I give it a name (SJ1), select the 100 TB device, and pick the S3 bucket that will receive data when the device is returned to AWS:

Now comes the fun part! I click Enable compute with EC2 and select the AMIs to be loaded on the Snowball Edge:

I click Add an AMI and find the one that I created earlier:

I can add up to ten AMIs to my job, but will stop at one for this post:

Next, I set up my IAM role and configure encryption:

Then I configure the optional SNS notifications. I can choose to receive notification for a wide variety of job status values:

My job is almost ready! I review the settings and click Create job to create it:

Connecting and Configuring the Device
After I create the job, I wait until my Snowball Edge device arrives. I connect it to my network, power it on, and then unlock it using my manifest and device code, as detailed in Unlock the Snowball Edge. Then I configure my EC2 CLI to use the EC2 endpoint on the device and launch an instance. Since I configured my AMI for SSH access, I can connect to it as if it were an EC2 instance in the cloud.

Things to Know
Here are a couple of things to keep in mind:

Long-Term Usage – You can keep the Snowball Edge devices on your premises and hard at work for as long as you would like. You’ll be billed for a one-time setup fee for each job; after 10 days you will pay an additional, per-day fee for each device. If you want to keep a device for an extended period of time, you can also pay upfront as part of a one or three year commitment.

Dev/Test – You should be able to do much of your development and testing on an EC2 instance running in the cloud; some of our early users are working in this way as part of a “Digital Twin” strategy.

S3 Access – Each Snowball Edge device includes an S3-compatible endpoint that you can access from your on-device code. You can also make use of existing S3 tools and applications.

Now Available
You can start ordering devices today and make use of this exciting new AWS feature right away.

Jeff;