New – Managed Databases for Amazon Lightsail

Amazon Lightsail makes it easy for you to get started with AWS. You choose the operating system (and optional application) that you want to run, pick an instance plan, and create an instance, all in a matter of minutes. Lightsail offers low, predictable pricing, with instance plans that include compute power, storage, and data transfer:

Managed Databases
Today we are making Lightsail even more useful by giving you the ability to create a managed database with a couple of clicks. This has been one of our top customer requests and I am happy to be able to share this news.

This feature is going to be of interest to a very wide range of current and future Lightsail users, including students, independent developers, entrepreneurs, and IT managers. We’ve addressed the most common and complex issues that arise when setting up and running a database. As you will soon see, we have simplified and fine-tuned the process of choosing, launching, securing, accessing, monitoring, and maintaining a database!

Each Lightsail database bundle has a fixed, monthly price that includes the database instance, a generous amount of SSD-backed storage, a terabyte or more of data transfer to the Internet and other AWS regions, and automatic backups that give you point-in-time recovery for a 7-day period. You can also create manual database snapshots that are billed separately.

Creating a Managed Database
Let’s walk through the process of creating a managed database and loading an existing MySQL backup into it. I log in to the Lightsail Console and click Databases to get started. Then I click Create database to move forward:

I can see and edit all of the options at a glance. I choose a location, a database engine and version, and a plan, enter a name, and click Create database (all of these options have good defaults; a single click often suffices):

We are launching with support for MySQL 5.6 and 5.7, and will add support for PostgreSQL 9.6 and 10 very soon. The Standard database plan creates a database in one Availability Zone with no redundancy; the High Availability plan also creates a presence in a second AZ, and is recommended for production use.

Database creation takes just a few minutes, the status turns to Available, and my database is ready to use:

I click on Database-Oregon-1, and I can see the connection details, and have access to other management information & tools:

I’m ready to connect! I create an SSH connection to my Lightsail instance, ensure that the mysql package is installed, and connect using the information above (read Connecting to Your MySQL Database to learn more):

Now I want to import some existing data into my database. Lightsail lets me enable Data import mode in order to defer any backup or maintenance operations:

Enabling data import mode deletes any existing automatic snapshots; you may want to take a manual snapshot before starting your import if you are importing fresh data into an existing database.

I have a large (13 GB) , ancient (2013-era) MySQL backup from a long-dead personal project; I download it from S3, uncompress it, and import it:

I can watch the metrics while the import is underway:

After the import is complete I disable data import mode, and I can run queries against my tables:

To learn more, read Importing Data into Your Database.

Lightsail manages all routine database operations. If I make a mistake and mess up my data, I can use the Emergency Restore to create a fresh database instance from an earlier point in time:

I can rewind by up to 7 days, limited to when I last disabled data import mode.

I can also take snapshots, and use them later to create a fresh database instance:

Things to Know
Here are a couple of things to keep in mind when you use this new feature:

Engine Versions – We plan to support the two latest versions of MySQL, and will do the same for other database engines as we make them available.

High Availability – As is always the case for production AWS systems, you should use the High Availability option in order to maintain a database footprint that spans two Zones. You can switch between Standard and High Availability using snapshots.

Scaling Storage – You can scale to a larger database instance by creating and then restoring a snapshot.

Data Transfer – Data transfer to and from Lightsail instances in the same AWS Region does not count against the usage that is included in your plan.

Amazon RDS – This feature shares core technology with Amazon RDS, and benefits from our operational experience with that family of services.

Available Now
Managed databases are available today in all AWS Regions where Lightsail is available:

Jeff;

re:Invent 2018 – 55 Days to Go….

As I write this, there are just 55 calendar days until AWS re:Invent 2018. My colleagues and I are working flat-out to bring you the best possible learning experience and I want to give you a quick update on a couple of things…

Transportation – Customer Obsession is the first Amazon Leadership Principle and we take your feedback seriously! The re:Invent 2018 campus is even bigger this year, and our transportation system has been tuned and scaled to match. This includes direct shuttle routes from venue to venue so that you don’t spend time waiting at other venues, access to real-time transportation info from within the re:Invent app, and on-site signage. The mobile app will even help you to navigate to your sessions while letting you know if you are on time. If you are feeling more independent and don’t want to ride the shuttles, we’ll have partnerships with ridesharing companies including Lyft and Uber. Visit the re:Invent Transportation page to learn more about our transportation plans, routes, and options.

Reserved Seating – In order to give you as many opportunities to see the technical content that matters the most to you, we are bringing back reserved seating. You will be able to make reservations starting at 10 AM PT on Thursday, October 11, so mark your calendars. Reserving a seat is the best way to ensure that you will get a seat in your favorite session without waiting in a long line, so be sure to arrive at least 10 minutes before the scheduled start. As I have mentioned before, we have already scheduled repeats of the most popular sessions, and made them available for reservation in the Session Catalog. Repeats will take place all week in all re:Invent venues, along with overflow sessions in our Content Hubs (centralized overflow rooms in every venue). We will also stream live content to the Content Hubs as the sessions fill up.

Trivia Night – Please join me at 7:30 PM on Wednesday in the Venetian Theatre for the first-ever Camp re:Invent Trivia Night. Come and test your re:Invent and AWS knowledge to see if you and your team can beat me at trivia (that should not be too difficult). The last person standing gets bragging rights and an awesome prize.

How to re:Invent – Whether you are a first-time attendee or a veteran re:Invent attendee, please take the time to watch our How to re:Invent videos. We want to make sure that you arrive fully prepared, ready to learn about the latest and greatest AWS services, meet your peers and members of the AWS teams, and to walk away with the knowledge and the skills that will help you to succeed in your career.

See you in Vegas!

Jeff;

Learn about AWS – October AWS Online Tech Talks

AWS Tech Talks

AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. Join us this month to learn about AWS services and solutions. We’ll have experts online to help answer any questions you may have.

Featured this month, check out the webinars under AR/VR, End-User Computing, Industry Solutions. Also, register for our second fireside chat discussion on Amazon Redshift.

Register today!

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:

AR/VR

October 16, 2018 | 01:00 PM – 02:00 PM PTCreating and Publishing AR, VR and 3D Applications with Amazon Sumerian – Learn about Amazon Sumerian, the fastest and easiest way to create and publish immersive applications.

Compute

October 25, 2018 | 09:00 AM – 10:00 AM PTRunning Cost Effective Batch Workloads with AWS Batch and Amazon EC2 Spot Instances – Learn how to run complex workloads, such as analytics, image processing, and machine learning applications efficiently and cost-effectively.

Data Lakes & Analytics

October 18, 2018 | 01:00 PM – 01:45 PM PTFireside Chat: The Evolution of Amazon Redshift – Join Vidhya Srinivasan, General Manager of Redshift, in a candid conversation as she discusses the product’s evolution recently shipped features and improvements.

October 15, 2018 | 01:00 PM – 01:45 PM PTCustomer Showcase: The Secret Sauce Behind GroupM’s Marketing Analytics Platform – Learn how GroupM – the world’s largest media investment group with more than $113.8bn in billings – created a modern data analytics platform using Amazon Redshift and Matillion.

Databases

October 15, 2018 | 11:00 AM – 12:00 PM PTSupercharge Query Caching with AWS Database Services – Learn how AWS database services, including Amazon Relational Database Service (RDS) and Amazon ElastiCache, work together to make it simpler to add a caching layer to your database, delivering high availability and performance for query-intensive apps.

October 17, 2018 | 09:00 AM – 09:45 AM PTHow to Migrate from Cassandra to DynamoDB Using the New Cassandra Connector in the AWS Database Migration Service – Learn how to migrate from Cassandra to DynamoDB using the new Cassandra Connector in the AWS Database Migration Service.

End-User Computing

October 23, 2018 | 01:00 PM – 02:00 PM PTHow to use Amazon Linux WorkSpaces for Agile Development – Learn how to integrate your Amazon Linux WorkSpaces development environment with other AWS Developer Tools.

Enterprise & Hybrid

October 23, 2018 | 09:00 AM – 10:00 AM PTMigrating Microsoft SQL Server 2008 Databases to AWS – Learn how you can provision, monitor, and manage Microsoft SQL Server on AWS.

Industry Solutions

October 24, 2018 | 11:00 AM – 12:00 PM PTTape-to-Cloud Media Migration Walkthrough – Learn from media-specialist SAs as they walk through a content migration solution featuring machine learning and media services to automate processing, packaging, and metadata extraction.

IoT

October 22, 2018 | 01:00 PM – 01:45 PM PTUsing Asset Monitoring in Industrial IoT Applications – Learn how AWS IoT is used in industrial applications to understand asset health and performance.

Machine Learning

October 15, 2018 | 09:00 AM – 09:45 AM PTBuild Intelligent Applications with Machine Learning on AWS – Learn how to accelerate development of AI applications using machine learning on AWS.

Management Tools

October 24, 2018 | 01:00 PM – 02:00 PM PTImplementing Governance and Compliance in a Multi-Account, Multi-Region Scenario – Learn AWS Config best practices on how to implement governance and compliance in a multi-account, multi-Region scenario.

Networking

October 23, 2018 | 11:00 AM – 11:45 AM PTHow to Build Intelligent Web Applications @ Edge – Explore how Lambda@Edge can help you deliver low latency web applications.

October 25, 2018 | 01:00 PM – 02:00 PM PTDeep Dive on Bring Your Own IP – Learn how to easily migrate legacy applications that use IP addresses with Bring Your Own IP.

re:Invent

October 10, 2018 | 08:00 AM – 08:30 AM PTEpisode 6: Mobile App & Reserved Seating – Discover new innovations coming to the re:Invent 2018 mobile experience this year. Plus, learn all about reserved seating for your priority sessions.

Security, Identity & Compliance

October 22, 2018 | 11:00 AM – 11:45 AM PTGetting to Know AWS Secrets Manager – Learn how to protect your secrets used to access your applications, services, and IT resources.

Serverless

October 17, 2018 | 11:00 AM – 12:00 PM PTBuild Enterprise-Grade Serverless Apps – Learn how developers can design, develop, deliver, and monitor cloud applications as they take advantage of the AWS serverless platform and developer toolset.

Storage

October 24, 2018 | 09:00 AM – 09:45 AM PTDeep Dive: New AWS Storage Gateway Hardware Appliance – Learn how you can use the AWS Storage Gateway hardware appliance to connect on-premises applications to AWS storage.

Saving Koalas Using Genomics Research and Cloud Computing

Today is Save the Koala Day and a perfect time to to tell you about some noteworthy and ground-breaking research that was made possible by AWS Research Credits and the AWS Cloud.

Five years ago, a research team led by Dr. Rebecca Johnson (Director of the Australian Museum Research Institute) set out to learn more about koala populations, genetics, and diseases. As a biologically unique animal with a limited appetite, maintaining a healthy and genetically diverse population are both key elements of any conservation plan. In addition to characterizing the genetic diversity of koala populations, the team wanted to strengthen Australia’s ability to lead large-scale genome sequencing projects.

Inside the Koala Genome
Last month the team published their results in Nature Genetics. Their paper (Adaptation and Conservation Insights from the Koala Genome) identifies the genomic basis for the koala’s unique biology. Even though I had to look up dozens of concepts as I read the paper, I was able to come away with a decent understanding of what they found. Here’s my lay summary:

Toxic Diet – The eucalyptus leaves favored by koalas contain a myriad of substances that are toxic to other species if ingested. Gene expansions and selection events in genes encoding enzymes with detoxification functions enable koalas to rapidly detoxify these substances, making them able to subsist on a diet favored by no other animal. The genetic repertoire underlying the accelerated metabolics also renders common anti-inflammatory medications and antibiotics ineffective for treating ailing koalas.

Food Choice – Koalas are, as I noted earlier, very picky eaters. Genetically speaking, this comes about because their senses of smell and taste are enhanced, with 6 genes giving them the ability to discriminate between plant metabolites on the basis of smell. The researchers also found that koalas have a gene that helps them to select eucalyptus leaves with a high water content, and another that enhances their ability to perceive bitter and umami flavors.

Reproduction – Specific genes which control ovulation and birth were identified. In the interest of frugality, female koalas produce eggs only when needed.

Koala Milk – Newborn koalas are the size of a kidney bean and weigh less than half of a gram! They nurse for about a year, taking milk that changes in composition over time, with a potential genetic correlation. The researchers also identified genes known to function as anti-microbial properties.

Immune Systems – The researchers identified genes that formed the basis for resistance, immunity, or susceptibility to certain diseases that affect koalas. They also found evidence of a “genomic invasion” (their words) where the koala retrovirus actually inserts itself into the genome.

Genetic Diversity – The researchers also examined how geological events like habitat barriers and surface temperatures have shaped genetic diversity and population evolution. They found that koalas from some areas had markedly less genetic diversity than those from others, with evidence that allowed them to correlate diversity (or the lack of it) with natural barriers such as the Hunter Valley.

Powered by AWS
Creating a complete gene sequence requires (among many other things) an incredible amount of compute power and vast amount of storage.

While I don’t fully understand the process, I do know that it works on a bottom-up basis. The DNA samples are broken up into manageable pieces, each one containing several tens of thousands of base pairs. A variety of chemicals are applied to cause the different base constituents (A, T, C, or G) to fluoresce, and the resulting emission is captured, measured, and stored. Since this study generated a koala reference genome, the sequencing reads were assembled using an overlapping layout consensus assembly algorithm known as Falcon which was run on AWS. The koala genome comes in at 3.42 billion base pairs, slightly larger than the human genome.

I’m happy to report that this groundbreaking work was performed on AWS. The research team used cfnCluster to create multiple clusters, each with 500 to 1000 vCPUs, and running Falcon from Pacific Biosciences. All in all, the team used 3 million EC2 core hours, most of which were EC2 Spot Instances. Having access to flexible, low-cost compute power allowed the bioinformatics team to experiment with the configuration of the Falcon pipeline as they tuned and adapted it to their workload.

We are happy to have done our small part to help with this interesting and valuable research!

Jeff;

Now Available – Amazon EC2 High Memory Instances with 6, 9, and 12 TB of Memory, Perfect for SAP HANA

The Altair 8800 computer that I built in 1977 had just 4 kilobytes of memory. Today I was able to use an EC2 instance with 12 terabytes (12 tebibytes to be exact) of memory, almost 4 billion times as much!

The new Amazon EC2 High Memory Instances let you take advantage of other AWS services including Amazon Elastic Block Store (EBS), Amazon Simple Storage Service (S3), AWS Identity and Access Management (IAM), Amazon CloudWatch, and AWS Config. They are designed to allow AWS customers to run large-scale SAP HANA installations, and can be used to build production systems that provide enterprise-grade data protection and business continuity.

Here are the specs:

Instance Name Memory Logical Processors
Dedicated EBS Bandwidth Network Bandwidth
u-6tb1.metal 6 TiB 448 14 Gbps 25 Gbps
u-9tb1.metal 9 TiB 448 14 Gbps 25 Gbps
u-12tb1.metal 12 TiB 448 14 Gbps 25 Gbps

Each Logical Processor is a hyperthread on one of the 224 physical CPU cores. All three sizes are powered by the latest generation Intel® Xeon® Platinum 8176M (Skylake) processors running at 2.1 GHz (with Turbo Boost to 3.80 GHz), and are available as EC2 Dedicated Hosts for launch within a new or existing Amazon Virtual Private Cloud (VPC). You can launch them using the AWS Command Line Interface (CLI) or the EC2 API, and manage them there or in the EC2 Console.

The instances are EBS-Optimized by default, and give you low-latency access to encrypted and unencrypted EBS volumes. You can choose between Provisioned IOPS, General Purpose (SSD), and Streaming Magnetic volumes, and can attach multiple volumes, each with a distinct type and size, to each instance.

SAP HANA in Minutes
The EC2 High Memory instances are certified by SAP for OLTP and OLAP workloads such as S4/HANA, Suite on HANA, BW4/HANA, BW on HANA, and Datamart (see the SAP HANA Hardware Directory for more information).

We ran the SAP Standard Application Benchmark and measured the instances at 480,600 SAPS, making them suitable for very large workloads. Here’s an excerpt from the benchmark:

In anticipation of today’s launch, the EC2 team provisioned a u-12tb1.metal instance for my AWS account and I located it in the Dedicated Hosts section of the EC2 Console:

Following the directions in the SAP HANA on AWS Quick Start, I copy the Host Reservation ID, hop over to the CloudFormation Console and click Create Stack to get started. I choose my template, give my stack a name, and enter all of the necessary parameters, including the ID that I copied, and click Next to proceed:

On the next page I indicate that I want to tag my resources, leave everything else as-is, and click Next:

I review my settings, acknowledge that the stack might create IAM resources, and click Next to create the stack:

The AWS resources are created and SAP HANA is installed, all in less than 40 minutes:

Using an EC2 instance on the public subnet of my VPC, I can access the new instance. Here’s the memory:

And here’s the CPU info:

I can also run an hdbsql query:

SELECT 
  DISTINCT HOST, CAST(VALUE/1024/1024/1024 AS INTEGER) AS TOTAL_MEMORY_GB 
  FROM SYS.M_MEMORY
  WHERE NAME='SYSTEM_MEMORY_SIZE';

Here’s the output, showing that SAP HANA has access to 12 TiB of memory:

Another option is to have the template create a second EC2 instance, this one running Windows on a public subnet, and accessible via RDP:

I could install HANA Studio on this instance and use its visual interface to run my SAP HANA queries.

The Quick Start implementation uses high performance SSD-based EBS storage volumes for all of your data. This gives you the power to switch to a larger instance in minutes without having to migrate any data.

Available Now
Just like the existing SAP-certified X1 and X1e instances, the EC2 High Memory instances are very cost-effective. For example, the effective hourly rate for the All Upfront 3-Year Reservation for a u-12tb1.metal Dedicated Host in the US East (N. Virginia) Region is $30.539 per hour.

These instances are now available in the US East (N. Virginia) and Asia Pacific (Tokyo) Regions as Dedicated Hosts with a 3-year term, and will be available soon in the US West (Oregon), Europe (Ireland), and AWS GovCloud (US) Regions. If you are ready to get started, contact your AWS account team or use the Contact Us page to make a request.

In the Works
We’re not stopping at 12 TiB, and are planning to launch instances with 18 TiB and 24 TiB of memory in 2019.

Jeff;

PS – If you have applications that might need multiple terabytes in the future but can run comfortably in less memory today, be sure to consider the R5, X1, and X1e instances.

 

Meet the Newest AWS Heroes (September 2018 Edition)

AWS Heroes are passionate AWS enthusiasts who use their extensive knowledge to teach others about all things AWS across a range of mediums. Many Heroes eagerly share knowledge online via forums, social media, or blogs; while others lead AWS User Groups or organize AWS Community Day events. Their extensive efforts to spread AWS knowledge have a significant impact within their local communities. Today we are excited to introduce the newest AWS Heroes:

Jaroslaw Zielinski – Poznan, Poland

Jaroslaw ZielinskiAWS Community Hero Jaroslaw Zielinski is a Solutions Architect at Vernity in Poznan (Poland), where his responsibility is to support customers on their road to the cloud using cloud adoption patterns. Jaroslaw is a leader of AWS User Group Poland operating in 7 different cities around Poland. Additionally, he connects the community with the biggest IT conferences in the region – PLNOG, DevOpsDay, Amazon@Innovation to name just a few.

He supports numerous projects connected with evangelism, like Zombie Apocalypse Workshops or Cloud Builder’s Day. Bringing together various IT communities, he hosts a conference Cloud & Datacenter Day – the biggest community conference in Poland. In addition, his passion for IT is transferred into his own blog called Popołudnie w Sieci. He also publishes in various professional papers.

 

Jerry Hargrove – Kalama, USA

Jerry HargroveAWS Community Hero Jerry Hargrove is a cloud architect, developer and evangelist who guides companies on their journey to the cloud, helping them to build smart, secure and scalable applications. Currently with Lucidchart, a leading visual productivity platform, Jerry is a thought leader in the cloud industry and specializes in AWS product and services breakdowns, visualizations and implementation. He brings with him over 20 years of experience as a developer, architect & manager for companies like Rackspace, AWS and Intel.

You can find Jerry on Twitter compiling his famous sketch notes and creating Lucidchart templates that pinpoint practical tips for working in the cloud and helping developers increase efficiency. Jerry is the founder of the AWS Meetup Group in Salt Lake City, often contributes to meetups in the Pacific Northwest and San Francisco Bay area, and speaks at developer conferences worldwide. Jerry holds several professional AWS certifications.

 

Martin Buberl – Copenhagen, Denmark

Martin BuberlAWS Community Hero Martin Buberl brings the New York hustle to Scandinavia. As VP Engineering at Trustpilot he is on a mission to build the best engineering teams in the Nordics and Baltics. With a person-centered approach, his focus is on high-leverage activities to maximize impact, customer value and iteration speed — and utilizing cloud technologies checks all those boxes.

His cloud-obsession made him an early adopter and evangelist of all types of AWS services throughout his career. Nowadays, he is especially passionate about Serverless, Big Data and Machine Learning and excited to leverage the cloud to transform those areas.

Martin is an AWS User Group Leader, organizer of the AWS Community Day Nordics and founder of the AWS Community Nordics Slack. He has spoken at multiple international AWS events — AWS User Groups, AWS Community Days and AWS Global Summits — and is looking forward to continue sharing his passion for software engineering and cloud technologies with the Community.

To learn more about the AWS Heroes program or to connect with an AWS Hero in your community, click here.

New – Parallel Query for Amazon Aurora

Amazon Aurora is a relational database that was designed to take full advantage of the abundance of networking, processing, and storage resources available in the cloud. While maintaining compatibility with MySQL and PostgreSQL on the user-visible side, Aurora makes use of a modern, purpose-built distributed storage system under the covers. Your data is striped across hundreds of storage nodes distributed over three distinct AWS Availability Zones, with two copies per zone, on fast SSD storage. Here’s what this looks like (extracted from Getting Started with Amazon Aurora):

New Parallel Query
When we launched Aurora we also hinted at our plans to apply the same scale-out design principle to other layers of the database stack. Today I would like to tell you about our next step along that path.

Each node in the storage layer pictured above also includes plenty of processing power. Aurora is now able to make great use of that processing power by taking your analytical queries (generally those that process all or a large part of a good-sized table) and running them in parallel across hundreds or thousands of storage nodes, with speed benefits approaching two orders of magnitude. Because this new model reduces network, CPU, and buffer pool contention, you can run a mix of analytical and transactional queries simultaneously on the same table while maintaining high throughput for both types of queries.

The instance class determines the number of parallel queries that can be active at a given time:

  • db.r*.large – 1 concurrent parallel query session
  • db.r*.xlarge – 2 concurrent parallel query sessions
  • db.r*.2xlarge – 4 concurrent parallel query sessions
  • db.r*.4xlarge – 8 concurrent parallel query sessions
  • db.r*.8xlarge – 16 concurrent parallel query sessions
  • db.r4.16xlarge – 16 concurrent parallel query sessions

You can use the aurora_pq parameter to enable and disable the use of parallel queries at the global and the session level.

Parallel queries enhance the performance of over 200 types of single-table predicates and hash joins. The Aurora query optimizer will automatically decide whether to use Parallel Query based on the size of the table and the amount of table data that is already in memory; you can also use the aurora_pq_force session variable to override the optimizer for testing purposes.

Parallel Query in Action
You will need to create a fresh cluster in order to make use of the Parallel Query feature. You can create one from scratch, or you can restore a snapshot.

To create a cluster that supports Parallel Query, I simply choose Provisioned with Aurora parallel query enabled as the Capacity type:

I used the CLI to restore a 100 GB snapshot for testing, and then explored one of the queries from the TPC-H benchmark. Here’s the basic query:

SELECT
  l_orderkey,
  SUM(l_extendedprice * (1-l_discount)) AS revenue,
  o_orderdate,
  o_shippriority

FROM customer, orders, lineitem

WHERE
  c_mktsegment='AUTOMOBILE'
  AND c_custkey = o_custkey
  AND l_orderkey = o_orderkey
  AND o_orderdate < date '1995-03-13'
  AND l_shipdate > date '1995-03-13'

GROUP BY
  l_orderkey,
  o_orderdate,
  o_shippriority

ORDER BY
  revenue DESC,
  o_orderdate LIMIT 15;

The EXPLAIN command shows the query plan, including the use of Parallel Query:

+----+-------------+----------+------+-------------------------------+------+---------+------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table    | type | possible_keys                 | key  | key_len | ref  | rows      | Extra                                                                                                                          |
+----+-------------+----------+------+-------------------------------+------+---------+------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
|  1 | SIMPLE      | customer | ALL  | PRIMARY                       | NULL | NULL    | NULL |  14354602 | Using where; Using temporary; Using filesort                                                                                   |
|  1 | SIMPLE      | orders   | ALL  | PRIMARY,o_custkey,o_orderdate | NULL | NULL    | NULL | 154545408 | Using where; Using join buffer (Hash Join Outer table orders); Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra)   |
|  1 | SIMPLE      | lineitem | ALL  | PRIMARY,l_shipdate            | NULL | NULL    | NULL | 606119300 | Using where; Using join buffer (Hash Join Outer table lineitem); Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra) |
+----+-------------+----------+------+-------------------------------+------+---------+------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.01 sec)

Here is the relevant part of the Extras column:

Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra)

The query runs in less than 2 minutes when Parallel Query is used:

+------------+-------------+-------------+----------------+
| l_orderkey | revenue     | o_orderdate | o_shippriority |
+------------+-------------+-------------+----------------+
|   92511430 | 514726.4896 | 1995-03-06  |              0 |
|  593851010 | 475390.6058 | 1994-12-21  |              0 |
|  188390981 | 458617.4703 | 1995-03-11  |              0 |
|  241099140 | 457910.6038 | 1995-03-12  |              0 |
|  520521156 | 457157.6905 | 1995-03-07  |              0 |
|  160196293 | 456996.1155 | 1995-02-13  |              0 |
|  324814597 | 456802.9011 | 1995-03-12  |              0 |
|   81011334 | 455300.0146 | 1995-03-07  |              0 |
|   88281862 | 454961.1142 | 1995-03-03  |              0 |
|   28840519 | 454748.2485 | 1995-03-08  |              0 |
|  113920609 | 453897.2223 | 1995-02-06  |              0 |
|  377389669 | 453438.2989 | 1995-03-07  |              0 |
|  367200517 | 453067.7130 | 1995-02-26  |              0 |
|  232404000 | 452010.6506 | 1995-03-08  |              0 |
|   16384100 | 450935.1906 | 1995-03-02  |              0 |
+------------+-------------+-------------+----------------+
15 rows in set (1 min 53.36 sec)

I can disable Parallel Query for the session (I can use an RDS custom cluster parameter group for a longer-lasting effect):

set SESSION aurora_pq=OFF;

The query runs considerably slower without it:

+------------+-------------+-------------+----------------+
| l_orderkey | o_orderdate | revenue     | o_shippriority |
+------------+-------------+-------------+----------------+
|   92511430 | 1995-03-06  | 514726.4896 |              0 |
...
|   16384100 | 1995-03-02  | 450935.1906 |              0 |
+------------+-------------+-------------+----------------+
15 rows in set (1 hour 25 min 51.89 sec)

This was on a db.r4.2xlarge instance; other instance sizes, data sets, access patterns, and queries will perform differently. I can also override the query optimizer and insist on the use of Parallel Query for testing purposes:

set SESSION aurora_pq_force=ON;

Things to Know
Here are a couple of things to keep in mind when you start to explore Amazon Aurora Parallel Query:

Engine Support – We are launching with support for MySQL 5.6, and are working on support for MySQL 5.7 and PostgreSQL.

Table Formats – The table row format must be COMPACT; partitioned tables are not supported.

Data Types – The TEXT, BLOB, and GEOMETRY data types are not supported.

DDL – The table cannot have any pending fast online DDL operations.

Cost – You can make use of Parallel Query at no extra charge. However, because it makes direct access to storage, there is a possibility that your IO cost will increase.

Give it a Shot
This feature is available now and you can start using it today!

Jeff;

 

AWS Data Transfer Price Reductions – Up to 34% (Japan) and 28% (Australia)

I’ve got good good news for AWS customers who make use of our Asia Pacific (Tokyo) and Asia Pacific (Sydney) Regions. Effective September 1, 2018 we are reducing prices for data transfer from Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and Amazon CloudFront by up to 34% in Japan and 28% in Australia.

EC2 and S3 Data Transfer
Here are the new prices for data transfer from EC2 and S3 to the Internet:

EC2 & S3 Data Transfer Out to Internet Japan Australia
Old Rate New Rate Change Old Rate New Rate Change
Up to 1 GB / Month $0.000 $0.000 0% $0.000 $0.000 0%
Next 9.999 TB / Month $0.140 $0.114 -19% $0.140 $0.114 -19%
Next 40 TB / Month $0.135 $0.089 -34% $0.135 $0.098 -27%
Next 100 TB / Month $0.130 $0.086 -34% $0.130 $0.094 -28%
Greater than 150 TB / Month $0.120 $0.084 -30% $0.120 $0.092 -23%

You can consult the EC2 Pricing and S3 Pricing pages for more information.

CloudFront Data Transfer
Here are the new prices for data transfer from CloudFront edge nodes to the Internet

CloudFront Data Transfer Out to Internet Japan Australia
Old Rate New Rate Change Old Rate New Rate Change
Up to 10 TB / Month $0.140 $0.114 -19% $0.140 $0.114 -19%
Next 40 TB / Month $0.135 $0.089 -34% $0.135 $0.098 -27%
Next 100 TB / Month $0.120 $0.086 -28% $0.120 $0.094 -22%
Next 350 TB / Month $0.100 $0.084 -16% $0.100 $0.092 -8%
Next 524 TB / Month $0.080 $0.080 0% $0.095 $0.090 -5%
Next 4 PB / Month $0.070 $0.070 0% $0.090 $0.085 -6%
Over 5 PB / Month $0.060 $0.060 0% $0.085 $0.080 -6%

Visit the CloudFront Pricing page for more information.

We have also reduced the price of data transfer from CloudFront to your Origin. The price for CloudFront Data Transfer to Origin from edge locations in Australia has been reduced 20% to $0.080 per GB. This represents content uploads via POST and PUT.

Things to Know
Here are a couple of interesting things that you should know about AWS and data transfer:

AWS Free Tier – You can use the AWS Free Tier to get started with, and to learn more about, EC2, S3, CloudFront, and many other AWS services. The AWS Getting Started page contains lots of resources to help you with your first project.

Data Transfer from AWS Origins to CloudFront – There is no charge for data transfers from an AWS origin (S3, EC2, Elastic Load Balancing, and so forth) to any CloudFront edge location.

CloudFront Reserved Capacity Pricing – If you routinely use CloudFront to deliver 10 TB or more content per month, you should investigate our Reserved Capacity pricing. You can receive a significant discount by committing to transfer 10 TB or more content from a single region, with additional discounts at higher levels of usage. To learn more or to sign up, simply Contact Us.

Jeff;

 

New – AWS Storage Gateway Hardware Appliance

AWS Storage Gateway connects your on-premises applications to AWS storage services such as Amazon Simple Storage Service (S3), Amazon Elastic Block Store (EBS), and Amazon Glacier. It runs in your existing virtualized environment and is visible to your applications and your client operating systems as a file share, a local block volume, or a virtual tape library. The resulting hybrid storage model gives our customers the ability to use their AWS Storage Gateways for backup, archiving, disaster recovery, cloud data processing, storage tiering, and migration.

New Hardware Appliance
Today we are making Storage Gateway available as a hardware appliance, adding to the existing support for VMware ESXi, Microsoft Hyper-V, and Amazon EC2. This means that you can now make use of Storage Gateway in situations where you do not have a virtualized environment, server-class hardware or IT staff with the specialized skills that are needed to manage them. You can order appliances from Amazon.com for delivery to branch offices, warehouses, and “outpost” offices that lack dedicated IT resources. Setup (as you will see in a minute) is quick and easy, and gives you access to three storage solutions:

File Gateway – A file interface to Amazon S3, accessible via NFS or SMB. The files are stored as S3 objects, allowing you to make use of specialized S3 features such as lifecycle management and cross-region replication. You can trigger AWS Lambda functions, run Amazon Athena queries, and use Amazon Macie to discover and classify sensitive data.

Volume Gateway – Cloud-backed storage volumes, accessible as local iSCSI volumes. Gateways can be configured to cache frequently accessed data locally, or to store a full copy of all data locally. You can create EBS snapshots of the volumes and use them for disaster recovery or data migration.

Tape Gateway – A cloud-based virtual tape library (VTL), accessible via iSCSI, so you can replace your on-premises tape infrastructure, without changing your backup workflow.

To learn more about each of these solutions, read What is AWS Storage Gateway.

The AWS Storage Gateway Hardware Appliance is based on a specially configured Dell EMC PowerEdge R640 Rack Server that is pre-loaded with AWS Storage Gateway software. It has 2 Intel® Xeon® processors, 128 GB of memory, 6 TB of usable SSD storage for your locally cached data, redundant power supplies, and you can order one from Amazon.com:

If you have an Amazon Business account (they’re free) you can use a purchase order for the transaction. In addition to simplifying deployment, using this standardized configuration helps to assure consistent performance for your local applications.

Hardware Setup
As you know, I like to go hands-on with new AWS products. My colleagues shipped a pre-release appliance to me; I left it under the watchful guide of my CSO (Canine Security Officer) until I was ready to write this post:

I don’t have a server room or a rack, so I set it up on my hobby table for testing:

In addition to the appliance, I also scrounged up a VGA cable, a USB keyboard, a small monitor, and a power adapter (C13 to NEMA 5-15). The adapter is necessary because the cord included with the appliance is intended to plug in a power distribution jack commonly found in a data center. I connected it all up, turned it on and watched it boot up, then entered a new administrative password.

Following the directions in the documentation, I configured an IPV4 address, using DHCP for convenience:

I captured the IP address for use in the next step, selected Back (the UI is keyboard-driven) and then logged out. This is the only step that takes place on the local console.

Gateway Configuration
At this point I will switch from past to present, and walk you through the configuration process. As directed by the Getting Started Guide, I open the Storage Gateway Console on the same network as the appliance, select the region where I want to create my gateway, and click Get started:

I select File gateway and click Next to proceed:

I select Hardware Appliance as my host platform (I can click Buy on Amazon to purchase one if necessary), and click Next:

Then I enter the IP address of my appliance and click Connect:

I enter a name for my gateway (jbgw1), set the time zone, pick ZFS as my RAID Volume Manager, and click Activate to proceed:

My gateway is activated within a second or two and I can see it in the Hardware section of the console:

At this point I am free to use a console that is not on the same network, so I’ll switch back to my trusty WorkSpace!

Now that my hardware has been activated, I can launch the actual gateway service on it. I select the appliance, and choose Launch Gateway from the Actions menu:

I choose the desired gateway type, enter a name (fgw1) for it, and click Launch gateway:

The gateway will start off in the Offline status, and transition to Online within 3 to 5 minutes. The next step is to allocate local storage by clicking Edit local disks:

Since I am creating a file gateway, all of the local storage is used for caching:

Now I can create a file share on my appliance! I click Create file share, enter the name of an existing S3 bucket, and choose NFS or SMB, then click Next:

I configure a couple of S3 options, request creation of a new IAM role, and click Next:

I review all of my choices and click Create file share:

After I create the share I can see the commands that are used to mount it in each client environment:

I mount the share on my Ubuntu desktop (I had to install the nfs-client package first) and copy a bunch of files to it:

Then I visit the S3 bucket and see that the gateway has already uploaded the files:

Finally, I have the option to change the configuration of my appliance. After making sure that all network clients have unmounted the file share, I remove the existing gateway:

And launch a new one:

And there you have it. I installed and configured the appliance, created a file share that was accessible from my on-premises systems, and then copied files to it for replication to the cloud.

Now Available
The Storage Gateway Hardware Appliance is available now and you can purchase one today. Start in the AWS Storage Gateway Console and follow the steps above!

Jeff;

 

 

New – AWS Systems Manager Session Manager for Shell Access to EC2 Instances

It is a very interesting time to be a corporate IT administrator. On the one hand, developers are talking about (and implementing) an idyllic future where infrastructure as code, and treating servers and other resources as cattle. On the other hand, legacy systems still must be treated as pets, set up and maintained by hand or with the aid of limited automation. Many of the customers that I speak with are making the transition to the future at a rapid pace, but need to work in the world that exists today. For example, they still need shell-level access to their servers on occasion. They might need to kill runaway processes, consult server logs, fine-tune configurations, or install temporary patches, all while maintaining a strong security profile. They want to avoid the hassle that comes with running Bastion hosts and the risks that arise when opening up inbound SSH ports on the instances.

We’ve already addressed some of the need for shell-level access with the AWS Systems Manager Run Command. This AWS facility gives administrators secure access to EC2 instances. It allows them to create command documents and run them on any desired set of EC2 instances, with support for both Linux and Microsoft Windows. The commands are run asynchronously, with output captured for review.

New Session Manager
Today we are adding a new option for shell-level access. The new Session Manager makes the AWS Systems Manager even more powerful. You can now use a new browser-based interactive shell and a command-line interface (CLI) to manage your Windows and Linux instances. Here’s what you get:

Secure Access – You don’t have to manually set up user accounts, passwords, or SSH keys on the instances and you don’t have to open up any inbound ports. Session Manager communicates with the instances via the SSM Agent across an encrypted tunnel that originates on the instance, and does not require a bastion host.

Access Control – You use IAM policies and users to control access to your instances, and don’t need to distribute SSH keys. You can limit access to a desired time/maintenance window by using IAM’s Date Condition Operators.

Auditability – Commands and responses can be logged to Amazon CloudWatch and to an S3 bucket. You can arrange to receive an SNS notification when a new session is started.

Interactivity – Commands are executed synchronously in a full interactive bash (Linux) or PowerShell (Windows) environment

Programming and Scripting – In addition to the console access that I will show you in a moment, you can also initiate sessions from the command line (aws ssm ...) or via the Session Manager APIs.

The SSM Agent running on the EC2 instances must be able to connect to Session Manager’s public endpoint. You can also set up a PrivateLink connection to allow instances running in private VPCs (without Internet access or a public IP address) to connect to Session Manager.

Session Manager in Action
In order to use Session Manager to access my EC2 instances, the instances must be running the latest version (2.3.12 or above) of the SSM Agent. The instance role for the instances must reference a policy that allows access to the appropriate services; you can create your own or use AmazonEC2RoleForSSM. Here are my EC2 instances (sk1 and sk2 are running Amazon Linux; sk3-win and sk4-win are running Microsoft Windows):

Before I run my first command, I open AWS Systems Manager and click Preferences. Since I want to log my commands, I enter the name of my S3 bucket and my CloudWatch log group. If I enter either or both values, the instance policy must also grant access to them:

I’m ready to roll! I click Sessions, see that I have no active sessions, and click Start session to move ahead:

I select a Linux instance (sk1), and click Start session again:

The session opens up immediately:

I can do the same for one of my Windows instances:

The log streams are visible in CloudWatch:

Each stream contains the content of a single session:

In the Works
As usual, we have some additional features in the works for Session Manager. Here’s a sneak peek:

SSH Client – You will be able to create SSH sessions atop Session Manager without opening up any inbound ports.

On-Premises Access – We plan to give you the ability to access your on-premises instances (which must be running the SSM Agent) via Session Manager.

Available Now
Session Manager is available in all AWS regions (including AWS GovCloud) at no extra charge.

Jeff;