I first told you about Polly in late 2016 in my post Amazon Polly – Text to Speech in 47 Voices and 24 Languages. After that AWS re:Invent launch, we added support for Korean, five new voices, and made Polly available in all Regions in the aws partition. We also added whispering, speech marks, a timbre effect, and dynamic range compression.
New WordPress Plugin
Today we are launching a WordPress plugin that uses Polly to create high-quality audio versions of your blog posts. You can access the audio from within the post or in podcast form using a feature that we call Amazon Pollycast! Both options make your content more accessible and can help you to reach a wider audience. This plugin was a joint effort between the AWS team our friends at AWS Advanced Technology Partner WP Engine.
As you will see, the plugin is easy to install and configure. You can use it with installations of WordPress that you run on your own infrastructure or on AWS. Either way, you have access to all of Polly’s voices along with a wide variety of configuration options. The generated audio (an MP3 file for each post) can be stored alongside your WordPress content, or in Amazon Simple Storage Service (S3), with optional support for content distribution via Amazon CloudFront.
Installing the Plugin
I did not have an existing WordPress-powered blog, so I begin by launching a Lightsail instance using the WordPress 4.8.1 blueprint:
Then I follow these directions to access my login credentials:
Credentials in hand, I log in to the WordPress Dashboard:
The plugin makes calls to AWS, and needs to have credentials in order to do so. I hop over to the IAM Console and created a new policy. The policy allows the plugin to access a carefully selected set of S3 and Polly functions (find the full policy in the README):
Then I create an IAM user (wp-polly-user). I enter the name and indicate that it will be used for Programmatic Access:
Then I attach the policy that I just created, and click on Review:
I review my settings (not shown) and then click on Create User. Then I copy the two values (Access Key ID and Secret Access Key) into a secure location. Possession of these keys allows the bearer to make calls to AWS so I take care not to leave them lying around.
Now I am ready to install the plugin! I go back to the WordPress Dashboard and click on Add New in the Plugins menu:
Then I click on Upload Plugin and locate the ZIP file that I downloaded from the WordPress Plugins site. After I find it I click on Install Now to proceed:
WordPress uploads and installs the plugin. Now I click on Activate Plugin to move ahead:
With the plugin installed, I click on Settings to set it up:
I enter my keys and click on Save Changes:
The General settings let me control the sample rate, voice, player position, the default setting for new posts, and the autoplay option. I can leave all of the settings as-is to get started:
The Cloud Storage settings let me store audio in S3 and to use CloudFront to distribute the audio:
The Amazon Pollycast settings give me control over the iTunes parameters that are included in the generated RSS feed:
Finally, the Bulk Update button lets me regenerate all of the audio files after I change any of the other settings:
With the plugin installed and configured, I can create a new post. As you can see, the plugin can be enabled and customized for each post:
I can see how much it will cost to convert to audio with a click:
When I click on Publish, the plugin breaks the text into multiple blocks on sentence boundaries, calls the Polly
SynthesizeSpeech API for each block, and accumulates the resulting audio in a single MP3 file. The published blog post references the file using the
<audio> tag. Here’s the post:
I can’t seem to use an
<audio> tag in this post, but you can download and play the MP3 file yourself if you’d like.
The Pollycast feature generates an RSS file with links to an MP3 file for each post:
The plugin will make calls to Amazon Polly each time the post is saved or updated. Pricing is based on the number of characters in the speech requests, as described on the Polly Pricing page. Also, the AWS Free Tier lets you process up to 5 million characters per month at no charge, for a period of one year that starts when you make your first call to Polly.
The plugin is available on GitHub in source code form and we are looking forward to your pull requests! Here are a couple of ideas to get you started:
Voice Per Author – Allow selection of a distinct Polly voice for each author.
Quoted Text – For blogs that make frequent use of embedded quotes, use a distinct voice for the quotes.
Translation – Use Amazon Translate to translate the texts into another language, and then use Polly to generate audio in that language.
Other Blogging Engines – Build a similar plugin for your favorite blogging engine.
SSML Support – Figure out an interesting way to use Polly’s SSML tags to add additional character to the audio.
Let me know what you come up with!
This article was originally published at Amazon Web Services Blog.