Which aws service can be used to turn text into life-like speech?

Use Cases

Content creation

Audio can be used as a complementary media to written and/or visual communication. By voicing your content, you can provide your audience with an alternative way to consume information and meet the needs of a larger pool of readers. Amazon Polly can generate speech in dozens of languages, making it easy to add speech to applications with a global audience, such as RSS feeds, websites, or videos.


E-learning

Amazon Polly enables developers to provide their applications with an enhanced visual experience such as speech-synchronized facial animation or karaoke-style word highlighting. Amazon Polly makes it easy to request an additional stream of metadata with information about when particular sentences, words and sounds are being pronounced. Using this metadata stream alongside the synthesized speech audio stream, customers can animate avatars and highlight text as it is currently spoken text in their app.


Telephony

With Amazon Polly, your contact centers can engage customers with natural sounding voices. You can cache and replay Amazon Polly’s speech output to prompt callers through interactive voice response (IVR) systems, such as Amazon Connect. Additionally, you can leverage Amazon Polly’s API to deliver automated real-time information such as service status, account and billing inquiries, addresses, and contact information.

Example: Text-to-speech for telephony systems

Which aws service can be used to turn text into life-like speech?

Posted On: Sep 1, 2022

Amazon Polly is a service that turns text into lifelike speech. Today, we are excited to announce the general availability of all Neural Text-to-Speech (NTTS) voices in the Asia Pacific (Mumbai) Region.

Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Polly's Text-to-Speech (TTS) service uses advanced deep learning technologies to synthesize natural sounding human speech.

Amazon Polly voices can be applied to a diverse set of use cases to increase customer engagement. For example, giving interactive voice response (IVR) or virtual assistant agents’ natural sounding voices or producing spoken versions of text-based content. For eLearning, audiobooks, newsreaders, and other content, you can also provide audio/visual experiences by synchronizing speech with facial animation or karaoke-style word highlighting.

Customers in the following 13 regions can now experience higher voice quality and lower latency when using Amazon Polly: US East (N. Virginia), US West (Oregon), Africa (Cape Town), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London) and AWS GovCloud (US-West). Refer to Global Infrastructure page for the latest list.

AWS support for Internet Explorer ends on 07/31/2022. Supported browsers are Chrome, Firefox, Edge, and Safari. Learn more »

Amazon Polly converts input text into life-like speech. You call one of the speech synthesis methods, provide the text that you want to synthesize, choose one of the Neural Text-to-Speech (NTTS) or Standard Text-to-Speech (TTS) voices, and specify an audio output format. Amazon Polly then synthesizes the provided text into a high-quality speech audio stream.

  • Input text – Provide the text that you want to synthesize, and Amazon Polly returns an audio stream. You can provide the input as plain text or in Speech Synthesis Markup Language (SSML) format. With SSML you can control various aspects of speech, such as pronunciation, volume, pitch, and speech rate. For more information, see Generating Speech from SSML Documents.

  • Available voices – Amazon Polly provides a portfolio of languages and a variety of voices, including a bilingual voice (for both English and Hindi). For most languages you can choose from several voices, both male and female. When launching a speech synthesis task, you specify the voice ID, and then Amazon Polly uses this voice to convert the text to speech. Amazon Polly is not a translation service—the synthesized speech is in the same language as the text. However, if the text is in a different language than designated for the voice, numbers represented as digits (for example, 53, not fifty-three) are synthesized in the language of the voice and not the text. For more information, see Voices in Amazon Polly.

  • Output format – Amazon Polly can deliver the synthesized speech in multiple formats. You can select the audio format that suits your needs. For example, you might request the speech in the MP3 or Ogg Vorbis format for consumption by web and mobile applications. Or, you might request the PCM output format for consumption by AWS IoT devices and telephony solutions.

What's Next?

If you are new to Amazon Polly, we recommend that you to read the following topics in order:

  • Getting Started with Amazon Polly

  • Example Applications

  • Quotas in Amazon Polly

Which AWS service or feature is used to send both text?

Amazon SQS is the AWS service that allows application components to communicate in the cloud. You will use the Amazon SQS console to create and configure a message queue, send a message, receive and delete that message, and then delete the queue.

Which AWS service is designed to help users who want to use machine learning for natural language processing?

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text. No machine learning experience required.

Which AWS services provide automatic replication across availability zones?

AWS Lambda automatically runs your code on highly available, fault-tolerant infrastructure spread across multiple Availability Zones in a single region without requiring you to provision or manage servers.