Definition

Amazon Transcribe

Tim Culverhouse, Site Editor

Published: Aug 15, 2018

Amazon Transcribe is a speech recognition service that transcribes audio files into text.

The service, which uses machine learning technology, also enables a developer to add speech-to-text capabilities into an application. A developer, for example, could build an application that uses Amazon Transcribe to create transcriptions of customer service calls in a contact center, or to generate subtitles for audio or video content in real time.

How to use Amazon Transcribe

To use Amazon Transcribe, a developer must first have an AWS account and create an AWS Identity and Access Management user. Then, he or she can access the service through the AWS Management Console, AWS Command Line interface (CLI) or Transcribe API.

Audio files for Transcribe -- which a developer uploads and stores in S3 -- can be in MP3, MP4, WAV or FLAC format, and no longer than two hours in length. The service supports both 16-kilohertz (kHz) and 8-kHz audio streams.

A developer must specify the language and format of the audio file he or she wants to transcribe with the service. As of mid-2018, Transcribe supports only US English and Spanish.

Start the service by creating a transcription job. — How to create a transcription job in Amazon Transcribe.

Other Amazon Transcribe features

Transcribe uses deep learning to incorporate punctuation and formatting into each text output, and to limit the amount of editing required after it completes a transcription. During each transcription, the service will also generate a timestamp for each word, in case a user needs to return to a point in time in the original audio file for clarification.

Transcribe can identify between two and 10 different speakers in an audio file, and then label segments of its text file to indicate which speaker spoke which words. Transcribe also enables a developer to input files with custom vocabulary -- such as jargon or proper names that are relevant to a particular industry or use case -- to ensure a more accurate text output.

Transcribe integrates with a range of other Amazon services, including Amazon Comprehend, a natural language processing (NLP) service; Amazon Translate, a language translation service; and Amazon Polly, a service that converts text files into speech.