Introduction
Azure Speech in Foundry Tools provides APIs that you can use to build speech-enabled applications, including:
- Speech to text: An API that enables speech recognition in which your application can accept spoken input.
- Text to speech: An API that enables speech synthesis in which your application can provide spoken output.
- Speech Translation: An API that you can use to translate spoken input into multiple languages.
- Voice Live: An API that you can use to build AI agents that are capable of conducting real-time conversations.
This module focuses on speech recognition and speech synthesis, which are core capabilities of any speech-enabled application.
The code examples in this module are provided in Python, but you can use any of the available Azure Speech SDK packages to develop speech-enabled applications in your preferred language. Available SDK packages include:
Note
We recognize that different people like to learn in different ways. You can choose to complete this module in video-based format or you can read the content as text and images. The text contains greater detail than the videos, so in some cases you might want to refer to it as supplemental material to the video presentation.