SpeechAnalyzer

Asked on 06/26/2025

1 search

The SpeechAnalyzer is a new API introduced by Apple that enhances speech-to-text capabilities across its platforms. It leverages the power of Swift to perform speech-to-text processing and manage model assets on the user's device with minimal code. This API is designed to support a wide range of use cases, including long-form and distant audio such as lectures, meetings, and conversations. It is already being used in system apps like Notes, Voice Memos, and FaceTime for features like live captions and call summarization.

The SpeechAnalyzer API consists of the SpeechAnalyzer class and several other classes. It allows developers to add modules to an analysis session to perform specific types of analysis, such as transcription. The API works asynchronously, enabling applications to process audio input and display results independently.

For more detailed information, you can refer to the session titled "Bring advanced speech-to-text to your app with SpeechAnalyzer" from WWDC 2025. The session covers the design of the API, its capabilities, and includes a live coding demo. You can start exploring this session from the SpeechAnalyzer API chapter.

SpeechAnalyzer | Ask WWDC