How do I create Speech input that gets filtered and combined with text from my own database

Asked on 06/11/2025

1 search

To create a speech input that gets filtered and combined with text from your own database, you can use the SpeechAnalyzer API introduced by Apple. This API allows you to perform speech-to-text processing on-device with minimal code. Here's a general approach based on the information from the WWDC session "Bring advanced speech-to-text to your app with SpeechAnalyzer":

  1. Set Up SpeechAnalyzer: Use the SpeechAnalyzer class to manage an analysis session. You can add a transcriber module to perform speech-to-text processing. This involves passing audio buffers to the analyzer, which processes them asynchronously.

  2. Transcription: The Speech Transcriber module will convert spoken audio into text. This text can be processed further or displayed in your application. The transcription results are provided as attributed strings, which include timing data for synchronization with audio playback.

  3. Combine with Database Text: Once you have the transcribed text, you can filter or combine it with text from your own database. This could involve searching your database for related content or appending the transcribed text to existing entries.

  4. Handle Results: The API provides both volatile (real-time) and finalized (best guess) results. You can choose to display volatile results with lighter opacity and replace them with finalized results as they become available.

For more detailed guidance, you can refer to the session Bring advanced speech-to-text to your app with SpeechAnalyzer (02:41) which covers the SpeechAnalyzer API and its capabilities.