You are here:
Speech-to-Text Model for Voice Transcription
Voice transcription converts spoken conversations into text during voice interactions using specific speech-to-text models that determine LLM usage and language support.
Required Editions
This article applies to:
Salesforce Voice (Native Telephony)
| Available in: Agentforce Contact Center with Salesforce Voice |
| Available in: Enterprise, Unlimited, and Developer Editions |
| User Permissions Needed | |
|---|---|
| To use speech-to-text model: | Agentforce Contact Center Admin (Salesforce Voice) Permission Set |
Voice Transcription Models
Speech-to-text models determine how spoken audio is processed and transcribed during voice interactions. This configuration is applied at the org level and supports voice call transcription for service reps. There are two primary models available:
- Low Latency Speech Model—Optimized for real-time transcription with fast response times during live voice conversations.
- Universal Speech Model—Designed to support transcription across a broad set of languages and regions.
Low Latency Speech Model
Use the low latency speech model for real-time voice interactions. This model is optimized for conversational speech recognition and supports low-latency streaming transcription for interactive voice experiences.
Key characteristics include:
- Optimized for low latency streaming transcription
- Higher transcription accuracy for conversational audio
- Supports multilingual transcription capabilities
- Designed for faster response times in live conversations
Universal Speech Model
Use the universal speech model to support transcription across a wide range of languages used in voice interactions. This model supports streaming transcription for multiple languages and regional variants.
Key characteristics include:
- Enterprise-grade infrastructure designed for large-scale deployments
- Broad global language support
- Supports multilingual transcription capabilities
- Suitable for organizations operating across multiple countries and languages
Supported languages
| Low Latency Speech Model | Universal Speech Model |
|---|---|
| English | English (UK, USA, and Canada) |
| German | German |
| French | French |
| Spanish | Spanish (Spain, Mexico) |
| Italian | Italian |
| Dutch | Dutch |
| Portuguese | Portuguese (Brazil, Portugal) |
| Japanese | Japanese |
| Hindi | Hindi |
| Finnish | |
| Tamil | |
| Telugu | |
| Bengali | |
| Marathi | |
| Bahasa Indonesian | |
| Malay |

