Loading
Agentforce Contact Center
Table of Contents
Select Filters

          No results
          No results
          Here are some search tips

          Check the spelling of your keywords.
          Use more general search terms.
          Select fewer filters to broaden your search.

          Search all of Salesforce Help
          Speech-to-Text Model for Voice Transcription

          Speech-to-Text Model for Voice Transcription

          Voice transcription converts spoken conversations into text during voice interactions using specific speech-to-text models that determine LLM usage and language support.

          Required Editions

          This article applies to:

          Salesforce Voice (Native Telephony)

          Available in: Agentforce Contact Center with Salesforce Voice
          Available in: Enterprise, Unlimited, and Developer Editions
          User Permissions Needed
          To use speech-to-text model: Agentforce Contact Center Admin (Salesforce Voice) Permission Set

          Voice Transcription Models

          Speech-to-text models determine how spoken audio is processed and transcribed during voice interactions. This configuration is applied at the org level and supports voice call transcription for service reps. There are two primary models available:

          • Low Latency Speech Model—Optimized for real-time transcription with fast response times during live voice conversations.
          • Universal Speech Model—Designed to support transcription across a broad set of languages and regions.

          Low Latency Speech Model

          Use the low latency speech model for real-time voice interactions. This model is optimized for conversational speech recognition and supports low-latency streaming transcription for interactive voice experiences.

          Key characteristics include:

          • Optimized for low latency streaming transcription
          • Higher transcription accuracy for conversational audio
          • Supports multilingual transcription capabilities
          • Designed for faster response times in live conversations

          Universal Speech Model

          Use the universal speech model to support transcription across a wide range of languages used in voice interactions. This model supports streaming transcription for multiple languages and regional variants.

          Key characteristics include:

          • Enterprise-grade infrastructure designed for large-scale deployments
          • Broad global language support
          • Supports multilingual transcription capabilities
          • Suitable for organizations operating across multiple countries and languages

          Supported languages

          Low Latency Speech Model Universal Speech Model
          English English (UK, USA, and Canada)
          German German
          French French
          Spanish Spanish (Spain, Mexico)
          Italian Italian
          Dutch Dutch
          Portuguese Portuguese (Brazil, Portugal)
          Japanese Japanese
          Hindi Hindi
            Finnish
            Tamil
            Telugu
            Bengali
            Marathi
            Bahasa Indonesian
            Malay
           
          Loading
          Salesforce Help | Article