Loading
Agentforce and Einstein Generative AI
Table of Contents
Select Filters

          No results
          No results
          Here are some search tips

          Check the spelling of your keywords.
          Use more general search terms.
          Select fewer filters to broaden your search.

          Search all of Salesforce Help
          Scorers and Custom Scorers (Beta)

          Scorers and Custom Scorers (Beta)

          Scorers (Beta) are evaluation components in Agentforce Studio that analyze agent sessions and produce scores, dimensions, and measures for Agentforce Optimization and Analytics. Pair Salesforce standard scores with custom evaluations for your KPIs, create custom scorers using Next Gen Testing, apply them to sessions, and use the outputs in Optimization and Analytics to prioritize agent improvements.

          Required Editions

          Available in: Enterprise, Performance, and Unlimited Editions with an Einstein for Sales, Einstein for Platform, Einstein for Service, Einstein 1 Service, or Einstein GPT Service add-on. To purchase add-ons, contact your Salesforce account executive.
          Note
          Note Salesforce Standard Data Model version 1.130 or higher is required.
          Note
          Note

          Scorers are a beta service that is subject to the Beta Services Terms at Agreements - Salesforce.com or a written Unified Pilot Agreement if executed by Customer, and applicable terms in the Product Terms Directory. Use of this beta service is at the Customer's sole discretion.

          Where Scorers Appear in Optimization and Analytics

          • Next Gen Testing — Create custom scorers as part of a new or an existing test suite, refine and publish it.
          • Scorer Hub (Agentforce Studio) — Central place to view, activate, and manage scorers.
          • Sessions and Intents table — Sessions show scorer results; filter and analyze by custom scorer dimensions.
          • Session Page — Sessions show associated scorer labels and scores.
          • Analytics dashboards — Custom scorer measures appear as metrics for reporting.

          Considerations and Limitations

          • Available in Lightning Experience in Enterprise, Performance, and Unlimited editions with Salesforce Foundations or Agentforce 1 Edition where Agent Optimization and Analytics are enabled.
          • Custom Scorers are in Beta and require no additional license beyond Agentforce.
          • During Beta, only session-level scorers are supported.
          • Custom scorers are not supported for Agentforce Employee Agent (AEA) agent types.
          • SDR agent type appears in the Scorer Hub but not in the Optimization UI agent dropdown.
          • Standard scorers (A&D, Quality Score) are created when STDM is provisioned. If they are missing, check for a provisioning issue.
          • You must clone a standard scorer before editing it; direct edit is not supported during Beta.
          • Expression-based scorers require a Boolean output type.
          • Numeric scorers on a 0–1 scale are not recommended because LLMs perform poorly on very narrow numeric ranges.
          • Testing Center sessions are not scored by custom scorers. Testing Center sessions don't receive an end timestamp, so the data action that triggers custom scorer evaluation is never fired.

          Overview

          Scorers evaluate agent sessions and generate scores, dimensions (Text values), and measures (Numeric values). They're part of Agentforce Optimization and Observability. They run automatically against a configurable percentage of sessions and surface results in Optimization and Analytics, so admins and developers can improve agent performance continuously.

          There are two kinds of scorers:

          • Standard scorers are provided by Salesforce out of the box.
          • Custom scorers are defined by your organization to evaluate business-specific criteria.

          Standard Scorers

          Standard scorers are pre-built evaluations that are created automatically with STDM provisioning. They run on agent sessions without additional customer configuration.

          Current Standard Scorers

          • Abandonment Score — Indicates whether the customer ended the session prematurely before the agent resolved their issue.
          • Deflection Score — Indicates whether the agent successfully deflected an interaction from requiring human escalation.
          • Quality Score — Measures overall session quality based on predefined Salesforce criteria.

          Standard scorers use a reserved location in the user interface and are separate from the custom scorer list.

          Important
          Important

          You can't edit a standard scorer directly. Clone it to create a new custom version with modified logic.

          Custom Scorers

          Custom scorers let you define evaluation logic and apply it to sessions. They use an LLM-as-a-judge approach (prompt-based evaluation) or expression-based logic to label, score, or classify sessions against your criteria.

          Note
          Note

          You create custom scorers in Next Gen Testing (NGT) in Agentforce Studio. See Set Up and Run Tests in Agentforce Studio (Beta). For more information, see Next Gen Testing on Slack.

          Custom scorers run against sessions and surface custom scores and dimensions in Observability.

          How Custom Scorers Work

          A custom scorer is backed by a prompt template that runs automatically as a batch job when the session ends. After a scorer runs on a session, the session is associated with that scorer, and labels and scores appear in Optimization and Analytics.

          • A scorer is a prompt template that runs on a configurable percentage of sessions (sample rate).
          • Scores are applied at session end, not during the live conversation. Sessions must receive an end timestamp for the scoring data action to fire.
          • Results appear in Observability as custom measures.
          • You can also trigger scorers manually by using the executeAgentforceScoresJob invocable action.
          • Scorers don't affect the agent's latency or performance during live conversations because scoring runs after the session ends.

          Scorer Configuration

          Key Fields

          Field Description
          scorerApiName Unique API name for the scorer.
          status Lifecycle status of the scorer (for example, Draft or Available).
          isDraft Boolean. Indicates whether this scorer definition is a draft.
          versionNumber Integer version of the scorer definition.
          engine Evaluation engine: LLM-based or expression-based.
          inputScope Granularity: session-level (default for Beta and general availability) or interaction-level (planned).
          dataType Output type for the scorer (for example, Text or Numeric).
          scorerValues / valuesSpecification Provide exactly one. Defines output labels or the numeric range for the scorer.
          promptTemplateRef Reference to the prompt template for LLM-based evaluation. The template must be active.
          agentApiName Optional. Associates the scorer with a specific agent. Only one agent API name is allowed.
          samplingRate Percentage of sessions the scorer runs against.

          Validation Rules

          • scorerApiName must not already exist.
          • promptTemplateRef must exist and be active.
          • For Text data type: scorer values must exist; exactly one value must have isFallback: true; exactly one must have isSystemFallback: true (only when isFallback is also true).
          • For Numeric data type: no fallback values; values must parse as valid doubles. If you use valuesSpecification: step must be greater than 0; min must be less than max; threshold is optional but must be between min and max.
          • Maximum number of scorer values: 101.

          Output Types

          • Text (labels) — The scorer assigns one of a set of predefined string labels to each session (for example, Resolved, Abandoned, or Escalated). Use for categorical classification.
          • Numeric — The scorer returns a number in a defined range (for example, 0–5). Suited to quality ratings and continuous scores. LLMs typically perform better on discrete scales (such as 0–5) than on narrow 0–1 ranges.
          • Boolean (stretch or future) — Returns true or false. Supported for expression-based scorers.

          Run Scorers in Production

          Activate from Scorer Tab

          After you create and test a custom scorer, activate it from the Scorer Hub in Agentforce Optimization. Activation lets the scorer run on live sessions at the configured sample rate.

          Access and view Agentforce Optimization Assign the Access Agentforce Optimization and Data Cloud User permission sets
          Use Scorer Tab (Beta) Assign the Agentforce Scorer Beta permission set (no additional license beyond Agentforce is required)
          Create a scorer Assign the Next Gen Testing (NGT) access permission set
          Activate a scorer Assign the AgentforceScorerActivation permission set (included in the admin profile by default)

          Run from Salesforce Flow

          We provide a new invocable action, TriggerAgentBulkScoring, to execute custom scorers asynchronously on historical sessions. This action accepts multiple sessions and scorers and is accessible via:

          • Salesforce Flow
          • REST API
          • Apex

          Action Name: TriggerAgentBulkScoring

          Input Parameter Type Description
          InputIds List<String> List of up to 500 unique session IDs.
          InputScope Enum Scope of the scoring: Session, Moment, or Interaction.
          ScoresApiNames List<String> Up to 10 developer names for the scorers to be executed.

          Important Notes

          • All session IDs and scorerApiNames must belong to the same agent.
          • Ensure the scorer version is Available before execution.

          REST API

          Send a POST request to the standard action REST endpoint.

          /services/data/v66.0/actions/standard/triggerAgentBulkScoring

          Example payload:

          {
            "inputs": [
              {
                "inputIds": ["019c9442-b760-7d62-837c-8ab80ecc0fc2"],
                "inputScope": "Session",
                "scorerApiNames": ["language_classifier"]
              }
            ]
          }

          Use the executeAgentforceScoresJob invocable action in Flow. The action accepts a list of session IDs and scorers and submits a job to the evaluation pipeline. Common uses include:

          • Running scorers on historical sessions
          • Batch analysis and custom reporting
          • Integration with Agentforce Grid

          Metadata API

          Create and manage custom scorers with the Salesforce Metadata API so you can deploy scorers in pro-code and DevOps workflows, including CI/CD pipelines.

          Use Cases

          Use Case User Goal How Scorers Help
          Monitor agent quality Understand how well the agent resolves customer issues Standard scorers (Quality Score, A&D) run automatically and surface results in Analytics without configuration
          Evaluate business-specific outcomes Measure custom KPIs (for example, topic classification or compliance) Custom scorers apply your LLM prompts to sessions and expose results as custom dimensions
          Iterate and improve agents Find patterns in weak sessions and refine instructions Observability shows scorer results per session so you can drill into flagged interactions
          Test scorers before production Confirm a new scorer behaves as expected before activation Testing Center in Agentforce Studio supports creating, running, and refining scorers against test cases.
          Run scorers at scale Evaluate large batches of historical sessions executeAgentforceScoresJob supports batch processing and custom reporting
          Manage scorers as code Deploy scorers through CI/CD Metadata API support enables pro-code creation and management
          Customize evaluation criteria Adjust or extend out-of-the-box scoring Clone a standard scorer to create a custom version with modified prompt logic
           
          Loading
          Salesforce Help | Article