Agentforce: Agent Stops Responding During Long Conversations Due to Token Limit Being Exceeded

Julkaisupäivä: Dec 22, 2025

Kuvaus

Some customers may observe that the Agentforce bot participates normally in a conversation but eventually stops responding after several exchanges. At this point, the session may be automatically passed to a fallback queue (for example, a messaging queue or human agent queue).

This behavior occurs in scenarios where:

The conversation history becomes long
Custom actions produce large inputs or outputs
Prompt templates contain long or repetitive instructions
The user and agent exchange lengthy messages
The model must retain a large memory context

Why the Agent Stops Responding

Agentforce relies on an LLM to generate responses. Every LLM has a maximum context length — the total number of tokens it can process for history, instructions, user inputs, and function outputs. When this limit is exceeded, the provider returns an error, and the agent cannot continue the conversation.

For Example (Model-Level Limitation)

OpenAI’s GPT-4o model documents a maximum context window of 128,000 tokens:
https://platform.openai.com/docs/models/gpt-4o

This context-window limit is a standard constraint of modern LLMs and not specific to Salesforce or Agentforce.

Ratkaisu

Recommended Suggestions to Prevent the Agent From Stopping Mid-Conversation

1. Keep Prompt Templates Short and Focused

Long or repetitive instructions increase the number of tokens the LLM must process.
To reduce token usage:

Remove redundant or repeated guidance
Avoid overly long system or topic instructions
Keep prompts concise and purposeful

2. Reduce Output and Input Size From Custom Actions

Large action outputs contribute significantly to total token count.
To minimize this:

Limit large JSON structures and nested fields
Return only required fields instead of full record objects

3. Minimize Message Length Throughout the Conversation

Agentforce retains a minimum of 24 previous messages to maintain context.
Long messages from either the user or the agent increase the retained token count.
To avoid exceeding the model limit:

Keep responses concise
Avoid unnecessarily long conversational turns
Ensure actions do not repeatedly output the same large content

4. Avoid Repetition in Actions and Instructions

Repeated action invocations or repeated instructions increase token accumulation.
Ensuring actions run only when needed helps prevent hitting the model’s context limit.

5. Agent LLM Call Limit Exceeded

If an agent makes more than 8 LLM calls within a single user turn/request, the planner stops initiating further LLM calls and displays an error message to inform the user.
The session remains active, but the current turn is terminated due to the limit being exceeded.

Knowledge-artikkelin numero

005232510

Ratkaisiko tämä artikkeli ongelmasi?

Anna palautetta, jotta voimme kehittyä!