Some customers may observe that the Agentforce bot participates normally in a conversation but eventually stops responding after several exchanges. At this point, the session may be automatically passed to a fallback queue (for example, a messaging queue or human agent queue).
This behavior occurs in scenarios where:
The conversation history becomes long
Custom actions produce large inputs or outputs
Prompt templates contain long or repetitive instructions
The user and agent exchange lengthy messages
The model must retain a large memory context
Agentforce relies on an LLM to generate responses. Every LLM has a maximum context length — the total number of tokens it can process for history, instructions, user inputs, and function outputs. When this limit is exceeded, the provider returns an error, and the agent cannot continue the conversation.
OpenAI’s GPT-4o model documents a maximum context window of 128,000 tokens:
https://platform.openai.com/docs/models/gpt-4o
This context-window limit is a standard constraint of modern LLMs and not specific to Salesforce or Agentforce.
Long or repetitive instructions increase the number of tokens the LLM must process.
To reduce token usage:
Remove redundant or repeated guidance
Avoid overly long system or topic instructions
Keep prompts concise and purposeful
Large action outputs contribute significantly to total token count.
To minimize this:
Limit large JSON structures and nested fields
Return only required fields instead of full record objects
Agentforce retains a minimum of 24 previous messages to maintain context.
Long messages from either the user or the agent increase the retained token count.
To avoid exceeding the model limit:
Keep responses concise
Avoid unnecessarily long conversational turns
Ensure actions do not repeatedly output the same large content
Repeated action invocations or repeated instructions increase token accumulation.
Ensuring actions run only when needed helps prevent hitting the model’s context limit.
5. Agent LLM Call Limit Exceeded
If an agent makes more than 8 LLM calls within a single user turn/request, the planner stops initiating further LLM calls and displays an error message to inform the user.
The session remains active, but the current turn is terminated due to the limit being exceeded.
005232510

We use three kinds of cookies on our websites: required, functional, and advertising. You can choose whether functional and advertising cookies apply. Click on the different cookie categories to find out more about each category and to change the default settings.
Privacy Statement
Required cookies are necessary for basic website functionality. Some examples include: session cookies needed to transmit the website, authentication cookies, and security cookies.
Functional cookies enhance functions, performance, and services on the website. Some examples include: cookies used to analyze site traffic, cookies used for market research, and cookies used to display advertising that is not directed to a particular individual.
Advertising cookies track activity across websites in order to understand a viewer’s interests, and direct them specific marketing. Some examples include: cookies used for remarketing, or interest-based advertising.