You are here:
Einstein Requests Billable Usage Types
Usage of certain generative AI features impacts credit consumption against subtypes of the Einstein Requests billable usage type. Usage is calculated based on the number of calls to the LLM gateway. Einstein Request usage types apply to embedded generative AI features such as Service Replies that involve direct requests to the LLM for specific tasks.
Einstein Requests Usage Types
Einstein Requests can be consumed with Agentforce Conversations or Flex Credits when the usage involves direct calls to the LLM gateway.
| Billing Category | Description |
|---|---|
| Starter Prompts | Usage is calculated based on two factors: the number of direct requests to the LLM via the LLM gateway, and whether the gateway uses a Bring Your Own Large Language Model (BYOLLM). Each starter prompt includes the processing of up to 2,000 tokens. Prompt usage is counted in chunks of 2,000 tokens, rounded up. Prompts that exceed this limit will be metered as multiple prompts, with each additional 2,000-token chunk counting as a new prompt. For example, a prompt with a total of 6,500 input and output tokens will be metered as 4 prompts. Tokens are units of data processed by the AI models. |
Standard Prompts Basic Prompts Advanced Prompts |
Usage is calculated based on two factors: the number of direct requests to the LLM via the LLM gateway, and whether the gateway uses a Salesforce managed large language model. The specific category depends on the model that is used. See Large Language Model Support to find out which usage types apply. All Standard, Basic, and Advanced prompts process up to 2,000 tokens per prompt. Token usage is rounded up in 2,000-token increments. All Standard, Basic, and Advanced prompts that exceed this limit will be metered as multiple prompts, with each additional 2,000-token chunk counting as a new prompt. For example, a prompt with a total of 6,500 input and output tokens will be metered as 4 prompts. Tokens are units of data processed by the AI models. |

