Safety and Security - Prompt Injection Detection Control

You are here:

Safety and Security - Prompt Injection Detection Control

Detects and mitigates prompt injection attacks where users attempt to overwrite system instructions to force the AI into unintended or malicious behaviors.

Control Name

Einstein Trust Layer - Prompt Injection Detection

Control Overview

Detects and mitigates prompt injection attacks where users attempt to overwrite system instructions to force the AI into unintended or malicious behaviors.

Description

Monitors the prompt journey to identify adversarial patterns, such as "ignore previous instructions" or "system override" commands, blocking or flagging the request before it reaches the LLM.

Recommended Configuration

Enable "Prompt Injection Detection" in the Einstein Trust Layer settings. Make sure that the Einstein Audit Trail is enabled to log events for injection attempts.

Security Impact

Prevents the AI from being reprogrammed by a user to leak internal data, generate prohibited content, or bypass the ethical boundaries established in the Prompt Template.

Business Impact

Maintains the integrity of AI-driven business processes and prevents operational disruptions caused by users manipulating AI logic for personal or malicious gain.

Security Risk If Not Configured

The LLM may follow malicious user instructions over the System Prompt, leading to unauthorized data disclosure, social engineering, or the execution of unauthorized workflows.

Threat Scenarios

Prompt Injection: A user tricks the AI into performing unwanted actions, leading to data leakage.

Estimated CVSS Score Range

Critical (9.0–10.0).

Risk Impact Considerations

Extreme risk for agentic workflows where the AI can execute actions (for example, updating records or sending emails) based on user prompts.

Higher Risk When

Using external/third-party LLMs without the Salesforce Trust Layer intermediary, or when Prompt Templates are poorly constructed with least context.

Low Risk When

Prompt injection detection is active, the org uses Salesforce-hosted models (which have built-in defense), and Least Privilege is applied to the AI's data access.

Business and Integration Considerations

Strict detection may occasionally flag complex, legitimate user prompts as adversarial. Admins should monitor the Einstein Trust Layer audit feedback data to fine-tune sensitivity.

Security Health Review Guidance

Security Health Review scans the Einstein Trust Layer Setup to confirm that prompt injection detection is enabled.

Who Is Impacted

Security teams, AI architects, developers, and end-users interacting with Prompt Builder or Agentforce agents.

Did this article solve your issue?

Let us know so we can improve!

Safety and Security - Prompt Injection Detection Control

Control Name

Control Overview

Description

Recommended Configuration

Security Impact

Business Impact

Security Risk If Not Configured

Threat Scenarios

Estimated CVSS Score Range

Risk Impact Considerations

Higher Risk When

Low Risk When

Business and Integration Considerations

Security Health Review Guidance

Who Is Impacted

General Information

Required Cookies

Functional Cookies

Advertising Cookies

General Information

Required Cookies

Functional Cookies

Advertising Cookies

Cookie List

Product Area

Feature Impact

Edition

Experience

Safety and Security - Prompt Injection Detection Control

Control Name

Control Overview

Description

Recommended Configuration

Security Impact

Business Impact

Security Risk If Not Configured

Threat Scenarios

Estimated CVSS Score Range

Risk Impact Considerations

Higher Risk When

Low Risk When

Business and Integration Considerations

Security Health Review Guidance

Who Is Impacted