You are here:
Document AI
Use Document AI in Data 360 to extract structured data from unstructured documents like invoices, resumes, lab reports, and purchase orders.
Document AI supports two processing modes.
- Real-time (Transactional) Mode: Invoke Document AI APIs directly for documents in Salesforce or sent at run time. This mode doesn't require you to create unstructured data model objects.
- Batch Mode: To process large volumes of documents from external sources, create
unstructured data model objects (UDMO) that reference the files. Data 360
has several connectors built to ingest unstructured data from third-party sources.
By using AI, Document AI extracts data from documents into Data Lake Objects (DLO), which represent schemas for the extracted data. From this point, your data is available for search index processing and retrieval from automation flows or AI agent actions.
Example Use Cases for Document AI
With Document AI you can improve your business processes across the Salesforce platform. Consider these use cases for how teams working in various clouds can use Document AI.
| Cloud | Use Case |
|---|---|
| Sales Cloud | Ingest contract documents and link them to customer profiles to centralize all contract data. |
| Sales Cloud | Ingest invoices to programmatically compare them against data extracted from related loan applications. |
| Service Cloud | Ingest guest survey data, create customer records, and follow-up for future opportunities. |
| Employee Service | Ingest tax return documents and link them to employee records. |
| Health Cloud | Extract data from patient lab reports, which can be handwritten or system generated. |
Supported Large Language Models (LLMs)
Document AI supports these LLMs.
| Model Provider | Model Family |
|---|---|
| Open AI | GPT-4o (GPT 4 Omni) |
| Vertex AI (Google) | Gemini 2.0 Flash |
| Vertex AI (Google) | Gemini 2.5 Flash |
Document AI in Government Cloud supports only the Open AI (GPT 4 Omni) model hosted in Microsoft Azure. For more information, see Agentforce in Government Cloud.
Supported File Types
Document AI supports these file formats.
- PDFs
- PNG and JPEG
See the Document AI table on Data Cloud Limits and Guidelines for specific limits.
Billing Considerations for Document AI
Use of Document AI has these billing implications.
- LLM gateway calls from Document AI are counted as standard Einstein requests.
- Usage of Document AI is calculated based on the amount of unstructured data that is processed.
See Billing Considerations for Unstructured Data and Search Index for more on billing.
Document AI in Connect API
You can use the Data Cloud Connect API to create Document AI configurations. See the Connect API reference for more.
Other Limits
Note these other limits for using Document AI.
- Single schema extraction: You can upload only one type of source document for a given DLO.
- Nested DLOs: Document AI supports only one level of nesting, with a maximum of five nested DLOs.
- Field Limits: A primary DLO can contain a maximum of 50 fields. A nested DLO can contain 25 fields.
- Create a Document Schema Config (Auto-extraction)
Configure a document schema to represent data extracted from your source files into a Data Lake Object (DLO). Use the auto-extraction option in the document schema builder to have an LLM define the structure of the document schema. The builder auto-populates all fields, tables, and columns. - Create a Document Schema Config (Manually)
Configure a document schema to represent data extracted from your source files into a Data Lake Object (DLO). Use the document schema builder to manually define the structure of the document schema by adding fields and tables. - Create a Document Schema Config (Without Source)
Configure a document schema to use in programmatic API workflows. Use the document schema builder to define a new document schema. - Manage Document AI Configurations
You can view, delete, and rebuild Document AI configurations. You can also package them in data kits. - Document AI Example Schemas
The following prompts are JSON-formatted schemas that you can use when creating an Document AI configuration in Data 360. Use these examples as starting points, or modify them for your specific use cases.

