Loading
About Salesforce Data 360
Table of Contents
Select Filters

          No results
          No results
          Here are some search tips

          Check the spelling of your keywords.
          Use more general search terms.
          Select fewer filters to broaden your search.

          Search all of Salesforce Help
          Unstructured Data File Formats and Connectors

          Unstructured Data File Formats and Connectors

          Data 360 supports multiple file formats and connector types for unstructured data.

          Supported File Formats for Unstructured Data

          Data 360 supports these file formats for unstructured data and search index configurations.

          Text-Based Files
          FILE Format File Extension Search indexing notes harmonization notes
          PDF .pdf Limited support for content formatted as tables.

          These elements are supported:

          • Tables: Plain text only
          • Text: Plain text, icons, images, links, lists, and headers
          • Lists: Simple lists and single indentation lists
          • Orientation: Left to right, top down, and up to two-columned text (partially)

          These elements are not supported:

          • Tables: Icons, images, inner tables, lists in cells, multiple tables, merged cells, and deep indentations beyond a single level
          • Text: Code blocks, ligatures (two or more letters joined together as a single symbol), and non-Latin letters
          • Lists: Deep nested lists beyond a single level
          • Orientation: Right to left, rotated documents (non vertical), and text with more than two columns.
          Text .txt    
          HTML .html Limited support for content formatted as tables. Supported
          CSV .csv Each row is a chunk.  
          Log .log Each row is a chunk.  
          Excel .xlsx Limited support for content formatted as tables.  
          PowerPoint .pptx Chunked by heading. Limited support for content formatted as tables.  
          Rich text format .rtf Chunked by heading or paragraph. Limited support for content formatted as tables.  
          Word .docx Chunked by heading or paragraph. Limited support for content formatted as tables. Supported
          Messages .msg Chunked by body and attachments.  
          Email .eml Chunked by body, headers, and attachments.  
          XML .xml Chunked by XML element tag.  
          Markdown .md   Supported
          Note
          Note Data 360 content harmonization supports Word, PDF, HTML, and Markdown file formats only from unstructured data model objects (UDMOs), not from standard data model objects (DMOs), including those that have a file attachment UDMO. Other ingested file types are not harmonized and can be viewed in their source platform linked from the Content Viewer.
          Audio, Video, and Image Files
          FILE Format File Extension SEARCH INDEXING Notes
          MPEG .mpg Audio only
          MPEG-3 .mp3 Audio only
          MPEG-4 .mp4 Audio only
          MPGA .mpga Audio only
          Ogg .ogg Audio only
          Wav .wav Audio only
          FLAC .flac Audio only
          WebM .webm Audio only
          JPEG (Beta) .jpg

          Image-based files. Embeddings include extracted text and captions

          The maximum size of an image that can be processed is 20 MB.

          PNG (Beta) .png

          Image-based files. Embeddings include extracted text and captions.

          The maximum size of an image that can be processed is 20 MB.

          PDF (Beta) .pdf

          PDFs with Images.

          Maximum size of a PDF that can be processed is 100 MB

          Unstructured Data Connectors

          Data 360 has several connectors built for ingesting unstructured data from third-party applications or sources. You can also create a data lake object to ingest unstructured data through an existing connector.

          Additionally, you can ingest ingest unstructured data from file attachments on Salesforce objects or upload files directly into Agentforce Data Libraries.

          Unstructured Data Limits and Guidelines

          For limits and guidelines related to supported file types, see Data 360 Limits and Guidelines.

          For billing info, see Billing Considerations for Unstructured Data and Search Index.

          Monitor Unstructured Data Connectors

          For methods of monitoring your unstructured data streams and ingestion progress, see Monitor the Status of Unstructured Connectors Data Ingestion.

           
          Loading
          Salesforce Help | Article