You are here:
Enriched Indexing of Content Chunks
When you create a vector or hybrid search index from unstructured data in Data 360, you can enrich the generated content chunks by extracting metadata from the content. Extracted metadata includes keywords, entities, topic overviews, questions answered by the content, and content summaries.
Chunk enrichment greatly improves retrieval accuracy, especially in cases in which you can’t use prepend fields (for example, in a UDMO path) and in question and answer agent actions. Enriched content chunks provide an alternative to intensive content curation because LLM-generated content helps improve the identification of the right chunks.
To use enriched content chunks, first create a search index and enable enriched content chunks. In AI Models, create a new custom retriever. When prompted for the search index, select a search index that was created using the enrich chunks option.
To get the most value from an enriched index, access it through a retriever. To use the metadata chunks, a retriever applies extra logic, such as reranking and intent alignment. Calling the index directly through search APIs bypasses the retriever and results in retrieval behavior closer to standard vector or hybrid search. Although introducing a retriever sometimes adds latency, it enables higher retrieval accuracy by improving how queries map to relevant, answer-ready content.
Use of this feature in Data Cloud consumes Flex Credits. For more information see, Billing Considerations for Enriched Index. Contact your account executive for more information.
Contents of Enriched Chunks
When you enable enriched content on a vector or hybrid search index, Data 360 generates three chunks: one chunk that contains the original chunk text, one chunk that contains metadata text, and one chunk that contains questions that the chunk can answer. Retrievers access these chunks to use in RAG and AI agent workflows.
| Chunk Type | Metadata Type | Description |
|---|---|---|
| METADATA | Keywords | Extracts keywords that uniquely identify the content. |
| Entities | Identifies and lists verbatim any of the following:
|
|
| Topics | Extracts main topics discussed in the chunk. Includes important entities and keywords in the topic names. |
|
| Title | Generates a concise and informative title summarizing the content. |
|
| Summary | Provides a summary of the content, highlighting key topics and entities. If possible, aim to encompass a minimum of 100 words and a maximum of 250 words. | |
| QUESTION | Questions | If applicable, compiles a comprehensive set of detailed questions that the content in the text chunk can answer. |
Data Processing for Enriched Indexing
Enriched indexes are supported in all regions where Data 360 is available. Data for enriched indexes is processed through Amazon Bedrock managed services in these regions.
- us-west-2: US West (Oregon)
- us-east-1: US East (Virginia)
- eu-central-1: Europe (Frankfurt)
- eu-central-2 Europe (Zurich)
- eu-west-3: Europe (Paris)
Enable Amazon Bedrock Models (Amazon and Anthropic) in Einstein Setup to use enriched indexing. For more information, see Amazon Bedrock documentation.
- Billing Considerations for Enriched Index
Enriched indexing impacts the consumption of credits used for billing for orgs operating Data 360 under a Data Coud license. With enriched search indexes, you are processing unstructured data and making calls to a LLM to generate enriched chunks.

