Retriever Metrics and Results

Review retriever metrics and the retrieved data chunks to understand the results of your test. Modify the retriever configuration to try different test configurations and test data to get an optimal response. Save your configuration as a new version of the retriever.

Retriever Metrics

You can test a retriever in a basic mode and an advanced mode. When you test a retriever in the advanced mode, the retriever playground generates a context recall score based on a reference answer. Testing retrievers in the advanced mode consumes flex credits.

Retriever Results and Metrics in Advanced Test Mode

The image shows a retriever playground in the advanced test mode and a sample configuration for a retriever. The test retriever panel shows a sample test question and an expected answer. The retriever test panel is empty.

Metric	Description	More Details
Source Recall Score	Available only in the Basic Mode. Measures the percentage of test data chunks that are part of the top retrieved chunks over the total number of chunks in the test data.	For example, in the Retriever Metrics and Results In Basic Test Mode image, the total number of chunks in the test data is 49, out of which only 4 are part of the chunks retrieved by the retriever. So, the calculated Source Recall Score is 4/49=0.08. If the Source Recall Score is zero, retrieved chunks don't include chunks from the test data. The score decreases when the number of test data chunks is high, even when there are common chunks between retrieved chunks and test data chunks.
Response Time	The time in milliseconds for the retriever to respond with results.	2589 ms
Context Recall	Available only in Advanced mode. Measures how many of the relevant documents were successfully retrieved based on a reference answer.	Higher recall means fewer relevant documents were missed. For more information see, Ragas Context Recall

Metric

Description

More Details

Source Recall Score

Available only in the Basic Mode. Measures the percentage of test data chunks that are part of the top retrieved chunks over the total number of chunks in the test data.

For example, in the Retriever Metrics and Results In Basic Test Mode image, the total number of chunks in the test data is 49, out of which only 4 are part of the chunks retrieved by the retriever.

So, the calculated Source Recall Score is 4/49=0.08.

If the Source Recall Score is zero, retrieved chunks don't include chunks from the test data.

The score decreases when the number of test data chunks is high, even when there are common chunks between retrieved chunks and test data chunks.

Response Time

The time in milliseconds for the retriever to respond with results.

2589 ms

Context Recall

Available only in Advanced mode. Measures how many of the relevant documents were successfully retrieved based on a reference answer.

Higher recall means fewer relevant documents were missed. For more information see, Ragas Context Recall

Retrieved Chunks Metrics

Retriever Metrics and Results In Basic Test Mode

The image shows retriever metrics and retrieved chunks details for a sample test question. The retriever metrics shows the Source Recall Score available in the basic test mode.

Metric	Description	More Details
# Chunks to Display	Determines the maximum number of test data chunks to show in the tables.	For example, in the image, the table shows 10 out of 49 test data chunks.
All Chunks	The table shows data on all source chunks, chunks in test data, chunks selected by the retriever, chunk score, and source for the chunk.	For example, in the image, the test question returns 26 source chunks.
Selected by Retriever	The table shows filtered column data on the source chunks that the retriever selects.	The retriever selected 20 chunks from all source chunks returned.
Test Data Chunks	The table shows filtered column data on test data chunks.	For example, in the image, the total number of chunks in the test data is 49.
Chunks	Document chunks retrieved by retriever. These chunks contain the response to the question.
Chunk Score	Score value for the chunk that determines relevance ranking.
Source	The source data such as a data model object (DMO) record, a file path, or a knowledge article version.

Did this article solve your issue?

Let us know so we can improve!

Retriever Metrics and Results