evals.retrievals#

Helper functions for evaluating the retrieval step of retrieval-augmented generation.

classify_relevance(query: str, document: str, model_name: str) bool | None#

Given a query and a document, determines whether the document contains an answer to the query.

Parameters:
  • query (str) – The query text. document (str): The document text. model_name (str): The name

  • classification. (of the OpenAI API model to use for the)

Returns:

A boolean indicating whether the document contains an answer to the query

(True meaning relevant, False meaning irrelevant), or None if the LLM produces an unparseable output.

Return type:

Optional[bool]

compute_precisions_at_k(relevance_classifications: List[bool | None]) List[float | None]#

Given a list of relevance classifications, computes precision@k for k = 1, 2, …, n, where n is the length of the input list.

Parameters:

relevance_classifications (List[Optional[bool]]) – A list of relevance classifications for a set of retrieved documents, sorted by order of retrieval (i.e., the first element is the classification for the first retrieved document, the second element is the classification for the second retrieved document, etc.). The list may contain None values, which indicate that the relevance classification for the corresponding document is unknown.

Returns:

A list of precision@k values for k = 1, 2, …, n, where n is the

length of the input list. The first element is the precision@1 value, the second element is the precision@2 value, etc. If the input list contains any None values, those values are omitted when computing the precision@k values.

Return type:

List[Optional[float]]