evals.generate#

llm_generate(dataframe, template, model, system_instruction=None, verbose=False, output_parser=None, include_prompt=False, include_response=False, run_sync=False, concurrency=None)#

Generates a text using a template using an LLM. This function is useful if you want to generate synthetic data, such as irrelevant responses.

Parameters:
  • dataframe (pandas.DataFrame) – A pandas dataframe in which each row represents a record to be used as in input to the template. All template variable names must appear as column names in the dataframe (extra columns unrelated to the template are permitted).

  • template (Union[PromptTemplate, str]) – The prompt template as either an instance of PromptTemplate or a string. If the latter, the variable names should be surrounded by curly braces so that a call to .format can be made to substitute variable values.

  • model (BaseEvalModel) – An LLM model class.

  • system_instruction (Optional[str], optional) – An optional system message.

  • verbose (bool, optional) – If True, prints detailed information to stdout such as model invocation parameters and retry info. Default False.

  • output_parser (Callable[[str, int], Dict[str, Any]], optional) – An optional function that takes each generated response and response index and parses it to a dictionary. The keys of the dictionary should correspond to the column names of the output dataframe. If None, the output dataframe will have a single column named “output”. Default None.

  • include_prompt (bool, default=False) – If True, includes a column named prompt in the output dataframe containing the prompt used for each generation.

  • include_response (bool, default=False) – If True, includes a column named response in the output dataframe containing the raw response from the LLM prior to applying the output parser.

  • run_sync (bool, default=False) – If True, forces synchronous request submission. Otherwise evaluations will be run asynchronously if possible.

  • concurrency (Optional[int], default=None) – The number of concurrent evals if async submission is possible. If not provided, a recommended default concurrency is set on a per-model basis.

Returns:

A dataframe where each row

represents the generated output.

Return type:

generations_dataframe (pandas.DataFrame)