Helpers#

Span Utilities#

Utilities for generating and regenerating OpenTelemetry-compliant span and trace IDs.

async async_get_input_output_context(client, *, start_time=None, end_time=None, project_name=None, project_identifier=None, timeout=DEFAULT_TIMEOUT_IN_SECONDS)#

Async version of get_input_output_context.

Extracts input/output data with context for RAG evaluation.

Parameters:

client – Phoenix AsyncClient instance.
start_time – Optional start time for filtering spans (inclusive lower bound).
end_time – Optional end time for filtering spans (exclusive upper bound).
project_name – Project name (alias for project_identifier). If not provided, uses the environment variable PHOENIX_PROJECT_NAME.
project_identifier – Project identifier (name or ID). Takes precedence over project_name if both are provided.
timeout – Request timeout in seconds. Defaults to 5.

Returns:

Q&A data with index context.span_id and columns:

context.trace_id: Trace ID
input: Question/query from the root span
output: Answer/response from the root span
context: Concatenated retrieved document content
metadata: Metadata from the root span

Returns None if no spans or retrieval documents are found.

Return type:

Optional[pd.DataFrame]

Examples

Basic usage:

from phoenix.client import AsyncClient
from phoenix.client.helpers.spans import async_get_input_output_context

client = AsyncClient()
qa_df = await async_get_input_output_context(client, project_name="my-rag-app")

async async_get_retrieved_documents(client, *, start_time=None, end_time=None, project_name=None, project_identifier=None, timeout=DEFAULT_TIMEOUT_IN_SECONDS)#

Async version of get_retrieved_documents.

Extracts retrieved documents from retriever spans for RAG evaluation.

Parameters:

client – Phoenix AsyncClient instance.
start_time – Optional start time for filtering spans (inclusive lower bound).
end_time – Optional end time for filtering spans (exclusive upper bound).
project_name – Project name (alias for project_identifier). If not provided, uses the environment variable PHOENIX_PROJECT_NAME.
project_identifier – Project identifier (name or ID). Takes precedence over project_name if both are provided.
timeout – Request timeout in seconds. Defaults to 5.

Returns:

Retrieved documents with multi-index (context.span_id, document_position): and columns: - context.trace_id: Trace ID - input: Input value from the retriever span - document: Document content - document_score: Document relevance score - document_metadata: Document metadata

Return type:

pd.DataFrame

Examples

Basic usage:

from phoenix.client import AsyncClient
from phoenix.client.helpers.spans import async_get_retrieved_documents

client = AsyncClient()
docs_df = await async_get_retrieved_documents(client, project_name="my-rag-app")

dataframe_to_spans(df)#

Converts a pandas DataFrame (from get_spans_dataframe) back to a list of Span objects.

This utility reconstructs Span objects from the flattened DataFrame structure returned by get_spans_dataframe. It handles the conversion of flattened column names (e.g., ‘context.span_id’) back to nested dictionaries.

Parameters:: df (pd.DataFrame) – A pandas DataFrame typically returned by get_spans_dataframe. Timestamps in ‘start_time’ and ‘end_time’ columns must be timezone-aware.
Returns:: A list of Span objects reconstructed from the DataFrame.
Return type:: list[v1.Span]
Raises:: ValueError – If start_time or end_time columns contain timezone-naive timestamps.

Examples

Basic usage:

from phoenix.client import Client
from phoenix.client.helpers.spans import dataframe_to_spans

client = Client()

# Get spans as DataFrame
df = client.spans.get_spans_dataframe(
    project_identifier="my-project"
)

# Filter or modify the DataFrame
filtered_df = df[df['span_kind'] == 'LLM']

# Convert back to Span objects
spans = dataframe_to_spans(filtered_df)
print(f"Converted {len(spans)} spans from DataFrame")

# Now you can use these spans with other APIs
result = client.spans.create_spans(
    project_identifier="another-project",
    spans=spans
)

get_input_output_context(client, *, start_time=None, end_time=None, project_name=None, project_identifier=None, timeout=DEFAULT_TIMEOUT_IN_SECONDS)#

Extracts Q&A data with context for RAG evaluation.

Constructs a DataFrame that combines root span input/output pairs with their associated retrieved documents as context. This is formatted for RAG Q&A and hallucination evaluation with phoenix.evals.

Parameters:

client – Phoenix Client instance.
start_time – Optional start time for filtering spans (inclusive lower bound).
end_time – Optional end time for filtering spans (exclusive upper bound).
project_name – Project name (alias for project_identifier). If not provided, uses the environment variable PHOENIX_PROJECT_NAME.
project_identifier – Project identifier (name or ID). Takes precedence over project_name if both are provided.
timeout – Request timeout in seconds. Defaults to 5.

Returns:

Q&A data with index context.span_id and columns:

context.trace_id: Trace ID
input: Question/query from the root span
output: Answer/response from the root span
context: Concatenated retrieved document content
metadata: Metadata from the root span

Returns None if no spans or retrieval documents are found.

Return type:

Optional[pd.DataFrame]

Examples

Basic usage:

from phoenix.client import Client
from phoenix.client.helpers.spans import get_input_output_context

client = Client()
qa_df = get_input_output_context(client, project_name="my-rag-app")

With phoenix.evals:

from phoenix.evals import ClassificationEvaluator, evaluate_dataframe
from phoenix.evals.llm import LLM

eval_model = LLM(provider="openai", model="gpt-4o")
qa_df = get_input_output_context(client, project_name="my-rag-app")
if qa_df is not None:
    results = evaluate_dataframe(
        dataframe=qa_df,
        evaluators=[my_qa_evaluator, my_hallucination_evaluator],
    )

get_retrieved_documents(client, *, start_time=None, end_time=None, project_name=None, project_identifier=None, timeout=DEFAULT_TIMEOUT_IN_SECONDS)#

Extracts retrieved documents from retriever spans for RAG evaluation.

Constructs a DataFrame formatted for RAG retrieval evaluation with phoenix.evals. Each row represents a single retrieved document with its associated metadata.

Parameters:

client – Phoenix Client instance.
start_time – Optional start time for filtering spans (inclusive lower bound).
end_time – Optional end time for filtering spans (exclusive upper bound).
project_name – Project name (alias for project_identifier). If not provided, uses the environment variable PHOENIX_PROJECT_NAME.
project_identifier – Project identifier (name or ID). Takes precedence over project_name if both are provided.
timeout – Request timeout in seconds. Defaults to 5.

Returns:

Retrieved documents with multi-index (context.span_id, document_position): and columns: - context.trace_id: Trace ID - input: Input value from the retriever span - document: Document content - document_score: Document relevance score - document_metadata: Document metadata

Return type:

pd.DataFrame

Examples

Basic usage:

from phoenix.client import Client
from phoenix.client.helpers.spans import get_retrieved_documents

client = Client()
docs_df = get_retrieved_documents(client, project_name="my-rag-app")

With time filtering:

from datetime import datetime, timedelta

docs_df = get_retrieved_documents(
    client,
    project_name="my-rag-app",
    start_time=datetime.now() - timedelta(days=1)
)

uniquify_spans(spans, *, in_place=False)#

Regenerates span and trace IDs for a sequence of Span objects while maintaining parent-child relationships. Typically used when creating spans with the client to ensure that the spans have unique OpenTelemetry IDs to avoid collisions and guarantee that the spans can be inserted.

This utility generates new valid OpenTelemetry-compliant span_ids and trace_ids for a collection of spans. The parent-child relationships within the span collection are preserved by mapping old IDs to new IDs consistently.

Parameters:

spans (Sequence[v1.Span]) – A sequence of Span objects to regenerate IDs for.
in_place (bool) – If True, modifies the original spans. If False (default), creates deep copies of the spans before modification.

Returns:

A list of Span objects with regenerated IDs.: If in_place=True, returns the modified input. If in_place=False, returns a deep copy with modifications.

Return type:

list[v1.Span]

Examples

Basic usage:

from phoenix.client import Client
from phoenix.client.helpers.spans import uniquify_spans

client = Client()

# Original spans that may have duplicate IDs
spans = [...]

# Generate new IDs to ensure uniqueness
new_spans = uniquify_spans(spans)

# Now create the spans with guaranteed unique IDs
result = client.spans.create_spans(
    project_identifier="my-project",
    spans=new_spans
)

uniquify_spans_dataframe(df, *, in_place=False)#

Regenerates span and trace IDs for a pandas DataFrame while maintaining parent-child relationships. Typically used when creating spans with the client to ensure that the spans have unique OpenTelemetry IDs to avoid collisions and guarantee that the spans can be inserted.

This utility generates new valid OpenTelemetry-compliant span_ids and trace_ids for a DataFrame of spans (typically from get_spans_dataframe). The parent-child relationships within the span collection are preserved by mapping old IDs to new IDs consistently.

Parameters:

df (pd.DataFrame) – A pandas DataFrame (typically from get_spans_dataframe) to regenerate IDs for.
in_place (bool) – If True, modifies the original DataFrame. If False (default), creates a deep copy of the DataFrame before modification.

Returns:

A DataFrame with regenerated IDs in the index and columns.: If in_place=True, returns the modified input. If in_place=False, returns a deep copy with modifications.

Return type:

pd.DataFrame

Examples

Basic usage:

from phoenix.client import Client
from phoenix.client.helpers.spans import uniquify_spans_dataframe

client = Client()

# Get spans as DataFrame
df = client.spans.get_spans_dataframe(
    project_identifier="my-project"
)

# Generate new IDs for the DataFrame
new_df = uniquify_spans_dataframe(df)

# Use the DataFrame with unique IDs
print(f"Generated {len(new_df)} spans with unique IDs")

SDK Integrations#

OpenAI#

Anthropic#

Google Generative AI#

ATIF (Agent Trajectory Interchange Format)#

ATIF (Agent Trajectory Interchange Format) to Phoenix trace conversion.

Public API:: upload_atif_trajectories_as_spans(client, trajectories, *, project_name)

upload_atif_trajectories_as_spans(client, trajectories, *, project_name, timeout=DEFAULT_TIMEOUT_IN_SECONDS)#

Upload one or more ATIF trajectories as spans to Phoenix.

Converts ATIF (Agent Trajectory Interchange Format) trajectory dicts into Phoenix/OpenTelemetry-compatible span trees and uploads them. Supports ATIF schema versions v1.0 through v1.7.

Trace structure

Each trajectory produces one trace. Only agent steps become spans; user and system messages appear as llm.input_messages on the LLM spans that follow them (matching how real instrumented traces work).

Single-turn trajectories are flat. LLM and TOOL spans are siblings under the AGENT — the agent runtime executes tools, not the LLM:
```
AGENT (root — input=user message, output=final agent reply)
  LLM
  TOOL
  LLM
```

Multi-turn trajectories (multiple user messages) get nested AGENT spans, one per turn. A new turn starts at each follow-up user message:

AGENT (root — input=first user message, output=final agent reply)
  AGENT turn_1 (input=user msg 1, output=agent reply 1)
    LLM
    TOOL
  AGENT turn_2 (input=user msg 2, output=agent reply 2)
    LLM

Multi-agent / subagent handoffs

When trajectories in the batch reference each other via subagent_trajectory_ref, the child trajectory’s spans are nested under the parent’s tool span within a single trace. Upload the parent and child trajectories together in one call for linking to work. ATIF v1.7 embedded subagent_trajectories are flattened and uploaded automatically, with trajectory_id used as the canonical embedded reference key:

AGENT (parent)
  LLM
  TOOL (delegate_task)
    AGENT (child agent)
      LLM
      TOOL

Continuation trajectories

When an agent’s context window is exhausted, Harbor splits the session across files using continued_trajectory_ref. The continuation trajectory gets a session_id ending in -cont-{N}. These are automatically detected and merged into the same trace as the original, so the full agent session appears as one trace. The continuation’s root span is annotated with metadata.is_continuation = True.

Multimodal content (v1.6+)

Messages containing image content parts (type: "image" with a source.path URL) are written using the OpenInference message.contents array format, with image URLs stored in message_content.image.image.url. Text-only messages use the standard message.content string attribute.

Copied context

Steps marked is_copied_context: true (replayed conversation history from a continuation handoff) are included in llm.input_messages as normal messages. LLM spans whose input includes any copied context steps are annotated with metadata.has_copied_context = True.

Deterministic dispatch (v1.7+)

Agent steps with llm_call_count: 0 represent non-LLM orchestration that issued tool calls. These steps do not create synthetic LLM spans; their TOOL spans are still emitted under the AGENT/turn parent.

Attribute mapping

metrics.prompt_tokens / completion_tokens → llm.token_count.prompt / completion / total
metrics.cached_tokens → llm.token_count.prompt_details.cache_read
metrics.cost_usd → llm.cost.total
agent.model_name or step model_name → llm.model_name
agent.tool_definitions → llm.tools.{i}.tool.json_schema
reasoning_content → metadata.reasoning_content
session_id → session.id on all spans

Deterministic IDs

Trace IDs are derived from the run-scoped session_id when present. For ATIF v1.7 standalone trajectories that omit trajectory_id and do not declare a continuation, a stable document hash is used instead so separate trajectory documents that share a run-scoped session_id do not collapse into one trace. Span IDs use document-scoped trajectory_id when available, with the same v1.7 document-hash fallback to avoid collisions.

Known limitation: long conversations

Each LLM span includes the full conversation history up to that point as llm.input_messages attributes. For long multi-turn sessions (roughly 16+ turns with dense tool calls), this can exceed OpenTelemetry attribute size limits, causing spans to be truncated or rejected. This matches the behavior of real-time instrumentors and is a known platform-wide issue, not specific to ATIF conversion.

Parameters:

client – A Phoenix Client instance.
trajectories – A sequence of ATIF trajectory dicts conforming to the ATIF schema (v1.0–v1.7).
project_name – The Phoenix project to upload spans into.
timeout – Request timeout in seconds.

Returns:

The response body from log_spans, containing total_received and total_queued counts.

Raises:

ValueError – If any trajectory fails validation.

Example:

from phoenix.client import Client
from phoenix.client.helpers.atif import (
    upload_atif_trajectories_as_spans,
)

client = Client()
trajectories = [
    {
        "schema_version": "ATIF-v1.4",
        "session_id": "sess-001",
        "agent": {
            "name": "my-agent",
            "version": "1.0",
            "model_name": "gpt-4",
        },
        "steps": [...],
    }
]
result = upload_atif_trajectories_as_spans(
    client, trajectories, project_name="my-project"
)
print(result)  # {"total_received": 5, "total_queued": 5}