Skip to main content
Log, query, and update LLM traces programmatically. Upload bulk traces or update evaluations and annotations after the fact.

Key Capabilities

  • List and filter spans for a project
  • Bulk upload traces from offline processing
  • Update evaluations asynchronously (LLM-as-judge patterns)
  • Add human feedback and annotations
  • Attach custom metadata for filtering and analysis
  • Export spans for offline analysis

List Spans

spans.list is currently in ALPHA. A one-time warning is emitted on first use. For downloading large volumes of spans, use export_to_df instead.
List spans for a project within an optional time window. Spans are returned in descending start-time order (most recent first). If start_time and end_time are not provided, the last seven days are queried.
from datetime import datetime

resp = client.spans.list(
    project="your-project-name-or-id",
    start_time=datetime(2024, 1, 1),  # optional
    end_time=datetime(2024, 2, 1),    # optional
    limit=100,
)

for span in resp.spans:
    print(span.span_id, span.name)

Filter Spans

Use the filter parameter to narrow results by status, evaluation labels, annotation labels, or latency:
# Filter by status
resp = client.spans.list(
    project="your-project-name-or-id",
    filter="status_code = 'ERROR'",
)

# Filter by evaluation label
resp = client.spans.list(
    project="your-project-name-or-id",
    filter="eval.Correctness.label = 'correct'",
)

# Filter by annotation label
resp = client.spans.list(
    project="your-project-name-or-id",
    filter="annotation.Quality.label = 'good'",
)

# Filter by latency
resp = client.spans.list(
    project="your-project-name-or-id",
    filter="latency_ms > 1000",
)

# Combine filters with AND / OR
resp = client.spans.list(
    project="your-project-name-or-id",
    filter="status_code = 'ERROR' AND eval.Correctness.label = 'correct'",
)
For details on pagination, field introspection, and data conversion (to dict/JSON/DataFrame), see Response Objects.

Log Spans

Upload traces in bulk from offline processing or batch evaluation.
import pandas as pd

# Prepare spans DataFrame
spans_df = pd.DataFrame([
    {
        "context.span_id": "span-1",
        "context.trace_id": "trace-1",
        "name": "llm_call",
        "span_kind": "LLM",
        "start_time": "2024-01-15T10:00:00Z",
        "end_time": "2024-01-15T10:00:02Z",
        "attributes.llm.model_name": "gpt-4",
        "attributes.llm.input_messages": [...],
        "attributes.llm.output_messages": [...],
    },
])

# Optional: include evaluations
evals_df = pd.DataFrame([
    {
        "context.span_id": "span-1",
        "name": "Correctness",
        "label": "correct",
        "score": 1.0,
    },
])

# Log spans
response = client.spans.log(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=spans_df,
    evals_dataframe=evals_df,  # Optional
)

print(f"Logged spans successfully: {response.status_code}")

Log Spans Only

client.spans.log(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=spans_df,
)

Update Evaluations

Add or update evaluations for existing spans (useful for LLM-as-judge patterns).
import pandas as pd

evals_df = pd.DataFrame([
    {
        "context.span_id": "span-1",
        "name": "Relevance",
        "label": "relevant",
        "score": 0.95,
        "explanation": "The response directly answers the question.",
    },
    {
        "context.span_id": "span-2",
        "name": "Relevance",
        "label": "not_relevant",
        "score": 0.2,
    },
])

response = client.spans.update_evaluations(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=evals_df,
)

print("Updated evaluations successfully")

Batch Evaluation Pattern

# Run async LLM evaluations on existing traces
async def evaluate_traces():
    # Fetch traces to evaluate
    traces = fetch_recent_traces()

    # Run LLM-as-judge evaluations
    eval_results = []
    for trace in traces:
        score = await llm_judge.evaluate(trace)
        eval_results.append({
            "context.span_id": trace.span_id,
            "name": "Quality",
            "score": score,
        })

    # Upload evaluations
    evals_df = pd.DataFrame(eval_results)
    client.spans.update_evaluations(
        space_id="your-space-id",
        project_name="my-llm-app",
        dataframe=evals_df,
    )

Update Annotations

Add human feedback and annotations to spans.
import pandas as pd

annotations_df = pd.DataFrame([
    {
        "context.span_id": "span-1",
        "annotation.Quality.label": "correct",
        "annotation.Quality.score": 1.0,
        "annotation.Quality.text": "Verified by human reviewer",
    },
])

response = client.spans.update_annotations(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=annotations_df,
)

print("Updated annotations successfully")

Update Metadata

Attach or patch custom metadata on existing spans for filtering and analysis. The method uses JSON Merge Patch semantics and supports three input approaches.

Method 1: Direct Field Columns

Set individual metadata fields using attributes.metadata.<field> column names. This is the simplest approach.
import pandas as pd

metadata_df = pd.DataFrame([
    {
        "context.span_id": "span-1",
        "attributes.metadata.customer_id": "cust-456",
        "attributes.metadata.experiment_version": "v2",
        "attributes.metadata.region": "us-west",
    },
    {
        "context.span_id": "span-2",
        "attributes.metadata.customer_id": "cust-789",
        "attributes.metadata.region": "eu-central",
    },
])

response = client.spans.update_metadata(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=metadata_df,
)

print(f"Updated: {response['spans_updated']}, Failed: {response['spans_failed']}")

Method 2: Patch Document Column

Provide a JSON patch document per span for more control. The patch is applied after any field columns. The default column name is "patch_document".
metadata_df = pd.DataFrame([
    {
        "context.span_id": "span-1",
        "patch_document": {"tag": "important", "priority": "high"},
    },
    {
        "context.span_id": "span-2",
        "patch_document": {"tag": "standard"},
    },
])

response = client.spans.update_metadata(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=metadata_df,
)
Use a custom column name with the patch_document_column_name parameter:
response = client.spans.update_metadata(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=metadata_df,
    patch_document_column_name="my_patch_col",
)

Method 3: Combined Approach

Use both field columns and a patch document. The patch document is applied last and overrides any conflicting field column values.
metadata_df = pd.DataFrame([
    {
        "context.span_id": "span-1",
        "attributes.metadata.tag": "important",
        "patch_document": {"priority": "high"},  # Applied after field columns
    },
])

response = client.spans.update_metadata(
    space_id="your-space-id",
    project_name="my-llm-app",
    dataframe=metadata_df,
)

Type Handling

Python typeStored as
strstring
int / floatnumber
boolstring ("True" / "False")
NoneJSON null (field is set to null, not removed)
dict / listJSON string
Setting a field to None stores JSON null — it does not remove the field. This differs from standard JSON Merge Patch behavior.

Response Structure

update_metadata returns a dictionary with the following keys:
KeyDescription
spans_processedTotal spans in the input DataFrame
spans_updatedSpans successfully updated
spans_failedSpans that failed to update
errorsList of {"span_id": ..., "error_message": ...} for each failure
response = client.spans.update_metadata(...)

print(f"Processed: {response['spans_processed']}")
print(f"Updated:   {response['spans_updated']}")
print(f"Failed:    {response['spans_failed']}")

for err in response.get("errors", []):
    print(f"  span {err['span_id']}: {err['error_message']}")

Export Spans

Export spans for offline analysis, custom processing, or archival.
from datetime import datetime

start_time = datetime.strptime("2024-01-01", "%Y-%m-%d")
end_time = datetime.strptime("2026-01-01", "%Y-%m-%d")

# Export to DataFrame
df = client.spans.export_to_df(
    space_id="your-space-id",
    project_name="my-llm-app",
    start_time=start_time,
    end_time=end_time,
)

print(f"Exported {len(df)} spans")

Export to Parquet

client.spans.export_to_parquet(
    space_id="your-space-id",
    project_name="my-llm-app",
    start_time=start_time,
    end_time=end_time,
    path="./spans_export.parquet",
)
Export capabilities:
  • Time-range filtering
  • DataFrame or Parquet output
  • Efficient Arrow Flight transport for large exports
  • Progress bars for long-running exports