Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Log, query, and update LLM traces programmatically. Upload bulk traces or update evaluations and annotations after the fact.
Key Capabilities
- List and filter spans for a project
- Bulk upload traces from offline processing
- Update evaluations asynchronously (LLM-as-judge patterns)
- Annotate spans by ID or attach annotations to traces in bulk
- Attach custom metadata for filtering and analysis
- Export spans for offline analysis
- Permanently delete spans by ID
List Spans
The spans.list method is currently in ALPHA. The API may change without notice. A one-time warning is emitted on first use. For downloading large volumes of spans, use export_to_df instead.
List spans for a project within an optional time window. Spans are returned in descending start-time order (most recent first). If start_time and end_time are not provided, the last seven days are queried.
from datetime import datetime
resp = client.spans.list(
project="your-project-name-or-id",
start_time=datetime(2024, 1, 1), # optional
end_time=datetime(2024, 2, 1), # optional
limit=100,
)
for span in resp.spans:
print(span.span_id, span.name)
Filter Spans
Use the filter parameter to narrow results by status, evaluation labels, annotation labels, or latency:
# Filter by status
resp = client.spans.list(
project="your-project-name-or-id",
filter="status_code = 'ERROR'",
)
# Filter by evaluation label
resp = client.spans.list(
project="your-project-name-or-id",
filter="eval.Correctness.label = 'correct'",
)
# Filter by annotation label
resp = client.spans.list(
project="your-project-name-or-id",
filter="annotation.Quality.label = 'good'",
)
# Filter by latency
resp = client.spans.list(
project="your-project-name-or-id",
filter="latency_ms > 1000",
)
# Combine filters with AND / OR
resp = client.spans.list(
project="your-project-name-or-id",
filter="status_code = 'ERROR' AND eval.Correctness.label = 'correct'",
)
For details on pagination, field introspection, and data conversion (to dict/JSON/DataFrame), see Response Objects.
Log Spans
Upload traces in bulk from offline processing or batch evaluation.
import pandas as pd
# Prepare spans DataFrame
spans_df = pd.DataFrame([
{
"context.span_id": "span-1",
"context.trace_id": "trace-1",
"name": "llm_call",
"span_kind": "LLM",
"start_time": "2024-01-15T10:00:00Z",
"end_time": "2024-01-15T10:00:02Z",
"attributes.llm.model_name": "gpt-4",
"attributes.llm.input_messages": [...],
"attributes.llm.output_messages": [...],
},
])
# Optional: include evaluations
evals_df = pd.DataFrame([
{
"context.span_id": "span-1",
"eval.Correctness.label": "correct",
"eval.Correctness.score": 1.0,
"eval.Correctness.explanation": "The model's response was accurate.",
},
])
# Log spans
response = client.spans.log(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=spans_df,
evals_dataframe=evals_df, # Optional
)
print(f"Logged spans successfully: {response.status_code}")
Log Spans Only
client.spans.log(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=spans_df,
)
Update Evaluations
Add or update evaluations for existing spans (useful for LLM-as-judge patterns).
import pandas as pd
evals_df = pd.DataFrame([
{
"context.span_id": "span-1",
"eval.Relevance.label": "relevant",
"eval.Relevance.score": 0.95,
"eval.Relevance.explanation": "The response directly answers the question.",
},
{
"context.span_id": "span-2",
"eval.Relevance.label": "not_relevant",
"eval.Relevance.score": 0.2,
"eval.Relevance.explanation": "The model's response was not relevant.",
},
])
response = client.spans.update_evaluations(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=evals_df,
)
print("Updated evaluations successfully")
Batch Evaluation Pattern
# Run async LLM evaluations on existing traces
async def evaluate_traces():
# Fetch traces to evaluate
traces = fetch_recent_traces()
# Run LLM-as-judge evaluations
eval_results = []
for trace in traces:
score = await llm_judge.evaluate(trace)
eval_results.append({
"context.span_id": trace.span_id,
"name": "Quality",
"score": score,
})
# Upload evaluations
evals_df = pd.DataFrame(eval_results)
client.spans.update_evaluations(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=evals_df,
)
Annotate Spans
The spans.annotate_spans method is currently in ALPHA. The API may change without notice. A one-time warning is emitted on first use.
Write human annotations to a batch of spans by ID. Annotations are upserted by annotation config name for each span; submitting the same name for the same span overwrites the previous value. Up to 1000 spans may be annotated per request. Spans are looked up within the specified time window (defaulting to the last 31 days). If any span ID in the batch is not found within the window, the entire request is rejected with a 404 error.
from datetime import datetime
from arize.spans.types import AnnotateRecordInput
from arize.annotation_queues.types import AnnotationInput
client.spans.annotate_spans(
project="your-project-name-or-id",
space="your-space-name-or-id", # required when project is a name
annotations=[
AnnotateRecordInput(
record_id="your-span-id",
values=[
AnnotationInput(name="accuracy", label="correct", score=1.0),
AnnotationInput(name="notes", text="Verified by reviewer"),
],
),
],
start_time=datetime(2026, 4, 1), # optional, defaults to 31 days ago
end_time=datetime(2026, 5, 1), # optional, defaults to now
)
Update Annotations
Add human feedback and annotations to spans.
import pandas as pd
annotations_df = pd.DataFrame([
{
"context.span_id": "span-1",
"annotation.Quality.label": "correct",
"annotation.Quality.score": 1.0,
"annotation.Quality.text": "Verified by human reviewer",
},
])
response = client.spans.update_annotations(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=annotations_df,
)
print("Updated annotations successfully")
Attach or patch custom metadata on existing spans for filtering and analysis. The method uses JSON Merge Patch semantics and supports three input approaches.
Method 1: Direct Field Columns
Set individual metadata fields using attributes.metadata.<field> column names. This is the simplest approach.
import pandas as pd
metadata_df = pd.DataFrame([
{
"context.span_id": "span-1",
"attributes.metadata.customer_id": "cust-456",
"attributes.metadata.experiment_version": "v2",
"attributes.metadata.region": "us-west",
},
{
"context.span_id": "span-2",
"attributes.metadata.customer_id": "cust-789",
"attributes.metadata.region": "eu-central",
},
])
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
)
print(f"Updated: {response['spans_updated']}, Failed: {response['spans_failed']}")
Method 2: Patch Document Column
Provide a JSON patch document per span for more control. The patch is applied after any field columns. The default column name is "patch_document".
metadata_df = pd.DataFrame([
{
"context.span_id": "span-1",
"patch_document": {"tag": "important", "priority": "high"},
},
{
"context.span_id": "span-2",
"patch_document": {"tag": "standard"},
},
])
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
)
Use a custom column name with the patch_document_column_name parameter:
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
patch_document_column_name="my_patch_col",
)
Method 3: Combined Approach
Use both field columns and a patch document. The patch document is applied last and overrides any conflicting field column values.
metadata_df = pd.DataFrame([
{
"context.span_id": "span-1",
"attributes.metadata.tag": "important",
"patch_document": {"priority": "high"}, # Applied after field columns
},
])
response = client.spans.update_metadata(
space_id="your-space-id",
project_name="my-llm-app",
dataframe=metadata_df,
)
Type Handling
| Python type | Stored as |
|---|
str | string |
int / float | number |
bool | string ("True" / "False") |
None | JSON null (field is set to null, not removed) |
dict / list | JSON string |
Setting a field to None stores JSON null — it does not remove the field. This differs from standard JSON Merge Patch behavior.
Response Structure
update_metadata returns a dictionary with the following keys:
| Key | Description |
|---|
spans_processed | Total spans in the input DataFrame |
spans_updated | Spans successfully updated |
spans_failed | Spans that failed to update |
errors | List of {"span_id": ..., "error_message": ...} for each failure |
response = client.spans.update_metadata(...)
print(f"Processed: {response['spans_processed']}")
print(f"Updated: {response['spans_updated']}")
print(f"Failed: {response['spans_failed']}")
for err in response.get("errors", []):
print(f" span {err['span_id']}: {err['error_message']}")
Delete Spans
The spans.delete method is currently in ALPHA. The API may change without notice. A one-time warning is emitted on first use.
Permanently delete spans by their IDs. This operation is irreversible. Only spans within the 2-year lookback window are considered; older spans are not affected. Span IDs that are not found are silently ignored.
result = client.spans.delete(
project="your-project-name-or-id",
span_ids=["span-id-1", "span-id-2"],
space="your-space-name-or-id", # required when project is a name
)
# `result` is None on full deletion (HTTP 204), or a response with
# `deleted_span_ids` on partial deletion (HTTP 200) — retry to complete.
if result is not None:
print(f"Partially deleted; retry to complete: {result.deleted_span_ids}")
else:
print("All requested spans deleted")
Export Spans
Export spans for offline analysis, custom processing, or archival.
from datetime import datetime
start_time = datetime.strptime("2024-01-01", "%Y-%m-%d")
end_time = datetime.strptime("2026-01-01", "%Y-%m-%d")
# Export to DataFrame
df = client.spans.export_to_df(
space_id="your-space-id",
project_name="my-llm-app",
start_time=start_time,
end_time=end_time,
)
print(f"Exported {len(df)} spans")
Export to Parquet
client.spans.export_to_parquet(
space_id="your-space-id",
project_name="my-llm-app",
start_time=start_time,
end_time=end_time,
path="./spans_export.parquet",
)
Export capabilities:
- Time-range filtering
- DataFrame or Parquet output
- Efficient Arrow Flight transport for large exports
- Progress bars for long-running exports