2025-11-20
Structured Outputs Support for Playground
2025-11-10
Session Annotations
- Input/Output Level: Attach insights to specific output messages, automatically linked to the root span of the trace.
- Span Level: Dive deeper into a trace and annotate individual spans for precise, context-rich feedback.
2025-11-05
Integrations Revamp

2025-11-03
OpenInference TypeScript 2.0
- Added easy manual instrumentation with the same decorators, wrappers, and attribute helpers found in the Python
openinference-instrumentationpackage. - Introduced function tracing utilities that automatically create spans for sync/async function execution, including specialized wrappers for chains, agents, and tools.
- Added decorator-based method tracing, enabling automatic span creation on class methods via the
@observedecorator. - Expanded attribute helper utilities for standardized OpenTelemetry metadata creation, including helpers for inputs/outputs, LLM operations, embeddings, retrievers, and tool definitions.
- Overall, tracing workflows, agent behavior, and external tool calls is now significantly simpler and more consistent across languages.
2025-10-30
API-Driven Monitors

2025-10-30
Automatic Threshold Ranges for Monitors
You can now set upper and lower bounds using our new Auto Threshold options!Arize AX can automatically determine the right threshold for your alerts based on your historical data. This is ideal for most users who want to start monitoring without manually tuning thresholds.2025-10-29
Data Fabric

2025-10-28
New Timeline Tab for Traces

2025-10-24
Tags

- Describe source (
from-prod,EHR-record) - Encode purpose (
ab-test,regression-test) - Indicate readiness (
golden,deprecated) - Group by config (
threshold-0.85,cohort_5)
Space level, under Space Settings. They can be reused across entities that belong to that space (Datasets, Experiments, and more). More on Tags.2025-10-22
Sort Datasets and Experiment Listing Table

2025-10-15
Support for Tool Call IDs in OpenInference Messages
This update introduces full support fortool_call_id and tool_call.id in OpenInference message semantics. These identifiers are now stored alongside input and output messages. Tool call IDs now appear in the trace slideover’s input/output and attributes tabs.2025-10-14
Add Data Region to Login Page

2025-10-13
Add Auth Failures to Tracing
This release adds tracing for authentication failures, enabling better visibility and debugging of auth-related issues across systems.2025-10-12
Total Traces on Stats Bar

2025-10-10
Support for GPT OSS Models on Bedrock

2025-10-05
Expanded LLM Support
Expanded LLM model support to include Claude models on Bedrock and Vertex, Titan Text Premiere, Amazon Nova Premiere, Gemini 2.5 Flash/Pro, and new GPT OSS and DeepSeek models —offering broader coverage across top providers.2025-10-03
Dashboard Widget Time Setting

2025-10-01
Autocomplete for Annotations on Datasets

2025-09-28
Experiment Annotations on Experiment Compare Page
Annotations are now visible directly on the Experiment Compare page, making it easier to review context and insights alongside your experiment results. This streamlines analysis by keeping annotations and metrics in one place.2025-09-27
Autosave Annotations
2025-09-26
Improved Sessions View
2025-09-24
Dashboard Rework Enhancements
Dashboards now support per-widget time range selection and improved legend display options, giving users more granular control and clearer data visualization. These updates make it easier to tailor each dashboard view to your specific analysis needs.2025-09-22
Large Dataset Runs in Playground
2025-09-19
Session and Trace Evals
2025-09-17
Alyx: Synthetic Data Generation Skill
2025-09-15
Position Dashboard Widgets Freely
2025-09-11
Annotations Config: Optimization Direction
optimization_direction field now lives within their respective config columns, improving logical grouping.2025-09-10
Improved Annotator Selection for New Labeling Queues
The annotator selection flow has been refined: users are grouped into Annotators and All Other Users (with group-level select all), and the interface now clearly highlights “You.”2025-09-09
Search in Settings
All settings tables now include search bars, making it easier to quickly find what you need.2025-09-08
Dashboard Improvements
Dashboards have been reworked with a richer experience: clickable legends and new widget creation forms for line charts, bar charts, experiments, monitors, and statistics.2025-08-29
Delete Annotation Queues
You can now delete annotation queues directly from the queues listing using the “more” button, or remove multiple queues at once with checkboxes in the queues table.2025-08-26
Playground View Defaults
The playground now loads the most recent view by default, so you can pick up right where you left off.2025-08-20
Trace Total Cost Views
The trace tab (table view) now displays the total cost at the root span, making it easier to understand overall usage at a glance.2025-08-19
Experiment Traces Improvements
The new traces page slide-over enhances the experiment tracing experience, with hover buttons now always visible, experiment traces added to the overflow menu, and search functionality added to the Experiments List Page.2025-08-18
Dataset Management Upgrades
The Datasets interface has been improved with CSV upload fixes, search capabilities on the Datasets List Page, and REST API support for dataset deletion.2025-08-16
Experiments Refinements
Color maps and diffing functionality have been improved, and the trace metadata now uses experiment IDs for better consistency. The experiment compare headers also feature a pinned experiment button for easier navigation.2025-08-15
Dataset Filtering & REST API Updates
Datasets now support improved filtering capabilities, better column organization following semantic conventions, and expanded REST API coverage for listing datasets and examples.Text areas on dataset example pages can now expand to full column width, and dataset filter history is now preserved.2025-08-14
Dedicated Agent Graph Tab
The tracing interface now includes a dedicated Agent Graph tab, making it clearer to visualize and explore agent interactions within traces.2025-08-13
Alyx Copilot API Advancements
Copilot API now supports structured output, improved frontend message parsing, and streamlined post-processing workflows, delivering a major upgrade to the AI assistant architecture.2025-08-12
Playground Performance Updates
Playground data loading has been improved to boost reliability and performance. Fixed missing metrics displays for AWS Bedrock models, ensuring smoother and more consistent evaluation workflows.2025-08-11
Project UX Enhancements
The Projects page now features improved navigation and usability, with the addition of Tasks Provider to simplify task and evaluation management.2025-08-07
Support for GPT-5 in Prompt Playground
Prompt Playground now supports GPT-5, giving users access to the latest OpenAI model for experimentation and evaluation.2025-08-06
Trace Interactivity Improvements
Hover states have been added for trace costs, and spans in traces are now clickable—making it easier to explore cost details and navigate through trace data.2025-08-04
Revamped Eval and Tasks Experience
The Evals experience has been upgraded with a redesigned Tasks page and updated slideovers that for a cleaner workflow. A save button has been added to Evals slideovers, counters in Datasets now stay up to date, and evaluators automatically refresh from datasets.2025-08-05
Expanded Annotation Configuration Capabilities
Annotations now support up to five labels per configuration, giving teams more flexibility to capture nuanced judgments and tailor evaluation workflows to their needs.This release also adds improved validations, clearer table views, and multiple UI and labeling queue enhancements for a smoother annotation workflow.2025-08-04
Image Support for Datasets and Labeling Queues
Datasets now support images in both datasets and labeling queues, with updated column groupings, clearer example ID displays, and reference tokens in headers. This release also introduces download tooltips for datasets and experiments, making it easier to export data directly from the UI.2025-08-02
Experiment Comparison & Data Visualization Improvements
The Experiments page has been redesigned with improved UX and richer charting, including new select components and diffing support on the compare page for clearer side-by-side analysis.Average aggregate metrics are now shown in experiment headers. Usability fixes such as expandable/collapsible tables, editable experiment names, and updated column headers make workflows smoother.2025-07-18
Preview Examples in Evals UI
Now supports previewing examples while editing eval templates, making it easier to refine and validate your evals.2025-07-18
OpenInference Java
OpenInference Java is now available, providing a comprehensive solution for tracing AI applications using OpenTelemetry. Fully compatible with any OpenTelemetry-compatible collector or backend like Arize.Included in this release:- openinference-semantic-conventions: Java constants for capturing model calls, embeddings, and tool usage.
- openinference-instrumentation: Core utilities for manual OpenInference instrumentation.
- openinference-instrumentation-langchain4j: Auto-instrumentation for LangChain4j applications.
2025-07-16
Alyx Build Eval Skill
You can now build and run evals directly with Alyx. The new UX enables full interaction with Alyx to create multiple evals in the same flow, run evals on experiments and datasets, and list available evals for easy tracking.When building an eval in the Evals & Tasks tab, find the Alyx button to get started with this feature. More enhancements coming soon!2025-07-12
Prompt Hub Release Tags
You can now tag prompts in Prompt Hub as Production, Staging, or with custom labels to keep your workflow organized. This makes it easy to track which prompts are live, in testing, or under development.2025-07-12
Prompt Learning
With Prompt Optimization Tasks, you can now optimize prompts in a few clicks using human or automated feedback loops, versioned releases, and CI-friendly workflows—no more trial-and-error.Key Features:- Auto-generate the best prompt from your labeled dataset
- Promote the best prompt to production in Prompt Hub
- Evaluate auto-generated prompts side-by-side with originals on the Experiments page
2025-07-11
Arize Tracing Assistant
The Arize Tracing Assistant is now live, bringing docs, examples, and tracing help directly into your IDE or LLM—no guesswork, no tab hopping:- Instantly look up instrumentation guides without leaving your editor
- Drop in working tracing examples to adapt immediately
- Ask tracing questions in plain language and get answers as you debug
2025-07-10
Saved Filters for Traces
You can now save up to 7 filters on the Traces page to quickly revisit your most-used views. Just create a filter and hit “Save” to pin it for easy access.2025-07-02
Customizable Hotkey for Alyx
Alyx now supports a default hotkey for quickly adding context and opening chat:- macOS: ⌘ + L
- Windows: Ctrl + L
2025-07-01
Cost Tracking
You can now monitor model spend directly in Arize with native cost tracking. Supporting 60+ models and providers out of the box, this flexible feature adapts to various cost structures and team needs, making it easy to track and manage your AI spend in-platform.2025-06-25
Arize Database (ADB)
We’re excited to introduce Arize Database (ADB), the powerful engine behind all Arize AX instances. Built for massive scale and speed, ADB processes billions of traces and petabytes of data with high efficiency.Its robust architecture supports real-time ingestion, bulk updates, and fast querying, powering even the heaviest AI workloads reliably. ADB has long been the unsung hero of our platform, and we’re proud to bring it to light.Introducing ADB: Arize’s Proprietary OLAP Database
Arize AI
2025-06-25
Playground Views
The new Prompt Playground lets you save views including prompts, dataset selections, comparison views, messages, and model selections. You can iterate and test variations seamlessly in one environment and share optimal views with your team to accelerate prompt development and evaluation.2025-06-25
Prompt Learning
We’re excited to launch Prompt Learning, a new workflow in Arize to accelerate prompt iteration and evaluation. With Prompt Learning, you can:- Run prompt optimization experiments directly in Arize
- Incorporate text-based judgments from humans and LLMs
- Tune and compare prompt variants to systematically improve agent behavior
2025-06-25
Agent Trajectory Evaluations
With Agent Trajectory Evaluation you can assesses the sequence of tool calls and reasoning steps your agent takes to solve a task. Key benefits:- Path Quality: See if your agent is following expected, efficient problem-solving paths.
- Tool Usage Insights: Detect redundant, inefficient, or incorrect tool call patterns.
- Debugging Visibility: Understand internal decision-making to resolve unexpected behaviors, even when outcomes appear correct.
2025-06-25
Session-level Evaluations
You can now evaluate your agents across entire sessions with new session-level evaluations, enabling deeper insight beyond trace-level metrics. Assess:- Coherence: Does the agent maintain logical consistency throughout the session?
- Context Retention: Is it effectively remembering and building on prior exchanges?
- Goal Achievement: Does the conversation accomplish the user’s intended outcome?
- Conversational Progression: Is the agent navigating multi-step tasks in a natural, helpful way?
2025-06-25
Agent and Multi-Agent Visualization
Easily inspect and debug multi-agent workflows with the new **Agent Visibility **feature. Alongside Traces and Spans, the new Agents tab auto-generates an interactive flowchart showing how agents, tools, and components interact step-by-step. With Agent Visibility, you can:- Visualize agent workflows end-to-end
- Debug bottlenecks and errors with clarity
- Link agents to traces and spans for deeper insights
- Accelerate orchestration iteration and refinement
2025-06-25
Alyx MCP Assistant
All Alyx skills are accessible via MCP, allowing seamless integration into your existing workflows. You can leverage the full suite of Alyx debugging and analysis tools wherever you build, without needing to switch contexts.This means you can debug traces directly from your IDE while building in environments like Cursor, or connect through Claude Code to identify improvement areas. Refer to the video below for setting up Alyx via MCP in Cursor.2025-06-25
Arize Copilot v3: Alyx & Trace Troubleshooting
We are excited to introduce Alyx, our major upgrade to our Copilot assistant. You can now drop context anywhere across the app and open copilot with the magic of ctrl+L to instantly pull context for smarter, faster help.We’re also introducing Trace Troubleshooting — a new Copilot skill that lets you navigate the entire trace to pinpoint issues. Built with O3 under the hood, you can now:- @ specific spans
- Use existing span skills for span questions or evals
- Let Copilot traverse and diagnose like a pro
-
Ability to customize the hot key if you don’t want to use
Ctrl + L
2025-06-20
New Homepage & Onboarding Experience
We’ve just rolled out a revamped onboarding flow to guide first-time users smoothly into either Tracing or Experiments.2025-05-20
Realtime Trace Ingestion for All Arize AX Instances
Realtime trace ingestion is now supported across all Arize AX tiers, including the free tier.Previously, this feature was only available for enterprise AX users and within our open-source platform, Phoenix. It is now fully rolled out to all users of Arize AX.No configuration changes are required to begin using realtime trace ingestion.2025-05-11
More OpenAI models in prompt playground and tasks
We’ve added support for more OpenAI models in prompt playground and evaluation tasks. Experiment across models and frameworks quickly.
2025-05-09
Sleeker display of inputs and outputs on a span
We’ve improved the design of the span page to showcase the functions, inputs, and outputs, to help you debug your traces faster!
2025-05-07
Attribute search on traces
Now you can filter your span attributes right on the page, no moreCMD+F !
2025-05-05
Column selection in prompt playground
You can now view all of your prompt variables and dataset values directly in playground!
2025-05-02
Latency and token counts in prompt playground
We’ve added latency and token counts to prompt playground runs! Currently supported for OpenAI, with more providers to come!
2025-04-28
Major design refresh in Arize AX
We’ve refreshed Arize AX with polished fonts, spacing, color, and iconography throughout the whole platform.2025-04-26
Custom code evaluators
You can now run your own custom python code evaluators in Arize against your data in a secure environment. Use background tasks to run any custom code, such as URL validations, or keyword match. Learn more
2025-04-25
Security audit logs for enterprise customers
Improve your compliance and policy adherence. You can now use audit logs to monitor data access in Arize. Note: This feature is completely opt-in and this tracking is not enabled unless a customer explicitly asks for it. Learn more2025-04-24
Larger dataset runs in prompt playground
We’ve increased the row limit for datasets in the playground, so you can run prompts in parallel on up to 100 examples.
2025-04-24
Evaluations on experiments
You can now create and run evals on your experiments from the UI. Compare performance across different prompt templates, models, or configurations without code. Learn more →
2025-04-24
Cancel running background tasks
When running evaluations using background tasks, you can now cancel them mid-flight while observing task logs. Learn more →
2025-04-21
Improved UI for functions in prompt playground
We’ve made it easier to view, test, and validate your tool calls in prompt playground. Learn more →
2025-04-15
Compare prompts side by side
Compare the outputs of a new prompt and the original prompt side-by-side. Tweak model parameters and compare results across your datasets. Learn more →
2025-04-14
Image segmentation support for CV models
We now support logging image segmentation to Arize. Log your segmentation coordinates and compare your predictions vs. your actuals.Learn more →
2025-04-11
New time selector on your traces
We’ve made it way easier to drill into specific time ranges, with quick presets like “last 15 minutes” and custom shorthand for specific dates and times, such as10d ,4/1 - 4/6, 4/1 3:00am . Learn more →
2025-04-07
Prompt hub python SDK
Access and manage your prompts in code with support for OpenAI and VertexAI. Learn more2025-04-04
View task run history and errors
Get full visbility into your evaluation task runs, including when it ran, what triggered it, and if there were errors. Learn more →
2025-04-02
2025-03-24
Test online evaluation tasks in playground
Quickly debug and refine your prompts used by your online evaluators by loading them prefilled into prompt playground. Learn more →2025-03-01
Select metadata on the sessions page
Dynamically select the fields you want to see in your sessions view.2025-02-27
2025-02-20
Expand and collapse your traces
You can now collapse rows to see more data at a glance or expand them to view more text.
2025-02-14
Improved traces export
Specify which columns of data you’d like to export when exporting data via the ArizeExportClient by specifyingcolumns .2025-02-14
2025-02-14
2025-01-21
Voice application tracing and evaluation
Audio tracing: Capture, process, and send audio data to Arize and observe your application behavior.Evaluation: Assess how well your models identify emotional tones like frustration, joy, or neutrality.
Voice App Tracing
2025-01-21
2024-12-19
2024-12-19
Managed code evaluators
Use our pre-built, off-the-shelf evaluators to evaluate spans without requiring requests to an LLM-as-a-Judge. These include Regex matching, JSON validation, Contains keyword, and more!2024-12-19
2024-12-19
LangChain Instrumentation
Support forsessions via LangChain native thread tracking in TypeScript is now available. Easily track multi-turn conversations / threads using LangChain.js.2024-12-05
Analyze your spans with Copilot
Extract key insights quickly from your spans instead of trying to decipher meaning in hundreds of spans. Ask questions and run evals right in the trace view.
Span Chat Evaluation
2024-12-05
Generate dashboards with Copilot
Building dashboard plots just got way easier. Create time series plots and even translate code into ready to go visualizations.
Dashboard generator
2024-12-05
View your experiment traces
Experiment traces for a dataset are now consolidated accessed under “Experiment Projects”.
Experiment Projects
2024-12-05
Multi-class calibration chart
For your multi-class ML models, you can see how your model is calibrated in one visualization
Calibration Chart
2024-12-05
Log experiments in Python SDK
You can now log experiment data manually using a dataframe, instead of running an experiment. This is useful if you already have the data you need, and re-running the query would be expensive. SDK Reference2024-11-07
Create custom metrics with Copilot
Users can generate their desired metric by having copilot translate natural language descriptions or existing code (e.g., SQL, Python) into AQL. Learn more →
Copilot Custom Metric Skill
2024-11-07
Summarize embeddings with Copilot
Copilot now works for embeddings! Users can select embedding data point and Copilot will analyze for patterns and insights. Learn more →
Copilot Embedding Summarization Skill
2024-11-07
Local explainability support for ML models
Local Explainability is now live, providing both a table view and waterfall style plot for detailed, per-feature SHAP values on individual predictions. Learn more →
Local Explainability Support
2024-11-07
See experiment results over time
Visualize specific evaluations over time in dashboards. Learn more →
Experiment Over Time Widget
2024-11-07
Function calling replay in prompt playground
Now users can follow the full function calling tutorial from OpenAI and iterate on different functions in different messages from within the Prompt Playground.
Full Function Calling Replay
2024-11-07
Vercel AI auto-instrumentation
User can now ingest traces created by the Vercel AI SDK into Arize. Learn more →2024-11-07
Track sessions and context attributes in instrumentation
You can add metadata and context that will be picked up by all of our auto instrumentations and added to spans. Learn more →2024-10-24
Easily test your online tasks and evals
Users now have the option to to test a task, such as online eval, by running it once on existing data, or apply evaluation labels to older traces. Learn more →
2024-10-24
Experiment filters
Users can now filter experiments based on dataset attributes or experiment results, making it easy to identify areas for improvement and track their experiment progress with more precision. Learn more →
Filtering experiments by experiment name
2024-10-03
Embedding traces
With Embeddings Tracing, you can effortlessly select embedding spans and dive straight into the UMAP visualizer, simplifying troubleshooting for your genAI applications. Learn more→
Embedding traces in action
2024-10-03
Experiments Details Visualization
Users can now view a detailed breakdown of labels for their experiments on the Experiments Details page.
Experiments details visualization
2024-10-03
Support for o1-mini and o1-preview in playground
****We’ve added full support for all available OpenAI models in the playground including theo1-mini and o1-preview.
2024-10-03
Improved auto-complete in playground
We’ve added better input variable behavior, autocompletion enhancements, support for mustache/f-string input variables, and more.2024-10-03
Filter history
We now store the last three filters used by a user! Users can easily access their filter history in the query filters dropdown, making it simpler to reuse filters for future queries.
Filters history
2024-10-03
Tracing quick filters
Apply filters directly from the table by hovering over the text to reveal the filter icon.
Quick filters
2024-10-03
New arize-otel package
We made it way simpler to add automatic tracing to your applications! It’s now just a few lines of code to use OpenTelemetry to trace your LLM application. Check out our new quickstart guide which uses our arize-otel package.2024-10-03
Easily add spans to datasets
Easily add spans to a dataset from the Traces page using the “Add to Dataset” button.
"Add to Dataset" & "Setup Task" buttons"
See more
2024
2023
2022
2021


