We now have API-Triggered Monitors: A monitor type that only evaluates when triggered via API call, instead of on a fixed schedule. Ideal for teams running evaluations after events like batch ingestions, model retraining, or CI/CD workflows.
October 30, 2025You can now set upper and lower bounds using our new Auto Threshold options!Arize can automatically determine the right threshold for your alerts based on your historical data. This is ideal for most users who want to start monitoring without manually tuning thresholds.
We’re excited to introduce Data Fabric, a new capability that automatically synchronizes production trace data, evaluations, and annotations from Arize into your cloud data warehouse every 60 minutes in Iceberg format—giving you an always-current, query-ready source of truth.
You can now see a timeline view when you click into a trace! The new timeline view is right next to the Trace Tree and Agent Graph tabs, and it shows the execution flow and duration of each span.
Tags are a lightweight way for you to organize and label your entities across the Arize Platform. You can use tags to:
Describe source (from-prod, EHR-record)
Encode purpose (ab-test, regression-test)
Indicate readiness (golden, deprecated)
Group by config (threshold-0.85, cohort_5)
Tags live at the Space level, under Space Settings. They can be reused across entities that belong to that space (Datasets, Experiments, and more). More on Tags.
Support for Tool Call IDs in OpenInference Messages
October 15, 2025This update introduces full support for tool_call_id and tool_call.id in OpenInference message semantics. These identifiers are now stored alongside input and output messages. Tool call IDs now appear in the trace slideover’s input/output and attributes tabs.
We’ve added a Data Region selector to the login page, allowing users to choose their preferred data region during sign-in. This helps ensure compliance and improved performance based on regional data needs.
October 5, 2025Expanded LLM model support to include Claude models on Bedrock and Vertex, Titan Text Premiere, Amazon Nova Premiere, Gemini 2.5 Flash/Pro, and new GPT OSS and DeepSeek models —offering broader coverage across top providers.
You can now define time settings per widget in dashboards! This enhancement adds flexibility by letting you set custom time ranges at the widget level — without losing the ability to apply a global dashboard time range. It’s a powerful way to dig deeper into data and run more tailored analyses.
You can now autocomplete annotation variables when editing eval templates in the playground or directly from dataset slideovers. This makes building and managing evals faster and more intuitive.