Why Arize AX for Agents?
1. Agent Observability with Auto Instrumentation
Observability is critical for understanding how agents behave in real-world scenarios. Arize AI provides robust tracing capabilities through our open source OpenInference library, automatically instrumenting your agent applications to capture traces and spans. This includes LLM calls, tool invocations, and data retrieval steps, giving you a detailed view of your agent’s workflow. With just a few lines of code, you can set up tracing for popular frameworks like OpenAI Agents, LangGraph, and Autogen. Learn more about Tracing. Code Example: Auto Instrumentation for OpenAI Agents2. Agent Evaluations with Online Evals
- Comprehensive Evaluation Templates: Arize AX provides templates for evaluating various agent components, such as Tool Calling, Path Convergence, and Planning.
- Online Evals: With Online Evals, you can run continuous evaluations on production data to monitor correctness, hallucination, relevance, and latency. This ensures your agents perform consistently across diverse scenarios.
- Custom Metrics and Alerts: Track key metrics on custom dashboards and receive alerts when performance deviates from the norm, allowing proactive optimization of agent behavior.
3. Testing Agents in Prompt Playground with Tool Calling Support
- Iterate on Prompts: Test different prompt templates, models, and parameters side by side to refine how your agent responds to user inputs.
- Tool Calling Support: Debug tool calling directly in the Playground to ensure your agent selects the right tools and parameters. Learn more about Using Tools in Playground.
- Save as Experiment: Run systematic A/B tests on datasets to validate agent performance and share results with your team via experiments.
4. Sessions for Agent Interaction Tracking

- Session ID and User ID: Add
session.idanduser.idas attributes to spans to group interactions and analyze conversation flows. This helps identify where conversations break or user frustration increases. - Debugging Sessions: Use Arize AX to filter sessions and find underperforming groups of traces. Learn more about Sessions and Users.
5. Agent Replay and Agent Pathing

- Agent Replay: Replay agent interactions to debug agent tool calling in a controlled environment. Replay will help you simulate past sessions to test improvements without impacting live users.
- Agent Pathing: Analyze and optimize the pathways your agents take to complete tasks. Understand whether agents are taking efficient routes or getting stuck in loops, with tools to refine planning and convergence strategies.
Additional Resources for Agent Development
Agent Evaluation Guide
Learn how to evaluate every component of your agent.
Try our Tutorials
Explore example notebooks for agents, RAG, tracing, and evaluations.
Watch our Paper Readings
Dive into video discussions on the latest AI research, including agent architectures.
Join our Slack Community
Connect with other developers to ask questions, share insights, and provide feedback on agent development with Arize.