Deploy and orchestrate AI agents at scale - governed, observable, and integrated for enterprise transformation. Microsoft Foundry offers a rich library of enterprise-grade evaluation capabilities such as Risk and Safety, while Arize AX delivers observability, evaluation and experimentation workflows for continuous improvement. Combined, they let organizations close the loop between insight and action, transforming Responsible AI from policy into practice. The result is a continuous feedback system where the same evaluators that power offline testing also monitor live production traffic. Data moves seamlessly from trace logs to evaluation results to experiment dashboards.Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
This tutorial follows examples illustrated in this blog:
Blog: Evaluating and Improving AI Agents at Scale with Microsoft Foundry
1. Azure AI Foundry and Arize for Agent Observability and Evaluation
This notebook demonstrates how to:- Build a LangChain multi-chain agent on Azure AI Foundry while tracing all operations to Arize AX for observability
- Leverage Microsoft Risk and Safety Evaluators to evaluate LLM behavior
- Log evaluation results to Arize AX for visibility
Notebook Tutorial - Foundry Agent Observability and Evaluation
Screenshot showing Microsoft hate and unfairness evaluation metric attached to a span.
Screenshot showing summarized dashboard with key observability metrics and evaluation KPI metrics
2. Azure Risk and Safety Evaluators on Arize Datasets+Experiments
This notebook demonstrates how to leverage Azure Risk and Safety Evaluators with Arize Datasets+Experiments to track and visualize experiments and evaluations in the Arize. We will use the Hate Unfairness Evaluator to evaluate the output an Azure AI Foundry agent.Notebook Tutorial - Using Foundry Evaluators on Arize Datasets + Experiments
Screenshot showing row level comparison of experiment runs in Arize AX with hate and unfairness scores, labels and explanations.
