
Google Colab
- Build a customer support agent with the OpenAI Agents SDK
- Trace agent activity to monitor interactions
- Generate a benchmark dataset for performance analysis
- Evaluate agent performance using Ragas
Initial setup
We’ll setup our libraries, keys, and OpenAI tracing using Phoenix.Install Libraries
Setup Keys
Next you need to connect to Arize AX and enter the relevant keys.Setup Tracing
Create your first agent with the OpenAI SDK
Here we’ve setup a basic agent that can solve math problems. We have a function tool that can solve math equations, and an agent that can use this tool. We’ll use theRunner class to run the agent and get the final output.
Evaluating our agent
Agents can go awry for a variety of reasons. We can use Ragas to evaluate whether the agent responded correctly. Two Ragas measurements help with this:- Tool call accuracy - did our agent choose the right tool with the right arguments?
- Agent goal accuracy - did our agent accomplish the stated goal and get to the right outcome?
multi_turn_ascore(sample) to get the results.
The AgentGoalAccuracyWithReference metric compares the final output to the reference to see if the goal was accomplished.
The ToolCallAccuracy metric compares the tool call to the reference tool call to see if the tool call was made correctly.