Prompt Optimization

Arize AX offers multiple ways to optimize your prompts for better LLM performance.

What Is Prompt Optimization?

Prompt Optimization allows you to automatically improve upon your LLM applications through your LLM prompts. At Arize, the primary way we do data-driven prompt optimization is Prompt Learning. Prompt learning is an iterative approach to optimizing LLM prompts by using feedback from evaluations to systematically improve prompt performance. Instead of manually tweaking prompts through trial and error, the SDK automates this process. The prompt learning process follows this workflow:

Initial Prompt → Generate Outputs → Evaluate Results → Optimize Prompt → Repeat

Initial Prompt: Start with a baseline prompt that defines your task
Generate Outputs: Use the prompt to generate responses on your dataset
Evaluate Results: Run evaluators to assess output quality
Optimize Prompt: Use feedback to generate an improved prompt
Iterate: Repeat until performance meets your criteria

Prompt Learning uses meta-prompting, where an LLM analyzes the original prompt, evaluation feedback, and examples to generate an optimized version that better aligns with your evaluation criteria.

Why Optimize Your Prompts?

Modern LLMs are already highly capable — but how you guide them matters just as much as the model itself. A strong prompt can dramatically boost reasoning, consistency, and accuracy without retraining. Even top systems like Claude Sonnet 3-7 rely on massive, hand-tuned prompts (about 24k tokens) to define their reliability and depth. Most teams can’t afford that level of manual engineering — which is why data-driven prompt optimization is so powerful. By using evaluations, natural language feedback, and production traces, prompts can be refined automatically based on real performance data. In our tests, this approach improved coding accuracy on SWE-Bench by 10–15% and reasoning scores on Big Bench Hard by up to 10%. Prompt optimization lets every team shape LLM behavior with the same rigor as top AI labs — but through automation and data, not guesswork.

Why Use Arize Prompt Learning for Prompt Optimization?

Strong Results: Proven improvements for agents across key benchmarks like SWE-Bench for Claude Code and HotPotQA.
No-Code or SDK: Optimize prompts through the UI or directly via the Prompt Learning SDK - same optimization engine, flexible workflows.
Version Control: All optimized prompts are automatically versioned in the Prompt Hub for easy rollback, comparison, and tracking.
Experimentation: Run side-by-side experiments on optimized prompt versions in the Prompt Playground to identify the highest-performing configurations.
Tutorials & Cookbooks: Explore hands-on tutorials that walk you through real examples of optimizing prompts using Prompt Learning.