Ollama runs open-source models locally and exposes an OpenAI-compatible Chat Completions API atDocumentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
http://localhost:11434/v1. Because the client side speaks the OpenAI protocol, Arize AX captures every Ollama call via the openinference-instrumentation-openai package — the same instrumentor that covers OpenAI’s hosted API.
Llama 3.2 + Ollama Tracing Tutorial (Google Colab)
Prerequisites
- Python 3.9+
- An Arize AX account (sign up)
-
Ollama installed and running locally (
ollama serve) -
A small instruction-tuned model pulled (this guide uses
llama3.2:1b):
Launch Arize AX
- Sign in to your Arize AX account.
- From Space Settings, copy your Space ID and API Key. You will set them as
ARIZE_SPACE_IDandARIZE_API_KEYbelow.
Install
Configure credentials
Setup tracing
Run Ollama
Expected output
Verify in Arize AX
- Open your Arize AX space and select project
ollama-tracing-example. - You should see a new trace within ~30 seconds containing a
ChatCompletionLLM span with the prompt, response, and token usage attached. The model name on the span will be the Ollama model you ran (e.g.llama3.2:1b). - If no traces appear, see Troubleshooting.
Troubleshooting
- No traces in Arize AX. Confirm
ARIZE_SPACE_IDandARIZE_API_KEYare set in the same shell that runsexample.py. Enable OpenTelemetry debug logs withexport OTEL_LOG_LEVEL=debugand re-run. Connection refusedorConnectErrortolocalhost:11434. The Ollama daemon is not running. Start it withollama serve(in another terminal, or as a background service).model "llama3.2:1b" not found, try pulling it first. Pull the model:ollama pull llama3.2:1b. Runollama listto see what’s pulled locally.- Different model. Swap
llama3.2:1bfor any model in the Ollama library you’ve pulled —llama3.3,mistral,qwen2.5, etc. TheOpenAIInstrumentordoesn’t care which model serves the response. - Spans show but with the wrong model name. Ollama reports the model alias you passed to the API; if you renamed the model locally (
ollama cp), use that alias.