Instructor is a Python library that makes it easy to get structured data from LLMs. This guide shows how to instrument your Instructor application using OpenInference to send trace data to Arize for observability, allowing you to see both the Instructor-specific operations and the underlying LLM calls.
Before running your application, ensure you have the following environment variables set:
export ARIZE_SPACE_ID="YOUR_ARIZE_SPACE_ID"export ARIZE_API_KEY="YOUR_ARIZE_API_KEY"export OPENAI_API_KEY="YOUR_OPENAI_API_KEY" # Needed for the OpenAI example
You can find your Arize Space ID and API Key in your Arize account settings.
Install Instructor, its OpenInference instrumentor, the instrumentor for the underlying LLM client (e.g., OpenAI), Arize OTel, and supporting OpenTelemetry packages:
Remember to install the OpenInference instrumentor for the specific LLM client library you are using with Instructor (e.g., openinference-instrumentation-openai for OpenAI, openinference-instrumentation-anthropic for Anthropic, etc.).
Connect to Arize using arize.otel.register and apply the InstructorInstrumentor as well as the instrumentor for your LLM client (e.g., OpenAIInstrumentor).
import osfrom arize.otel import registerfrom openinference.instrumentation.instructor import InstructorInstrumentorfrom openinference.instrumentation.openai import OpenAIInstrumentor # Or your LLM client's instrumentor# Ensure your API keys are set as environment variables# ARIZE_SPACE_ID = os.getenv("ARIZE_SPACE_ID")# ARIZE_API_KEY = os.getenv("ARIZE_API_KEY")# OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # For the example# Setup OTel via Arize's convenience functiontracer_provider = register( space_id=os.getenv("ARIZE_SPACE_ID"), api_key=os.getenv("ARIZE_API_KEY"), project_name="my-instructor-app" # Choose a project name)# Instrument InstructorInstructorInstrumentor().instrument(tracer_provider=tracer_provider)# Instrument the underlying LLM clientOpenAIInstrumentor().instrument(tracer_provider=tracer_provider) # Example for OpenAIprint("Instructor and OpenAI client instrumented for Arize.")
Now you can use Instructor as you normally would. The instrumentors will capture traces.
import instructorfrom pydantic import BaseModelfrom openai import OpenAI # Ensure OPENAI_API_KEY is set# Define your desired output structureclass UserInfo(BaseModel): name: str age: int# Patch the OpenAI client with Instructor# The OpenAI client itself will be instrumented by OpenAIInstrumentor# InstructorInstrumentor will trace the .create call patched by instructor.from_openaiclient = instructor.from_openai(OpenAI())# Extract structured datauser_info_response = client.chat.completions.create( model="gpt-3.5-turbo", response_model=UserInfo, # Instructor specific messages=[{"role": "user", "content": "John Doe is 30 years old."}])print(f"Name: {user_info_response.name}")print(f"Age: {user_info_response.age}")# Example with validation errortry: invalid_user_info = client.chat.completions.create( model="gpt-3.5-turbo", response_model=UserInfo, messages=[{"role": "user", "content": "The user is Jane."}], # Age is missing max_retries=1 # Optional: limit retries for demonstration )except Exception as e: print(f"Failed to extract valid UserInfo: {e}")