Use this file to discover all available pages before exploring further.
Some teams have complex experiment pipelines and might need to run experiments remotely. Teams can still log those experiment results to Arize AX via log_experiment to maintain a record of experiments for tracking and comparing.
We will be logging an example experiment with three columns:
result is the output of the LLM pipeline.
correctness is the evaluation label of the experiment.
example_id is the dataset row ID, which is needed to map the results to the specific dataset row with inputs and expected outputs.
# Example DataFrame:experiment_run_df = pd.DataFrame( { "result": [ "The telephone was invented by **Alexander Graham Bell**.", "The invention of the light bulb is commonly attributed to **Thomas Edison**" ], "label": ["correct", "incorrect"], "score": [1, 0], "explanation_text": [ "This statement is accurate because Alexander Graham Bell is credited with inventing the telephone.", "This statement is inaccurate; others like Humphry Davy and Joseph Swan made earlier versions of the light bulb.", ], })
This code sets up mappings that link each dataset example to example_id, the LLM output to result, and evaluator outputs to label, score, and explanation.
from arize.experiments import ( ExperimentTaskFieldNames, EvaluationResultFieldNames,)# Define field mappings for the LLM task id and example outputtask_fields = ExperimentTaskFieldNames( example_id="example_id", output="result")# Define field mappings for evaluatorevaluator_fields = EvaluationResultFieldNames( label="label", score="score", explanation="explanation_text",)# This maps the dataset ID to the example_iddataset_examples = client.datasets.list_examples(dataset_id=dataset_id, all=True)dataset_df = dataset_examples.to_df()experiment_run_df["example_id"] = dataset_df["id"]