- Create a dataset from CSV
- Create a dataset from your spans
- Create a dataset with code
- Create a synthetic dataset
Create a dataset from CSV
You can upload CSVs as a dataset in Arize. Your columns in the file can be accessed in experiments or in prompt playground.Create a dataset from your spans
Arize supports adding spans from your projects to datasets. The trace data from an application with errors or faulty evals can become fuel for ongoing development. You can use our tracing filters or ✨AI search to curate your dataset.Create a dataset with code
If you’d like to create your datasets programmatically, you can using our clients to create, update, and delete datasets. To start let’s install the packages we need:
- Simple dataset
- Dataset with prompt template & variables
This is a simple dataset with just string values for the columns.
Create a synthetic dataset
In some cases, the data you have might not be enough to cover all the scenarios you want to test. This is where you can use Alyx for Synthetic Dataset Generation:- Suggested Prompt: “Generate a synthetic dataset of 20 examples that cover…”
- Use When: You need labeled examples to test, fine-tune, or evaluate prompts without relying on real user data.
Description: Creates artificial examples that mimic real-world scenarios enabling faster experimentation