Skip to main content

What is a dataset?

A dataset is a curated collection of logs (inputs/outputs + metadata) that you can evaluate, annotate, and use to power Experiments.

Resources

Steps to use

1

Step 1: Go to Datasets and click Create

Open the Keywords AI platform, navigate to Datasets, then click Create dataset.
Datasets page with Create dataset button
2

Step 2: Choose “From logs (sampling)”

In the creation modal, choose dataset type From logs (sampling).
Create dataset modal with dataset type set to From logs (sampling)
3

Step 3: Set filters, time range, and sampling percentage

Configure:
  • Filters: narrow down which logs to include (e.g. status_code = 200, a specific model, metadata filters)
  • Time range: the window of logs to sample from
  • Sampling percentage: how much of matching logs to include
Dataset from logs configuration showing filters, time range, and sampling percentage
4

Step 4: Create, wait for ingestion, and verify logs

Click Create. The dataset will begin processing while logs are being added.Depending on the size of the time range and filters, it may take some time until the dataset is fully populated.When it finishes, open the dataset and confirm:
  • log_count is non-zero
  • logs appear in the dataset logs table
Dataset detail page showing dataset logs table populated