Create experiment

Creates a new experiment with workflows. Supports three workflow types:

Custom: Submit your own workflow results via API
Completion: Direct LLM completions with custom parameters
Prompt: Load and render Jinja2 prompt templates with dataset variables

For custom workflows, the system creates placeholder traces that you update with your results. For built-in workflows (prompt/completion), execution starts automatically in the background.

Authentication

All endpoints require API key authentication:

Authorization: Bearer YOUR_API_KEY

Parameters

name

string

required

The name of the experiment.

description

string

Description of the experiment.

dataset_id

string

required

The ID of the dataset to run the experiment on.

workflows

array

required

List of workflow configurations.

Properties

type

string

required

Type of workflow. Options: custom, completion, or prompt.

custom: Submit your own workflow results
completion: Direct LLM completions
prompt: Load and render Jinja2 prompt templates

config

object

Configuration for the workflow. Structure depends on workflow type:

Custom Workflow Config

allow_submission

boolean

Allow trace updates (default: true).

timeout_hours

number

Submission timeout in hours.

Completion Workflow Config

model

string

required

Model identifier (e.g., “gpt-4o-mini”).

temperature

number

Sampling temperature (0-2, default: 1.0).

max_tokens

integer

Maximum completion tokens (default: 150).

top_p

number

Nucleus sampling (0-1, default: 1.0).

frequency_penalty

number

Frequency penalty (-2 to 2, default: 0).

presence_penalty

number

Presence penalty (-2 to 2, default: 0).

stop

string or array

Stop sequences.

response_format

object

Response format (e.g., {"type": "json_object"}).

tools

array

Function calling tools.

tool_choice

string or object

Tool choice strategy.

reasoning_effort

string

Reasoning effort for o1 models.

Prompt Workflow Config

prompt_id

string

required

Prompt identifier to load and render.

evaluator_slugs

array

List of evaluator slugs to run on the experiment results.

Response

{
  "id": "experiment-123",
  "name": "My Custom Workflow Experiment",
  "description": "Testing custom workflow implementation",
  "dataset_id": "your-dataset-id",
  "workflows": [
    {
      "type": "custom",
      "config": {
        "allow_submission": true,
        "timeout_hours": 24
      }
    }
  ],
  "evaluator_slugs": [
    "response_quality_v1",
    "factual_accuracy"
  ],
  "status": "running",
  "created_at": "2025-11-18T10:00:00Z",
  "updated_at": "2025-11-18T10:00:00Z"
}

Examples

Custom Workflow

curl -X POST "https://api.keywordsai.co/api/v2/experiments/" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Custom Processing Experiment",
    "dataset_id": "dataset-123",
    "workflows": [{"type": "custom", "config": {"allow_submission": true}}],
    "evaluator_slugs": ["response_quality_v1"]
  }'

Completion Workflow

curl -X POST "https://api.keywordsai.co/api/v2/experiments/" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Completion Experiment",
    "dataset_id": "dataset-123",
    "workflows": [{
      "type": "completion",
      "config": {
        "model": "gpt-4o-mini",
        "temperature": 0.7,
        "max_tokens": 150
      }
    }],
    "evaluator_slugs": ["response_quality_v1"]
  }'

Prompt Workflow

curl -X POST "https://api.keywordsai.co/api/v2/experiments/" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Prompt Experiment",
    "dataset_id": "dataset-123",
    "workflows": [{
      "type": "prompt",
      "config": {
        "prompt_id": "6caa11b48d4d440986b3eb3b96ae795e"
      }
    }],
    "evaluator_slugs": ["response_quality_v1"]
  }'

Workflow Rules

✅ Valid Combinations:

Single custom workflow
Single built-in workflow (prompt or completion)
Multiple built-in workflows chained together

❌ Invalid Combinations:

Multiple custom workflows
Custom workflow + built-in workflow
Mixing custom and built-in types

Chaining (Built-in Only): When you configure multiple built-in workflows, they execute in sequence - the output of one becomes the input of the next.

Observe

Develop

Evals

Manage

Automation

Reference

Create experiment

Authentication

Parameters

Response

Examples

Custom Workflow

Completion Workflow

Prompt Workflow

Workflow Rules

Observe

Develop

Evals

Manage

Automation

Reference

​Authentication

​Parameters

​Response

​Examples

​Custom Workflow

​Completion Workflow

​Prompt Workflow

​Workflow Rules

Authentication

Parameters

Response

Examples

Custom Workflow

Completion Workflow

Prompt Workflow

Workflow Rules