Overview
Custom workflows give you complete control over processing logic while leveraging Keywords AI’s evaluation infrastructure. You submit your own workflow results via API, and the system automatically runs evaluators on your outputs.How It Works
- Create Experiment: Configure custom workflow with evaluators
- Get Placeholder Traces: System creates traces with
status: "pending"containing dataset inputs - Process Externally: Retrieve inputs and process with your own logic
- Submit Results: Update traces via PATCH with your outputs
- Auto-Evaluation: System runs evaluators and updates trace to
status: "success"
Key Benefits
- Full Control: Use any processing logic, models, or external systems
- Automatic Evaluation: Evaluators run automatically on submitted outputs
- Flexible Data Types: Any JSON-serializable input/output
- Partial Updates: PATCH only the fields you need
Constraints
- Custom and built-in workflows are mutually exclusive - cannot mix them
- Only one custom workflow per experiment
- Custom workflows are atomic (no chaining)
Configuration
Workflow Config
Type:"custom"
Config Fields (all optional):
| Field | Type | Required | Description |
|---|---|---|---|
allow_submission | boolean | No | Allow trace updates (default: true) |
timeout_hours | number | No | Submission timeout in hours |
API Endpoints
1. Create Custom Workflow Experiment
POST/api/v2/experiments/
Creates experiment and placeholder traces. Execution happens in the background - check status in the platform UI.
Request:
2. List Placeholder Traces
GET/api/v2/experiments/{experiment_id}/logs/list/
Retrieves placeholder traces with dataset inputs.
Query Parameters:
page: Page number (default: 1)page_size: Results per page (default: 100)
status: "pending"indicates awaiting your submissioninputcontains dataset entry to processoutputis empty until you submit- Use
idfor detail/update operations
3. Get Trace Details
GET/api/v2/experiments/{experiment_id}/logs/{trace_id}/
Get full trace with complete (untruncated) input.
Query Parameters:
detail: Include span tree (default: 1/true)
4. Submit Workflow Results
PATCH/api/v2/experiments/{experiment_id}/logs/{trace_id}/
Update placeholder with your results. Evaluators run automatically.
Request:
input: Updated input (any JSON type)output: Your workflow output (any JSON type)name: Trace namecustomer_identifier: Your identifiermetadata: Custom metadata object
start_time and end_time in the metadata field:
- Accurate Latency: Calculated from your actual execution time (2.5s in example above)
- Better Analytics: Get meaningful performance metrics in experiment summaries
- Preserved Context: Unknown fields (
processor_version) stay in metadata for your use
- Latency = time between placeholder creation and result submission (inaccurate)
- Typically shows much longer duration than actual workflow execution
"2024-01-15T10:30:00.000Z")
Response (200):
- Response is optimistic - shows your submitted data immediately
- Evaluators run in the background automatically
- Status changes:
pending→success(orerrorif evaluator fails) - Partial updates supported - only include fields you want to change
5. Check Evaluator Results
GET/api/v2/experiments/{experiment_id}/logs/{trace_id}/?detail=1
Poll to see evaluator results after submission.
Response (200):
span_name:"evaluator.{slug}"log_type:"score"output: Score resultsprimary_score: Numerical score (if applicable)string_value: Text evaluationjson_value: Structured databoolean_value: Pass/failcategorical_value: Category
Data Types
Input/Output Support
Bothinput and output accept any JSON-serializable type:
String:
Complete Example
Step 1: Create Experiment
Step 2: Get Placeholders
Step 3: Process Input (Your Code)
Step 4: Submit Results
Step 5: Check Evaluators
Error Handling
Validation Errors
Empty Update (Valid):Evaluator Errors
If evaluator fails, span shows error:Best Practices
1. Poll for Results
Evaluators run in the background. Poll every 2-5 seconds:2. Use Detail Endpoint
List view truncatesinput/output. Use detail endpoint for full data:
3. Handle Errors Gracefully
4. Include Metadata
Add debugging info in metadata:Troubleshooting
Placeholders Not Created
Symptom: List returns empty after experiment creation Causes:- Experiment still processing in background
- Dataset is empty
- Wait a bit longer (5-10 seconds after creation)
- Check experiment status in the platform UI
- Verify dataset has entries
- Wait and retry
Evaluators Not Running
Symptom: No evaluator spans after submission Causes:- Evaluator slug doesn’t exist
- Still processing in background (wait longer)
- Evaluator configuration error
- Verify evaluator exists in the platform UI
- Wait 10-20 seconds and poll again
- Check evaluator configuration
Status Stuck on “pending”
Symptom: Trace stayspending after submission
Causes:
- PATCH request failed
- Background processing error
See Also
- API Reference - Complete endpoint documentation
- Workflows Overview - Compare workflow types
- Complete Example Notebook