| Parameter | Type | Description |
|---|---|---|
evaluator_id | string | The unique ID of the evaluator to run |
inputs object. This applies to all evaluator types (llm, human, code). The same fields are also recorded and visible on the Scores page for every evaluation.
| Field | Type | Required | Description |
|---|---|---|---|
inputs | object | Yes | The unified input object containing all evaluation data |
inputs.input | any JSON | Yes | The request/input to be evaluated |
inputs.output | any JSON | Yes | The response/output being evaluated |
inputs.metrics | object | No | System-captured metrics (e.g., tokens, latency, cost) |
inputs.metadata | object | No | Context and custom properties you pass; also logged |
inputs.llm_input | string | No | Legacy convenience alias for input (maps to unified fields) |
inputs.llm_output | string | No | Legacy convenience alias for output (maps to unified fields) |
inputs is auto-populated from the request/response and tracing data{{llm_input}}/{{llm_output}} placeholders remain supported and transparently map to the unified fields{{input}} and {{output}}| Field | Type | Description |
|---|---|---|
score | varies | The evaluation score (type depends on evaluator’s score_value_type) |
score_type | string | The type of score: numerical, boolean, categorical, or comment |
evaluator_id | string | ID of the evaluator that was run |
evaluator_name | string | Name of the evaluator that was run |
evaluation_result | object | Detailed evaluation results and reasoning |
inputs | object | The input data that was evaluated (echoed back) |
execution_time | number | Time taken to execute the evaluation (in seconds) |
timestamp | string | ISO timestamp of when the evaluation was performed |
4.5, 8.2)min_score and max_scorepassing_score thresholdtrue or false)true = passed, false = failed["Good", "Accurate"])categorical_choices