Run Experiment Evaluations

Overview

Execute evaluation metrics on experiment results to assess model performance.

Method Signature

# Synchronous
client.experiments.run_experiment_evals(
    experiment_id: str,
    evaluator_ids: List[str],
    run_id: Optional[str] = None
) -> Dict[str, Any]

# Asynchronous
await client.experiments.run_experiment_evals(
    experiment_id: str,
    evaluator_ids: List[str],
    run_id: Optional[str] = None
) -> Dict[str, Any]

Parameters

experiment_id

string

required

The unique identifier of the experiment

evaluator_ids

List[str]

required

List of evaluator IDs to run on the experiment results

run_id

string

Specific run ID to evaluate. If not provided, evaluates the latest run

Returns

Returns a dictionary containing the evaluation results and metrics.

Example

from keywordsai import KeywordsAI

client = KeywordsAI(api_key="your-api-key")

# Run evaluations on latest experiment run
evaluator_ids = ["eval_accuracy", "eval_relevance"]

result = client.experiments.run_experiment_evals(
    experiment_id="exp_123",
    evaluator_ids=evaluator_ids
)

print(f"Evaluation status: {result['status']}")
print(f"Metrics: {result['metrics']}")

# Run evaluations on specific run
result = client.experiments.run_experiment_evals(
    experiment_id="exp_123",
    evaluator_ids=evaluator_ids,
    run_id="run_456"
)

Error Handling

try:
    result = client.experiments.run_experiment_evals(
        experiment_id="exp_123",
        evaluator_ids=["eval_accuracy"]
    )
except Exception as e:
    print(f"Error running evaluations: {e}")

Python Tracing SDK

TypeScript Tracing SDK

Keywords AI SDK

Run Experiment Evaluations

Overview

Method Signature

Parameters

Returns

Example

Error Handling

Python Tracing SDK

TypeScript Tracing SDK

Keywords AI SDK

​Overview

​Method Signature

​Parameters

​Returns

​Example

​Error Handling

Overview

Method Signature

Parameters

Returns

Example

Error Handling