This is a beta feature. Please do let us know if you encounter any issues. We’ll continuously improve it.
Prerequisites
You have already created prompts in the platform. Learn how to create a prompt here.Steps
1
Create a new evaluator
You can set up an evaluator in Evaluators. Click the + New evaluator button, and select LLM.

2
Configure an LLM evaluator
Here’s a sample evaluator configuration. We’ll describe each section in detail below.
You need to define a
Then, you need to choose a model for the evaluator. The evaluator will use this model to evaluate the LLM outputs. Currently, we only support
After that, you need to write a description for the evaluator. This description is for the LLM to understand the task and the expected output. There are 3 variables that you can use in the description:
In the last, you need to define the
You’re good to go now! Click on the Save button to create the evaluator. Let’s move on to the next step to see how to run the evaluator in the UI.

Slug
for each evaluator. This slug will be used to apply the evaluator in your LLM calls, and will be used to identify the evaluator in the Logs.
The
Slug
is a unique identifier for the evaluator. We suggest you don’t change it once you have created the evaluator.gpt-4o
and gpt-4o-mini
from OpenAI and Azure OpenAI.
{{llm_output}}
: The output text from the LLM.{{ideal_output}}
: The ideal output text from the LLM. This is optional, you can add it if you want to give the LLM a reference output.

Scoring rubric
for the evaluator. This is for the LLM to understand the scoring criteria.Passing score is the minimum score that the LLM output needs to achieve to be considered as a passing response.