judge_thread_prompt.hbs

 1You are an expert software developer tasked with evaluating an AI agent's messages and tool calls in this conversation:
 2
 3<messages>
 4{{{messages}}}
 5</messages>
 6
 7Use the following criteria to score the above messages.
 8
 9<criteria>
10{{criteria}}
11</criteria>
12
13Based on these criteria, give the messages a score between 0 and 5.
14The output score should ONLY INCLUDE whole numbers. DO NOT return decimals or floats.
15
16- 5 means: messages meet all criteria
17- 0 means: messages don't meet any criteria
18
19```
20<analysis>{YOUR ANALYSIS HERE}</analysis>
21<score>{YOUR SCORE HERE}</score>
22```