judge_prompt.hbs

 1You are an expert software developer tasked with evaluating the following changes to a codebase:
 2
 3<changes>
 4{{repository_diff}}
 5</changes>
 6
 7Use the following criteria to score the above changes.
 8
 9<criteria>
10{{criteria}}
11</criteria>
12
13Based on these criteria, give the test output a score between 0 and 5.
14The output score should ONLY INCLUDE whole numbers. DO NOT return decimals or floats.
15
16- 5 means: changes meet all criteria
17- 0 means: changes don't meet any criteria
18
19Be suspicious of the changes because they were generated by an LLM.
20Sometimes the LLM decides to change random code, so if the changes are not mentioned in the criteria, penalize the score.
21Analyze the diff hunk by hunk and describe how each change meets or fails to meet the criteria.
22
23```
24<analysis>{YOUR ANALYSIS HERE}</analysis>
25<score>{YOUR SCORE HERE}</score>
26```