judge_diff_prompt.hbs

 1You are an expert software developer tasked with evaluating the following changes to a codebase:
 2
 3<changes>
 4{{repository_diff}}
 5</changes>
 6
 7Use the following criteria to score the above changes.
 8
 9<criteria>
10{{criteria}}
11</criteria>
12
13{{#if ran_diagnostics_check}}
14Take into account the diagnostics before and after applying the change:
15
16<diagnostics_before>
17{{#if diagnostics_before}}
18{{{diagnostics_before}}}
19{{else}}
20No diagnostics before applying the edits.
21{{/if}}
22</diagnostics_before>
23
24<diagnostics_after>
25{{#if diagnostics_after}}
26{{{diagnostics_after}}}
27{{else}}
28No diagnostics after applying the edits.
29{{/if}}
30</diagnostics_after>
31{{else}}
32No diagnostic checks were performed.
33{{/if}}
34
35Based on these criteria, give the test output a score between 0 and 5.
36The output score should ONLY INCLUDE whole numbers. DO NOT return decimals or floats.
37
38- 5 means: changes meet all criteria
39- 0 means: changes don't meet any criteria
40
41Be suspicious of the changes because they were generated by an LLM.
42Sometimes the LLM decides to change random code, so if the changes are not mentioned in the criteria, penalize the score.
43Analyze the diff hunk by hunk and describe how each change meets or fails to meet the criteria.
44
45```
46<analysis>{YOUR ANALYSIS HERE}</analysis>
47<score>{YOUR SCORE HERE}</score>
48```