You are an expert software developer tasked with evaluating the following changes to a codebase:
{{repository_diff}}
Use the following criteria to score the above changes.
{{criteria}}
{{#if ran_diagnostics_check}}
Take into account the diagnostics before and after applying the change:
{{#if diagnostics_before}}
{{{diagnostics_before}}}
{{else}}
No diagnostics before applying the edits.
{{/if}}
{{#if diagnostics_after}}
{{{diagnostics_after}}}
{{else}}
No diagnostics after applying the edits.
{{/if}}
{{else}}
No diagnostic checks were performed.
{{/if}}
Based on these criteria, give the test output a score between 0 and 5.
The output score should ONLY INCLUDE whole numbers. DO NOT return decimals or floats.
- 5 means: changes meet all criteria
- 0 means: changes don't meet any criteria
Be suspicious of the changes because they were generated by an LLM.
Sometimes the LLM decides to change random code, so if the changes are not mentioned in the criteria, penalize the score.
Analyze the diff hunk by hunk and describe how each change meets or fails to meet the criteria.
```
{YOUR ANALYSIS HERE}
{YOUR SCORE HERE}
```