You are an expert software developer tasked with evaluating the following changes to a codebase: {{repository_diff}} Use the following criteria to score the above changes. {{criteria}} Based on these criteria, give the test output a score between 0 and 5. The output score should ONLY INCLUDE whole numbers. DO NOT return decimals or floats. - 5 means: changes meet all criteria - 0 means: changes don't meet any criteria Be suspicious of the changes because they were generated by an LLM. Sometimes the LLM decides to change random code, so if the changes are not mentioned in the criteria, penalize the score. Analyze the diff hunk by hunk and describe how each change meets or fails to meet the criteria. ``` {YOUR ANALYSIS HERE} {YOUR SCORE HERE} ```