Ofqual exam results algorithm

In 2020, Ofqual, the regulator of qualifications, exams and tests in England, produced a grades standardisation algorithm to combat grade inflation and moderate the teacher-predicted grades for A level and GCSE qualifications in that year, after examinations were cancelled as part of the response to the COVID-19 pandemic.

History
In late March 2020, Gavin Williamson, the secretary of state for education in Boris Johnson's Conservative government, instructed the head of Ofqual, Sally Collier, to "ensure, as far as is possible, that qualification standards are maintained and the distribution of grades follows a similar profile to that in previous years". On 31 March, he issued a ministerial direction under the Children and Learning Act 2009.

Then, in August, 82% of 'A level' grades were computed using an algorithm devised by Ofqual. More than 4.6 million GCSEs in England – about 97% of the total – were assigned solely by the algorithm. Teacher rankings were taken into consideration, but not the teacher-assessed grades submitted by schools and colleges.

On 25 August, Collier, who oversaw the development of Williamson's algorithm calculation, resigned from the post of chief regulator of Ofqual following mounting pressure.

Vocational qualifications
The algorithm was not applied to vocational and technical qualifications (VTQs), such as BTECs, which are assessed on coursework or as short modules are completed, and in some cases adapted assessments were held. Nevertheless, because of the high level of grade inflation resulting from Ofqual's decision not to apply the algorithm to A levels and GCSEs, Pearson Edexcel, the BTEC examiner, decided to cancel the release of BTEC results on 19 August, the day before they were due to be released, to allow them to be re-moderated in line with Ofqual's grade inflation.

The algorithm
Ofqual's Direct Centre Performance model is based on the record of each centre (school or college) in the subject being assessed. Details of the algorithm were not released until after the results of its first use in August 2020, and then only in part.


 * {| border="1" cellpadding="5" cellspacing="0"


 * Synopsis
 * The examination centre provided a list of teacher predicted grades, called 'centre assessed grades' (CAGs)
 * The students were listed in rank order with no ties.
 * The students were listed in rank order with no ties.


 * For large cohorts (over 15)
 * With exams with a large cohort; the previous results of the centre were consulted. For each of the three previous years, the number of students getting each grade (A* to U) is noted. A percentage average is taken.
 * This distribution is then applied to the current years students-irrespective of their individual CAG.
 * A further standardisation adjustment could be made on the basis of previous personal historic data: at A level this could be a GCSE result, at GCSE this could be a Key Stage 2 SAT.
 * For small cohorts, and minority interest exams (under 15).
 * The individual CAG is used unchanged


 * The formulas:
 * for large schools with $$n \ge 15$$
 * $$P_{kj} = (1-r_j)C_{kj} + r_j(C_{kj} + q_{kj} - p_{kj})$$
 * for small schools with $$n<15$$
 * $$P_{kj} = \text{CAG}$$
 * for small schools with $$n<15$$
 * $$P_{kj} = \text{CAG}$$


 * The variables
 * $$n$$ is the number of pupils in the subject being assessed
 * $$k$$ is a specific grade
 * $$j$$ indicates the school
 * $$C_{kj}$$ is the historical grade distribution of grade at the school (centre) over the last three years, 2017-19.
 * $$q_{kj}$$ is the predicted grade distribution based on the class’s prior attainment at GCSEs. A class with mostly 9s (the top grade) at GCSE will get a lot of predicted A*s; a class with mostly 1s at GCSEs will get a lot of predicted Us.
 * $$p_{kj}$$ is the predicted grade distribution of the previous years, based on their GCSEs. You need to know that because, if previous years were predicted to do poorly and did well, then this year might do the same.
 * $$r_j$$ is the fraction of pupils in the class where historical data is available. If you can perfectly track down every GCSE result, then it is 1; if you cannot track down any, it is 0.
 * CAG is the centre assessed grade.
 * $$P_{kj}$$ is the result, which is the grade distribution for each grade $$k$$ at each school $$j$$.


 * }

Schools were not only asked to make a fair and objective judgement of the grade they believed a student would have achieved, but also to rank the students within each grade. This was because the statistical standardisation process required more granular information than the grade alone. Some examining boards issued guidance on the process of forming the judgement to be used within centres, where several teachers taught a subject. This was to be submitted 29 May 2020.

For A-level students, their school had already included a predicted grade as part of the UCAS university application reference. This was submitted by 15 January (15 October 2019 for Oxbridge and medicine) and had been shared with the students. This UCAS predicted grade is not the same as the Ofqual predicted grade.

The normal way to test a predictive algorithm is to run it against the previous year's data: this was not possible as the teacher rank order was not collected in previous years. Instead, tests used the rank order that had emerged from the 2019 final results.

Effects of the algorithm
The A-level grades were announced in England, Wales and Northern Ireland on 13 August 2020. Nearly 36% were lower than teachers' assessments (the CAG) and 3% were down two grades.

Side-effects of the algorithm
Students at small schools or taking minority subjects, such as are offered at small private schools (which are also more likely to have fewer students even in popular subjects), could see their grades being higher than their teacher predictions, especially when falling into the small class/minority interest bracket. Such students traditionally have a narrower range of marks, the weaker students having been invited to leave. Students at large state schools, sixth-form colleges and FE colleges who have open access policies and historically have educated BAME students or vulnerable students saw their results plummet, in order to fit the historic distribution curve.

Students found the system unfair, and pressure was applied on Williamson to explain the results and to reverse his decision to use the algorithm that he had commissioned and Ofqual had implemented. On 12 August Williamson announced 'a triple lock' that let students appeal the result using an undefined valid mock result. But on 15 August, the advice was published with eight conditions set which differed from the minister's statement. Hours after the announcement, Ofqual suspended the system. On 17 August, Ofqual accepted that students should be awarded the CAG grade, instead of the grade predicted by the algorithm.

UCAS said on 19 August that 15,000 pupils were rejected by their first-choice university on the algorithm-generated grades. After the Ofqual decision to use unmoderated teacher predictions, many affected students had grades to meet their offer, and reapplied. 90% of them said they aimed to study at top-tier universities. The effect was that top-tier universities appeared to have a capacity problem.

The Royal Statistical Society said they had offered to help with the construction of the algorithm, but withdrew that offer when they saw the nature of the non-disclosure agreement they would have been required to sign. Ofqual was not prepared to discuss it and delayed replying by 55 days.

Legal opinion
Lord Falconer, a former attorney general, opined that three laws had been broken, and gave an example of where Ofqual had ignored a direct instruction of the Secretary of State for Education.

Falconer said the formula for standardising grades was in breach of the overarching objectives under which Ofqual was established by the Apprenticeships, Skills, Children and Learning Act 2009. The objectives require that the grading system gives a reliable indication of the knowledge, skills and understanding of the student, and that it allows for reliable comparisons to be made with students taking exams graded by other boards and to be made with students who took comparable exams in previous years.

The Labour Party suggested that the process was unlawful in that the students were given no appeal mechanism, stating: "There will be a mass of discriminatory impacts by operating the process on the basis of reflecting the previous years' results from their institutions", and "It is bound to disadvantage a whole range of groups with protected characteristics, in breach of a range of anti-discrimination legislation."