Algorithms fail the test in exams debacle

Have we reached a Cambridge Analytica-type inflection point with algorithms?

The Facebook/Cambridge Analytica scandal made alarmingly real the secretive data-harvesting, usage and selling operations of apps and tech platforms. The ongoing grades debacle in the United Kingdom over the decision to apply algorithm-assigned grades suggests that wariness about the technology and its transparency may be an emerging reality, at least across the UK.

And, as we follow the front page headlines across the Irish Sea, it is a subject of much interest in Ireland, too, especially as another algorithmic system is being used to "evaluate" Leaving Cert grades.

David McKeown, UCD assistant professor in mechanical engineering dynamics and control systems, tweeted semi-seriously, “Disappointed to hear that leaving cert grades are being calculated using a computer algorithm. The papers clearly state only non-programmable calculators may be used.”

But of course, behind that jocularity is real concern. As a former chief technology officer, and now, investor working with early stage companies, Dermot Casey noted in a tweet on Wednesday: "When you hear 'The Leaving Cert Algorithm', [you] mentally think 'The Leaving Cert set of decision rules person(s) agreed on before handing to a machine to work out'."

Angry students

In a blog post back in January entitled “Your algorithmic future: weapons of maths creation and destruction”, Casey wrote: “The proprietary nature of many algorithms and data sets means that only certain people can look at these algorithms. Worse we are building systems in a way where we don’t necessarily understand the internal workings and rules of these systems very well at all.”

The UK's A-level algorithm is a case in point. Angry students discovered that an algorithm used by the regulator, Ofqual, had often downgraded marks they had been awarded by their own schools, with sometimes devastating effect: places that many students had already been offered in university courses were withdrawn, as they no longer had high-enough marks.

What was the reasoning behind the downgrades? No one seemed able or willing to explain. The only emerging clarity, if it might be termed so, was that students in more privileged schools and from wealthier backgrounds were less likely to see a downgrade.

Ofqual figures showed that 39.1 per cent of teacher-assigned grades were downgraded by at least one full grade by the algorithm, while grades were increased in only 2.2 per cent of cases.

Further analysis indicated that, of those whose grades were lowered, the wealthiest students were downgraded by 8.3 per cent on their teacher-assigned grades, compared to 9.5 per cent for middle-income children and 10.4 per cent for low-income students.

An article in the Guardian noted: “Ofqual’s own figures showed that pupils at independent schools received double the improvement in A* and A grades compared with those attending state comprehensives, while sixth-form colleges received only a tiny improvement.”

But why?

It’s difficult to explain such figures in any way except algorithmic bias – the same sort of bias that has been shown to tilt results on standardised assessment and IQ tests that were for a long time (and, still) viewed as unassailably “neutral”. But developers bring in assumptions of what should be included in a “fair” algorithm, based on their own lives, experiences and norms – and privilege.

Code writers are disproportionately white and male, and from higher income backgrounds. Again and again, this has been shown to affect algorithms used in areas like employment screening, patient assessment, even products (like the automatic soap dispenser I’ve written about before, that only recognised hands with pale skins. People of colour had to put a tissue on their palm to trigger the dispenser).

In addition, there’s evidence that the UK’s A-level algorithm was less likely to change grades of students in more “elite” subjects, such as Latin or philosophy, because, if fewer students were enrolled in a course, the algorithm had less comparative data and would therefore leave grades as they were.

But we don’t really know. Ofqual has brandished non-disclosure agreements, and the algorithm’s workings remain occluded.

Bias

Ofqual defended the algorithm. Its “analyses show no evidence that this year’s process of awarding grades has introduced bias”, it said, even though the Guardian reported that some college heads said the results were some of the worst their students had ever received.

Students, rightly, took to the streets in angry protests, carrying signs with slogans such as “The algorithm stole my future” and “Ditch the algorithm” – and others with spicier verbs.

Within days, the UK government was forced into a U-turn, allowing the teacher-assigned grades to stand where they were better.

Irish students, parents and universities are now demanding greater transparency if a similar approach is taken here.

This is a good immediate result. A much more significant one may well be that tens of millions of people, including an entire generation of students in Britain and Ireland– tomorrow’s thinkers and leaders, coders and policymakers – have had the reality of algorithmic non-transparency and life-altering bias brutally exposed in a personally tangible way.

Hopefully, this will reinforce emerging campaigns to force greater algorithmic transparency across governments and industry.