Bias in Machine Learning Algorithms

4 min readFeb 17, 2021

The progress made in the field of machine learning and its capabilities in solving practical problems have heralded a new era in the broader domains of artificial intelligence (AI) and technology. Machine learning algorithms can now identify groups of cancerous cells in radiographs, write persuasive advertising copy and ensure the safety of self-driving cars. Though there are many useful benefits from the advent of machine learning, there is also great potential for harmful, societal impacts.

Broadly, machine learning algorithms learn patterns in datasets and map input data to outputs based on their model. Though the models and algorithms seem mathematical and objective, they are nevertheless susceptible to implicit bias. Some of the most pervasive biases in machine learning algorithms are often fed into a feedback loop that reinforces them too. For example, a machine learning algorithm might be used by a credit reporting agency to decide whether a consumer’s application for credit is approved or declined. The consumer’s application is declined and that decision is fed back into the model, further reducing the consumer’s chances of successfully applying for credit in future and compounding the problem. Where the input data are implicitly biased, the model will reinforce the bias that is present in the data, as demonstrated by Amazon’s failed recruitment engine.

Besides the unintentional biases that can be exacerbated by the use of machine learning models, explicit bias can also be brought to bear on ‘supervised’ or ‘reinforcement learning’ algorithms. The Home Office was forced to stop using an algorithm that discriminated against visa applicants from countries on a predetermined list, after legal threats made by Foxglove and the Joint Council for the Welfare of Immigrants (JCWI) last year. Supervised learning algorithms use iterative optimisation of a function to learn how to predict an output based on a novel input. If the inputs are biased (e.g. applications from Country X are classified in a way that makes the rejection of the application more likely) then the algorithm will learn that prejudice and optimise the performance of its function, which is to reject applications from particular countries.

Bias can also be borne out by the poor quality of the input data. Typically, facial recognition technology (FRT) performs badly when it is deployed to identify people with darker skin tones, particularly when the quality of the image is inadequate. As such, there have been a number of reported cases where citizens have been misidentified as criminals or suspects in recent years. The US National Institute of Standards and Technology (NIST) report from December 2019 found that in one-to-one matching, most systems had a higher rate of false positive matches for Asian and African-American faces over Caucasian faces, sometimes by a factor of 10 or even 100. Due to FRT’s propensity for misidentifying citizens that belong to historically marginalised groups, its utilisation will only serve to further entrench the historical discrimination.

It is clear that machine learning and AI can be transformative and useful when applied in the right way, but how can the implicit and explicit biases that are asserted by such powerful technology be eliminated? Technologists have risen to the challenges posed by this type of inequity and created tools to counter it. Academics at the Oxford Internet Institute have developed a new method for detecting bias and discrimination in machine learning systems which has been integrated into a new Amazon Web Services toolkit.

Apart from technological solutions to the bias problem, there are also ways to mitigate it through legislation and enforcement. Though the GDPR and the UK GDPR require that data protection impact assessments are conducted where novel technology is being deployed, there are derogations in the legislation — like processing data on the basis of performing a task in the public interest — which might allow a police force to use facial recognition tech to ensure the safety of high street shoppers, for instance. Proper enforcement and regulation of machine learning systems — by governments and supervisory authorities — is necessary to ensure that both public and private organisations are disincentivised to deploy novel technology without rigorous testing and sufficient consideration of bias and discrimination first.

Lastly, it will be increasingly important that those who are vulnerable to discrimination — in all parts of society — are properly represented in the upper echelons of management structures, in corporations and public bodies alike. When the right people are no longer excluded and are consistently ‘in the room’, they will be more able to affect change and begin to solve these problems.

There is huge potential for machine learning and AI to be used for societal good, but it will take an attentive, humanitarian and sympathetic approach to the aforementioned issues in order to ensure the fair and non-discriminatory application of the technology.

###

Further reading: Why Fairness Cannot Be Automated: Bridging the Gap Between EU Non-discrimination Law and AI

Bias in Machine Learning Algorithms

Written by Foxh0und