CAMBRIDGE CATALYST ISSUE 04

AI SPECIAL

ARE YOU SURE YOU WANT TO POST IT? Dr Marcus Tomalin, senior research associate at the University of Cambridge's Machine Intelligence Laboratory, looks at using AI to combat hate speech online ILLUSTRATION BRUCE RICHARDSON

here are currently deep concerns about the psychological and societal harms caused by online hate speech. Published in June, a government report called the Online Harms White Paper recognised that “hateful content on digital platforms is a growing problem in the UK, inflicting harm on victims, creating and exacerbating social divisions, and eroding trust in host platforms”. But determining the scale and scope of hateful online content is not easy. Hate speech is defined as abusive or threatening language directed towards an individual (or group) who is targeted because of protected characteristics such as gender, race, religion and age. In many countries, victims of social hate speech can seek redress under existing laws. But online hate speech raises particular problems. Many of the offensive messages are posted anonymously, and some of them are generated by automated chatbots.

This topic is of particular contemporary relevance because language-based artificial intelligence (AI) systems are starting to determine with reasonable accuracy whether a given utterance constitutes hate speech or not. These emerging technologies present the possibility of handling the growing phenomenon of offensive online language in new ways. For instance, automated hate speech detection systems could enable the responsibility of dealing with harmful messages to be delegated to the users themselves, rather than to corporations or to governments. A specific example should clarify this. If someone with the username ‘White Dragon’ tried to post a blatantly homophobic message as a comment beneath a YouTube video, then the

At present, social media companies

generally deal with hate speech reactively on their platforms. An

already-offended user can report an offensive message, but it will only be removed if human moderators decide to uphold the complaint. In terms of protection, this is usually too little too late: the victim has already suffered; the damage has already been done. Keen to take a more proactive stance, Facebook recently chose to ban all overtly white nationalist material, having previously banned white supremacist content. While those decisions may well be positive and effective ones, it’s important to consider whether unelected corporations should be the self-appointed gatekeepers of censorship and free speech in our modern digital democracies.

These emerging technologies present the possibility of handling the growing phenomenon of offensive online language in new ways"

ISSUE 04 20

cambridgecatalyst.co.uk