Fighting online harassment with artificial intelligence

Social media companies are finally getting more serious about tackling online harassment. But it took the public debate about the dangers of an open Internet to reach a fever pitch, and for countless users, famous and everyday, to declare the problem was reaching crisis levels, especially for women, young people and sexual, ethnic and religious minorities all around the world.

Online harassment takes many forms, ranging from negative comments to threats, stalking and doxing. Compounding the problem is that bad behavior is easily amplified by “bots.” Over the years, neither law enforcement nor online platforms have responded in full force, leading to accusations that social media companies even lack the will to do so.

Like others before him, writer Yair Rosenberg’s frustration with hateful comments on Twitter peaked in 2017 and he created a vigilante fighting bot to outsmart anti-Semitic comment “trolls” and bots. Twitter eventually blocked Rosenberg’s bot saying it violated rules about comment spam, but they have been taking action in other ways to reduce harassment.

Last year, Twitter reported a decrease in harassment within six months of announcing new rules. They said they took action on ten times the number of offending accounts in 2017 as in 2016. Facebook also acknowledged the growing problem in 2017, launching new features to prevent unwanted contact, and safety guides for people at high risk, such as victims of domestic violence and journalists.

With approximately 2.5 billion people worldwide using social media – many in languages not supported by major platforms – the task of countering harassment in all its forms calls for a mix of approaches, both human and automated. A few companies are harnessing artificial intelligence (AI) in novel ways as part of their attempts to tackle abuse on their platforms.

In 2017, Instagram began cracking down on harassing comments with an AI tool called DeepText that was programmed by a multilingual team to identify and filter toxic words and emojis for English, Arabic, French, German and Portuguese. Learning from the context in which the toxicity appears, DeepText identifies additional words on its own. Users can customize their own word and emoji filters too. As a final check, Instagram’s AI assesses user relationships to judge whether an offending comment is really toxic or perhaps just a joke between friends.

Gfycat, a short-video platform with over 130 million daily users, uses AI with the potential to combat another form of online harassment: “revenge porn” (non-consensual pornography), in the form of “deepfakes.”

Deepfakes are two videos combined into one to impose an individual’s face onto someone else’s body. Gfycat’s AI recognizes faces and backgrounds that exist elsewhere on the Internet. If it detects a mismatch the video is removed. That’s good news for famous victims, but not as effective for individuals who are less easily recognizable.

Yet while AI can be a useful tool in the fight against online harassment, there are also troubling shortcomings that put free speech at risk. In one instance, Instagram’s AI algorithm interpreted the word ‘Mexican’ as a slur, given its frequent use alongside hateful comments against immigrants in the United States. Attempts at automated moderation on other platforms, including Facebook and Twitter, have unjustly silenced activists and people of color. Minimizing these risks should be at the core of any strategy to use AI in the fight against online harassment, but it’s likely there will be mistakes now and in the future.

Asking machines to navigate the nuances of language is tricky territory but with an assist from humans it could contribute to making the Internet safer. Now that platforms are starting to respond more to online harassment, they need to work with communities of users to better understand the problems and refine guidelines, rules and best practices.

Further reading:

Instagram’s Kevin Systrom Wants to Clean Up the Internet, WIRED, August 2017
Facebook’s Secret Censorship Rules Protect White Men From Hate Speech But Not Black Children, ProPublica, June 2017
Artificial Intelligence is now fighting fake porn, WIRED, February 2018
Deepfakes are disappearing from parts of the Web, but they’re not going away, The Verge, February 2018

Fighting online harassment with artificial intelligence

Comments