Can AI solve the content-moderation problem?

Published 11 hours ago• 4 minute read

Anyone who spends time on social media knows that it’s hard to avoid abusive misinformation, abusive language and offensive content. Platforms like Facebook and YouTube have content moderation systems designed to keep obnoxious material in check, but a 2021 Cato Institute survey found that just one in four users think platforms apply their standards fairly.

“Content moderation is one of the crucial issues of our day and it is essentially broken," says Carolina Are, a researcher at the Center for Digital Citizens at Northumbria University in the U.K. One major difficulty is that different users have different ideas about what should be prohibited online. Are swear words fair game? Nudity? Violent imagery? In the Cato survey, 60% of users said they wanted social-media platforms to provide them with greater choice to pick and choose what they see and what they don’t.

Soon this kind of personalized content moderation may become a reality, thanks to generative AI tools. In a paper presented this spring at the ACM Web Conference in Sydney, Australia, researchers Syed Mahbubul Huq and Basem Suleiman created a YouTube filter based on commercially available large language models. They used four AI chatbots to analyze subtitles from 4,098 public YouTube videos across 10 popular genres, including cartoons and reality TV. Each video was assessed on 17 metrics used by the British Board of Film Classification to assign film ratings, including violence, nudity and self-harm.

Two of the chatbots, GPT-4 and Claude 3.5, were able to identify content that human checkers assessed as harmful at least 80% of the time. The system isn’t perfect, and so far it can only assess language in videos, not images. It’s also expensive: “To filter every video [on YouTube] would cost trillions of dollars at today’s prices," Huq said.

But the demonstration model points to a future in which social media users are able to choose exactly what kinds of content they see. “If the parent of a child thinks it’s suitable for their child to see content that’s high in sexual scenes, but low intensity in drugs, they can adjust it," says Huq. With the cost of AI access rapidly dropping, “we’re not far away from when this will be possible."

Maarten Sap, assistant professor at Carnegie Mellon University, believes effective content moderation will require specialized LLMs, since off-the-shelf models are “not built for nuanced tasks." For instance, recent research by Sap and his colleagues find AIs have trouble understanding “relationship backstory," leaving them unable to distinguish between playful banter among friends and personal attacks on strangers. Still, he agrees that LLMs offer an “opportunity" to develop more finely-grained moderation tools.

Customized social-media filters could raise problems as well as solving them. Zeerak Talat, a computer scientist studying content moderation at the University of Edinburgh, notes that users could fine-tune their feeds to see more hate speech instead of less: “If everyone has personalized moderation, we have no way of controlling illegal content." One option could be to tailor AI content moderation to the laws and preferences of different countries, rather than to individuals: Sexual or political content that would be normal in Silicon Valley could be offensive or illegal in a conservative, religious society.

Still, some critics worry that by giving individual users the power to filter what they see, tech giants would be letting themselves off the hook for problems they created. “I’m reluctant to see the onus of everything being put on the user," says Are. “I don’t want to be told that I deserve to get raped, and I have to be the one putting it in the feed."

Talat agrees, noting that customized moderation may protect an individual user, but it doesn’t stop hateful or offensive content from spreading to others. “It defends your stream," he says. “But it does nothing about what is said about you."

Chris Stokel-Walker is a journalist and author based in the U.K. His latest book is “How AI Ate the World."

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.

Origin: