Now Anyone Can Deploy Google’s Troll-Fighting AI – WIRED

Slide: 1 / of 1. Caption: Merjin Hos

Last September, a Google offshoot called Jigsaw declared war on trolls, launching a project to defeatonline harassment using machine learning. Now, the team is opening up thattroll-fighting system to the world.

On Thursday, Jigsaw and its partners on Googles Counter Abuse Technology Team releaseda new piece of code called Perspective, an API that gives any developer access to the anti-harassment tools that Jigsaw has worked on for over a year. Part of the teams broader Conversation AI initiative, Perspective uses machine learning to automatically detect insults, harassment, and abusivespeech online. Enter a sentence into its interface, and Jigsaw says its AI can immediately spit out anassessment of the phrases toxicity more accurately than any keyword blacklist, and faster than any human moderator.

The Perspectivereleasebrings Conversation AI a step closer to its goal of helping to foster troll-free discussion online, and filtering out the abusive comments that silence vulnerable voicesor, as the projects critics have less generously put it, to sanitize public discussions based on algorithmic decisions.

Conversation AI has always been an open source project. But by opening up that system further with an API, Jigsaw and Google can offer developers the ability to tap into that machine-learning-trained speech toxicity detector running on Googles servers, whether for identifying harassment and abuse on social media or more efficiently filtering invective from the comments on a news website.

We hope this is a moment where Conversation AI goes from being this is interesting to a place where everyone can start engaging and leveraging these models to improve discussion, says Conversation AI product manager CJ Adams. For anyone trying to rein in the comments on a news site or social media, Adams says, the options have been upvotes, downvotes, turning off comments altogether or manually moderating. This gives them a new option: Take a bunch of collective intelligencethat will keep getting better over timeabout what toxic comments people have said would make them leave, and use that information to help your communitys discussions.

On a demonstration website launched today, Conversation AI will now let anyone type a phrase into Perspectives interface to instantaneously see how it rates on the toxicity scale. Google and Jigsaw developed that measurement tool by taking millions of comments from Wikipedia editorial discussions, the New York Times and other unnamed partnersfive times as much data, Jigsaw says, as when it debuted Conversation AI in Septemberand then showing every one of those comments to panels of ten people Jigsaw recruited online to state whether they found the comment toxic.

The resulting judgements gave Jigsaw and Google a massive set of training examples with which to teach their machine learning model, just as human children are largely taught by example what constitutes abusive language or harassment in the offline world. Type you are not a nice person into its text field, and Perspective will tell you it has an 8 percent similarity to phrases people consider toxic. Write you are a nasty woman, by contrast, and Perspective will rate it 92 percent toxic, and you are a bad hombre gets a 78 percent rating. If one of its ratings seems wrong, the interface offers an option to report a correction, too, which will eventually be used to retrain the machine learning model.

The Perspective API will allow developers to access that test with automated code, providing answers quickly enough that publishers can integrate it into their website to show toxicity ratings to commenters even as theyre typing. And Jigsaw has already partnered with online communities and publishers to implement that toxicity measurement system. Wikipedia used it to perform a study of its editorial discussion pages. The New York Times is planning to use it as a first pass of all its comments, automatically flagging abusive ones for its team of human moderators. And the Guardian and the Economist are now both experimenting with the system to see how they might use it to improve their comment sections, too. Ultimately we want the AI to surface the toxic stuff to us faster, says Denise Law, the Economists community editor. If we can remove that, what wed have left is all the really nice comments. Wed create a safe space where everyone can have intelligent debates.

Despite that impulse to create an increasingly necessary safe space for online discussions, critics of Conversation AI have argued that it could itself represent a form of censorship, enabling an automated system to delete comments that are either false positives (the insult nasty woman, for instance, took on a positive connotation for some, after then-candidate Donald Trump used the phrase to describe Hillary Clinton) or in a gray area between freewheeling conversation and abuse. People need to be able to talk in whatever register they talk, feminist writer Sady Doyle, herself a victim of online harassment, told WIRED last summer when Conversation AI launched. Imagine what the internet would be like if you couldnt say Donald Trump is a moron.

Jigsaw has argued that its tool isnt meant to have final say as to whether a comment is published. But short-staffed social media startup or newspaper moderators might still use it that way, says Emma Llans, director of the Free Expression Project at the nonprofit Center for Democracy and Technology. An automated detection system can open the door to the delete-it-all option, rather than spending the time and resources to identify false positives, she says.

Were not claiming to have created a panacea for the toxicity problem. Jigsaw founder Jared Cohen

But Jared Cohen, Jigsaws founder and president, counters that the alternative for many media sites has been to censor clumsy blacklists of offensive words or to shut off comments altogether. The default position right now is actually censorship, says Cohen. Were hoping publishers will look at this and say we now have a better way to facilitate conversations, and we want you to come back.'

Jigsaw also suggests that the Perspective API can offer a new tool to not only moderators, but to readers. Their online demo offers a sliding scale that changes which comments about topics like climate change and the 2016 election appear for different tolerances of toxicity, showing how readers themselves could be allowed to filter comments. And Cohen suggests that the tool is still just one step toward better online conversations; he hopes it can eventually be recreated in other languages like Russian, to counter the state-sponsored use of abusive trolling as a censorship tactic. Its a milestone, not a solution, says Cohen. Were not claiming to have created a panacea for the toxicity problem.

In an era when online discussion is more partisan and polarized than everand the president himself lobs insults from his Twitter feedJigsaw argues that a software tool for pruning comments may actually help to bring a more open atmosphere of discussion back to the internet. Were in a situation where online conversations are becoming so toxic that we end up just talking to people we agree with, says Jigsaws Adams. Thats made us all the more interested in creating technology to help people continue talking and continue listening to each other, even when they disagree.