Detecting Online Trolls: How Algorithms Are Fighting Internet Toxicity

How could an algorithm spot trolls on the Internet?

It can be fun to engage other users in conversations on the websites you frequent, but the comment sections tend to get out of hand to an upsetting degree. We often have a certain type of personality to thank for an increasingly hostile online environment, replete with vulgarity, insults and sometimes even threats — the Internet troll.

A troll is a person who baits online comment sections with posts designed to get a rise out of people or otherwise disrupt an online community. These extremely negative users make the Internet more gloomy and antagonistic than the fount of information and entertainment it should be.

The anonymity afforded by the Internet allows people (who might comport themselves civilly in person) to shed their inhibitions and engage in antisocial behavior [source: Academic Earth, Breeze]. A study by researchers at the University of Manitoba, University of Winnipeg and University of British Columbia in Canada conducted surveys and found strong correlations between people who appeared to enjoy online trolling and higher scores on personality tests for sadism, Machiavellianism and psychopathy, especially sadism [sources: Buckels, Goldbeck, Mooney].

Researchers have even found that abusive language in the comment section of an article can alter readers' perceptions of the content, skewing them toward a more polarized view of the topic or making them doubt the article's quality [source: Brossard, Mooney, Applebaum, Felder]. This prompted Popular Science to shut off commenting on most of its online articles in 2013 [source: LaBarre].

To minimize the effects of trolling, some sites hire moderators to keep an eye on comment threads and censor or ban offending posts and users, but that takes time and money that not all sites can or will spend, and there are far more trolls than moderators. Others have suggested removing anonymity, although there is some evidence that this might cause most users to never post comments [source: Ingram].

But researchers at Stanford and Cornell have come up with another potential tool for combatting trolls — early detection.

An Algorithm to Spot Trolls

Researchers at Stanford and Cornell (with funding from a Stanford Graduate Fellowship and a Google Faculty Research Award) conducted a study to see if they could use quantitative measures to detect antisocial users. They gained access to user comments hosted by Disqus for the sites Breitbart.com, CNN.com and IGN.com, spanning 18 months from March 2012 through August 2013. The data consisted of around 1.75 million users (nearly 49,000 of them banned), 1.26 million threads and 39 million posts (nearly 838,000 of them deleted and 1.35 million of them reported). They narrowed the banned user data down to around 12,000 users who joined the sites after March 2012, had at least five posts and were banned permanently for something other than spamming URLs [source: Cheng].

The scientists captured data including post content, user activity, community response and moderator actions. They compared messages of users who were never banned to messages of users who were permanently banned, and looked at changes in the banned users' behavior over their time. The team found that the posts of future banned trolls tended to have the following traits:

poor spelling and grammar
more profanity
more negative words
less conciliatory or tentative language
lower understandability readings based on several readability tests (including the Automated Readability Index) which got worse toward the time of banning
use of different jargon and function words from non-banned community members
more digression from the topic
a much higher number of comment posts than the average user
a tendency to concentrate their replies in individual threads
a tendency to provoke more replies from others
worse behavior over time resulting in their posts being increasingly deleted before banning

On CNN.com, the average user tended to post around 22 posts during the 18 month period, whereas the future banned users posted around 264 times before being banned [sources: Cheng, Collins]. The community was also less likely to tolerate the troll over time.

Using the quantifiable results, the researchers were able to develop an algorithm (a set of steps used to solve a problem or perform a task) that used as few as five comments to determine who would be banned in the future with 80 percent accuracy. With 10 posts, the results were 82 percent accurate, but performance peaked around 10 posts. Earlier user posts were better for predicting whether they would get banned later. The team achieved a similar level of accuracy over all three online communities. Post deletion by site moderators turned out to be the most informative activity studied, but all the data in aggregate resulted in better accuracy [source: Cheng].

Is an algorithm viable, or enough?

Using an algorithm like this to automate troll banning is unlikely in the immediate future. It's in the academic research phase and not available as a usable software package at the moment. And at 80 percent accuracy, one in five of the users it identified would be innocent. The researchers found that being overly harsh or quick to ban or censor users tended to increase antisocial behavior, and even worsened the writing of posters who initially had higher text quality [source: Cheng]. There are also other types of trolling, including more subtle varieties, such as posting vexingly naive questions, that this algorithm wouldn't be able to detect.

Therefore it's still important for a human moderator to review the situation before anyone is banned. But a tool like this could help moderators spend their time more efficiently by flagging potentially disruptive users ahead of time. The findings can be used to develop subsequent studies and future troll-detection and moderation tools.

In the meantime, we can keep trying other measures. Some have suggested instituting user pseudonyms that stay the same over multiple sites [source: Schwartz]. Sites like Slashdot and Reddit have peer moderation systems in place. Sites xkcd and 4chan have even experimented with a bot called ROBOT9000 that mutes users when they say something that's been said before, only allowing original content [source: Munroe].

Google has made changes to YouTube to try to get a handle on its well-known trolling problem, including allowing users more leeway and tools to moderate their videos' comment sections, moving more relevant comments and comments from users' Google+ friends to the top and adding public or private commenting options [sources: Dredge, YouTube].

We, as online users, can also do things to diminish trolling, like making use of the voting and reporting features on online communities. The Stanford and Cornell researchers also found that other users engaging trolls with harsh feedback exacerbated the behavior. This reinforces a commonly repeated Internet maxim: "Don't feed the trolls!"

Detecting Online Trolls: How Algorithms Are Fighting Internet Toxicity

How could an algorithm spot trolls on the Internet?

An Algorithm to Spot Trolls

Is an algorithm viable, or enough?

computer security

More categories