Trolloscopy.
May 13, 2017
Pro-Kremlin comments in the Financial Times.
Beeing a regular reader of the FT, i find the quality of the comments section far above the standard level we can expect those days. The fact that only suscribers can comment gives it a touch of Gentleman’s club. This makes the comment section an interesting study field for the Kremlin supporting members : the argumentation tends to be well structured and the length of the comments much longer than a tweet.
The problem with pro-Kremlin commentators is that, instead of providing facts and figure in the debate, they flood the thread with such comment :
I retrieved the comments for all articles related to the following keywords :
Syria, Gazprom, Brexit, MH17, NATO, Trump, Assad, Georgia, IOC, Doping, Putin, Merkel, Refugees, Afd, Le Pen, Obama, Russia.
From February 2009 to Sept 2016, it generates a dataset of 240 000 comments for 10 000 articles.
A “Like” option allows commentators to tag each others.
Those are the top likers :
And the top liked :
And summarized in a matrix :
Astonishingly, the top 10 commentators in term of quantity and “quality”(if we considere the number of likes a reasonable metrics) are pro-Kremlin !
Now, taking the top commentators, how do their activity looks like :
Nota Bene : UTC+3 corresponds to Saint Petersburg.
Looking at the activity curves, my take is those are people doing extra jobs as commentators.
Well, there is a pattern here :
- March 2014 : Crimean annexion
- Sept 2015 : Putin intervenes in Syria
- October 2016 : Doping scandal
Now, using word vectorization and TFIDF (Term Frequency-Inverse Document Frequency), we can extract the main components describing the best the variance between the top commentators :
Here my take is that, given the proximity in terms of vocabulary, Njegos and Maljoffre are just one person.
Next step would be to build a binary classifier to filter those commentators, which combined with a chrome extension would simply color the comments according to their kremlino-sensitivity.