Research talk:Are the bots really fighting/Work log/2017-03-04
Saturday, March 4, 2017
editI got a little work in today so I wanted to record it here. I created a repository to capture the work here: https://github.com/halfak/are-the-bots-really-fighting
Specifically, I'm loading an old dataset of reverts into the database. Here's the table creation script:
CREATE TABLE staging.enwiki_reverted_20140820 (
rev_id INT UNSIGNED,
rev_timestamp VARBINARY(14),
rev_user INT UNSIGNED,
rev_user_text VARBINARY(255),
rev_page INT UNSIGNED,
rev_sha1 VARBINARY(32),
rev_minor_edit TINYINT,
rev_deleted TINYINT,
rev_parent_id INT UNSIGNED,
archived TINYINT,
reverting_id INT UNSIGNED,
reverting_timestamp VARBINARY(14),
reverting_user INT UNSIGNED,
reverting_user_text VARBINARY(255),
reverting_page INT UNSIGNED,
reverting_sha1 VARBINARY(32),
reverting_minor_edit TINYINT,
reverting_deleted TINYINT,
reverting_parent_id INT UNSIGNED,
reverting_archived TINYINT,
rev_revert_offset INT UNSIGNED,
revisions_reverted INT UNSIGNED,
reverted_to_rev_id INT UNSIGNED
);
Here's how I plan to get a dataset of bot-on-bot reverts:
SELECT * FROM staging.enwiki_reverted_20140820
INNER JOIN enwiki.user_groups AS reverted_bot ON
rev_user = reverted_bot.ug_user AND
reverted_bot.ug_group = "bot"
INNER JOIN enwiki.user_groups AS reverting_bot ON
reverting_user = reverting_bot.ug_user AND
reverting_bot.ug_group = "bot"
WHERE rev_user != reverting_user;
Oh! But first, I'll need an index:
CREATE INDEX user_to_user_ix ON enwiki_reverted_20140820 (rev_user, reverting_user);
In Tsvetkova's study, they limited their analysis to the revision before the reverting edit. We can use the rev_revert_offset and revisions_reverted fields to find that those specific edits. So I modified the query above to add an offset of 1.
SELECT * FROM staging.enwiki_reverted_20140820
INNER JOIN enwiki.user_groups AS reverted_bot ON
rev_user = reverted_bot.ug_user AND
reverted_bot.ug_group = "bot"
INNER JOIN enwiki.user_groups AS reverting_bot ON
reverting_user = reverting_bot.ug_user AND
reverting_bot.ug_group = "bot"
WHERE
rev_user != reverting_user AND
rev_revert_offset = 1;
I'm waiting for the index to build now, so I'm going to go AFK for a while and will extract the bot-on-bot reverts later. --EpochFail (talk) 20:48, 4 March 2017 (UTC)