Research talk:Anonymous editor acquisition/Signup CTA experiment/Work log/2014-06-11
Wednesday, June 11th
editPicking up where I left off with revision data. In order to get some basic figures, I'll need to limit my analysis to the week of the experiment. Regretfully, I forgot to include the timestamp
field in the last dataset, so I'll need to regenerate.
SELECT
wiki,
event_token as token,
timestamp,
event_revId as rev_id
FROM log.TrackedPageContentSaveComplete_8535426
UNION
SELECT
wiki,
event_token as token,
timestamp,
event_revId as rev_id
FROM log.TrackedPageContentSaveComplete_7872558;
--Halfak (WMF) (talk) 16:56, 11 June 2014 (UTC)
Time to get some ballpark estimates before I go to R and make some visualizations. I'll need to limit the query to users (tokens) who were not registered before the experimental period.
SELECT
wiki,
bucket,
ROUND(EXP(AVG(LOG(experimental_revisions+1)))-1, 3) AS geom_mean_revisions,
SUM(experimental_revisions > 0) AS editing_clients,
SUM(experimental_revisions > 0)/COUNT(*) AS editing_prop,
COUNT(*) AS relevant_tokened_clients
FROM token_info
LEFT JOIN (
SELECT
wiki,
token,
COUNT(rev_id) AS experimental_revisions
FROM staging.token_revision
WHERE timestamp BETWEEN "20140519180800" AND "20140526180800"
GROUP BY wiki, token
) AS token_revision_count USING (wiki, token)
WHERE (first_user_id IS NULL OR first_user_registration > "20140519180800")
AND link_clicks > 0
GROUP BY wiki, bucket;
+--------+-----------+---------------------+-----------------+--------------+--------------------------+ | wiki | bucket | geom_mean_revisions | editing_clients | editing_prop | relevant_tokened_clients | +--------+-----------+---------------------+-----------------+--------------+--------------------------+ | dewiki | control | 1.849 | 4093 | 0.0874 | 46835 | | dewiki | post-edit | 1.708 | 3788 | 0.0892 | 42454 | | dewiki | pre-edit | 1.896 | 3107 | 0.0717 | 43319 | | enwiki | control | 1.946 | 27036 | 0.1139 | 237262 | | enwiki | post-edit | 1.899 | 24738 | 0.1145 | 216138 | | enwiki | pre-edit | 2.043 | 20759 | 0.0921 | 225354 | | frwiki | control | 1.836 | 3915 | 0.1124 | 34821 | | frwiki | post-edit | 1.817 | 3601 | 0.1142 | 31543 | | frwiki | pre-edit | 1.966 | 3007 | 0.0912 | 32984 | | itwiki | control | 2.060 | 2615 | 0.1227 | 21306 | | itwiki | post-edit | 2.099 | 2381 | 0.1235 | 19280 | | itwiki | pre-edit | 2.199 | 2116 | 0.0997 | 21232 | +--------+-----------+---------------------+-----------------+--------------+--------------------------+ 12 rows in set (1 min 23.37 sec)
So, it looks like we get fewer people editing in the pre-edit then the control condition, but we get more edits per person. Now to visualize this with some error bars and do some statistical tests to see if the differences are real. --Halfak (WMF) (talk) 17:26, 11 June 2014 (UTC)