Research talk:Anonymous editor acquisition/Signup CTA experiment/Work log/2014-05-28
Wednesday, May 28th
editToday I'm bringing together the token work and generating the first funnel bits. First I need to write a query that combines the info I have on token'd users into user classes.
SELECT
wiki,
token,
IF(RIGHT(token, 1) <= "J", "pre-edit",
IF(RIGHT(token, 1) <= "d", "post-edit", "control")) AS bucket,
first_event,
token_stats.revisions AS tokened_revisions,
token_stats.account_creations,
token_stats.creation_impressions,
token_stats.button_clicks,
token_stats.cta_impressions,
token_stats.link_clicks,
SUM(DISTINCT user_id) AS user_accounts,
MIN(user_id) AS first_user_id,
MIN(registration) AS first_user_registration
FROM staging.token_stats
LEFT JOIN staging.user_token USING (wiki, token)
GROUP BY 1,2;
--Halfak (WMF) (talk) 16:24, 28 May 2014 (UTC)
Looks like I made a mistake with the queries above. I didn't include an old version of the Schema:TrackedPageContentSaveComplete. So I need to include that now and re-run the queries (that took forever yesterday. Ugh.). --Halfak (WMF) (talk) 16:25, 28 May 2014 (UTC)
I have data! OK. Time to pick through it.
> select count(*) from token_info; +----------+ | count(*) | +----------+ | 1384785 | +----------+ 1 row in set (0.00 sec)
About 1.4 million tokens appeared in since we started tracking data. Let's get the experiment bounds so that we can draw some cutoffs.
Query
|
---|
> select left(timestamp, 10) as hour, count(*) from SignupExpCTAImpression_8101716 group by 1; +------------+----------+ | hour | count(*) | +------------+----------+ | 2014041618 | 1 | | 2014041720 | 1 | | 2014041721 | 1 | | 2014041722 | 1 | | 2014041816 | 9 | | 2014042120 | 6 | | 2014042121 | 7 | | 2014042505 | 36 | | 2014042516 | 4 | | 2014050204 | 4 | | 2014050205 | 1 | | 2014050206 | 10 | | 2014050207 | 10 | | 2014050323 | 5 | | 2014050401 | 24 | | 2014050502 | 8 | | 2014050503 | 9 | | 2014050504 | 7 | | 2014051400 | 1 | | 2014051402 | 3 | | 2014051417 | 1 | | 2014051520 | 2 | | 2014051603 | 1 | | 2014051604 | 4 | | 2014051918 | 908 | | 2014051919 | 1154 | | 2014051920 | 1639 | | 2014051921 | 1596 | | 2014051922 | 1433 | | 2014051923 | 1201 | | 2014052000 | 1197 | | 2014052001 | 1268 | | 2014052002 | 1241 | | 2014052003 | 1139 | | 2014052004 | 1194 | | 2014052005 | 1409 | | 2014052006 | 1724 | | 2014052007 | 2136 | | 2014052008 | 2473 | | 2014052009 | 2510 | | 2014052010 | 2511 | | 2014052011 | 2584 | | 2014052012 | 2928 | | 2014052013 | 3433 | | 2014052014 | 3603 | | 2014052015 | 3500 | | 2014052016 | 3366 | | 2014052017 | 3424 | | 2014052018 | 3386 | | 2014052019 | 3035 | | 2014052020 | 2790 | | 2014052021 | 2414 | | 2014052022 | 1951 | | 2014052023 | 1707 | | 2014052100 | 1642 | | 2014052101 | 1524 | | 2014052102 | 1554 | | 2014052103 | 1510 | | 2014052104 | 1481 | | 2014052105 | 1581 | | 2014052106 | 1882 | | 2014052107 | 2132 | | 2014052108 | 2508 | | 2014052109 | 2642 | | 2014052110 | 2610 | | 2014052111 | 2563 | | 2014052112 | 3063 | | 2014052113 | 3529 | | 2014052114 | 3483 | | 2014052115 | 3314 | | 2014052116 | 3219 | | 2014052117 | 3283 | | 2014052118 | 3187 | | 2014052119 | 2975 | | 2014052120 | 2675 | | 2014052121 | 2271 | | 2014052122 | 1782 | | 2014052123 | 1691 | | 2014052200 | 1527 | | 2014052201 | 1540 | | 2014052202 | 1572 | | 2014052203 | 1353 | | 2014052204 | 1420 | | 2014052205 | 1503 | | 2014052206 | 1920 | | 2014052207 | 2207 | | 2014052208 | 2583 | | 2014052209 | 2639 | | 2014052210 | 2503 | | 2014052211 | 2550 | | 2014052212 | 2933 | | 2014052213 | 3298 | | 2014052214 | 3352 | | 2014052215 | 3437 | | 2014052216 | 3233 | | 2014052217 | 3070 | | 2014052218 | 2876 | | 2014052219 | 2594 | | 2014052220 | 2512 | | 2014052221 | 2143 | | 2014052222 | 1703 | | 2014052223 | 1541 | | 2014052300 | 1534 | | 2014052301 | 1384 | | 2014052302 | 1340 | | 2014052303 | 1162 | | 2014052304 | 1150 | | 2014052305 | 1329 | | 2014052306 | 1658 | | 2014052307 | 1894 | | 2014052308 | 2026 | | 2014052309 | 2174 | | 2014052310 | 2149 | | 2014052311 | 2154 | | 2014052312 | 2435 | | 2014052313 | 2859 | | 2014052314 | 2796 | | 2014052315 | 2775 | | 2014052316 | 2686 | | 2014052317 | 2463 | | 2014052318 | 2368 | | 2014052319 | 2369 | | 2014052320 | 2292 | | 2014052321 | 1918 | | 2014052322 | 1503 | | 2014052323 | 1325 | | 2014052400 | 1122 | | 2014052401 | 1122 | | 2014052402 | 1060 | | 2014052403 | 1110 | | 2014052404 | 1178 | | 2014052405 | 1230 | | 2014052406 | 1293 | | 2014052407 | 1527 | | 2014052408 | 1747 | | 2014052409 | 1819 | | 2014052410 | 1751 | | 2014052411 | 1869 | | 2014052412 | 2049 | | 2014052413 | 2227 | | 2014052414 | 2292 | | 2014052415 | 2366 | | 2014052416 | 2201 | | 2014052417 | 2117 | | 2014052418 | 2008 | | 2014052419 | 2026 | | 2014052420 | 1888 | | 2014052421 | 1869 | | 2014052422 | 1542 | | 2014052423 | 1397 | | 2014052500 | 1288 | | 2014052501 | 1266 | | 2014052502 | 1170 | | 2014052503 | 1086 | | 2014052504 | 1184 | | 2014052505 | 1179 | | 2014052506 | 1321 | | 2014052507 | 1480 | | 2014052508 | 1745 | | 2014052509 | 1871 | | 2014052510 | 1905 | | 2014052511 | 1970 | | 2014052512 | 2125 | | 2014052513 | 2246 | | 2014052514 | 2424 | | 2014052515 | 2679 | | 2014052516 | 2506 | | 2014052517 | 2600 | | 2014052518 | 2446 | | 2014052519 | 2273 | | 2014052520 | 2266 | | 2014052521 | 1913 | | 2014052522 | 1578 | | 2014052523 | 1581 | | 2014052600 | 1429 | | 2014052601 | 1419 | | 2014052602 | 1353 | | 2014052603 | 1267 | | 2014052604 | 1314 | | 2014052605 | 1334 | | 2014052606 | 1705 | | 2014052607 | 1996 | | 2014052608 | 2317 | | 2014052609 | 2419 | | 2014052610 | 2463 | | 2014052611 | 2478 | | 2014052612 | 2526 | | 2014052613 | 2489 | | 2014052614 | 2688 | | 2014052615 | 2968 | | 2014052616 | 3017 | | 2014052617 | 2929 | | 2014052618 | 2787 | | 2014052619 | 2654 | | 2014052620 | 2463 | | 2014052621 | 2127 | | 2014052622 | 1851 | | 2014052623 | 1662 | | 2014052700 | 1609 | | 2014052701 | 1521 | | 2014052702 | 1551 | | 2014052703 | 1320 | | 2014052704 | 1482 | | 2014052705 | 1497 | | 2014052706 | 1909 | | 2014052707 | 2085 | | 2014052708 | 2230 | | 2014052709 | 2620 | | 2014052710 | 2559 | | 2014052711 | 2486 | | 2014052712 | 2720 | | 2014052713 | 2794 | | 2014052714 | 2810 | | 2014052715 | 3158 | | 2014052716 | 3060 | | 2014052717 | 3077 | | 2014052718 | 2898 | | 2014052719 | 2942 | | 2014052720 | 2899 | | 2014052721 | 2520 | | 2014052722 | 1979 | | 2014052723 | 1077 | | 2014052800 | 847 | | 2014052801 | 764 | | 2014052802 | 626 | | 2014052803 | 518 | | 2014052804 | 490 | | 2014052805 | 467 | | 2014052806 | 391 | | 2014052807 | 366 | | 2014052808 | 254 | | 2014052809 | 194 | | 2014052810 | 136 | | 2014052811 | 139 | | 2014052812 | 173 | | 2014052813 | 169 | | 2014052814 | 199 | | 2014052815 | 198 | | 2014052816 | 165 | | 2014052817 | 160 | | 2014052818 | 148 | | 2014052819 | 58 | +------------+----------+ 242 rows in set (0.93 sec) |
It looks like the experiment started in the 18th hour of 2014-05-19 and began to ramp down in the 23rd hour of 2014-05-27.
> select left(timestamp, 11) as ten_minutes, count(*) from SignupExpCTAImpression_8101716 where timestamp LIKE "2014051918%" group by 1; +-------------+----------+ | ten_minutes | count(*) | +-------------+----------+ | 20140519180 | 30 | | 20140519181 | 171 | | 20140519182 | 179 | | 20140519183 | 198 | | 20140519184 | 155 | | 20140519185 | 175 | +-------------+----------+ 6 rows in set (0.00 sec)
Looks like we're looking for the first 10 minutes of the hour.
> select left(timestamp, 12) as minutes, count(*) from SignupExpCTAImpression_8101716 where timestamp LIKE "20140519180%" group by 1; +--------------+----------+ | minutes | count(*) | +--------------+----------+ | 201405191805 | 1 | | 201405191806 | 1 | | 201405191807 | 6 | | 201405191808 | 9 | | 201405191809 | 13 | +--------------+----------+ 5 rows in set (0.00 sec)
I think that "201405191807" is the closest we're going to get. It's just aim for the next minute to be sure and say the experiment started at 20140519180800 or 2014-05-19 18:08:00 UTC.
So, for the sake of consistency and dealing with the periodic nature of the work week, I need to limit this to 7 days, so, my assumed end date will be 20140526180800 or 2014-05-16 18:00:00 UTC.
Let me make this bold.
Assumed start and end timestamps
edit- start
- 20140519180800
- end
- 20140526180800
--Halfak (WMF) (talk) 22:42, 28 May 2014 (UTC)
First results
editOK, here's the money. These represent counts of tokens based on user classes.
- pure anon -- A token that was never associated with a registered account.
- old user -- A token that is associated with an account that was registered before tracking began
- tracked user -- A token that is associated with an account that was registered after tracking began, but before the experiment
- experimental user -- A token that is associated with an account that was registered during the experiment
> SELECT -> wiki, -> bucket, -> IF(first_user_id IS NULL, -> "pure anon", -> IF(first_user_registration IS NULL OR first_user_registration <= 20140502000000, -> "old user", -> IF(first_user_registration <= 20140519180800, -> "tracked user", -> "experiment user"))) AS editor_class, -> COUNT(*) AS tokens -> FROM staging.token_info -> GROUP BY 1,2,3; +--------+-----------+-----------------+--------+ | wiki | bucket | editor_class | tokens | +--------+-----------+-----------------+--------+ | dewiki | control | experiment user | 723 | | dewiki | control | old user | 4577 | | dewiki | control | pure anon | 60441 | | dewiki | control | tracked user | 630 | | dewiki | post-edit | experiment user | 711 | | dewiki | post-edit | old user | 4342 | | dewiki | post-edit | pure anon | 54728 | | dewiki | post-edit | tracked user | 566 | | dewiki | pre-edit | experiment user | 998 | | dewiki | pre-edit | old user | 4415 | | dewiki | pre-edit | pure anon | 55410 | | dewiki | pre-edit | tracked user | 589 | | enwiki | control | experiment user | 11633 | | enwiki | control | old user | 22739 | | enwiki | control | pure anon | 296263 | | enwiki | control | tracked user | 6924 | | enwiki | post-edit | experiment user | 11047 | | enwiki | post-edit | old user | 20737 | | enwiki | post-edit | pure anon | 270362 | | enwiki | post-edit | tracked user | 6179 | | enwiki | pre-edit | experiment user | 14520 | | enwiki | pre-edit | old user | 21651 | | enwiki | pre-edit | pure anon | 275562 | | enwiki | pre-edit | tracked user | 6386 | | frwiki | control | experiment user | 1041 | | frwiki | control | old user | 3293 | | frwiki | control | pure anon | 46145 | | frwiki | control | tracked user | 747 | | frwiki | post-edit | experiment user | 1143 | | frwiki | post-edit | old user | 3052 | | frwiki | post-edit | pure anon | 41450 | | frwiki | post-edit | tracked user | 718 | | frwiki | pre-edit | experiment user | 1491 | | frwiki | pre-edit | old user | 3144 | | frwiki | pre-edit | pure anon | 42603 | | frwiki | pre-edit | tracked user | 696 | | itwiki | control | experiment user | 382 | | itwiki | control | old user | 1604 | | itwiki | control | pure anon | 27973 | | itwiki | control | tracked user | 309 | | itwiki | post-edit | experiment user | 470 | | itwiki | post-edit | old user | 1546 | | itwiki | post-edit | pure anon | 25205 | | itwiki | post-edit | tracked user | 252 | | itwiki | pre-edit | experiment user | 688 | | itwiki | pre-edit | old user | 1616 | | itwiki | pre-edit | pure anon | 26844 | | itwiki | pre-edit | tracked user | 240 | +--------+-----------+-----------------+--------+ 48 rows in set, 65535 warnings (3.71 sec)
Here, we can already see some differences between the experimental and control conditions become apparent. Let's take itwiki for example.
condition | pure anon | old user | tracked user | experiment user |
---|---|---|---|---|
control | 27973 | 1604 | 309 | 382 |
pre-edit | 26844 | 1616 | 240 | 688 |
post-edit | 25205 | 1546 | 252 | 470 |
Here, we see that about 33% more users registered during the experiment in the pre-edit condition than the control. About 25% more user registered during the experiment in the post-edit condition than the control.
OK. That's about it for today. I'll be limiting the pure-anons and experimental users to the experimental time bounds tomorrow to do a more accurate comparison. --Halfak (WMF) (talk) 23:50, 28 May 2014 (UTC)