Research:Lag between registration and first edit

This page documents a completed research project.

This sprint investigates the research question: How long does it take for new users to make an edit once they register an account?

Process

edit

Data for a registered user's first edit ever -- which includes live and deleted edits -- was generated. This was then compared to the user's registration date. Note: because of legacy installations of MediaWiki, user registration data may be inaccurate prior to 2005. At that time, the software would sometimes record the date of a user's first edit as their registration date. However, this makes up a small percent of users given the massive growth in registration and editors in 2006-7.

The data for all users were then fitted to a Gaussian mixture model, a clustering technique that is able to separate lag observations in several classes (or components). We tried fitting a mixture of N=2,3, and 4 components. Estimation of the parameters of the model is performed via the Expectation Maximization algorithm (EM). The data are first transformed in logarithmic scale (base 10). If data are log-normally distributed, then we should see that the logarithm is distributed according to the normal distribution.

Results

edit

What percentage of registered users edit?

edit

 

Pie Charts

edit

     

Histogram with model fit

edit

 

Mean (days) Median (days) Std. Dev. (days) Prob.
741.5 18.36 2.993e+04 0.2926
0.008591 0.004197 0.01534 0.7074

Data

edit
Days between reg and first edit Number of users Percent of all users
0 3477450 80.867%
1 146917 3.417%
2 48885 1.137%
3 33918 0.789%
4 28088 0.653%
5 to 10 111996 2.604%
11 to 20 94112 2.189%
21 to 31 59312 1.379%
31 to 60 73512 1.710%
61 to 180 130443 3.033%
180 to 365 95563 2.222%
Total < 1 year 4300196 100.000%
Hours between reg and first edit Number of users Percent of all users Percent of < 1 day users
0 3257914 75.762% 93.687%
1 111753 2.599% 3.214%
2 35798 0.832% 1.029%
3 18451 0.429% 0.531%
4 11214 0.261% 0.322%
5 7382 0.172% 0.212%
6 4881 0.114% 0.140%
7 3518 0.082% 0.101%
8 2631 0.061% 0.076%
9 2451 0.057% 0.070%
10 2278 0.053% 0.066%
11 2255 0.052% 0.065%
12 2200 0.051% 0.063%
13 2068 0.048% 0.059%
14 1972 0.046% 0.057%
15 1864 0.043% 0.054%
16 1777 0.041% 0.051%
17 1561 0.036% 0.045%
18 1503 0.035% 0.043%
19 1307 0.030% 0.038%
20 1061 0.025% 0.031%
21 854 0.020% 0.025%
22 562 0.013% 0.016%
23 195 0.005% 0.006%
Total < 1 day 3477450 80.867% 100.000%
Minutes between reg and first edit number of users Percent of all users Percent of < 1 hour users
0 293625 6.828% 9.013%
1 387565 9.013% 11.896%
2 360452 8.382% 11.064%
3 290431 6.754% 8.915%
4 232709 5.412% 7.143%
5 190312 4.426% 5.842%
6 to 10 588981 13.697% 18.078%
11 to 20 484058 11.257% 14.858%
21 to 30 207025 4.814% 6.355%
31 to 40 111793 2.600% 3.431%
41 to 50 68942 1.603% 2.116%
51 to 60 42021 0.977% 1.290%
Total < 1 hour 3257914 75.762% 100.000%

Future work

edit

Separate out this data by registration cohort: has this changed over time?