Research talk:VisualEditor's effect on newly registered editors/May 2015 study/Work log/2015-06-15
Monday, June 15, 2015
editTime to look at the editing sessions. I'm not expecting to see much of a difference here given that all of our productivity measures were dead even.
changed_and_noswitch means that the session abort was not "switchwith", "switchwithout" or "nochange". These represent our best denominator when looking at proportions.
bucket | via_mobile | users.n | ve.k | ve.p | attempted.k | attempted.p | successful.k | successful.p | changed_and_noswitch.n | changed.n | n |
---|---|---|---|---|---|---|---|---|---|---|---|
control | 0 | 3421 | 53 | 0.007391911 | 3207 | 0.4472803 | 2980 | 0.4156206 | 4683 | 4692 | 7170 |
experimental | 0 | 3459 | 2412 | 0.3404856 | 2668 | 0.3766234 | 2452 | 0.3461321 | 4260 | 4671 | 7084 |
control | 1 | 219 | 0 | 0 | 119 | 0.3190349 | 110 | 0.2949062 | 281 | 281 | 373 |
experimental | 1 | 211 | 78 | 0.2154696 | 89 | 0.2458564 | 83 | 0.2292818 | 240 | 252 | 362 |
It looks like 34% of edits sessions were VE. We also see a bit of a difference in overall proportion of successful sessions (41.6% vs. 34.6%). Even if we filter out nochance and switching sessions, then we see 2452/4260 = 57.6% for experimental and 2980/4683 = 63.6% for control.
> prop.test(c(2452,2980), c(4260, 4683)) 2-sample test for equality of proportions with continuity correction data: c(2452, 2980) out of c(4260, 4683) X-squared = 34.2779, df = 1, p-value = 4.778e-09 alternative hypothesis: two.sided 95 percent confidence interval: -0.08123271 -0.04028203 sample estimates: prop 1 prop 2 0.5755869 0.6363442
That difference is significant. This is surprising since we did not see significant difference in overall productivity. I think that we might see the result of people *playing* with VE. I can imagine people suddenly noticing the two edit links and spending some time checking out what it looks like to edit with VE on a few different articles. If it were me, I'd spend some time typing and copy-pasting around an article to see what it looked like and then not hit save. It's difficult to know if my experience and intuition is like others, but I think it's safe to conclude that something more complex than "VE doesn't work as well as Wikitext for people" is going on here. --Halfak (WMF) (talk) 15:50, 15 June 2015 (UTC)
Additional questions
editOK. So that roughly concludes my planned evaluation. Now, I'd like to do some descriptive statistics. Since productivity held roughly constant, I want to look at the distribution of productivity for new editors and see what level of productivity newcomers who choose to use VE are general at.
> mean(user_metrics[week_revisions > 0 & bucket=="experimental",]$prop.ve > .5) [1] 0.4057274
40.6% of experimental editors mostly used VE. That means we should have a good set of observations.
It looks like the primary difference between mostly Wikitext and mostly VE is that mostly VE has more editors who make at least one productive edit. I'll need to run a test to be sure. --Halfak (WMF) (talk) 16:00, 15 June 2015 (UTC)
group productive editing n 1: mostly WT 1086 2332 11203 2: mostly VE 1138 2033 2304 > prop.test(c(1086, 1138), c(2332, 2033)) 2-sample test for equality of proportions with continuity correction data: c(1086, 1138) out of c(2332, 2033) X-squared = 38.0831, df = 1, p-value = 6.779e-10 alternative hypothesis: two.sided 95 percent confidence interval: -0.12411876 -0.06401967 sample estimates: prop 1 prop 2 0.4656947 0.5597639
Yes. It looks like 10% more VE users will make at least one productive edit that WT users. This does not suggest that VE increases productive editing -- just that editors who were likely to be productive are more likely to mostly use VE. --Halfak (WMF) (talk) 16:31, 15 June 2015 (UTC)
Time to completion
editI had forgotten that I planned to measure the time between the start and completion of an edit. So, let's do that!
A t-test of the log values suggests this difference is significant. With an expected difference in the average edit time of ~20 seconds.
t = -4.9302, df = 5253.173, p-value = 8.468e-07
Let's look within the experimental condition at edits saved via the visual editor vs. wikitext.
It seems like editors who use wikitext are making substantially faster edits. The mode of the wikitext distribution is around 35 seconds, while the mode of the visualeditor distribution is more like 2 minutes. --Halfak (WMF) (talk) 19:22, 15 June 2015 (UTC)
So, I wonder if, when presented with VE, newcomers will perform different types of edits. We might also be seeing save delays due to the time spent waiting for the editor to load or even the newcomers spending more time exploring VE and its complex menus. --Halfak (WMF) (talk) 19:25, 15 June 2015 (UTC)