Research:Productive new editor

Productive new editor
Specification
A is a new editor who completes at least productive edit(s) within time since registration ().
WMF Standard
Status
completed

Productive new editor is a standardized user class used to measure the number of first-time editors in a wiki project over time who make productive contributions. It's used as a proxy for editor productivity, and to a lesser extent, editor activation. A "productive new editor" is a new editor who saves revisions to content namespace pages that are not reverted.

Discussion

edit

Excluding edits to deleted content

edit

Spammers and other non-productive new editors tend to create articles that are non-productive and those articles tend to be deleted rather than the edits to the articles being reverted (and therefore excluding them from the productive edit criteria). Edits to articles that are deleted by the end of a new editor's first week since registration are not included in counts of productive edits.

The n productive edits threshold

edit

Like choosing an   for any metric based on counts (e.g. new editor and active editor), choosing a threshold is somewhat arbitrary. Choosing a higher threshold will result in a smaller proportion of newly registered users being considered productive.

The t time cutoff

edit

There are a few ways that the timespan for identifying productive edits can be drawn. The two most common ways are based on time bounds and events. A time-bounded approach is based on the use of some   cutoff to limit observations to a certain amount of time after a user registered their account. An event-based approach will use some event as the starting point to count user contributions. Another candidate time-span includes edits that a newcomer performed in their first edit session. Since productive new editor qualifies the activity of a new editor we set  , which effectively makes the class of productive new editors a proper subset of new editors. We analyze the effect of choosing a different value for   below.

Time to revert cutoff

edit

Because a revert can theoretically occur years after the original edit, r en:Censoring_(statistics) everts are only counted if they occurred within 48 hours of the original edit. For more details, see Research:Revert#Time to revert cutoff.

Limitations

edit
  • This metric represents productivity as a binary attribute of a user, it does not measure how productive a new editor is. New editors who make many productive edits and contribute substantial amounts of content will look identical (under this metric) to new editors who fix a few typos.
  • The most clever vandalism/vandals may go unnoticed for more than 48 hours.

Lag time

edit

Generation of this metric will need to be delayed by   after users' registration dates in order to allow   days for newly registered users to make edits and an additional   for other editors to have a chance to revert them. In the case of the WMF Standard parameterization, this works out to  .

Analysis

edit

Besides the variables describing a productive edit, there are two variables used in this metric:

  • The value of  
  • The value of  

Given that the raw number of productive new editors is highly dependent on the raw number of new editors, this metric is best examined as a proportion. Given that identifying reverts is computationally difficult, the following plots were generated by randomly sampling newly registered users stratified by registration month.

German Wikipedia

edit
 
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of  .
 
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of  .

English Wikipedia

edit
 
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of  .
 
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of  .

Spanish Wikipedia

edit
 
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of  .
 
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of  .


French Wikipedia

edit
 
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of  .
 
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of  .

Polish Wikipedia

edit
 
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of  .
 
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of  .

Portuguese Wikipedia

edit
 
Productive new editors per new editorThe proportion of productive new editors is plotted by registration month for two values of  .
 
Productive new editors per newly registered userThe proportion of productive new editors is plotted by registration month for two values of  .

Factor comparison of n and t

edit
 
Factor of nThe factor of difference between proportions of productive new editors for different values of   is plotted.
 
Factor of tThe factor of difference between proportions of productive new editors for different values of   is plotted.

Usage

edit

User interface experiments at the Wikimedia Foundation apply the productive new editor metric across an entire cohort to generate the productive newcomer proportion. For example, in an A/B test, we might compare the proportion of productive new editors in a control group to a test group. This allows us to have a very basic understanding of whether the A/B test led to more new editors making productive contributions.

See the following research reports for examples of this type of usage:

References

edit