Grants:IEG/Wikiscan multi-wiki/Final

Welcome to this project's final report! This report shares the outcomes, impact and learnings from the Individual Engagement Grantee's 6-month project.

Part 1: The Project

edit

Summary

edit
  • Wikiscan have been successfully extended to a multi-wikis site with support for biggest Wikimedia wikis (actually 348 wikis).
  • Several other improvements have been made: new portal and home pages, multi-language support, pageviews for pages stats, etc.

Methods and activities

edit

I am an active Wikimedia participant since 2006, I started working on Wikiscan in 2010 :

  • As a volunteer, I have a lot of ideas about statistics that would be useful for me and other participants.
  • As a software developer, I rationalize those ideas into workable things.

I have built Wikiscan on my own for several years, following my ideas and community feedbacks from French Wikipedia where I am mainly active.

For this grant, I created a list of features with estimated time [1]. I started working on a prototype for the main multi-wiki feature to make sure it was doable without breaking too much things.

I use a spreadsheet for the task list and to count time spent on each task. It is a good thing to keep focused on planed features but sometimes it is hard to find the best line because a lot of things are interconnected. Some important tasks took me more time than expected, to stay on budget for the final part, I have cut few secondary improvements but nothing important.

The main activity is software development, the outcomes section below list the main achieved features.

Outcomes and impact

edit

Outcomes

edit
  • Wikiscan is now extended from one wiki to all Wikimedia wikis with more than 100,000 edits, actually 348 wikis, this was the main goal of this grant and the biggest challenge, requiring many new developments and optimizations.
  • The multi-wikis use a new worker system which automatically schedule all wikis updates.
  • New global portal on http://wikiscan.org which display small history graphs for all wiki, the order is based on an internal score using several factors.
  • New homepage for each wiki, with several new statistics and charts.
  • Two new status pages to monitor wikis updates, one big table with detailed informations for each wiki [2] and one for the master worker [3].
  • Basic multi-language support added with an English translation.
  • Page views added on statistics by date and last hours, it is a full rewrite to scale to multi-wikis and use the new datasets. All days since January 2015 have been imported and split by wiki.
  • Global statistics table for users are now also available for each year and each month with a filter. For example Top Commons users in 2015 sorted by uploads.
  • New graphics on the evolution of active users based on the number of months of participation.
  • It is now possible to obtain user stats from a list of users contained in a wiki page instead of a category. For example fr:Projet:Afripédia/Formation Douala give this table.
  • Added last 48 hours stats for any wiki with less than 10,000 edits on the last 24 hours.
  • Added link to yearly stats when available (small wikis or small years on big wikis), example Meta most active pages in 2013.
  • New global bots table to improve bot detection.
  • User stats have been optimized for big wikis, in particular for English Wikipedia.
  • Several internal optimizations and refactoring.
  • Source code publication under free license GPL v2 (same as MediaWiki).

Progress towards stated goals

edit
Planned measure of success
(include numeric target, if applicable)
Actual result Explanation
Public Wikimedia wikis totaling at least 100,000 edits (currently 336 wikis) will get their own functional xxx.wikiscan.org site. Public Wikimedia wikis with at least 100,000 edits have their own xxx.wikiscan.org site, currently 348 wikis. New wiki crossing the 100,000 edits limit are automatically added.
reach at least 300 unique users per month on all new subdomains (thus excluding fr.wikiscan.org and future portal wikiscan.org) 189 users in October, 324 in November, 449 in December The new subdomains are available since 10 October 2016, the goal was reached in November.


Think back to your overall project goals. Do you feel you achieved your goals? Why or why not?

Yes, I think the goals are reached, the multi-wikis is working with all expected wikis, the amount of visitors is good and increasing. I did not think there would be so many visitors so quickly, the main explanation is search engines, in November and December 33 % of the visitor sessions come from search traffic.

Global Metrics

edit
Metric Achieved outcome Explanation
1. Number of active editors involved 0 As an external site, it is not possible to know which visitors are active wiki editors.
2. Number of new editors 0
3. Number of individuals involved 1 Only me for the software development. Also the global number of unique visitors for the period Oct 10-Dec 31 on all new sub-domains is 913.
4. Number of new images/media added to Wikimedia articles/pages 0
5. Number of articles added or improved on Wikimedia projects 0
6. Absolute value of bytes added to or deleted from Wikimedia projects 0


Learning question
Did your work increase the motivation of contributors, and how do you know?

Detailed users metrics and rankings may increase the participation of some contributors by gamification or competition but it is difficult to measure.

Indicators of impact

edit

Do you see any indication that your project has had impact towards Wikimedia's strategic priorities? We've provided 3 options below for the strategic priorities that IEG projects are mostly likely to impact. Select one or more that you think are relevant and share any measures of success you have that point to this impact. You might also consider any other kinds of impact you had not anticipated when you planned this project.

Option A: How did you increase participation in one or more Wikimedia projects?

As above, providing detailed users metrics and rankings may increase the participation of some contributors by gamification or competition but it is difficult to measure.

Option B: How did you improve quality on one or more Wikimedia projects?

Option C: How did you increase the reach (readership) of one or more Wikimedia projects?

Project resources

edit
  • http://wikiscan.org the portal with links to all wikis sub-domains.
  • The about page contains a link to the source code (the published version at the end of this grant is 0.8).
    • Source code is no longer publicly available, but a fork has been published by an independent developer, using the 0.9.2 version as the starting point. A project page has been created at mediawiki.org: mw:Wikiscan.

Learning

edit

What worked well

edit

What didn’t work

edit
  • The project plan was very detailed, it is hard to predict how much time I will spent on each tasks. Sometimes I had to rapidly switch from one task to another because they are dependent of each others. If I do another grant with a lot of software development, I will try to regroup more similar tasks together.

Other recommendations

edit

If you have additional recommendations or reflections that don’t fit into the above sections, please list them here.

Next steps and opportunities

edit

This grant was the first step to open and improve Wikiscan, there is a lot of possibilities for future improvements, some ideas :

  • Add support for all wikis, including small wikis with less than 100,000 edits, this will double the amount of supported wikis and databases but it should works with dedicated optimizations.
  • Improve wiki specialization according to language (e.g. revert summary detection) and project type others than Wikipedia.
  • New statistics, like deleted contributions.
  • More and improved graphics, drop png for svg, custom graphics.
  • Statistics tables for each wiki and a global tables for all wikis.
  • Global editcounter for users by adding all projects together.
  • Improve statistics for the list/category of users, for example calculating total and average edits for the whole list.
  • Statistics data exports...
  • Code refactoring, the multi-wikis added a lot of changes, for example there is two "entry point", this should be simplified.
  • Use the future WMF Analytics edits history data lake [4].
  • Develop an API to allow external tools to easily use data (e.g. graphics embedded in wiki pages with the graph extension)
  • Puppetize the server and migrate to a better one with SSD and more RAM.
  • See if Wikiscan can be hosted inside Wikimedia Tool Labs or turned into a MediaWiki extension.

Part 2: The Grant

edit

Finances

edit

Actual spending

edit
Expense Approved amount Actual funds spent Difference
1 - Allow the site to calculate statistics directly with the remote Wikimedia Labs database and perform the necessary optimizations 450 480 +30
2 - Transform the site into a multi-wiki site with a subdomain and a database for each wiki 540 540 0
3 - Set up a system of "workers" to update statistics for each wiki 420 570 +150
4 - Create a new tracking page to monitor the updates status of workers of all wikis 480 570 +90
5 - Create a new global homepage displaying the list of available wikis with some general statistics and graphs for each wiki 600 660 +60
6 - Create a new home page for each wiki with global statistics updated regularly 480 600 +120
7 - Adapt statistics by date for small wiki that have too few edits per day (show only months and add years, allow more than 24 hours for recent edits) 180 180 0
8 - Transform the interface for multi-language support ​​with all texts in a single file 540 540 0
9 - Translate the interface in English 240 240 0
10 - French documentation of the various statistics computed by the site with possibility to add other languages ​​by translating a file 240 0 -240
11 - Display the overall ranking of users for each year and each month 180 180 0
12 - Add page views on the statistics by date, the system must be highly optimized to work with hundreds of wikis 600 600 0
13 - Allow to use a list of users contained in a wiki page instead of a category 60 60 0
14 - Add new graphics on the evolution of active users based on the number of months of participation 300 300 0
15 - Improve the display of the most active articles for the last 24 hours 360 0 -360
16 - UI improvements and graphics 240 60 -180
17 - Restructuring and additions of statistics 180 300 +120
18 - Optimizations and various corrections, code rewriting, code cleaning 600 600 0
19 - User stats optimizations for big wikis, enwiki scaling 0 210 +210
Total € 6690 € 6690 € 0


I have spent more time on several core functionalities in the first part :

  • Worker system needed more improvements than I initially thought, +5 hours to add a master worker and +3 hours for it status page (lines 3 and 4).
  • Global home page and wiki home page: those are visual pages important for visitors, it take a long time to choose what to display, search for graphics library or build mine, make the graphics and page layouts, etc. +2 hours for global, +4 hours for wiki home (lines 5 and 6).
  • Restructuring and additions of statistics: I have spent more time on this essentially to provide global statistics to new home pages in an efficient way, +4 hours (line 17).
  • English Wikipedia scaling for user stats needed a lot more optimizations to keep the server run smoothly, I used 7 hours for this that was not planned (new line 19).

This was the most difficult and important tasks, especially the scaling to 300+ wikis with support for English Wikipedia.

I have cut some secondary features to accommodate remaining time :

  • 10 - French documentation of the various statistics computed by the site
    There is already some documentation with tooltips, using a wiki page for documentation could be more appropriate, easier to maintain and translate.
  • 15 - Improve the display of the most active articles for the last 24 hours
    The actual list is working fine, improvements could be done in another grant.
  • 16 - UI improvements and graphics
    I used only two hours for minor improvements and stats optimizations.

Remaining funds

edit

Do you have any unspent funds from the grant?

  • no

Documentation

edit

Did you send documentation of all expenses paid with grant funds to grantsadmin wikimedia.org, according to the guidelines here?

Please answer yes or no. If no, include an explanation.

  • (All expenses are hours for software development.)

Confirmation of project status

edit

Did you comply with the requirements specified by WMF in the grant agreement?

  • yes

Is your project completed?

  • yes

Grantee reflection

edit

We’d love to hear any thoughts you have on what this project has meant to you, or how the experience of being an IEGrantee has gone overall. Is there something that surprised you, or that you particularly enjoyed, or that you’ll do differently going forward as a result of the IEG experience? Please share it here!