Wikimedia CEE Meeting 2015/Programme/Tools/List management bot
Title of your proposal
edit- Name(s) and/or username(s)
- User:Strainu
- Type of submission (Please choose one)
-
- Lecture (one-to-many)
- Summary
When Wiki Loves Monuments (WLM) went global, the international team asked that the lists be built using row-templates, which allow for easy automated parsing and editing. They also created a set of scripts that would parse the lists and do certain operations like adding categories to images and creating lists of pages with various errors. However, I found the code to be both too complicated (requiring the use of a database) and not flexible enough, so I wrote my own. It needed to achieve the following:
- be modular - due to the large number of pages, I had to be able to parse different namespaces and websites independently; I have one script for each task, the main ones glued together with a shell script; secondary scripts (like the one used to create articles) are independent
- allow caching - parsing 60.000 images takes about 2 days; the results need to be reused as much as possible
- be aware of local Infoboxes and other templates, as they can contain a great deal of information
- allow external data imports with as little preparation as possible
As time went by, I realized that it would make sense to use the same scripts for other lists (for instance archeological sites). This meant the code needed to be even more generic and modular. I ended up with quite a large, but easy to understand configuration for each script. This means that the code can be now reused for almost all list that respects the WLM's structure with just a few minutes of configuration.
- Preliminary preparation (if necessary)
- Expected outcomes
Encourage other countries to develop their own code and stop depending on WLM's international team robots to do their work.
- Duration (without Q&A)
20-25 min
- Specific requirements
- Slides or further information
- Interested attendees (Please add yourself, and you may indicate your questions to the presenter).