Toolhub/Progress reports/2021-08-20

Report on activities in the Toolhub project for the week ending 2021-08-20.

Wikimania outcomes

edit
 
Slides from lightning talk presented at Wikimania 2021

During Wikimania 2021 last weekend Srishti, Seve, and Bryan were all able to have some chats with folks from the larger Wikimedia community about Toolhub.

  • Bryan presented Saturday during the main conference with a lightning talk titled "How to find tools to improve your workflows". The slides with speaker notes from that presentation are now available on commons. A recording of the talk is on YouTube.
  • Bryan followed up with a longer presentation also on Saturday of the same slide deck in the unconference space. This longer version had more space for question and answer time with the folks who showed up.
  • On Sunday, the team participated in an unconference discussion of Quality Signal Sessions: The Wikimania edition. We had some "fun" adventures in using technology during the session, but eventually ended up having about 40 minutes to give another brief presentation from the lightning talk slides followed by asking folks to share their thoughts about Toolhub might be able to help everyone make more informed decisions about which tools to use in their work or which tools they spend their time maintaining and improving.

We have done one quick round of debriefing on these interactions, and expect to think a bit more about what we think we heard before making our next plans for outreach to promote the project and ask for more participation in shaping the next set of features for it.

Importing data from the wikis

edit
Tracked in Phabricator:
Task T288977

Building on the existing toolinfo.json standard created by Hay's Directory has allowed us to have information about a number of tools that have been documented already. We hope this also offers a reasonably easy path for adding more tools, especially off-wiki tools, to the catalog.

As mentioned in the presentations at Wikimania, we know that there are many lists of tools in various wiki pages across the movement. Bryan has been thinking about these other lists of tools, and started doing some experiments this week to see how difficult it might be to build bots to add some of that information to the catalog as well. The first challenge that Bryan discovered was how to authenticate a bot process to Toolhub so that it could use the write functions of the API. The OAuth service built into Toolhub is well suited to building web-based tools, but it is not an easy authentication method to use from an unattended command line script. The Django REST Framework library used in the backend of Toolhub provided a convenient solution in it's TokenAuthentication subsystem. Bryan has patches up for review in gerrit adding API endpoints for managing tokens and exposing that API in the UI. When these are merged, user's will be able to use the Toolhub UI to create a token for their own account which can then be used to authenticate to the API by passing a HTTP header like Authorization: Token <my token value> with requests.

The first experiment at actually importing content has been with the Wikipedia:User scripts/List page on enwiki. This page maintained by the enwiki community uses Template:User script table row which makes extracting structured data easier than just a freeform table would. Bryan has managed to import information on over 400 user scripts into his development server by using the mwclient and mwparserfromhell python libraries to parse data from that enwiki page. Bryan plans to clean up this code and turn it into a tool hosted on Toolforge so that it can be used with the production deployment of Toolhub. Hopefully Bryan and others will be able to repeat this pattern with other pages from the wikis to help get more tools documented in Toolhub.

Wrap up

edit

We started the week with discussions at Wikimania to promote the project generally and continue gathering information on what we could do next. Bryan spent some time doing hackathon-like experiments on using Toolhub as a tool builder. Discussions started last week about the overlooked content licensing issue continued both on Phabricator and with the Foundation's Legal team. A few patches needed for the production deployment were merged, but there is more work remaining there. Bryan expects to return his focus to the production deployment next week, and will ideally be able to make a better forecast of when https://toolhub.wikimedia.org/ will be live as result by next week's report.