Future Audiences/List of experiment ideas

This page documents ideas for experiments that could allow the Wikimedia movement to make use of new technology (i.e., new generative AI tools and libraries) to share or gather knowledge in new ways. These are intended to be ideas for small-scale technical experiments (i.e., prototypes, proofs-of-concept, or simple gadgets) that can be executed quickly in a hackathon or as a part-time project for a volunteer developer.

These ideas have been generated by the Future Audiences team and members of the Wikimedia community, and are intended to inspire anyone in the Wikimedia movement and beyond – volunteer developers, hackathon participants, affiliates or organizers – to add and discuss new ideas, try these or other experiments, and share resources and results with other interested Wikimedians.

Please feel free to add ideas, share resources (e.g., datasets, libraries, tools that may be helpful to developers), link your experiments in the table below, or leave other comments and suggestions on the talk page.

Experiment ideas


Idea	Notes/resources	New technology or trend this addresses	Links to experiments/results	What could this help our movement learn/achieve?
Use AI to remix Wikimedia content into new formats		AI	See this experiment to turn Wikipedia article content into short videos: https://gitlab.wikimedia.org/repos/machine-learning/article-to-short-video	What new content formats (i.e., videos, podcasts, visual "stories", chatbots) are preferred/useful for learning besides longform text, images?
Create shareable visualizations of reading 'rabbit holes' and/or end of year lists (like Spotify Wrapped)		Sharing knowledge on social apps	See this experimental "Wikipedia Year in Review" tool: https://wikipediayir.netlify.app/	Could we draw in more readers by creatively sharing knowledge on external platforms?
Create an app or bot that shares fun facts from Wikipedia	Dataset of "Did You Know"s from English Wikipedia: https://huggingface.co/datasets/derenrich/enwiki-did-you-know	Sharing knowledge on social apps		Could we draw in more readers by creatively sharing knowledge on external platforms?
Create an AI-assisted Wikipedia voice assistant (like the BBC's voice assistant)		AI		Could we draw in more readers by creatively sharing knowledge on external platforms?
Create a Wikipedia chatbot on popular messaging apps (i.e., WhatsApp, Discord, Telegram, etc.)		AI		What new content formats (i.e., videos, podcasts, visual "stories", chatbots) are preferred/useful for learning besides longform text, images?
Create a game where people compete to find missing citations, in the least amount of time possible		AI		Could we draw in more potential contributors through a gamefied experience?
Create a version of Citation Needed that allows a Wikipedian to log in and edit to add/improve Wikipedia information		AI	See Citation Needed experiment repo here: https://gitlab.wikimedia.org/repos/future-audiences	Could we increase the productivity of current editors by letting them contribute in new ways?
Use LLMs to find new or better sources for claims on Wikipedia		AI	See CiteCheck experiment: https://github.com/masmedim/citecheck	Could we increase the productivity of current editors by letting them contribute in new ways?
Use AI to autogenerate quizzes on articles	Could be available via a "Check your knowledge" interface	AI	Could leverage H5P	What new content formats (i.e., videos, podcasts, visual "stories", chatbots) are preferred/useful for learning besides longform text, images?
Use GenAI to generate suggestions to Wikipedia articles based on given source content.		AI	A first test: sv:Wikipedia:Projekt_Fredrika/SLS-AI-pilot	Achieve new content. Inspire new/old editors that prefer draft suggestions instead of editing from scratch.
Use AI to create up-to-date spoken Wikipedia audios		AI	c:Help:Spoken Wikipedia using AI	Achieve popular podcast- & audiobook-type formats for WP contents.
Your idea here!

Technical pain points

Enterprise API
- Enterprise API lacks typescript library bindings
- Enterprise API auth mechanisms are cumbersome
- Enterprise API's image information is minimal (optimally would have all images, their alt-tex, caption and license information)
Mediawiki
- Internal search isn't as good as external search engines over wikipedia
- There is no easy way to map text to the citations backing them
- There is no way to deep link to edits. Say we wanted to link to a page with an edit having been made that a user could review and then click "save" (or alternatively amend and then save).
Media
- Commons search isn't sufficient (semantic search would be optimal also search by resolution or dimension)
- Getting licensing strings automatically is hard. Especially hard to get in multiple languages. Lots of manual heuristic cleaning is needed.
- No easy way to identify images by "quality"
Toolforge
- Toolforge resources are constrained (e.g. disk quota for builds)
- Toolforge limits our ability to use custom docker images
- Toolforge lacks any metrics/logging infra