Grants:IdeaLab/Countering systemic bias through Wikidata authority control
Project idea
editWikidata allows specifying IDs of a topic in external databases. I'll use the Virtual International Authority File (VIAF) as an example, but Wikidata has many such IDs, and similar principles apply to all. For example, Alexander Graham Bell (wikidata:Q34286) is 59263727 in the VIAF database. Thus, it is easy to see if a given person or entity is linked to an external entity. If it is not, it might not be in the other database, or someone may just not have linked it yet.
However, the idea of this is to go in the opposite direction, to find missing entries in Wikidata, Wikipedia, or both. For example, if we have a list of VIAF identifiers, a comparison to a Wikidata dump or query can determine which are in VIAF but not Wikidata. These can be considered candidates for article (and Wikidata entity) creation, if the topics also meet Wikipedia inclusion criteria (and it's confirmed there's not yet an article).
Even if there is a Wikidata entry, this can also determine which Wikipedias lack the topic.
Project goals
editThe goal is to counter systemic bias by seeing where Wikipedias are missing topics other databases/reference works have.
Open questions
editThe described idea made me think about an alternate solution. A local language can be used as a source set for extracting most viewed articles and then compared to a target language. The target language is checked for existence of the highest ranking articles on the source language. The result will as default list both existing and non-existing articles. By ticking a checkbox all the existing articles can be removed. Major problem is that the title must be machine translated in a lot of cases. It is possible to use fallback mechanisms like on Wikidata, and it is also possible to cache translations, so the need for machine translations is perhaps not that large.
Such a special page could have versions for both WikibaseRepo and WikibaseClient, and it will utilize the label - description structure to make it possible to list the top articles. The client version of the page should list links to pages, possibly also allow changing the label into a local name, and have some helper functionality to connect any newly created page to the correct item.
The system would the work as a continuous evolving "the N most viewed articles in language X", and by creating articles in other languages the editors will continuously try to diffuse those articles into other languages. — Jeblad 15:37, 2 November 2014 (UTC)
Get involved
editWelcome, brainstormers! Your feedback on this idea is welcome. Please click the "discussion" link at the top of the page to start the conversation and share your thoughts.
Note, I (Superm401) am not planning to implement this idea.
Participants
editEndorsements
editExpand your idea
editDo you want to submit your idea for funding from the Wikimedia Foundation?