WikiCite 2016/Proposals/Generation of referenced Wikidata statements with StrepHit
Proposal
editBackground
editData quality in Wikidata is crucial and references to trustworthy third-party sources are a way to ensure it. Lots of Wikidata statements are either unsourced or sourced to Wikimedia sister projects (typically Wikipedia via bots). Adding references to such small units of information may be a cumbersome task for human editors.
StrepHit wants to relieve this effort: it is a Natural Language Processing system that reads documents across reliable Web sources and produces referenced Wikidata statements.
Aim
edit- Play with the current StrepHit dataset: biographies in English;
- create and fill a Request for Comments;
- encourage referenced data donations through the primary sources tool:
- @Daniel Mietchen, Aubrey, and Thomas: follow up past discussions with ContentMine and Hypothes.is people.
Demo
editInstall the primary sources tool gadget to check out the StrepHit dataset: instructions at wikidata:Wikidata:Primary_sources_tool#How_to_use
Skills needed
edit- Basic understanding of how Wikidata works;
- communication strategies for community engagement, in order to:
- raise awareness of StrepHit's potential impact;
- attract new primary sources tool users.
Phabricator task
editNone yet.
See also
editParticipants
edit- Hjfocs
- Aubrey (talk) 08:06, 21 May 2016 (UTC)
- add your name here