Talk:Artificial intelligence/Bellagio 2024
Looks like a fun crowd!
editThis is a nice paper on source reliability I found via one of the local attendees. A composite reading list might be a nice addendum.
To what extent did this help identify groups interested in pursuing or supporting these various research directions? Thinking of the EU and individual university groups as well as foundations.
Tools for creation and curation seem to me worth more than 1/8 of AI-related research. How can we [all] do more of this? Are there perceived bottlenecks to supporting and expanding that work? –SJ talk 22:42, 23 February 2024 (UTC)
- hey Sj :) I know Chris also prepared an informal reading list prior to the event, I wonder if he'd be open to sharing it here.
- Chris and Leila did most of the work to identify and invite participants, so they'll be in a better place to follow up on specifics about how they reached out, but I'll just say that the original group was just the start; there was a whole session during the event dedicated to identifying who else might be interested or should be involved. Posting the early draft here on Meta (literally during the event) is also a way for us to give this effort more visibility and attention before anything is decided.
- On your last point, I want to emphasize that this isn't the research agenda for the Foundation, but for the larger group and research community; some of those institutions have research priorities and expertises that, while useful to the field, aren't primarily focused on our specific needs as the Wikimedia movement.
- Hope this helps! Guillaume (WMF) (talk) 14:09, 26 February 2024 (UTC)
- Thanks Guillaume :) Yes, my last comment was directed at our larger community interested in the implications of AI for the knowledge commons. –SJ talk 18:13, 26 February 2024 (UTC)
Re last point: we need to reduce entry barriers for such research. Increased transparency responsibilities for training data should help. Nemo 12:03, 2 March 2024 (UTC)
Reconciliation as a challenge to look at
editGreat to see this group of people together.
I also suggest looking at the challenge of reconciliation (as in, matching entities across datasets). For instance, checking which names in a dataset match with people entities on Wikidata, or matching author and publisher names against VIAF. The approach now is to use an emerging web protocol / API, which requires each provider of a dataset to have technical and financial capacity to build and maintain a web service, making this process only maintainable by well-resourced organizations in the Minority World. As a significant example, the Wikimedia movement hasn't been able to bring together this capacity for Wikidata and Wikibase (Phabricator discussion). I think that should be seen as one sign that a fresh approach is needed.
I see this reconciliation process in practice a lot in my work (both inside and beyond Wikimedia), observing users, and I also hear input from developers who want to incorporate such reconciliation processes in their software. From both sides, more flexibility, clarity and configurability, and more ease of use is strongly asked for. The ideal is an approach that is deployable by even an individual without financial resources and technical capacity, that is intuitive and clear to use by the people doing the matching, and that is re-usable with ease and flexibility by developers who build software that integrates a reconciliation process.
I regularly hear suggestions that AI / LLM-based approaches have potential here, and I will be grateful to anyone who will actively dive into this. Spinster (talk) 07:31, 11 March 2024 (UTC)