Wikimedia+Libraries International Convention 2025/Programme/hallucination-llm-wikibase
- KAT THORNTON
- Kenneth Seals-Nutt
Time: 11:45-12:00
Room: Library Lobby
Abstract:
Who was Emmy Noether? What are some of Grace Hopper's contributions to computing? Write me a three-paragraph biography of Barbara McClintock. As more people turn to generative-artificial-intelligence-powered systems to find information, will they find accurate answers to these questions? We asked our software pipeline of large language models (LLMs) and LangChain to check biographical texts related to scientists for factual accuracy using structured data from the Wikidata knowledge base.
Requesting that LLMs such as Chat-GPT compose a biography requires less effort than researching and writing a biography. If a person has a use case in mind which requires information accuracy, it is necessary to review the facts supplied by the LLM for accuracy due to the possibility of hallucination. Researchers have found that LLMs provide incorrect answers to questions in multiple categories. These include responses that contain claims with no sources, or claims that conflict with reference sources or web sources. In order to investigate how accurately an LLM can accomplish the task of generating factually accurate biographies, we have designed a system to check the facts in the LLM-generated biographies using data from Wikidata.
The subset of Wikidata we explored is the set of people included in our web application, sciencestories.io. The projects includes professionals from STEM fields who are from underrepresented backgrounds and includes multimedia biographies for people to explore and share. An overview of the project is available here: https://www.youtube.com/watch?v=_xMjPB0b0IQ.