Tôo-su-kuán Kà pênn-tâi/Tshiâu-tshuē
The Wikipedia Library is adding a search tool to the Library Card platform to allow users to search across the library's collections from one place. This project page summarises this work and will provide updates as it progresses.
October update: The search feature is now available to all users! Please try it out and let us know what you find on the talk page.
Background
Users of The Wikipedia Library have access to collections from more than 60 publishers. Available content totals more than 100,000 periodicals, comprising countless individual articles that editors may wish to access, in addition to books, data, and other sources.
In its current form, the Library directs users to each publisher's website individually to then use the unique search and discovery capabilities of that website. This presents a number of challenges to users. They ideally need to know which publishers have the content they want to access before searching, and must navigate a new website interface for every publisher they access. Advanced searches or filters, such as date ranges, need to be re-entered on each website. We therefore require users to have a high level of research literacy to identify publishers with relevant content and to then potentially spend a long time searching before finding the right information. This leads to frustration and confusion.
We want to provide editors with an easy way to search across all of their available collections from a single location, removing the need to visit individual websites and allowing cross-cutting filtering. We will present Library Bundle content (for which users simply need to meet an automatically verified activity threshold) as the default results. We will also index free-to-read content and provide links to open access versions where possible.
Building a cross-publisher search platform is well out of scope for our team, and is a problem that other organisations are already solving. Major search products are already being used by libraries around the world, including Primo, WorldCat, and EBSCO Discovery Service. These products provide fully fledged search platforms and index collections from publishers, keeping them up-to-date for libraries.
Previous discussions
Users have raised issues with the current workflows a number of times. Some relevant discussions and quotes:
- "sometimes I can guess which collections are likely to have what I'm looking for. But quite often I have to step through many of them one by one in the hope of finding what I'm after. What would be very nice is to be able to search all the Library Card collections from a single point, like a meta-search facility."
- https://en.wikipedia.org/wiki/Wikipedia_talk:The_Wikipedia_Library/A%E2%80%93Z
- https://meta.wikimedia.org/wiki/Talk:The_Wikipedia_Library#'Search_partners'
- https://de.wikipedia.org/wiki/Benutzer_Diskussion:Martin_Rulsch_(WMDE)#Wikipedia_Library_Nachklapp
Iōng-tsiá kòo-sū
- As a Library Card user, I want to search for content from all my collections in one place so I can find the right sources faster
- As a Library Card user, I want to browse content from each collection I have access to in the same place so that I don’t need to learn and use multiple interfaces
- As an experienced researcher, I want flexible filtering and advanced search options so that I can find the most suitable content
- As a novice researcher, I want guidance on how to use the interface to enable effective research.
- As a Library Card user, I want to see open access links so that I can add free-to-read links to Wikipedia articles
- As a Wikipedia editor, I want to browse content available through The Wikipedia Library so that I can identify collections to apply for
EBSCO Discovery Service
We have a hosted instance of EBSCO Discovery Service (EDS) - a library search platform - for this project. The platform was chosen for three primary reasons: our ongoing good relationship with EBSCO, a high level of customisability, and an interface with a substantial number of translated languages (~30 at the time of writing).
EDS can be configured to index content The Wikipedia Library has access to through its partnerships, and these databases are kept up to date by EBSCO, meaning we only need to flag the collections we want to index, and their contents will be updated automatically. EDS has a wide range of configurable settings for the interface presented to users. Most importantly, we can add additional JavaScript and CSS to the interface to customise the user experience.
EBSCO also makes available a range of EBSCO Apps that have been developed to support specific workflows. So far we have installed the following apps:
Designs
EBSCO Discovery Service comes with an out-of-the-box design which we would like to further customise. Many interface elements may not be needed, or could be confusing, and we want to ensure the design is consistent with the Library Card platform.
Design iterations will be posted here as we work on them.
Implementation
Technical integration details are tracked at phab:T240128 and its subtasks.
We will only be indexing content from the Library Bundle in the default view presented to users. This totals more than 60% of our content across ~25 collections and we're looking to expand this further over time. While we would ideally index all of our content, we feel that this would lead to a confusing user experience, where users can't easily understand which results they do or don't have access to. Additionally, some content would be accessed directly via authentication-based access and others via some other publisher-specific method.
Users will have an option to browse all TWL content indexed in EDS, but individual results will not - at least in the initial deployment - highlight whether that content is accessible or not. This feature would be technically complex so we will evaluate demand for it post-deployment.