Wikidata For Wikimedia Projects/Projects/cache createSchemaElement
Improve article loading time by caching SkinAfterBottomScriptsHandler::createSchemaElement
Background
Every Wikipedia article page contains a JSON-LD block in its HTML-source. JSON-LD is used to structure linked data in article pages for better search engine and third-party data consumption.
Currently, when building the JSON-LD schema, two database queries are executed:
TermLookup::getDescription
: Fetches the description of the associated entity (possibly from Wikidata).RevisionLookup::getFirstRevision
Retrieves the first revision of the article.
The Problem
editThese function calls contribute significantly to the request time, with building the JSON-LD schema taking ~5% of the total request time when loading an article page.
The Solution
edit- Caching the results of
TermLookup::getDescription
andRevisionLookup::getFirstRevision
will improve article loading time by avoiding the re-execution of these queries on every request.
Function 1: Cache getFirstRevision value for Linked Data Schema
editThe Wikibase webhook SkinAfterBottomScriptsHandler::createSchema runs an expensive functiongetFirstRevision
every time a Wiki article is loaded, accounting for approximately 2.5% of a pages total load-time, of every read-page.
Its operation calculates and outputs the first (and oldest) article version by retrieving all revisions, sorting them by date and ID, and selecting the oldest.
Caching this revision value will reduce the frequency of this expensive function being called and server-load will be reduced. Cached values will be added in the Parser cache output.
Development
editDevelopment on this task started December 2024.
Deployment
editThe patch was added to MediaWiki version MW-1.44.0-wmf.12
and was deployed to all Wiki groups on January 16, 2025.
Function 2: Cache getDescription value for Linked Data Schema
editThe function TermLookup::getDescription()
is invoked whenever a Wiki page is read, it will query and return the Wikibase entity (Wikidata item) ID, the description and the language code. These values for the most part do not change, and requesting this information everytime a Wiki page is loaded is often unnecessary.
By caching this information for easy retrieval, the frequency of the function being invoked will be significantly reduced and so will the load on the Wikimedia servers.
The new criteria for a cache invalidation will become:
- An edit is made to the Client page (the page/article where the function is being invoked from)
- An edit is made to the Wikidata item Description, in the Userlanguage or fallback language (English)
- Once per 30 Days if neither of the above happen.
Development
editDevelopment on this task started December 2024.
Deployment
editThis patch is currently in code review.