Wikidata For Wikimedia Projects/Projects/cache createSchemaElement

Tracked in Phabricator:
Task T352019

Improve article loading time by caching SkinAfterBottomScriptsHandler::createSchemaElement

Background

Every Wikipedia article page contains a JSON-LD block in its HTML-source. JSON-LD is used to structure linked data in article pages for better search engine and third-party data consumption.
Currently, when building the JSON-LD schema, two database queries are executed:

The Problem

edit

These function calls contribute significantly to the request time, with building the JSON-LD schema taking ~5% of the total request time when loading an article page.

The Solution

edit


Function 1: Cache getFirstRevision value for Linked Data Schema

edit

The Wikibase webhook SkinAfterBottomScriptsHandler::createSchema runs an expensive functiongetFirstRevision every time a Wiki article is loaded, accounting for approximately 2.5% of a pages total load-time, of every read-page.

Its operation calculates and outputs the first (and oldest) article version by retrieving all revisions, sorting them by date and ID, and selecting the oldest.

Caching this revision value will reduce the frequency of this expensive function being called and server-load will be reduced. Cached values will be added in the Parser cache output.

Development

edit

Development on this task started December 2024.

Deployment

edit

The patch was added to MediaWiki version MW-1.44.0-wmf.12 and was deployed to all Wiki groups on January 16, 2025.


Function 2: Cache getDescription value for Linked Data Schema

edit

The function TermLookup::getDescription() is invoked whenever a Wiki page is read, it will query and return the Wikibase entity (Wikidata item) ID, the description and the language code. These values for the most part do not change, and requesting this information everytime a Wiki page is loaded is often unnecessary.

By caching this information for easy retrieval, the frequency of the function being invoked will be significantly reduced and so will the load on the Wikimedia servers.

The new criteria for a cache invalidation will become:

  • An edit is made to the Client page (the page/article where the function is being invoked from)
  • An edit is made to the Wikidata item Description, in the Userlanguage or fallback language (English)
  • Once per 30 Days if neither of the above happen.

Development

edit

Development on this task started December 2024.

Deployment

edit

This patch is currently in code review.