API Policy Update 2024

Below is a new draft text of WMF legal policy discussing the use of its APIs. Feedback is welcome on the talkpage. This new document would be published at "API usage guidelines" on the Foundation wiki, unless community comments suggest the text should be appended on an existing page (for example, the User-Agent policy page) or a different wiki.

This page is available for community comment for a period of at least two weeks or until new comments, questions, and suggestions have concluded.

Context

edit

In 2023, following broad and detailed public consultation and drafting process, the Wikimedia Foundation updated the Terms of Use (ToU).

The section that refers to the API now states:

12. API Terms:
We make available a set of APIs, which include documentation and associated tools, to enable users to build products that promote free knowledge. By using our APIs, you agree to abide by all applicable policies governing the use of the APIs, which include but are not limited to the User-Agent Policy, the Robot Policy, and the API:Etiquette (collectively, "API Documentation"), which are incorporated into these Terms of Use by reference.

Although the APIs have always been governed by the ToU, in that update, the Wikimedia Foundation sought to clarify (specifically to malicious actors) that the APIs are part of the Terms of Use and that the Foundation has the ability to enforce rules that are part of a fair and reliable API ecosystem.

To continue that effort, the Foundation is updating the API documentation to describe how the Foundation administers the APIs.

This is not a change in the requirements to use the API. This is a statement of existing practices in greater detail, with information that describes how the Foundation differentiates between permissible good-faith use of the APIs and inappropriate, harmful use of the APIs. This update does not change the technical ways that developers interact with the APIs. Just like the API-related language in the Terms of Use, this update is meant to more fully describe existing rules that are enforceable against malicious actors.

Version 1.0

edit

Date: August 26, 2024

The Wikimedia Foundation enforces limits on operators’ use of certain APIs, including but not limited to the MediaWiki Action API, the MediaWiki REST API, and the RESTBase API. Some of these limits are described below. The limits in this policy exist to maintain the performance and stability of our APIs, to promote the fair allocation of server resources, and to ensure that community members can use the APIs to further the free knowledge movement. You can read a FAQ about this policy below.

In this policy, an “operator” is defined as any person who deploys software that causes our APIs to be called. In other words, the operator controls how often the APIs will be called. For example, this includes people who write on-wiki “gadgets” (even if they do not run them), and people who run bots (even if they did not write them). If you are reading this and looking for useful tips on how to use Wikimedia APIs, then this is probably you. If limits are imposed on an operator's use, they may not circumvent these limits. For example, operators are required to follow all instructions to delay or reduce the rate of further requests they receive in a response from an API. The specific numerical limits on any endpoint may change from time to time (for example, as current and predicted future load changes).

When using Wikimedia APIs, an operator must:

  1. Follow the User-Agent policy and otherwise correctly label user agents;
  2. Follow rate limiting requests (e.g., throttling notification) you may receive; and
  3. Follow the requirements of the content licenses when republishing downloaded or cached data.

When using Wikimedia APIs, an operator must not:

  1. Send traffic via concurrent connections to Wikimedia APIs resulting in a degradation of service to others or endangering the stability of the site;
  2. Request data at a high rate, far beyond common use cases, such as in spikes or in a manner intentionally meant to circumvent this policy;
  3. Spread Wikimedia API requests over multiple user agents to hide excessive use by a single operator; or
  4. Send high traffic originating from a single source or targeting a specific wiki/resource that ends up blocking others from using or accessing that resource.

Operators should use our APIs within the guidelines described in this policy and other technical documentation for each API. For the avoidance of doubt, the existence of this policy does not require members of the Wikimedia community to get prior permission from the Wikimedia Foundation before using the APIs in a manner consistent with this policy. Rather, we want people to be aware of uses that could result in disruption of their API usage, so operators know how to use Wikimedia's shared resources properly.

If your use case might fall outside the bounds of the policy described here and you would like to receive an exception or clarification, please submit a request to legal wikimedia.org.

In situations where a limit may affect an operator’s use, the Foundation may contact the operator to discuss the nature of the limits and any exceptions that may be needed. This is only possible if the operator’s scripts adhere to the User-Agent policy and include up-to-date contact information.

The Foundation reserves the right to enforce this policy through blocking API access, disabling a program, or any similar action. Any choice to take or not take an enforcement action in a given situation will not be a waiver of any future action under this policy. In situations when this policy is enforced, any action taken can be lifted at the Foundation's discretion if the requesting party takes action to reduce the harm or unfairness caused. For example:

  • Reducing the rate of the API requests being sent;
  • Implementing an exponential backoff, where a throttling notification is sent to the operator, and in response, they slow down their rate of requests automatically; or
  • Following User-Agent naming conventions, as required in the User-Agent policy, such that you can be contacted if usage becomes problematic.

Sub-licensing

edit

Operators (or those acting on their behalf) may not sublicense, lease, assign, or guarantee the availability or functionality of a Wikimedia Foundation-managed API to any third party. It is not permissible to implement an API client that white labels in a manner that obscures the identity of the ultimate service provider of the APIs (the Wikimedia Foundation). For the avoidance of doubt, this term does nothing to limit the use and republication of Wikimedia content in accordance with the free licenses that content is licensed under.

Retiring APIs

edit

The Foundation may retire or modify APIs. Operators that use APIs beyond the announced end-of-service date should expect the API to become unavailable without further warning or to experience significant degradation in performance. It is expected that operators update to use appropriate alternatives in advance of the end-of-service date. The Foundation may provide notice regarding updates and deprecations of APIs to the contact information that is provided per the User Agent requirements.

Modifications to this policy

edit

This policy is a public summary of some of the current limitations that the Wikimedia Foundation imposes on operators regarding their use of the Wikimedia APIs. As such, the Wikimedia Foundation may modify the policy in its discretion to more fully describe current limits or reflect future changes.

What is happening to Wikimedia's API policies?

The Wikimedia Foundation is updating the language of our API policies as part of our ongoing effort towards more clear API management. For example, the page with policies around user-agents in headers has not had a substantial update since 2010. The Foundation strives to be transparent about how the APIs have been managed for the previous 10 years, so the Foundation created this policy page that conveys our existing management policies. This updated language should not be viewed as a change in the way the Foundation administers the APIs. Instead, we hope that it clarifies some points about how the API is already managed. More clear language will be helpful in communicating to bad actors already intentionally breaking the rules.

Will this update affect how I develop using the APIs?

For the vast majority of users, nothing about these clarifications should change how you use the APIs. This is a more detailed statement of existing practices. Therefore, if you have not encountered previous issues developing using the APIs, you are unlikely to encounter new ones. This update language does not limit the functionality of the Wikimedia Foundation’s APIs. Extremely high resource users may take note of the language because the new language may allow those users to better understand the Foundation’s resource management process and avoid errors, limits, and blocks.

What will the significant changes be?

Since these updates reflect existing expectations (found previously in places like the User Agent Policy and other documentation), the Wikimedia Foundation hopes the new language emphasizes a handful of points that are already true. For example, developers using the APIs should self-identify their uses with specifically described user agents. Extremely high-volume users should understand there are safety/security/resource implications to their use which makes labeling user agents with contact information even more important. Extremely high-volume users should voluntarily shut off broken uses when those uses no longer serve any purpose. Extremely high-volume users should refrain from circumventing existing limits by operating multiple agents in parallel when their intent is to circumvent existing limits.

How will this affect "power" community users of Wikimedia APIs?

The updated language should have zero impact on members of the Wikimedia community who follow the policy. Wikimedians generally never fall into the category of extremely high-volume users who end up encountering technical limits. Optionally, anyone wishing to seek assurances about their specific usage may submit a request to legal wikimedia.org.

If I am a heavy-user of Wikimedia APIs, can my access be limited?

Yes. We have measures in place to ensure the stability and security of our systems. We are now working to standardize the way to describe these limits to help extremely high-volume users understand how to avoid disrupting the APIs with high-frequency, resource-intensive requests.

As a concrete example, our User Agent rules do this. If there is a peak in the usage of an API by a single user and we are able to identify who that user is (because they have followed the user agent rules and included their contact information), we may be able to reach a solution for the resource-intensive use together. If a developer does not follow those rules, there’s no way to mutually troubleshoot a solution.

Will volunteer tools and bots be affected?

We know that many volunteer-developed tools and bots rely on Wikimedia APIs for various tasks, so if those usages have not caused concern previously, they are unlikely to in the future. This update is only to clarify policy. Optionally, anyone wishing to seek assurances about their specific usage may submit a request to legal wikimedia.org.

If this language just restates existing policies, why is it being updated?

There are several reasons, all of which are useful to improve the quality of service to those who do already follow best practices.

First, being as explicit as possible about what the rules are helps extremely high-resource-using developers avoid violating those rules. Extra information provides these users with increased predictability. Hopefully, if good-faith operators are more aware of the existing rules this could  give the Foundation’s API administrators less work.

Secondly, this is an attempt to emphasize helpful policies that are not currently being followed universally. For example, foregrounding the rule that extremely high-resource users need to follow the existing user-agent policy avoids a situation where we may have to unintentionally disrupt good-faith users when attempting to react to malicious use.

Finally, a restatement of rules attempts to clarify what is and isn’t appropriate for developers working in bad faith.

How will these changes affect researchers?

They will not affect researchers who follow the policy. We understand the importance of our APIs for researchers, developers, and small organizations. It seems unlikely that *future* good faith research-related use could be limited because we are unaware of *current* good faith research-related uses that are limited.

Will this update affect “big tech” API users?

Possibly yes, if they do not follow the policy. High-volume users who cannot or choose not to comply with the existing rules described in this update may find a solution in the Enterprise API which has been created to support use cases that are high-volume and/or require an SLA.

Why the emphasis on increased specificity in the updated policy?

By making our policies more specific, we aim to provide users with a clear understanding of what they should and should not do to avoid technical restrictions on their API usage. The goal is to ensure smoother, more efficient interactions for all API users.