Wikilegal/Review of Legal Standards for Sensitive Personal Data


Introduction

edit

The Internet has seen a steady increase in complaints about handling sensitive information, including on informational sources such as Wikipedia.[1] [2] [3] The size and range of online platforms, while good for sharing information, has also increased the size and range of harms from sensitive information. This has become an even greater issue in an age of heightened division and political polarization, where reference to sensitive characteristics can create significant danger for some people depending on their location and personal circumstances. Some of the harms are misinformation, hate speech, false representation in media, infringement upon personal privacy, and automated algorithmic bias based on sensitive information.

Many new laws have been passed in order to counter the potential harms of the proliferation of sensitive personal data. The two most impactful have been the the General Data Protection Regulation (GDPR) and the Digital Services Act. These laws may inform work done on Wikipedia, particularly when dealing with sensitive data. Courts are now showing greater willingness to order the removal of personal information from Wikipedia articles. While the Foundation legal team, as of this article's writing, still evaluates and responds to requests to remove such information on a case-by-base basis, we have prepared this research article detailing how the law around privacy and sensitive personal data is evolving.

We hope that this information will help Wikipedia communities to treat sensitive personal information with appropriate care while pursuing the sharing of free knowledge. We suggest that care towards sensitive personal information may include ensuring a consistent standard for sources in biographies. In addition, communities may want to review the notability standard for sensitive personal information to consider whether the public value of sensitive personal information outweighs the right of the subject to keep it private or at least obscure.

Sensitive Data

edit

As of this writing in mid-2024, the definition of sensitive data is still in flux, but is generally agreed upon by academic and regulatory bodies to refer to information that may potentially result in a loss of well-being or security if released to the general public, and must therefore be kept confidential to some extent.[4]

The EU General Data Protection Act (GDPR) presents one of the most comprehensive definitions of sensitive data through the "special categories" in Art. 9(1).[5] These categories are types of data that can't normally be used without a special reason such as explicit consent, publication by the subject themselves, or a clear public interest in the information. The special categories are:

1. "data revealing racial or ethnic origin";
2. data revealing "political opinions";
3. data revealing "religious of philosophical beliefs";
4. data revealing "trade union membership";
5. "genetic data";
6. "biometric data for the purpose of uniquely identifying a natural person";
7. "data concerning health";
8. "data concerning a natural person's sex life or sexual orientation".[5]

Article 9 of the GDPR is one of the most prominent examples of the "Brussels effect", a phenomenon in which EU legislation is considered a baseline for future legal discussion in other jurisdictions. The GDPR has influenced legislation in non-EU jurisdictions, including multiple US states such as California,[6] the United Kingdom,[7] and MENA countries including the UAE and Bahrain.[8]

Furthermore, other jurisdictions have started to reference principles articulated in the GDPR as deserving of particular care. A key example is the right to be forgotten, which was used to justify an Argentinian trial court's ruling that any natural person has the right to enforce deletion of false information published online.[9] Even though this decision was later reversed on appeal, subsequent cases have allowed celebrities to curate their own representation and content online in Argentina under the same reasoning.[9] The South Korean Communications Commission also enforces the right of data subjects to remove content pertaining to their person.[10]

We focus on the GDPR's articulation of special categories of data because it is generally considered a default for the definition of sensitive personal data by academics.[11] The Foundation legal team believes that it can function as a useful guide without applying directly to the subject matter of Wikipedia articles.[12] As of this writing, the Foundation is appealing this case.

edit

The 2022 EU Digital Services Act provides additional guidance for online platforms, such as Wikipedia, that they must deal with risks of "societal impact" against vulnerable groups of people; how to handle these risks is proportional to the respective sizes of the online platforms.[13] This EU legislation aims to protect vulnerable minorities against systemic risks that may be posed by online platforms, which can often include lack of privacy for vulnerable groups. The debate over the tradeoff between societal welfare versus individual risk includes discussions of the exposure of sensitive information that some individuals may wish to keep confidential.

As a case in point, the recent contention over whether or not to include former names ("deadnames") of transgender individuals on the French edition of Wikipedia led to a debate over whether provision of deadnames constitutes hate speech.[3] Specifically, there was debate centered on whether the topic should be discussed, how to discuss it, and how to develop criteria for when and where a transgender person's deadname could be included in their Wikipedia biography. The discussion caused enough concern that it led to legal complaints to the Wikimedia Foundation, and in response, the Wikimedia Foundation legal team offered to provide new legal education pertinent to the use of sensitive personal information – including gender identity – on Wikipedia pages.[14] This is a part of what led us to produce this article.

Case Studies: Removal of Subject's Information Directly by Subject

edit

As the Foundation recently had two cases (one in Portugal and one in France) in which the courts found removal was necessary, we offer these cases as a reference for the way that the law is evolving. One of these cases involves César do Paço (also known as Caesar DePaço), who sued Wikipedia over mentions of his links to the Portuguese political right and information about accusations of illegal conduct from many years ago. This case illustrates how the balance of public interest can weigh against a person's right to keep information private. The Foundation continues to appeal this decision as of this writing because we believe that the editors in question reached the right balance by including the information, but we were not successful in getting a Portuguese appellate judge to agree with us.

In an additional case, we were sued related to a name change following the article subject's attempts to directly remove the information. The case, which was about the François Billot de Lochner article (currently deleted as of this writing), involved a discussion of his attempted change of name for reputational reasons. The case was notable for ruling that the article was too incomplete to meet minimum standards for publication. The Foundation legal team provided details about this case and another French removal (not related to sensitive data) in a post in French as well.

What these cases both show is that courts have become willing to rule against Wikipedia based on their determination that information may be too sensitive for the subject as compared to its public value, even where the information was reasonably well-sourced and meets normal standards of verifiability. We think the community should be aware of these examples as a consideration for policies like notability and verifiability across languages.

Conclusions

edit

While the handling of sensitive personal data presents complex concerns, a standard of care that considers both the privacy of BLP subjects and society at large (particularly including marginalized demographics) can help to inform what to include on the Wikimedia Project's going forward. We suggest thinking about the question as a comparison between benefits to the Wikipedia reading public compared with the possible harm to the article subject (including considering whether the harm might affect many people of a marginalized demographic). Unfortunately, based on what we've seen of courts so far, the discussion of whether to include information in Wikipedia can't end with it being available in other reliable sources: courts do not always care that someone came after Wikipedia first and may rule against Wikipedia even if the article subject hasn't gone after the original newspaper, magazine, or similar that published the information.

The standards for handling sensitive personal data could be aided by the formation of new or updated Wikipedia policies and guidelines that potentially raise the threshold for notability and establish a minimum level of credibility for cited sources above that used for normal articles. The editing styles and guidelines established through this ongoing process may have important implications for how members of marginalized and minority communities are addressed in public sources of information, and in general media and communications.

References

edit
  1. Sharma, Shweta, and Hill, Michael (26 April 2024). The biggest data breach fines, penalties, and settlements so far. CSO Online. [1]
  2. FTC v. Kochava, Inc., Federal Trade Commission. [2]
  3. a b Tual, Morgane (12 March 2024). Wikipedia’s French-speaking community is torn apart over ‘deadnaming’ trans people. Le Monde (English). [3]
  4. Sensitive information, National Institute of Standards and Technology. [4]
  5. a b REGULATION (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL (General Data Protection Regulation) (article 9). [5]
  6. 11 July 2023. Comparing U.S. State Data Privacy Laws vs. the EU’s GDPR. Bloomberg Law. [6]
  7. Data Protection Act 2018. UK Government. [7]
  8. 25 July 2023. Privacy in the Middle East: A Practical Approach. [8]
  9. a b Carter, Edward L. (2013). Argentina’s Right to be Forgotten. Emory International Law Review, Vol. 27, Iss. 1. [9]
  10. 2 May 2016. South Korea Releases Right to Be Forgotten Guidance. Bloomberg Law. [10]
  11. Bradford, Anu (December 2019). The Brussels Effect, in The Brussels Effect: How the European Union Rules the World. Oxford University Press. [11]
  12. However, a recent decision by the "Italian Data Protection Authority" articulates that the GDPR does apply to Wikipedia’s processing activities, specifically with regard to journalism-related information. The case centered on a request to delete information from a biographical Wikipedia article.
  13. REGULATION (EU) 2022/2065 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL (Digital Services Act) (article 76). [12]
  14. Rogers, Jacob (7 March 2024). Note from the Wikimedia Foundation Legal Department, in Discussion Wikipédia:Sondage/Mention du nom de naissance pour les personnes trans. [13]