Grants:Project/ContentMine/Diversitech
Project idea
editWhat is the problem you're trying to solve?
edit"Wikipedia is a key knowledge resource for most Internet users, including those seeking information on gender and LGBT issues, public figures and events. Failure to represent this knowledge appropriately can steer people in the wrong direction, create ambiguity and cause distress. The existing Wikimedia resources require enrichment and improvement through the use of high-quality sources to ensure that they are well-informed, multilingual and up-to-date. However, locating appropriate sources is time-consuming for editors in addition to the challenges of navigating LGBT-related terminology through time and across linguistic and cultural boundaries. Those reliable sources that are found typically have a systematic bias towards the English language and global North."
Gender and LGBT issues are now part of public debate and policy in almost all countries. The Internet and Wikipedia are seen as a key resource for information about these topics and it is therefore critical that it should be well informed, up-to-date and accessible to people from different cultures speaking different languages. However, there are many challenges to representing this type of knowledge.
One challenge is incorrect use of terminology. Terminology around gender and LGBT-related concepts is so varied, ambiguous and dynamic that people often use terms which are incorrect, outdated or irrelevant to a particular geographic region or culture. This has implications: using an inappropriate term, even with good intentions, can steer knowledge seekers in the wrong direction, create ambiguity and cause distress. Inappropriate vocabulary can also be used to denigrate legitimate communities or render them invisible.
Lack of multilingual content is another concern. We have identified that pages across different language Wikipedias are lacking for many LGBT-related topics and there has been little attempt to map terminology between different languages or cluster it meaningfully. Undertaking this process requires the use of high quality sources, sensitivity and close engagement with a multi-language, diverse community who are active as researchers and editors.
A broader issue is the variable quality of LGBT articles and their reference sources. Many articles on English Wikipedia have been assessed using quality metrics and subsequently flagged for improvement by WikiProject LGBT, with 1,848 requiring substantial improvement and 12,728 classified as start or stub articles on English Wikipedia. WikiProject LGBT have collected a list of links to external sites where editors can find resources to improve articles or to scrutinise existing referencing. However, this process still requires substantial effort by those editors to locate suitable texts and overcome existing systematic biases in the literature that preference knowledge in English and from the global North. The aforementioned characteristics of LGBT-related terminology such as ambiguity and dynamism mean that individual keywords are a poor choice to explore relevant sources and articles: topic classification and broader techniques using the underlying relationships between the terms are more likely to be helpful to editors, practitioners and researchers who wish to explore and connect source documents and relevant Wikipedia content or items from other Wikimedia projects such as Wikidata.
These problems require a solution combining technology and community.
What is your solution?
edit“A portal to access a community-curated multilingual corpus of high quality articles on LGBT issues that encourages existing and new Wikipedia editors to create and improve articles by offering rapid referencing suggestions and prompts for related content starting from the Wikipedia articles and reference sources in which they are interested. Use of Wikidata to index the corpus and terminology will provide i) semantic links between terms in multiple languages to assist in building content across different language Wikipedias; ii) topic classification for exploration and iii) raw material for academic and practitioner-led studies on LGBT terminology and online representation in Wikimedia projects and across the wider internet.”
We have assembled a remarkable team from a global network of interest groups, organizations, research groups and communities committed to improving LGBT knowledge representation on the internet. Many have a public face and large community outreach, helping increase the visibility of Wikimedia efforts to address major problems of knowledge diversity and representation. Our partners and advisors are themselves diverse in geography, culture and language. In particular, we are pleased to have groups working in and on LGBT-related topics in the Global South and to bring everyone together for a collective project while retaining individuality.
Connecting terminology and increasing multi-language representation of LGBT topics in Wikipedia
There is significant energy within the Wikimedia community and related groups to improve LGBT-related content through upholding two pillars of Wikimedia content policy: attribution and neutrality. This requires close attention to terminology in the LGBT and gender fields as laid out in our problem statement. Our solution is to compile a multilingual corpus of papers, and adopt a text and date mining tool that extracts words and phrases from community-compiled lists (dictionaries) in line with earlier ContentMine projects. The interest groups we have convened will then identify:
- topic classifications that can identify relevant sources more reliably then individual keywords (that may be loaded or ambiguous)
- cruxes (problematic terms) in texts
- proximity of terms in texts that can reveal more about their meaning or changing use
We will achieve this by setting up a Wikibase platform, where the community of partner organisations can discuss terminological issues and contested aspects of knowledge. The platform will support language-specific statements and a "reliability" property for sources. The combination of these features and the community-led process will help maintain neutrality in an area where for many topics there are unlikely to be single sources of “proof” to attribute. It will also avoid confirmation bias in searching for terms and ensure that they are interpreted in context and correctly linked to their equivalents in different languages, assisting with finding gaps in multi-language Wikipedias which editors can be mobilised to address through existing communities.
Leveraging Wikidata to underpin terminology and provide a portal to analyse data from within the corpus and across Wikimedia projects
Wikidata is now underpinning Wikipedia and is evolving as a go-to search for scientific (and social science) subjects. It is built on the semantic web and uses modern tools which support multilingualism, multiple meanings and multiple viewpoints making it a valuable tool to address some of the problems we identified. For example, a reader interested in a term can find the Wikidata entry and from there the "equivalent" page in another language. This is typically not a direct translation, but a separate page written by committed Wikipedia editors with specific knowledge about a particular region and/or culture, linked by the Wikidata terms. Wikidata can also be queried and could allow interesting questions to be asked once entries are linked e.g. how does the use of terminology in the LGBT reference corpus change over time? Are Wikipedia articles reflecting recent trends in use of terminology? These features will attract researchers in LGBT studies and other areas who may not already be Wikimedians and have valuable contributions to make to improving content.
Making it quicker and easier for editors to match high quality sources to Wikipedia articles and for researchers and experts to find opportunities to contribute to Wikipedia
Wikidata is currently sparse for LGBT-related terms and can be overwhelming to use for many users so our solution of including a dashboard front end to the Wikibase installation should allow easy access to both the sources, data and metadata.
Starting from a Wikipedia article of interest, editors will see sources that are relevant but not currently cited and could be checked for new and improved content, while interested readers of a source text will see where it is cited in Wikipedia, where it is not cited but appears to be relevant and where the text contains significant (highly relevant and repeated) terms that currently do not have a Wikipedia article at all. We aim to prompt more people to create and improve content while a relevant source is at their fingertips.
The portal will also supply a front and for querying Wikidata and simply visualising the results such as clusters of Wikipedia content and/or sources containing particular terms or clustered in topics that would be a useful function for researchers and other interested groups exploring these topics and demonstrates the power of connecting different Wikimedia projects.
To summarise: our solution combines community and technology to address several challenges identified by WikiProject LGBT and other interested groups in ensuring that LGBT knowledge representation in Wikipedia is well-informed, multicultural and up-to-date. We are solving a number of these problems through providing high quality reference sources that are integrated into the Wikimedia ecosystem and exploration tools that cater directly to the needs of existing editors while also recruiting specialists and interested individuals through our partner organisations as new users and potentially editors of Wikimedia knowledge resources.
Project goals
editOur project aims to help improve the quality and quantity of LGBT articles within Wikipedia, and also to promote a better understanding of its content. It will undertake the curation of a collection of multilingual papers, and also carry out research (GVSU and McGill research groups) with and on them. The field requires nuance, which the project will express as precisely as it can, and cultural awareness where it can play a role as a forum. We aim to make this area of the Social Sciences not only more digital, but more accessible. In concrete terms, it will produce results of interest both to Wikipedian editors and activists, while preserving the neutrality that defines Wikipedia's approach to reference material.
Specific Goal Description | Wikimedia Project benefit | Wikimedia community benefit |
---|---|---|
Improve LGBT-related Wikipedia articles by supplying good quality references to Wikipedia editors in the topic area. | Wikipedias | LGBT WikiProjects |
Improve LBGT-related Wikidata information, by the creation of a cache of new, relevant scholarly article items, for uploads to the DT platform, identified by means of a census of references in Wikipedia articles. | Wikidata | Wikidata:WikiProject LGBT |
Develop a more diverse community of editors, volunteers and international organizations, partnering in the work around this project. | Wikipedias, Wikidata | LGBT WikiProjects |
Project impact
editHow will you know if you have met your goals?
editSpecific Goal Description | Measurement criteria | Actions taken |
---|---|---|
Improve LGBT-related Wikipedia articles by supplying good quality references to Wikipedia editors in the topic area. | 10% improvement in the quality of LGBT articles, Citations, Edits and sentences with citations. | Tracking via pages at en:Wikipedia:WikiProject_LGBT_studies/Assessment#Statistics, es:Wikiproyecto:LGBT#Wikiestrella and pt:Wikipédia:Projetos/Estudos_LGBT/Avaliação#Matriz resumo de avaliações actuais |
Improve LBGT-related Wikidata information, by the creation of new scholarly article items, for uploads to the DT platform and found by means of a census of references in Wikipedia articles. | 10,000 scholarly article items with main subjects created on Wikidata | Wikidata analytics |
Develop a more diverse community of editors, volunteers and international organizations, partnering in the work around this project. | 100 new editors registered | WikiProject signups, and identification of editors in the topic area with WikiProject help. |
- Continuing impact
- The output of this project will be of great value and used by the supporting organization even after the project completion.
- The platform will have content reusable as RDF, and will be a hub for its area.
- Establishing a viable process for identifying relevant sources that go beyond English will open the way to greater multilingualism there.
- The software tools will be open source, and reusable for other projects.
- The chosen corpus, in several languages and reviewed for quality, will be kept online.
- The diverse, international community built around the project will take on its own direction.
- New articles may be incorporated into DiversiTech in the future, allowing new and relevant knowledge to be included.
Goals around participation or content?
editMetrics | Numeric target | Tools & documentation |
---|---|---|
Total participants | Editor: 100, measured by account creation Workshops attendees: 100 measured by attendance list Meetups attendees: 50, measured by attendance list Newsletter circulation: 1000 individuals Webinars attendees: 100, measured by attendace |
Number of accounts Attendance list Attendance list Mailing list Attendance list |
Number of newly registered users | New wikimedians: 100, measured by account creation | Number of accounts |
Number of content pages created or improved across all Wikimedia projects | New pages: 100 Improved pages: 700 |
Number of pages improved or referenced |
Project plan
editActivities
editOur project scope is to make available useful information extracted from a corpus of 50K multilingual scholarly papers on LGBT topics, making their usefulness as references for Wikipedia more transparent. For that we will apply standard text-mining techniques with output into a Wikibase site, and develop a front end as a dashboard, using SPARQL to serve up information from both the DiversiTech site and Wikidata.
Our project workflow starts with the selection of the corpus, the development of the TDM technology and the required dictionaries for processing this big corpus of knowledge, the creation of an "editors friendly" platform to help them to work on a)main subject, b) fact extraction and c) terminology, followed by a quality filter and the automatic upload of information into Wikidata. The output of the platform will be made available through a dashboard (feedback by Wikidata) in which groups of interest can find, filter and visualise the results of this project. Additionally, this information will be used by two research groups from the University of Grand Valley and McGill University to continuing exploring the changes in related terminology through time, this time, on a bigger scale. This project will help these groups to expand the boundaries of their useful research and make it available for all.
Following standard project and software management good practices, we have divided the work into eight work packages, as shown in the table below with each work package, duration (Gantt Chart), objectives and outputs.
WP code | Work package | Objectives | Outputs |
---|---|---|---|
WP1 | Corpus selection | Selection of 50k+ papers related to LGBT to help improving the current articles towards FA status. | Annotated corpus on Wikibase site |
WP2 | TDM tool and dictionaries | Develop software that can apply TDM techniques a high volume of papers, with custom dictionaries compiled for the project. | Tool made available under open license, language-specific dictionaries as files made available with an account of the compilation process. |
WP3 | DiversiTech platform | Develop the Diversitech UI so that editors can easily add value to the site. | "Editors Friendly" Online data and a dump of the content at the project's end. |
WP4 | Quality filtering | Criteria for the reliability of references in the area developed and applied by the DT community to tag papers. | Sub-corpus of papers tagged as reliable made available in machine-readable form, e.g. as a Wikidata focus list. |
WP5 | Wikidata uploading | Develop an automatic tool for uploading of items about suitable corpus papers on Wikidata, and populating them with metadata. | Bot code made available as open source. |
WP6 | Dashboard | UI for organizations without wiki knowledge to reuse the information curated by editors on the platform. | Front end available to other Wikibase sites, underlying SPARQL queries made available, e.g. on a Wikidata page. |
WP7 | Comms and dissemination | Communicate about the project internally and externally, disseminate our outputs with the wider community, engage with new users and experienced editors. | Archive of newsletters online. |
WP8 | Project management | Ensure that the action runs smoothly, that there is excellent communication among all the project participants, volunteers and community, and that action outputs are delivered on time to deliver a high-quality output | Project Plan, Progress report, Risk register, Financial management, Final report. |
- Project Gantt chart (by month 1 to 12)
Work Package | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
WP1 Corpus | X | X | X | X | X | X | ||||||
WP2 TDM tool and dictionaries | X | X | X | X | ||||||||
WP3 DTbase platform | X | X | X | X | X | X | X | |||||
WP4 Quality filtering | X | X | X | X | X | X | ||||||
WP5 Wikidata uploading | X | X | X | X | X | X | X | X | ||||
WP6 Dashboard | X | X | X | X | X | X | ||||||
WP7 Comms and dissemination | X | X | X | X | X | X | X | X | X | X | X | X |
WP8 Project management | X | X | X | X | X | X | X | X | X | X | X | X |
Time expenditure
editBelow we present a table with the work packages each project member will be spending time on. In addition, we include the list of the project advisors and the main work packages in which they may provide us guidance. Finally, the organisations that will help us to disseminate the project on new and diverse communities.
Tech development / Comms plan / PM | Advisors (volunteers) | Organizations (Volunteers) |
---|---|---|
ContentMine: WP1 to WP8 WMUK: WP7 Lane Rasberry: WP1, WP3, WP7 GVSU/McGill: WP2, WP6 |
Elisabeth Jay Friedman (Latam studies expert, U. of San Francisco): WP1, WP6 Anasuya Sengupta (Whose Knowledge?): WP6, WP7 Myra Abdallah (Arab Foundation of Freedom and Equality, AFE): WP6, WP7 Jason Moore (Wikimedia LGBT+ User Group): WP1, WP3, WP6, WP7 Gonzalo Velasquez (Movilh Chile): WP6, WP7 Tani Leon (Fundacion Arcoiris, Mexico): WP6, WP7 |
Movilh (Chile). WP6 Fundacion Arcoiris (Mexico). WP6 AFE (Middle East). WP6 GVSU (US). WP2, WP6 McGill (Canada). WP2, WP6 |
Project budget
editWork Package / Task | USD |
---|---|
WP1: Corpus selection | $2,850 |
WP2: TDM tool and dictionaries | $11,500 |
WP3: DiversiTech platform | $9,500 |
WP4: Quality filtering | $6,450 |
WP5: Wikidata uploading | $2,850 |
WP6: Dashboard | $9,000 |
WP7: Comms, dissemination & Engagement | $16,000 |
WP8: Project management | $11,000 |
Achievements at the end of the project
editAn overall summary of what is intended:
- Helped to improve the quality of 700 articles in Wikipedia.
- Added at least 20K statements to Wikidata.
- Created a platform that can be used by the Wikimedia community to improve Wikipedia content, and all for exploration of its content and to expand their knowledge of their interest.
- Formed institutional links between non-Wikimedia organizations related to the project's work and the Wikimedia world.
- Created a dashboard using the output of this project, powered also by Wikidata to be used by these (and any other) organizations to extract valuable, reusable information about the topic.
- Promoted and imported research in languages other than English.
- Created a diverse community of organizations and volunteers around this project.
- Filled gaps in knowledge about under-represented LGBT communities.
- Research tied into the project within academia carried out by GVSU/McGill.
Communications
editCommunication and dissemination plan
editOur communication and dissemination activities will be supported by Wikimedia UK, Movilh, AFE, Fundación Arcoiris, Whose Knowledge? and other organizations we are currently collaborating on this project. Our communication plan is divided into three main activities, as follows:
- Online and social media presence. We will use monthly newsletters, place articles in relevant publications or via blogs, and make intensive use of Twitter. We will drive traffic to one of three landing pages: the project wiki main page, for participation; a Wikipedia project page, for case studies on LGBT referencing; and a mentoring page on Wikiversity, for support. All pages will contain information on the problem context and be cross-linked for easy navigation.
- In-person presentations. These will introduce the project workflows in easy steps to lower the barriers to entry for those who would like to participate. We will provide meetups and networking opportunities, editathons, webinars, and organise hands-on workshops.
- External media and communication. For audiences outside Wikimedia, the goal will be to explain the context of the project such as the intractability of reviewing the whole open-access LGBT literature, and how important that is to ensure the LGBT pages are well-referenced statements on Wikimedia sites in all languages. The main opportunities are via mailing lists and attendance at relevant conferences.
Our key objectives are:
- Embed our diversity-tech tools into the workflow of Wikimedian editors and other targeted users, so that usage continues sustainably after the project's end.
- Reach a wider audience outside Wikimedia communities and bring new editors in to join the project.
- Disseminate our project's benefits to the broader Wikimedia community and the public.
Engagement plan
editWe have identified, contacted and engaged with: international organizations dedicated to gender studies and equality from three different regions in the Global South; an experienced team of advisors with a wide network in the field; and two university research groups. They will help us to deliver and promote our project to their audiences. This project was developed in consultation with members of WikiProject LGBT studies on English Wikipedia.
We will also design a social media campaign, centred on Twitter to reach more than 100,000 people. It will build on the above, especially our advisors and organizations supporting the project. For the broad audience, we'll circulate videos, newsletters and documentation. We'll place op-eds in the mainstream media and related publications with a campaigning editorial line.
Experience with past interviews given by our team members shows that 100K views are quite conservative.
Our engagement activities will consider:
Activity | Description | Timeframe |
---|---|---|
Existing and new editors engagement | Use our network to engage new editors from their community as well as existing LGBTwiki editors | M1-M6 |
Meetup sessions | Organise and deliver a three-monthly meetup (20 people per session + video for international audience). As part of this we will look for opportunities to coordinate with existing awareness-raising events such as LGBT History Month (February in the UK) and also to involve local LGBT+ groups. | Every 3 months |
Newsletter for Wikimedia community | Deliver a monthly newsletter, reaching wikimedians (100 people on newsletter list) | 1 every month |
Newsletter for general community (multilingual) | Deliver a monthly newsletter, reaching a wider community (1000 people on newsletter list) | 1 every month |
Advisory board network | Deliver targeted content for our advisors and their networks (+10K) | M3,M6,M9 |
Attendance at non-Wikimedia conferences | Present our project progress at 10 conferences | 1 every month |
Project webpage | Ensure development work and results are communicated through ContentMine’s own site, wiki page and social media to interested communities at each step in the process | M1 |
Engagement with social media communities | Develop a social media outreach plan for the project and the organizations supporting it (multilingual) | M6 |
Press campaigns | Deliver and develop press articles in related journals | M6-M9 |
Workshops | Deliver 4 workshops during the project aiming at new potential editors | Every 3 months |
Webinars | Deliver a project webinar every three months to explain the project progress and increase awareness (in 2 languages) | every 3 months |
Training material | Preparation of videos to explain each stage of the project and how new editors and volunteers can contribute | every month |
Get involved
editParticipants
edit- Jenny Molloy
Jenny is a molecular biologist by training and manages ContentMine collaborations and business development. She spoke on synthetic biology at Wikipedia Science Conference 2015 and has been a long term supporter of open science. She is also a Director of Biomakespace, a non-profit community lab in Cambridge for engineering with biology.
- Lane Rasberry
User:Bluerasberry, Wikimedian-in-residence at the Data Science Institute at the University of Virginia. He coordinates projects between the university and Wikipedia, Wikidata, and other Wikimedia projects. He is also a member of the Wikimedia LGBT+ User Group.
- Jo Brook
User:Jkcm ContentMine software development contractor with interests in LGBTQI+ culture and history and active with a particular interest in gender identities. They have recently worked applying text and data-mining and NLP techniques to tracking changes in sentiment of diverse gender identities terms over time in conjunction with researchers at GVSU. They are also active in local LGBTQ+ arts projects and communities.
- Peter Murray-Rust
Peter has been a Wikimedian since 2006 and delivered a keynote talk at Wikimania 2014 and Wikipedia Science Conference 2015, where CM also ran a hands-on workshop. Peter founded ContentMine as a Shuttleworth Foundation Fellow, and is the main software pipeline architect. He received his Doctor of Philosophy from the University of Oxford and has held academic positions at the University of Stirling and the University in Nottingham. His research interests have focused on the automated analysis of data in scientific communities. In addition to his ContentMine role, Peter is also Reader Emeritus in Molecular Informatics at the Unilever Centre, in the Department of Chemistry at the University of Cambridge, and Senior Research Fellow Emeritus of Churchill College in the University of Cambridge. Peter is renowned as a tireless advocate of open science and the principle that the right to read is the right to mine.
- Wikimedia UK
Wikimedia chapter based in London, Chief Executive Lucy Crompton-Reid. Their mission is "to support and advocate for the development of open knowledge, working in partnership with volunteers, the cultural and education sectors and other organisations to make knowledge available, usable and reusable online."
- Advisors (volunteer)
- Elisabeth Jay Friedman: Professor, University of San Francisco, author of Interpreting the Internet: Feminist and Queer Counterpublics in Latin America (University of California Press, 2016)
- Anasuya Sengupta, co-founder of Whose Knowledge?, Indian poet and activist, authority on representation for marginalized voices on the Internet
- Myra Abdallah, Middle-East and North Africa regional manager of Women in News program of the World Association of Newspapers and News Publishers (WAN-IFRA) and the Director of the Gender and Body rights Media Center of the Arab Foundation for Freedoms and Equality (AFE).
- Jason Moore of WikiProject LGBT studies and the Wikimedia LGBT+ User Group
- Gonzalo Velasquez (Movilh Chile)
- Tania Yasmín León Vázquez (Fundacion Arcoiris, Mexico)
- Organizations expressing interest in re-using the output of the project
- Whose Knowledge (USA), global campaign to center the knowledge of marginalized communities on the Internet
- Movilh (Chile), human rights advocacy organization with focus on civil rights and liberties for lesbian, gay, bisexual and transgender citizens
- Fundacion Arcoiris] (Mexico), social organization with focus on the analysis of sexuality in the Latin American and Caribbean region.
- Arab Foundation for Freedoms and Equality (Middle East), NGO based in Beirut
- Grand Valley State University (USA)
- McGill University (Canada)
- Volunteer Go 188.70.18.109 22:07, 19 December 2018 (UTC)
Endorsements
editDo you think this project should be selected for a Project Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page.)
- Support Wikidata now has a central place in the world as a reliable, trustable, neutral provider of up-to-date factual information. For me it is the first place I go to and I urge others likewise. This is a very important current subject where discourse can be constructive or divisive and where terminology evolves very rapidly. Wikidata is the best Open, shared, resource we have for solving and evolving discourse. This (challenging) project brings together many diverse parts of the world and community in a trusted transparent meeting point. Wikimedia has years of experience in supporting such communities. I am impressed by the range of organizations and individuals committed to the project. [Disclosure I am founder of ContentMine.] Petermr (talk) 17:47, 13 December 2018 (UTC)
- Support I've been in touch with Cesar about this proposal, and I think there is some solid enthusiasm, planning, and potential here, with the right parties/groups advising and supporting the project. I'd like to see this effort move forward. -Another Believer (talk) 23:04, 13 December 2018 (UTC)
- Support Please see the talk page for a description of my endorsement. This proposal would be high-reward to the Wikimedia community, as a data science precedent, for promoting LGBT+ interest in Wikimedia projects, and as a precedent for other institutional partnerships. I have committed to advise this project. Blue Rasberry (talk) 17:35, 17 December 2018 (UTC)
- Support ContentMine has a proven track record of working smartly and diligently, and this is an excellent proposal to work on some of the crucial 'gaps' that Wikimedia faces. ContentMine's partners in this proposed exercise are exactly the right people and organisations, and as such, this group is exactly the kind of inclusive, forward-thinking coalition (with a strong track record of working on Wikimedia projects to boot) that should be supported. aprabhala (talk) 09:27, 18 December 2018 (UTC)
- Support ContentMine could lead this project without problems. It should get the grant as it will help to bring more awareness about LGTB data, so it can improve the quality and quantity of LGBT related articles and items on Wikipedia and Wikidata. teleyinex (talk) 11:27, 18 December 2018 (UTC)
- Support This project combines the need for better access to scholarly research with the benefit of expanding the circle of those we welcome as part of the Wikimedia community. The ContentMine team is made up of community-minded individuals with experience in collaborating with Wikimedia. I would like to see this project supported. karienbez (talk) 11:59, 18 December 2018 (UTC)
- Support Apoyo Este proyecto le permite a la comunidad poco representada tener acceso al conocimiento generado y poder hacer uso del mismo. Además permitirá tener una visión más global de los distintos contextos a los cuales actualmente no tenemos acceso. Sergioleon174 (talk) 21:43, 18 December 2018 (UTC)
- Support I am a gay librarian (lesbrarian). I'm passionate about providing quality information resources in general, and especially resources that represent the gay community. 76.112.73.29 01:29, 19 December 2018 (UTC)
- Support Wikimedia UK is delighted to be collaborating with ContentMine on this project, which will uncover meaningful knowledge and data of value to the LGBT+ community. This work resonates strongly with the strategic direction of the global Wikimedia movement, to focus on the knowledge and communities that have been left out by structures of power and privilege; and also supports Wikimedia UK's commitment to developing technical solutions for the eradication of inequality and bias on the Wikimedia projects. LucyCrompton-Reid (WMUK) (talk) 12:50, 19 December 2018 (UTC)
- Support I think that this project could be very impactful in creating a much more aware and thus understanding community. A lot of people access wikipedia for basic understanding of concepts, and the way society is right now, there’s a huge need for accurate knowledge to be established. This project is a step closer to putting insightful and accurate information into the web. Zavalame (talk) 13:19, 19 December 2018 (UTC)
- Support. A strong proposal with the potential for high impact in an area that could really benefit from significant scholarly resources. It's especially gratifying to see such a broad range of collaborators. MichaelMaggs (talk) 14:09, 19 December 2018 (UTC)
- Strong support I strongly support this proposal. It will address a significant problem using novel text mining techniques. The project involves an impressive international team. Content Mine enjoys a reputation for delivering projects on time and within budget; the budget request for this proposal is modest considering its likely impact. 86.165.240.86 14:18, 19 December 2018 (UTC)
- Strong support I am happy to add my strong support to and endorsement of this project. I have previously been involved in a similar initiative, 'Proud Heritage' and am aware of the significant technical and conceptual challenges which this project will address. It is an important initiative which will significantly extend and enhance access to knowledge about LGBT identities and history. [Disclosure I am a Trustee of Wikimedia UK] Wolsey1473 (talk) 14.43, 19 December 2018 (UTC)
- Support Wikipedia and Wikidata have become the go-to sources of information for the majority of people. As such it is really crucial that they reflect information and reality without introducing a bias, which clearly is what happens when a community is not well represented. ContentMine has successfully enriched the medical information on Wikidata in the past through the use of text mining. They are in the best position to do the same with LGBTQ+ data and they get my full and strongest support in this proposal. [Disclosure: I have worked with ContentMine in the past]. (Juarsuff) (talk) 15:38, 19 December 2018 (UTC)
- Support As a faculty librarian at Grand Valley State University I endorse this project for its remarkable contribution to the base of knowledge of importance to the global LGBTQ+ community. Schultzmgvsu (talk) 15:59, 19 December 2018 (UTC)
- Support Happy to support this important and timely project that has the potential to increase the diversity and inclusivity of both Wikipedia and Wikidata in the area LGBTQ+ articles, sources and terminology. Particularly encouraged to see the project proposing an approach that harnesses both community knowledge and technology solutions LornaMCampbell (talk) 17:01, 19 December 2018 (UTC)
- Support This is a fascinating proposal. The need is real, LGBT issues have significant undercoverage on these target Wikipedias. The proposal also brings together such an interesting global mix of people and organizations - it is hard not to be impressed by this proposal. I have previously worked closely with ContentMine and know they have the tools to deliver here. Metacladistics (talk) 21:48, 19 December 2018 (UTC)
- Strong support I strongly support this project. I believe these cutting-edge tools can meet a key need in the social sciences and can contribute greatly to the field. Abannachbrown (talk) 22:20, 19 December 2018 (UTC)
- Support I am pleased to have been able to offer occasional advice and encouragement to some of this team's previous projects, and I'm equally delighted to see this proposal, which refines earlier work and focuses it on a very worthwhile topic. I have no doubt that the team is capable of delivering their proposed project, which has clear present value, and great future potential. --RexxS (talk) 14:14, 20 December 2018 (UTC)
- Support I support this innovative approach to raising awareness and understanding about the distinct identities represented in and by LGBTQ communities through sophisticated tools that will enable serious research to inform the most widely used sources of online knowledge. EJayFriedman (talk) 14:15, 20 December 2018 (UTC)
- Support Very important proposal, not least because it sees ContentMine applying its tools in more humanistic areas than hithertoo. Technolalia (talk) 15:01, 20 December 2018 (UTC)
- Strong support This is an excellent proposal. The main benefits I see are: 1) the inclusion of information from both closed and open academic publications – this gets at the major issue of access that many clinicians and community members are encountering, 2) the opportunity for Wiki editors and others to indicate under which topics they would like to receive updates, and obtaining information in push format, 3) automation of “fact” identification, and 4) access to publications in multiple languages. I.e., I note that the current categories for gender identity (https://www.wikidata.org/wiki/Q48264), transgender (https://www.wikidata.org/wiki/Q189125), and non-binary (https://www.wikidata.org/wiki/Q48270) and transphobia (https://www.wikidata.org/wiki/Q59677) all currently have 0 references. But if I type in “gender-affirming” there are 67 responses – references to scientific articles (https://www.wikidata.org/w/index.php?search=&search=gender-affirming&title=Special:Search&go=Go) . For these reasons I believe this project is positioned to offer meaningful, practical improvements. Burns23390 (talk) 16:29, 20 December 2018 (UTC)
- Strong support A well considered, realistic proposal which will effectively improve diversity. Josiefraser
- Support Great use of TDM technology combined with outstanding community engagement measures and an excellent project team. Pkraker (talk) 11:37, 21 December 2018 (UTC)
- Support Wikidata is a powerful tool for resource discovery and having LGBT subjects well represented there is important. The methodology from previous projects gives them a good grounding to work from. Richard Nevell (talk) 09:32, 24 December 2018 (UTC)
- Support لأن هذا المشروع مهم جداً خاصة للمواد باللغة العربية 46.20.111.249 11:02, 24 December 2018 (UTC)
- Support “Apoyo este proyecto ya que beneficiara a una comunidad que actualmente lo necesita. Ademas ayudara a entender mejor la terminologia asociada a LGBT”. 186.9.3.249 13:42, 26 December 2018 (UTC)
- Support Difficult but important topic. One thing that is not entirely clear to me what kind of articles this proposal is referring too? From scholarly literature or news outlets? Second, what role do you envision for Scholia? In particular, do you see use of the /topic/ aspect to visualize the link between articles and LGBT topics? --Egon Willighagen (talk) 08:06, 29 December 2018 (UTC)
- Support This is a complex and well thought-through proposal from a respected team, offering a much needed extension of tools for analysis and community outreach beyond a binary model for gender.--DarTar (talk) 23:16, 29 December 2018 (UTC)
- Support Houssem Abida (talk) 23:40, 29 December 2018 (UTC)
- Support I'm very happy about the planned efforts of bringing in new experts and I have faith in the ability of the team to pull it off. --Lydia Pintscher (WMDE) (talk) 19:32, 30 December 2018 (UTC)
- Support Useful to the LGBT community! 66.227.210.71 00:35, 31 December 2018 (UTC)
- Strong support John Samuel 11:54, 4 January 2019 (UTC)
- Strong support It sounds like this project could have a major impact on the world's understanding of LGBTQ issues. Its cross-cultural, multilingual aspect is particularly impressive. K Eng 23:21, 5 January 2019 (UTC)
- Strong support Addresses an important need for reference across languages, can create impact in countries where is difficult to talk about it and gather a community around it. Specially valuable in countries where now the community is being threatened more than ever. If successful, would create a model of community + technology that can be used to address other topics. Cassandreces (talk) 09:13, 10 January 2019 (UTC)
- Support great idea and a great team behind it. T.Shafee(Evo﹠Evo)talk 06:41, 11 January 2019 (UTC)
- Support Great initiative, so much needed. Thanks for doing this. JuanP Jotape (talk) 20:23, 16 January 2019 (UTC)