Talk:Future Audiences/July 2023
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Some Questions / Proposals
Thanks a lot for the great conversation of this 5 May ! Here some questions/proposals that would have taken too much time to be asked/answered in the talk : - Loving so much your proposal of plugin with the mentions " from Wikipedia - Unchecked". If applicable, these mentions should be standards in GenAi answers, not only the plugin.
- Obtaining from third parties the systematic crediting of Wikipedia, Wikidata, Commons, etc... in the GenAI results and derivative works like videos when applicable would be extremely rewarding for the community of volunteers which give their time and skills to build ressources with correct information. This could be a useful objective.
- In FA2 KR1 : Why do you use the term "global" when speaking from youth audiences, as we know that gender issue is very important for young people and the gender gap dramatic in Wikipedia editing ? Perhaps it would be better to segment the different youth audiences to avoid biases.
- Will we have one day our own Wikimedia GenAI tools or will it always be made by third-parties ?
- Is being an “infrastructure” of free knowledge enough attractive and rewarding for future contributors ? Don't we need to be simply (and there, we are unique and have no competitors), the “essential human ecosystem of free knowledge” ?
For the use on our own platforms, where, I believe, AI could be useful :
- Creating our own summaries for our Wikipedia articles to retain our audience (just look at articles like "Cairo" on Wikipedia with an introduction which is so long and precise that it's no more a summary, and "Cairo" in the Google search results with a summary which looks like being from Wikipedia but is not). This could be proposed as an option and would provide an easier entry point for a long and difficult article. (Also propose the mediaviewer embedded at the head of the page to see in a comfortable format all the illustrations of the article instead of having it to be clicked from any picture - I asked a lot of people and they didn't even know the mediaviewer already existed in the new look -).
- Summarizing our community discussions to understand at which moment we begin to split hairs and bring nothing more in endless discussions or at which point the conflict escalates and people begin to attack each others. We are known to edit collectively but do we really practice collective intelligence ? It seems the shape of our actual discussions spaces do not always favor constructive and creative discussions, and looks more like a space of confrontation than a brainstorming place. Looking in depth with intelligent tools the mechanism of our rhetoric could help us to understand how we interact. Are our discussion spaces suboptimal and do they lead to interpersonal violence instead of gaining collective energy ? It seems the type of space shapes in a certain way the style of the discussions. Arena and forums often lead to pitch battle. Let's try another form and build a more creative-friendly discussion area. Collective Intelligence is our best value against AI, but, as a human community, we need a more comfortable space to interact in a constructive way.
- Using or building our own video summarizing tools. This would allow to have a quick look on video contents which could contain valuable informations. It's up to us then to have a closer look, search references and quote them without copy-paste on Wikipedia. There are more and more interesting video contents that could be used to enrich texts of articles. But it takes too much time to look at all in realtime. Video summarizing tools could allow a first and quick access to more video content.
- Creating AI generated portraits for biographies which have no images. Would machine made "hallucinations" be acceptable in some cases ? Which cases exactly ?
Waltercolor (talk) 19:49, 5 May 2023 (UTC)
- @Waltercolor: Thanks so much for attending the session and for these great questions! I'll do my best to respond:
- "Obtaining from third parties the systematic crediting of Wikipedia, Wikidata, Commons, etc... in the GenAI results and derivative works like videos" yes, absolutely. One of the huge benefits of this plugin experiment is that it will allow us to demonstrate how we want our content to be credited, and to bring that example to every AI assistant/company.
- "In FA2 KR1 : Why do you use the term "global" when speaking from youth audiences, as we know that gender issue is very important for young people and the gender gap dramatic in Wikipedia editing ? Perhaps it would be better to segment the different youth audiences to avoid biases." That's a very interesting point... I think the honest answer is that we don't yet know enough about youth audiences to have different segmentation strategies ("global" just means we don't want to only focus on US/English-speaking youth), but one thing we're discussing is doing some surveying to better understand this audience, and I'll be very curious to see if there are any notable trends within respondents' demographic groups that can help us create these strategies. (If we do a survey project like this, we'll make the results public and invite discussion!)
- "Will we have one day our own Wikimedia GenAI tools or will it always be made by third-parties?" It's quite tricky to make predictions in AI because it feels like anything anyone says confidently becomes obsolete in a week My understanding is that the cost of building/training/maintaining new GenAI models is currently very high, but this is also changing rapidly. We already host ML/AI models on our platforms for helping communities with things like content translation and vandalism patrol, and some of my colleagues are currently collecting use-cases for GenAI (I'll take this excellent list you've put together & share with them!). While there are a lot of opportunities in this space, there are also many risks and unknowns to deploying any kind of GenAI tool (whether it's built in-house or taken from a third party) on our projects, so it's an area we'll likely be more cautious on than experiments off of our projects that can help us learn and don't risk damaging content quality or overwhelming community moderation processes.
- Thanks again for reaching out, and if you have any more questions, please let me know! MPinchuk (WMF) (talk) 18:34, 6 May 2023 (UTC)
- Thanks a lot @MPinchuk (WMF) for these precise answers. And waiting for the next steps ! Waltercolor (talk) 13:06, 11 May 2023 (UTC)
Discussion thread for first test of conversational AI KR
A more detailed description of our first experiment (building and testing a Wikipedia plugin for ChatGPT) is available here: Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Future_Audiences#FA2.2_Conversational_AI.
I'd love to hear your thoughts! @Klara Sielicka-Baryłka (WMPL) @Tochiprecious @Nada kareem22 you all +1ed the plugin idea when we demoed it, so I'm curious to hear more on what you thought! @LuisVilla: you shared your view that while there's a lot of reasons to be cautious, we're at more risk of going too slow/not acting in the AI space and being left behind. @DerHexer – I believe you agreed with what Luis said, but also voiced some concerns about putting our content onto third-party platforms that aren't aligned with our values. Please feel free to continue the conversation here! MPinchuk (WMF) (talk) 19:02, 10 May 2023 (UTC)
- @MPinchuk (WMF): do you already have ChatGPT plugin API development access? I don't even have user access to plugins yet. Is there a repo for your demo, or a link, or maybe a screencast or something? Which of the https://platform.openai.com/docs/plugins/examples did you base it on, if any? I wrote the scripts at [1] for Claude (because Anthropic doesn't charge for API access!) and I continue to work on the 2nd of the two there, as Claude has recently vastly increased the size of its context window, which makes some things easier, but I still need to get chunking to work with overlaps because it's not at all uncommon for a source to be several megabytes. That and reading PDF files is actually much easier than trying to trim the article text so that the verification attempt focuses on the correct assertions, because citations don't always appear in the best locations, especially when more than one of them are supposed to cover the same article text. Sandizer (talk) 06:47, 17 May 2023 (UTC)
- @Sandizer: Thanks for getting in touch (and very cool to see your Claude/Anthropic work)! The repo for the very alpha version of the ChatGPT plugin is here and we're working out a plan to get it in more people's hands for testing (which is a little tricky since, as you noted, many ChatGPT users don't have plugin access yet). I'll be sure to ping you when I have updates! Is this a fine place to reach you, or should I send you in-wiki email? MPinchuk (WMF) (talk) 23:18, 17 May 2023 (UTC)
- @MPinchuk (WMF): absolutely ping me here or on my meta or enwiki talk page any time, please. That plugin code looks off to a great start. I can't wait to see what people do with it once plugin developer access becomes more widely available. Props to Nat! Sandizer (talk) 00:40, 18 May 2023 (UTC)
- @Sandizer: Thanks for getting in touch (and very cool to see your Claude/Anthropic work)! The repo for the very alpha version of the ChatGPT plugin is here and we're working out a plan to get it in more people's hands for testing (which is a little tricky since, as you noted, many ChatGPT users don't have plugin access yet). I'll be sure to ping you when I have updates! Is this a fine place to reach you, or should I send you in-wiki email? MPinchuk (WMF) (talk) 23:18, 17 May 2023 (UTC)
@MPinchuk (WMF) and NHillard-WMF: I finally got access to plugins, but I can't find the Wikipedia plugin anywhere on https://chat.openai.com/?model=gpt-4-plugins -- How can I load and use it? Sandizer (talk) 08:34, 28 May 2023 (UTC)
- @Sandizer, ack sorry, I missed this message somehow! You're in luck though, because we're finally ready for a round of community testing . Could you please send an email to futureaudiences wikimedia.org that includes the email address associated with your OpenAI account? I'll send you back some instructions on how you can access the plugin (it's not yet been published to the plugin store). Sorry for the delay, but thanks in advance for helping us test it out! MPinchuk (WMF) (talk) 20:54, 27 June 2023 (UTC)
- Sure, will do. I also subsequently got access to plugin development, but haven't tried that yet. I will do that first to make sure I understand how to debug and such. Thank you! Sandizer (talk) 22:31, 27 June 2023 (UTC)
- @MPinchuk (WMF): Email sent. I can't wait! Sandizer (talk) 01:26, 28 June 2023 (UTC)
Only CC BY-SA 3.0?
It's the plan to change CC BY-SA 4.0. Dušan Kreheľ (talk) 08:13, 12 May 2023 (UTC)
- @Dušan Kreheľ: This page is about some specific Product & Technology projects and isn't monitored by WMF Legal team staff. If you'd like to discuss CC licensing and Terms of Use updates, I think this is the page you're looking for: Talk:Terms_of_use MPinchuk (WMF) (talk) 18:14, 12 May 2023 (UTC)
- @MPinchuk (WMF): Sorry, but in the Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Future_Audiences#FA2:_Testing_hypotheses is writing the text context included the string "CC BY-SA 3.0". Dušan Kreheľ (talk) 19:46, 12 May 2023 (UTC)
- Ah, good catch! Yes, you're right – that should be updated to CC 4.0 as soon as it's updated on our projects. Thank you! MPinchuk (WMF) (talk) 22:08, 14 May 2023 (UTC)
- @MPinchuk (WMF): Sorry, but in the Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Future_Audiences#FA2:_Testing_hypotheses is writing the text context included the string "CC BY-SA 3.0". Dušan Kreheľ (talk) 19:46, 12 May 2023 (UTC)
Chat-GPT plugin
I am pretty confused about why we would want to develop a Chat-GPT plugin to test "how we can remain the essential infrastructure of free knowledge in a possible future where AI transforms knowledge search".
For one, it seems like it's an endorsement of Chat-GPT (and AI more broadly). Given the unreliability of Chat-GPT, I don't see that as something we would want to entangle ourselves with hap-hazardly. If our plugin or Chat-GPT gives misinformation to the user (which is an eventual guarantee given the instability caused by the compression in training data) and attributes it to Wikipedia, it's like us giving that misinformation (even if no human editor had a hand in the process).
Wikipedia is fundamentally a human-driven project. Its unreliability comes from that fact. However, when we get something wrong, we can know why. The person who made the edit can be interrogated; the process scrutinized more broadly. That just isn't possible with a closed, proprietary, system like Chat-GPT.
If we are going to have a relationship with Chat-GPT, then the first thing I would want to come from that relationship is a tool that would prevent its undisclosed usage on our platforms. –MJL ‐Talk‐☖ 22:35, 18 May 2023 (UTC)
- @MJL: By "... in a possible future where AI transforms knowledge search," what we mean is: it's possible that more and more people will begin to use ChatGPT and other AI assistants instead of Google to search for information. We are already starting to see this happen, and it comes with the problems you pointed out (lack of reliability/transparency). People are already seeing knowledge from Wikipedia in ChatGPT responses, but without attribution, and/or mixed in with other sources in a confusing and sometimes misleading way.
- This plugin experiment allows us to address those problems head-on: by adding attribution, links to the source material, and more clearly stating to the end-user that while the content is coming from Wikipedia, ChatGPT is still interpreting it in its own way and it may not be 100% reliable. We hope a) that this a better experience for ChatGPT users, and b) that if we can show that ChatGPT users prefer this as a knowledge experience over regular ChatGPT (without sources/links), that this can influence OpenAI and other AI assistant providers to display our knowledge in this way as the default.
- It's possible that people will be skeptical of AI assistants for general knowledge search, but it's also possible that the readers that currently come to our projects via a traditional search engine (about 3/4th of all reading sessions currently) will begin using AI assistants to answer their questions instead, and if that happens and we're not prepared to work with these AI companies to deliver a reliable, transparent experience to those readers, we risk losing relevance and sustainability as a movement.
- Does that help clarify what we're trying to do? MPinchuk (WMF) (talk) 15:43, 22 May 2023 (UTC)
- @MPinchuk (WMF): Those are definitely good intentions. I just don't how practical any of that is going to be. I mean, I don't how possible it is going to be to develop such a plug-in with all the must-haves that have been documented. For example, one must-have is for users to be able to [a]sk ChatGPT any... current events question to trigger the Wikipedia plugin. However, ChatGPT has limited knowledge of events that occurred after September 2021. (per our article). To solve that problem, wouldn't you have to fix that underlying issue with ChatGPT first?
- Maybe if the plug-in was a little bit less ambitious? Like, what if the plug-in was just focused on finding the user an article which would answer their question instead of getting a A natural-language summary of the relevant Wikipedia knowledge. Then the plug-in could quote a blurb and link that article as it stands.
- Basically, AI is being used to retrieve the article rather than write a response. You may ask "What's that thing where what you are thinking determines how people treat the thing you did" (very vague/tip of the tongue), and the AI can eventually get you to the article about mens rea.
- We're on the same side. I only have hang-ups because I've seen this technology misused to vandalize Wikipedia before. You can probably see why I might be a bit nervous here. –MJL ‐Talk‐☖ 02:22, 24 May 2023 (UTC)
- @MJL I think what we've built so far and what you're describing are basically the same thing . (Using a search API, we can retrieve any info from Wikipedia, even very recent info, based on a fuzzy natural-language query, and ChatGPT's summary is usually more or less the lede.) It may just be hard to get a sense of without seeing it in action... which you can do around the 35-minute, 30-second mark in the video here! The design is still very drafty and we need to do some work on the backend to get it to scale properly, but hopefully this gives you a clearer picture. (But happy to answer any more questions this demo raises, of course!)
- Also, I'd be curious to hear more about the AI vandalism you're seeing. Does it look like good-faith clueless newbies who think they're adding good content, or intentionally malicious users trying to sneak in bad content (or both)? MPinchuk (WMF) (talk) 22:24, 25 May 2023 (UTC)
- @MPinchuk (WMF): The short answer is, it's complicated. There is an ongoing effort to regulate these tools at Wikipedia:Large language models (which is most of the conversation is happening). Here are a few case studies from AN/I:
- The first one runs the gambit on all sorts of abuses: people using ChatGPT to pad edit counts, spammers using it to defend their articles, LTAs getting involved with it, and even some people who are genuinely unaware of the risks involved with this tool. It takes very little effort to generate hoax articles with ChatGPT, but as the third case shows it can take a mountain of work to correct it.
- However, I find the fourth case equally as important because there was no evidence that this user even used ChatGPT, but the problem comes from our new inability to tell when a user is actually writing their own articles. –MJL ‐Talk‐☖ 21:00, 26 May 2023 (UTC)
- Also, I'd be curious to hear more about the AI vandalism you're seeing. Does it look like good-faith clueless newbies who think they're adding good content, or intentionally malicious users trying to sneak in bad content (or both)? MPinchuk (WMF) (talk) 22:24, 25 May 2023 (UTC)
Call for taking a look at/testing the in-progress Wikipedia ChatGPT plugin
Hi all! The Future Audiences team is interested in getting some feedback/testing of the Wikipedia plugin before it goes out to the broader audience of ChatGPT beta plugin users.
As a reminder, the Wikipedia plugin is an optional add-on to ChatGPT that allows it to draw from and summarize content from Wikipedia for general knowledge queries (including for answering questions related to current events/breaking news/anything that happened after the GPT training data cutoff of September 2021). Caveat: Our plugin and plugins as a whole are still experimental features, so expect some bugs and rough edges – but your feedback will be very helpful in giving us a sense of:
- How often does the plugin trigger and in what cases? (ChatGPT decides when to use the plugin vs. its default knowledge store, and we'd love to get more data on this)
- How well does it do at summarizing Wikipedia content? (We've seen it do pretty well in early internal testing, but on occasion it will slip in some en:WP:OR and we're also interested in getting more data on when/why this happens)
- How well does it do at following our instructions to clearly state attribution? (Again, it seems to do this most of the time, but will fail on occasion, and we want to know if there's any pattern to when/why).
- How well does it interact with our Search API to fetch relevant content? (Be advised that ChatGPT queries are visible to plugin developers and OpenAI more generally – please don't provide any sensitive information to ChatGPT!)
To be able to test out the plugin without a ChatGPT Plus subscription, please send an email to futureaudiences wikimedia.org that includes the email address associated with your OpenAI account. I'll send back some further instructions on how you can enable the plugin and where to leave testing feedback. Thank you!!!
cc @Waltercolor, @User:Natalia Ćwik (WMPL), @Lydia Pintscher (WMDE), @Grzegorz Kopaczewski (WMPL), @Klara Sielicka-Baryłka (WMPL), @Bertux, @Sandizer, @Frank Schulenburg, @MJL, @Jklamo, @Sdkb MPinchuk (WMF) (talk) 21:18, 27 June 2023 (UTC)
- Thanks so much for the ping! I'm eager to test the plugin! Best, --Frank Schulenburg (talk) 03:50, 28 June 2023 (UTC)
- You've got mail! Bertux (talk) 06:58, 28 June 2023 (UTC)
- Hi Bertux, I didn't receive an email. I also checked my spam folder and couldn't find anything. Could you please resend? Thanks, --Frank Schulenburg (talk) 16:52, 28 June 2023 (UTC)
- Resent to futureaudiences wikimedia.org, now with all links omitted Bertux (talk) 17:01, 28 June 2023 (UTC)
- Hi Bertux, I didn't receive an email. I also checked my spam folder and couldn't find anything. Could you please resend? Thanks, --Frank Schulenburg (talk) 16:52, 28 June 2023 (UTC)
- Thanks, all! I've followed up with you individually, but just confirming here that I'm working on getting you set up with plugin developer access. Stay tuned, and thank you again for volunteering to help test it out! MPinchuk (WMF) (talk) 16:50, 30 June 2023 (UTC)
- @MPinchuk (WMF) if I'm not late, I would like to test it as well. Tochiprecious (talk) 07:40, 9 July 2023 (UTC)
- @Tochiprecious: not too late at all! Can you send an email to futureaudiences wikimedia.org that includes the email address associated with your ChatGPT account? Also please let me know if you're a ChatGPT Plus subscriber or using free ChatGPT. Thank you! MPinchuk (WMF) (talk) 22:31, 11 July 2023 (UTC)
- Thank you @MPinchuk (WMF)! I'll send a mail. No, I'm not a ChatGPT Plus subscriber. Tochiprecious (talk) 14:05, 13 July 2023 (UTC)
- @Tochiprecious: not too late at all! Can you send an email to futureaudiences wikimedia.org that includes the email address associated with your ChatGPT account? Also please let me know if you're a ChatGPT Plus subscriber or using free ChatGPT. Thank you! MPinchuk (WMF) (talk) 22:31, 11 July 2023 (UTC)
Wikipedia ChatGPT plugin now available
Hi all – quick update to let you know that we've submitted the plugin to OpenAI and they've reviewed and approved it for the plugin store. All ChatGPT plugins are currently considered experimental beta features and are only available to ChatGPT Plus subscribers for the time being. If you are a subscriber, you should now be able to search for and install the Wikipedia plugin directly from the plugin store. If you'd like to help test it out but don't have a ChatGPT Plus account, please send an email to futureaudiences wikimedia.org and indicate the email address associated with your ChatGPT account, and we'll get you access.
Thanks so much to those who have already tested and provided feedback! We're interested to see how real-world ChatGPT users interact with the plugin and how we can continue to improve the quality of the results. Stay tuned for more updates as we gather and analyze the first batch of data! MPinchuk (WMF) (talk) 22:40, 11 July 2023 (UTC)
Sourcing for information
It was fun playing around with the plugin. One thing I would like to see get added to the "ideas for future releases" is adding more transparency about the sourcing of information. That is right now everything is being sourced as "from Wikipedia" but we all know that within Wikipedia there will be more specific sources that verify that information. Having that information be available on request, and/or presented in some of the more graphical outputs, could be useful. Best, Barkeep49 (talk) 17:12, 13 July 2023 (UTC)
- Hi @Barkeep49! We have been thinking about experimenting with exactly that idea. If a ChatGPT output to a user is based on an article about Hemingway, the output would not only give a link to the Hemingway article but also link to the relevant sources used inside the Hemingway article itself. That would help people fact check what ChatGPT is telling them.
- There isn't anything obvious in my mind that would prevent the Wikipedia plugin from doing this, but we will have to test it out and see. CAlbon (WMF) (talk) 17:46, 13 July 2023 (UTC)
- Yes your Hemingway example is what I was getting at. Glad to hear it's on the team's mind even though it wasn't on the meta page. Best, Barkeep49 (talk) 18:20, 13 July 2023 (UTC)
- Added now The meta docs are getting an upgrade soon – it's getting a bit messy/unwieldy all as one page. This makes me think that once the plugin stuff moves to its own sub-page, the future feature ideas/enhancements section should include a call to action for anyone to add suggestions. Thanks for the feedback! MPinchuk (WMF) (talk) 17:49, 16 July 2023 (UTC)
- Yes your Hemingway example is what I was getting at. Glad to hear it's on the team's mind even though it wasn't on the meta page. Best, Barkeep49 (talk) 18:20, 13 July 2023 (UTC)
Consent
I don't find the word "consent" on this page. Make sure you don't lead people to being exposed to LLM-generated content without their consent. It's not fine to run experiments on our fine humans. Nemo 13:34, 14 July 2023 (UTC)
- Hi Nemo, I'm a bit confused by this concern. Are you saying readers must sign consent forms before even being "exposed" to such content? Did the community members who already used LLMs or other AI models to create content for Wikipedia (e.g. en:Artwork title, fr:Fions) commit human rights violations? Regards, HaeB (talk) 22:17, 15 July 2023 (UTC)
- @Nemo bis: I assume you're not referring to the ChatGPT plugin experiment (which requires the user to explicitly consent – i.e., find and choose to install the plugin – in order for them to see and use it) but in case we ever explore using LLMs on our projects? As HaeB notes above, AI/ML tools have been used for years on our projects, and our process at WMF when building and deploying any new models is to publicly document them in order to be as transparent as possible with users about how they work, limitations, etc. (see: Machine learning models). Also, as we've demonstrated with the plugin (which we began documenting on this meta page before it was built/deployed), we're planning to share experiment ideas here as they're coming together to get feedback from the community.
- Sidenote: I know the word "experiment" carries a variety of connotations in different cultures/languages/historical context, some very negative. I hope that it becomes more clear over the course of the year that in the Future Audiences context, "experiment" means being open to trying new things that we've never tried as a movement, being open to learning what works and what doesn't, and being comfortable with stopping things that aren't working to focus on what's actually going to help us carry on the work of sharing free knowledge with everyone in the world no matter where the future of technology and user behavior take us. MPinchuk (WMF) (talk) 18:06, 16 July 2023 (UTC)
Zero legitimacy without community consensus
The Wikimedia Foundation has a conflict of interest against the ethics and values of the Wikimedia community here. The way to mitigate this conflict is by investing Wikimedia movement money into Wikimedia community organizations so that they can organize their own conversations outside of the influence of the Wikimedia Foundation's biases.
In many spaces, but especially AI, and especially where values and ethics come into play, the staff of the Wikimedia Foundation are out of sync with the ethics of the Wikimedia community. Wikimedia Foundation staff are not part of the Wikimedia community, and it is inappropriate to use Wikimedia movement funds to advance the views of staff when the volunteer community have no access to the funds and have different values.
Increasingly the Wikimedia Foundation claims the right to speak for the movement, or for the community, or just on behalf of good ethics. Stop it! Quit! There are countless problems here, but to point to one of them, when the Foundation established Wikimedia Enterprise part of the consequences was to have the commercial tech sector start intermingling socially with both staff and community when previously they were not welcome. It often happens that staff of the Wikimedia Foundation align with corporate lobbying interests when the Wikimedia community of volunteers do not.
There are millions of donor dollars being spent to hire Wikimedia Foundation staff, and staff only advance in their careers when they align with WMF ideology. When WMF ideology conflicts with community ideology, there is a conflict of interest. Stop it!
Here is what to do:
- Be radically more transparent with budgets. Anyone at WMF who ever speaks about AI policy and their entire teams needs to have their collective salaries and budgets published, so that community can have a sense of the scale of resources.
- Prohibit WMF staff to speak for the Wikimedia community or their ethics! Prohibit WMF staff from representing themselves as community members!
- Give actual money to actual Wikimedia community groups to organize their own social and ethical conversations. Community consensus is the ONLY SOURCE of LEGITIMACY for statements of values and ethics.
- Encourage dissent, because it exists. In too many public forums, WMF staff are defensive to community ethical concerns because community ideological objections disrupt staff careers. It is entirely inappropriate for paid staff to be paid to argue with Wikimedia community volunteers who are talking about ethics! Staff who work in AI should not even be participating in community conversation about ethics! It is inconceivable that staff can even be fair in surveys, focus groups, or public conversations, because the divergence between staff and community ethics is already too far gone! Fund independent university researchers to do the surveys and conversations or else there is no chance of open conversation! Invest the money to get the legitimacy.
- Instead of sponsor staff-led conversations, sponsor community led conversations! Crowdsourcing excels at ethics conversations! And again, get neutral third parties and not WMF staff to interpret the consensus and results! WMF staff too often interpret contrarian outcomes to instead align with WMF ideology!
AI ethics is a really serious issue! Invest money into the community for this! The legitimacy of any community positions on this issue is going to be comparable to the amount of financial investment in supporting the community in speaking its views on this position! Right now, the money is with the WMF, and there is no record of financial support for community engagement in this. Bluerasberry (talk) 15:57, 14 July 2023 (UTC)
- Hi Lane, with all respect, you have long shown a tendency to make sweeping proclamations about what the consensus or the values of "the community" are. (I seem to recall having mentioned this to you in other contexts years ago already, perhaps it was in a Signpost-related discussion.) I would recommend to either back up these claims with references (like links to specific community RfCs or policies), or to more clearly frame them as your personal views.
- In particular, claims that using or exploring generative AI tools is against the "ethics of the Wikimedia community" (or constitutes evidence of conspiring with "the commercial tech sector") would be in stark contrast with the fact that lots of community members have already done this (see e.g. here or here, not to speak of the longtime use of non-generative machine learning in e.g. ORES or ClueBot).
- Regards, HaeB (talk) 02:07, 16 July 2023 (UTC)
- @HaeB: I agree with you that I cannot proclaim what consensus is.
- I am within my rights to say that consensus should come from a defined process, like conventional surveys and discussions from a university research group.
- You say that I as an observer should "back up these claims with references". This is what I am asking of the WMF staffers. Would you wish the same for them when they reach conclusions?
- Also I invite you to speak with me on recorded video which we can post publicly. Bluerasberry (talk) 17:43, 16 July 2023 (UTC)
- Separately to the above comment. I am writing here under my volunteer user-account on a Sunday afternoon in order to ensure it's clear I'm replying as my own self. I wish to address the specific statement that "Wikimedia Foundation staff are not part of the Wikimedia community" and "Prohibit WMF staff from representing themselves as community members!". It is unfair and inaccurate to say that by virtue of having employment at the WMF, that somehow automatically excludes you from being considered part of the community too. HaeB didn't stop being part of the community when he was working at the WMF as User:Tbayer (WMF) and then 'become' it again afterwards. Equally I, as User:LWyatt (WMF) am currently employed at the WMF - but I was part of this community (in many volunteer, grantee, and employed ways) for long before that; I consider myself to be part of the community still now; and I will continue to be part of the community whenever I eventually no longer am employed at the WMF. As I'm the only person who is directly working on both the projects above ("Enterprise API" and "Future Audiences") I feel particularly called-out, as if I'm being excommunicated from my people.
- There are many ways of being 'part of the community' - Featured Article author, photographer, developer, grammar-fixer, local community/school organiser, professional-librarian-reference-checker, wikipedian in residence [for reference: I was not paid when I was an WiR], as well as working for an affiliate or the WMF ... but ALL of these kinds of modes of involvement are "part of the community". I would also add donor is a kind of community role too - it's certainly an expression of commitment. It's a broad church, keep the doors open and welcoming.
- If you are saying that WMF staff should not represent themselves as volunteers then I agree, but I don't know of any examples of that actually occurring. Wittylama (talk) 17:15, 16 July 2023 (UTC)
- @Wittylama: "If you are saying that WMF staff should not represent themselves as volunteers then I agree, but I don't know of any examples of that actually occurring." You are doing it right now by having logged out of your work account to debate from your volunteer account. The reason you did this is because you have a conflict of interest from posting from your work account, which is why you say "it's clear I'm replying as my own self". It is not possible for you to be your own self and also be on payroll with a career and life and paycheck obligation to ensure that WMF goals get met.
- Can you please introduce me to your supervisor? I would like to make a formal request that you and all WMF staff be forever prohibited from claiming that conflict of interest changes depending on whether posting from a work WMF account versus a volunteer account.
- Liam I do not want to make this personal about you; many WMF staff argue with Wikimedia volunteers on matters of ethics, and it is routinely problematic. You are paid to defend specific policies, you are arguing WMF policy right now, you have support from the WMF to behave in this way, and the entire system is stacked in favor of paid staff ethics positions and against unpaid, unsponsored, unfunded volunteering attempts to be heard. It is not fair that you get paid to argue for ethical positions when also the WMF decides when and how the Wikimedia community gets funding to organize their own conversations on ethics.
- WMF staff sometimes do things to which the community objects. This is their career and goals assigned by their manager. They have a conflict of interest in defending such things, because if the community objects, then their job goals change.
- To circumvent the conflict of interest, WMF staff logs out of their work accounts, then logs into their personal accounts, then starts to argue. I object to this. Staff are the same people with the same biases whether logged in or logged out.
- Can you please introduce me to someone who will talk to me about WMF versus personal accounts on recorded video that I can post online? Thanks. Bluerasberry (talk) 17:57, 16 July 2023 (UTC)
- I did not reply to any of the substantive claims you made in your initial post using my volunteer account, precisely because that would be a misuse of that 'hat' (for the CoI reasons you outline). I restricted my comments strictly to rebutting the claim that I am not a member of the community - in effect, that I am not a Wikimedian - simply because I also currently have a job at the WMF. To respond to the specific request: in my day-job, the person I work with/for in the context of this program is MPinchuk (WMF), already very active on this page. Wittylama (talk) 18:17, 16 July 2023 (UTC)
- Thanks Liam. I am doing what I feel I need to do to advocate for my community. What I said is not about you and I like you. Sorry for personal friction and I would like to move the conversation away from you personally and to WMF roles in general. @MPinchuk (WMF): Under what circumstances, if any, would you meet me for a recorded video chat? Bluerasberry (talk) 18:41, 16 July 2023 (UTC)
- For a discussion of the "WMF roles in general" (especially vis-a-vis its role within the movement and wider community), that's something beyond the scope this page or anyone here - I'd suggest you take that to the Board noticeboard. -- Wittylama (talk) 18:57, 16 July 2023 (UTC)
- Thanks Wittylama for replying. You did a great job. I apologize for being aggressive, too direct, and for raising issues here which are impossible to discuss in wiki text in this kind of discussion forum.
- I am convinced that you Wittylama and the entire Wikimedia Foundation team have good intents and wishes. The problems that I raise are not directed to any individual, or any team. No one has made bad decisions and no one did anything wrong.
- My objection is systemic. I want Wikimedia community conversation to greatly increase wherever there are Wikimedia decisions about values and ethics, and I think everyone wants that. There is no easy way to do this. The Talk:Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Future_Audiences#Announcing_monthly_Future_Audiences_open_"office_hours" discussion plan is a step toward that. I wish that multiple Wikimedia community groups globally could also organize their own discussions on this issue, because in my view, we are talking about how the Wikimedia Movement will interface with the multi-trillion dollar big technology sector. I believe that Google, Meta, Amazon, and Apple all have specific plans for what they intend to do to Wikipedia, and that the appropriate response is for the Wikimedia community to discuss what we can do to decide our own place in serving future audiences. I do not doubt the sincerity of Wikimedia Foundation staff. WMF staffer Selena Deckelmann said in April 2023 in Wikimedia-l Reflecting on my listening tour that "My experience so far has been that we have a very contentious relationship with English Wikipedia." We need reconciliation and collaboration, and something has to change to achieve that. Thanks Wittylama for helping me think things through and I look forward to civil friendly constructive collaboration with you and others. Bluerasberry (talk) 17:20, 31 July 2023 (UTC)
- For a discussion of the "WMF roles in general" (especially vis-a-vis its role within the movement and wider community), that's something beyond the scope this page or anyone here - I'd suggest you take that to the Board noticeboard. -- Wittylama (talk) 18:57, 16 July 2023 (UTC)
- Thanks Liam. I am doing what I feel I need to do to advocate for my community. What I said is not about you and I like you. Sorry for personal friction and I would like to move the conversation away from you personally and to WMF roles in general. @MPinchuk (WMF): Under what circumstances, if any, would you meet me for a recorded video chat? Bluerasberry (talk) 18:41, 16 July 2023 (UTC)
- I did not reply to any of the substantive claims you made in your initial post using my volunteer account, precisely because that would be a misuse of that 'hat' (for the CoI reasons you outline). I restricted my comments strictly to rebutting the claim that I am not a member of the community - in effect, that I am not a Wikimedian - simply because I also currently have a job at the WMF. To respond to the specific request: in my day-job, the person I work with/for in the context of this program is MPinchuk (WMF), already very active on this page. Wittylama (talk) 18:17, 16 July 2023 (UTC)
ChatGPT plug-in: Hallucination-free?
According to this statement by a Wikimedia Foundation executive, the plug-in produces hallucination-free answers. That seems to be a major technical feat, is there more information on how it was achieved and verified? Regards, HaeB (talk) 07:05, 16 July 2023 (UTC)
- @HaeB: you're correct, not accurate to say we can guarantee a fully hallucination-free experience (I think Yael was being a little cheeky/throwing some shade at ChatGPT in that tweet, not intending for this to be read 100% seriously ) – rather, we think because the plugin is set up to source facts from Wikipedia and provide links to source articles, it can reduce the likelihood of hallucinations, but we haven't yet measured this empirically at scale (we did some initial quality assessment looking closely at about 50 queries and responses and didn't find any outright hallucinations in the responses, but now that the plugin is available to more real ChatGPT users, we plan to do a larger assessment once we have a few weeks' worth of real-world usage data). My guess is that it's likely that queries answered by the plugin will contain far fewer outright hallucinations than queries answered by vanilla ChatGPT, but it's still always going to be possible that ChatGPT may not summarize things from Wikipedia accurately – which is why we added that disclaimer language seen in the screenshot! MPinchuk (WMF) (talk) 16:21, 16 July 2023 (UTC)
ChatGPT plug-in: Most impressive use cases?
Would some of the people who have already had a chance to try out the plug-in be willing to share screenshots of the results they found most impressive or useful?
I've got to say that I find the sole example that is highlighted here and and in the Diff post a bit underwhelming. Sure, it answers the question "when will the women's world cup be? where is it?" correctly. But it throws in lots of irrelevant additional information, such as regaling the user with the rather contrived superlative "the first senior World Cup to be held across multiple confederations", as Australia is in the Asian Confederation, while New Zealand is in the Oceanian Confederation" (which also assumes knowledge about - and interest in - the global governance structure of association football). The entire answer appears to mostly just rephrase the lead section of the cited article in somewhat shortened form - but we already have a well-tested API to excerpt a page's intro; does involving ChatGPT really add that much value here?
If I type the exact same question into Google, I get a much more succinct and frankly better result right now (citing a reliable source too):
- The 2023 Women's World Cup, which kicks off in Australia and New Zealand on Thursday, will be the largest ever, with 32 teams playing 64 games over a month.
Yes, I understand that the plug-in is experimental. But in an experiment designed to answer research questions such as "whether users of AI assistants like ChatGPT are interested in getting summaries of verifiable knowledge from Wikipedia", the quality of the user experience obviously matters.
Anyway, the above remarks are obviously limited to this single showcase example and I haven't yet checked out the plugin myself, so maybe others who have can share some examples where it genuinely adds value.
Regards, HaeB (talk) 04:49, 17 July 2023 (UTC)
- @HaeB: We're currently in the process of analyzing the post-launch data – just anecdotally, based on my own usage and what I'm seeing by spot-checking the data, I can say that a) Google has gotten so full of SEO-hacking, ads, and busy UI that sometimes I use ChatGPT for similar queries/to get the same kind of quick lookup info that I can get in Google, but without all the clutter (and we're seeing Google-type quick info lookup in the general usage data, too).
- Beyond that, there are a lot of other really interesting uses of the plugin that I certainly didn't anticipate. For example:
- Knowledge synthesis (e.g., asking for very specific top ten lists, most popular XYZ – again, Google can give you this type of info in the search results page, too, but you might get a summary based on a random blog post or other questionable source. You can't limit Google Search to summarizing only from Wikipedia)
- Creative remixing (asking for encyclopedic information in the style of [novelist X] – Google Search definitely can't do that)
- Deep learning/research into complex topics (asking for progressively more specific summaries of complex/multifaceted topics in the realm of philosophy, physics, etc. – also not really something Google Search can provide)
- I'll share more as the analysis comes together, but one interesting thing to note is that queries per day (and queries per user per day) have been trending gradually up in the 2 weeks since launch, so it doesn't seem like people are just turning the plugin on, playing around for a bit, and forgetting about it But based on overall usage (~100 unique users/day) this is a very tiny, niche audience of ChatGPT-using Wikipedia fans who are using it. MPinchuk (WMF) (talk) 22:28, 26 July 2023 (UTC)
YouTube plugin
An idea: YouTube automatic transcriptions could have key concepts linked to Wikipedia for further information. Fgnievinski (talk) 05:00, 20 July 2023 (UTC)
- @Fgnievinski: Thank you for the suggestion! I'm revamping the Meta docs over the next few days and hope to start a page where I can add all the ideas suggested so far in various fora (and a call to add/discuss more). Appreciate the nudge on this MPinchuk (WMF) (talk) 22:31, 26 July 2023 (UTC)
Docs being updated & more opportunities to learn & give feedback
Hi all, I'm slowly revamping this page to be more informative and to serve as a hub for all the documentation on what we're doing and why, what we're learning, and more space for idea generation & discussion.
A few FYIs:
- I've fleshed out the FAQ with actual questions I've gotten about Future Audiences/AI/ChatGPT. I now encourage you all to go and pick it apart
- As I noted to HaeB above, we're analyzing the early ChatGPT plugin data and I'll have some updates to share on that soon, both on this Meta hub and live, because...
- ... I'm working on starting up a monthly open call/"office hours" on Future Audiences where anyone interested in this work can come and hear what's going on and give input. The first one of these will likely be next week, and I'll ping the folks who've signed up here individually once we've got a date/time/Zoom link (as well as publicize in the usual channels). If you can't make it, don't worry – I'll record, and we'll be doing this all again next month! Looking forward to seeing/hearing from you all! MPinchuk (WMF) (talk) 22:42, 26 July 2023 (UTC)