Copyright strategy/IRC office hour

Chat on copyright strategy
15 September 2016
14:00 - 15:00 UTC

[14:00:27] <jsutherland> We're now starting the office hour for the Wikimedia Foundation Legal Team's copyright strategy!
[14:00:40] <jsutherland> For more information: https://meta.wikimedia.org/wiki/Copyright_strategy
[14:00:43] <Fluffernutter> quietly throws confetti
[14:00:45] <WikiGnom> waves
[14:00:56] <jsutherland> The team is in attendance now, so feel free to have at it questions!
[14:01:07] <jsutherland> Uh. *With questions.
[14:01:09] <croslof> Welcome, all!
[14:01:21] <john_WMDE> hi from Berlin
[14:01:32] <Nicole_WMDE> hi from Berlin²
[14:01:35] <jgerlach> Hello!
[14:02:12] <jsutherland> Hello, Berlin. Can we get your points please?
[14:02:14] <jsutherland> :3
[14:02:14] <jrogers55> Hi everyone
[14:02:34] <Nicole_WMDE> Hehe. Ok: Thanks for initiating this! I have a question regarding connection between the copyright strategy process and the policy website. I don't see any reference in the process to policy.wikimedia.org? Shouldn't this go hand in hand?
[14:02:53] <dimi_z> Hi
[14:03:17] <jsutherland> Hey Nicole_WMDE!
[14:03:46] <NotASpy> just come to say - I think the biggest problem with copyright, and our strategy, is that it's all too indecipherable for new users. We need to have a serious dumbing down exercise, maybe more videos/animations etc.
[14:04:20] <MER-C> agreed
[14:04:51] <WikiGnom> Hi! To second @Nicole_WMDE , I was also asking myself whether the strategy is more about the Foundation's policy work or whether it regards the general work of WMF legal, such as dealing with third-party requests etc.
[14:05:40] <NotASpy> I was talking someone through http://tools.wmflabs.org/relgen/ today, we got to the third step, and it cuts to talking about releasing media, releasing work depicted in the media, releasing both media and work depicted in the media and it confused the person I was helping.
[14:06:16] <NotASpy> the 'Release generator' is brilliant, but we're terrible at explaining it all in easy to use terms
[14:06:21] <croslof> Nicole_WMDE: Public policy is a component of the copyright strategy, but the strategy is broader than public policy. policy.wikimedia.org is primarily a site with resources relating to Wikimedia’s policy work, whereas the copyright strategy is a process for identifying, discussing, and addressing copyright issues.
[14:07:14] <jsutherland> WikiGnom, that may also answer your question ^?
[14:09:17] <jsutherland> NotASpy, thanks for raising that
[14:09:23] <slaporte> NotASpy: That OTRS release generator is intense
[14:09:23] <jrogers55> Adding to WikiGnom, copyright concerns both policy and more Wikimedia specific work because it covers a lot of different areas. Some aspects of copyright might involve looking into changes in the law, while others involve updating Foundation policies, legal analysis, or helping coordinate efforts to update community policies on the various projects.
[14:09:38] <Nicole_WMDE> Ok, I think I get it. The strategy has a much broader approach than "only" advocating for a better, modern copyright. The strategy process is built to identify the issues, and if some issues long for advocacy/public policy, then they will be included into the website. Right?
[14:09:54] <slaporte> Nicole_WMDE: Exactly.
[14:09:54] <WikiGnom> Thanks @jrogers55 !
[14:10:06] <Nicole_WMDE> ok, thanks! :)
[14:10:35] <melodykramer> Hello team! I was wondering what you've identified are the biggest blockers for end users. i.e. where do people seem to have the most trouble? Is it difficulty in understanding the issues? Difficulty in complying with licenses? Trouble tracking changes to copyright law?
[14:11:09] <melodykramer> Not knowing these issues exist at all?
[14:11:16] <jsutherland> Hey melodykramer!
[14:11:18] <Guest43052> My main concern is similar to that of notaspy, we have a wealth of material on copyright, but it needs organization and severe pruning. As part of that exercise, improving the delivery to people with limited prior knowledge is important.
[14:11:39] <melodykramer> Hey jsutherland!
[14:12:02] <croslof> NotASpy: If I understand what you’re saying, it’s that there’s a gap in copyright knowledge and expertise. There are plenty of people in the Wikimedia communities with copyright expertise, but it’s hard for new people who may want to help with copyright projects or activities (such as licensing review) to gain the expertise necessary to do so. Is that right?
[14:12:34] <MER-C> not only that, but new contributors should be made aware of our copyright policies without having to find out the hard way
[14:12:47] <jsutherland> melodykramer, the consultation (see https://meta.wikimedia.org/wiki/Copyright_strategy/Issues ) was to identify those.
[14:12:52] <Guest43052> FYI I'm Sphilbrick, not sure how to login properly (but that's a different session :)
[14:12:52] <melodykramer> ah, gotcha.
[14:12:55] <jsutherland> :)
[14:13:01] <jrogers55> melodykramer: I think that's a tough question. A big part of the reason we're holding the consultation is to answer it. I'd say at the moment that one of the big things we've seen so far is that there's a need to make the process of dealing with copyrights easier to understand and better organzied, but it's by no means the only one.
[14:13:13] <john_WMDE> related to what melodykramer asks: What is the process for prioritizing within that heap of issues?
[14:13:36] <NotASpy> croslof: that's certainly a problem. The 'old hands' who have had years to get to understand copyright now have developed a solicitor/lawyer mentality and way of speaking. It's difficult to pass on that knowledge to new users and general contributors alike.
[14:13:36] <melodykramer> As part of that process, have we identified all of the situations WHEN users may need to know about copyright issues? i.e. what actions on site might need additional information, so we can best understand how to present that information?
[14:16:21] <NotASpy> we have the most common scenarios covered, I think. It's usually uploading an image or donating text.
[14:16:36] <croslof> john_WMDE: The prioritization should come out of the discussion. If people see issues that they agree are a problem, I encourage them to chime in by proposing solutions or commenting on proposed solutions. As I see it, the issues that have the most discussion and some measure of consensus about how to address them are the highest priority.
[14:17:12] <Steinsplitter> waves
[14:17:15] <jgerlach> Guest43052: How to explain policies in a simple and concise way is one of the issues raised on Meta: https://meta.wikimedia.org/wiki/Copyright_strategy/Issues#Project_copyright_guidance Feel free to add your thoughts.
[14:17:17] <jsutherland> Hi, Steinsplitter :)
[14:17:27] <john_WMDE> croslof: well, they might just be the most controversial ones, but not the most important if you ask me
[14:18:12] <Guest43052> Jgerlach, as you can see, I already have :)
[14:19:01] <jgerlach> Guest43052: ha, my bad. :)
[14:19:12] <Guest43052> That's OK :)
[14:21:36] <croslof> john_WMDE: We’ll also use our own judgment when looking at the issues. If there’s an issue with a proposed solution that only a couple people have commented on, but they agree on the solution, then that’s something we’ll definitely look at and think about acting on. (And, of course, WMF lawyers aren’t the only ones who can take action.)
[14:21:39] <Steinsplitter> A coypright-crashtour would be useful, especially for new commons users. Likely that would reduce that amout of copyvios. With the new filter which have been added, we where able to reduce them a bit, linking to a crashtour(-course) would be hepfult :) Likely somone from legal can help building them .
[14:21:54] <slaporte> melodykramer: we’ve started looking at the different ways that encounter copyright, but it’s a tough question to answer comprehensively for Wikimedia. Just for image uploads, there are a few different paths (UploadWizard, Flickr import, etc) that each have slightly different information around copyright.
[14:22:19] <croslof> john_WMDE: But still, I encourage people to +1 solutions they think are good, to make sure they aren't lost in the noise.
[14:22:30] <Guest43052> As a very small issue, I spent a lot of time with logos. I often advise upload or's that they may not want to provide a free license but I'd like to get feedback on whether this general advice is good and how best to present it.
[14:23:10] <melodykramer> slaporte: Gotcha. I'm thinking about what information people might need and when. Someone new to Commons may not need to know EVERYTHING about copyright at that point in time, but they may need to know 1 or 2 things immediately, followed by something else at a later point in time (measured by an additional action or some other variable.)
[14:23:58] <slaporte> melodykramer: Yeah, that’s a great way to look at it.
[14:24:26] <melodykramer> I like the way that GitHub's help desk presents this: https://help.github.com/ - There are 4 "likely scenarios" for users at the top, each with a bit of guidance. And then sections down below, which help users who have already passed through that gateway.
[14:24:27] <Guest43052> I've also handled dozens of situations recently where people are improperly copying within Wikipedia. I wonder if an automated message could be devised because in theory it should be relatively easy to detect.
[14:24:30] <Nicole_WMDE> croslof can you give a quick update on the timeline? the collection of feedback on the process itself is now over, how long will you collect issues and comments to those?
[14:25:44] <jrogers55> Guest43052: could you explain a little bit more what you're worried about in the question about logos? Is it the worry that providing attribution for them is often difficult because of how they're used, or something else?
[14:25:55] <Guest43052> I'll echo Melodies comment — how do we avoid overwhelming someone when they have a very specific question. Obviously, if they ask it carefully we can answer narrowly but they may ask broadly but still have only a specific issue
[14:26:05] <melodykramer> slaporte: Slack's help might also be helpful: https://get.slack.help/hc/en-us/categories/202622877-Slack-Guides - It's separated into guides specific to each end user, and also presents a video option at the bottom.
[14:26:39] <NotASpy> we must have thought about transcluding history for attribution purposes when copying within Wikipedia - what's the reason we don't do that ?
[14:26:48] <Guest43052> Regarding logos, if it is just English Wikipedia I think the best option is not to freely licensed it but use the fair use provisions. I'm often at OTRS where someone is trying to freely license a logo which I think is a bad idea (unfortunately some exceptions)
[14:26:49] <slaporte> melodykramer: This is great, I think it’s connected to Steinsplitter and other’s suggestion around simplifying or explaining policies where we can.
[14:27:07] <croslof> Nicole_WMDE: There's no planned end date—the goal (as long as interest and participation continues) is that the copyright strategy page(s) will become a place for ongoing organization, discussion, and planning.
[14:27:07] <marktraceur> Guest43052: Generally the answer in my experience is to refuse to answer hypotheticals and only answer questions pertaining to specific examples (sorry I'm not watching the stream but I got pinged by slaporte so here I am eavesdropping)
[14:27:52] <melodykramer> slaporte: Agreed.
[14:28:03] <Nicole_WMDE> croslof: ah, I see. sounds good, thanks!
[14:28:21] <jsutherland> NotASpy, could you clarify a little bit?
[14:28:27] <Steinsplitter> a lot of users, epcially new one, are not familiar with the underlying licenses and copyright issues. Explaining them (in a shourt interactive tour - we have the neccesary infrastructure: Extension:GuidedTour er all) woule be helpful.
[14:29:28] <slaporte> melodykramer: One challenge I see is that many (most?) of these policies, guides, and explinations are created by the community, so we are careful to make sure the changes are proposed and driven by the community too.
[14:29:41] <Steinsplitter> Regarding logos, good point. i want +1 it. My suggestion would also solve the logo problem, i am sure most of the companyes won't relaise theyr logos under CC ... but if they are not familiar with it sometimes they do (then later they try to revoke the license grant).
[14:29:43] <jrogers55> Guest43052: Ah okay. For logos, I think people can get confused about the copyright vs. trademark problems there. There isn't anything wrong with freely licensing them under copyright, but that doesn't mean that reusers can just use the logo wherever they want, so letting people know that the free license might cause confusion is a good idea. You can also help them make sure
[14:29:44] <jrogers55> that things like the "this is a trademark" template wind up on the page for any logos contributed ot the projects, I think that one helps reduce confusion a lot.
[14:29:52] <NotASpy> jsutherland: if you copy part of Article A into Article B, the history from Article A upto the point it's copied is transcluded into Article B. A bit like a history merge, but less complicated to reverse
[14:30:00] <slaporte> I wouldn’t want folks to assume that these policies always need to be written by a lawyer!
[14:30:27] <NotASpy> jsutherland: the history would include either the history page transcluded from Article A or some sort of more elegant linking process for attribution purposes
[14:32:06] <Guest43052> Steinsplitter Exactly. I am currently dealing with a user who wants to freely license a logo for an article that is going to be deleted. I guarantee they will not be happy to learn that they can have their article but they've given up all rights to the logo. I refused to grant the license for that reason.
[14:32:37] <Guest43052> sorry, "canNOT have their article"
[14:32:43] <Fluffernutter> NotASpy: Oh god now I'm picturing copyright policy/histmerging as a set of Ikea assembly instructions. And it's...so apt.
[14:33:00] <jsutherland> NotASpy, ah, gotcha. I'll let legal respond with their thoughts.
[14:35:06] <jgerlach> Steinsplitter: I personally like the idea of pointing people to explainer videos or other material that's out there. (even though that may also be a frustrating experience for them, at times.) I want to encourage you to add this to the discussion on Meta.
[14:35:24] <slaporte> NotASpy: I like elegant linking solutions, but keeping track of history of text is technically hard (from what I understand)
[14:35:25] <croslof> NotASpy: Currently, people can provide attribution (and are required to do so by the Terms of Use) by linking to the source article in their edit summary. I don't know everything that would be involved in transcluding article history along with article text, but it sounds extremely difficult to me.
[14:35:30] <melodykramer> slaporte: Thanks for the clarification.
[14:35:44] <slaporte> melodykramer: Thanks for the links!
[14:35:55] <NotASpy> Fluffernutter: yeah, when it was mainly us old fuddie duddies writing and uploading content, you took time to become conversant with policy, like a carpenter does with making a table. With all these people doing drive-by promo spamming, it's just like buying a flat pack table and assembling at home
[14:36:34] <jrogers55> Guest43052: Steinsplitter: I would say on logos that you don't have to tell people no, but it's really good practice to make sure they understand what they're doing. Both to ensure that licensing it freely can't be revoked later and allows people to reuse it, but also that licensing it freely doesn't give up the trademark rights and doesn't necessarily make it available to
[14:36:35] <jrogers55> reusers for any purpose.
[14:36:38] <slaporte> melodykramer: I like Github’s choose a license page too: http://choosealicense.com/
[14:36:42] <melodykramer> slaporte: I really like thinking about documentation and the best ways to help people understand hard stuff. This is incredibly complicated stuff!
[14:36:53] <melodykramer> Oh I really like that slaporte.
[14:37:21] <Guest43052> I have some concerns about explainer videos, but my concern is alleviated if there is accompanying text. The nice thing about text is that you can do a manual scan or control F but not easy to do with video
[14:37:29] <melodykramer> And the reason I like that is that that's likely how the person who is decide is thinking.
[14:37:48] <melodykramer> It's like "What's your end goal?" And then describing what's available for that situation.
[14:37:58] <melodykramer> I think that's always a good way to approach complicated subjects.
[14:38:20] <jsutherland> Guest43052, that's true, yeah.
[14:38:44] <melodykramer> slaporte: Here's another great knowledge tree to help people understand something: http://library.pdx.edu/diy/ (In this case, how to use a library for reference material.)
[14:38:59] <Steinsplitter> jgerlach: if a filter detected that the user uploads copyvios we can link them to the interactive tour. just for example. We don't need a discussion on meta for that, it is on my to do for a while. I had no time to code it and to write a text which wouldn't be legal advice.
[14:39:01] <Guest43052> jrogers55 I agree (I said I refused but I followed that up with "I'll license is if you really want me to but here is why you should withdraw your request)
[14:39:08] <croslof> melodykramer: I agree. It's good to design for how people actually think and behave, rather than some idealized workflow/decision tree.
[14:39:46] <melodykramer> And also how people comprehend knowledge, croslof. It's very hard to take in very dense, very long info about situations that may or may not be related to what you need to know right now.
[14:40:07] <Steinsplitter> jgerlach: I am talking about files (disclaimer) :)
[14:40:52] <jgerlach> Steinsplitter: I am curious to see that! And I think Guest43052's concerns with video are valid.
[14:43:53] <jsutherland> Does anyone have any further questions? :)
[14:43:58] <NotASpy> I notice https://phabricator.wikimedia.org/T125459 on Phab, where do we stand on access to image copyvio detection ?
[14:44:29] <Steinsplitter> +1
[14:46:23] <Nicole_WMDE> jsutherland: nope, thanks again for initiating this office hour.
[14:46:27] <slaporte> NotASpy: I’m not sure about the status of this tool, but I just wanted to point out that tools like this are usually good at detecting if images are duplicates but not necessarily verifying the copyright status of images. I think it’s funny that we call the tool “copyvio detection.”
[14:46:53] <Steinsplitter> slaporte: https://commons.wikimedia.org/wiki/Commons:Abuse_filter/Automated_copyvio_detection
[14:47:04] <Steinsplitter> ...but it is *very limited*
[14:47:13] <slaporte> As a lawyer, I like the fact that humans get to review files (not just leaving it entirely up to machines)
[14:47:32] <Steinsplitter> the tool aforementoined by NotASpy would be helpful for tagging files as "posible copyvios".
[14:47:49] <NotASpy> of course, it would be useful to prioritise files and accounts for us to review
[14:48:00] <jsutherland> Nicole_WMDE, no problem! If you do have other issues, or want to get involved with discussions and such, feel free to chip in on https://meta.wikimedia.org/wiki/Copyright_strategy/Issues
[14:48:31] <Nicole_WMDE> jsutherland: will do. :)
[14:48:43] <marktraceur> NotASpy: I believe the status of automatic copyvio detection is, at best, "stalled"
[14:48:45] <Steinsplitter> and needles to say that on some small wikis and commons are only a small amout of users which are checking the recent uploads. It is impossible for them to check all.
[14:49:01] <ankry> has question concerning planned changes in text licensing: Is there any chance to accept non-CC licensed texts for Wikisources?
[14:49:08] <jrogers55> Steinsplitter: NotASpy: I think flagging as "possible" is the right way to do it. Copyright law has shown itself to be very resistant to automated tools because whether use of a copyrighted image is permitted varies based on where and how it's used.
[14:49:27] <NotASpy> we have Flickr washing which is a perennial problem for example - user takes copyright image, uploads to Flickr, sets licence to CC-BY, imports from Flickr to Commons. We may never know if it was uploaded elsewhere.
[14:49:53] <NotASpy> I guess in such a situation, 'we' are covered though, and the rightful owner just DMCA's us and Flickr at once.
[14:50:05] <slaporte> ankry: I believe Wikisource allows public domain, do you mean more restrictive copyright?
[14:50:10] <Revent> Steinsplitter: arwiki (cringe)
[14:50:14] <ankry> I mean GFDL
[14:50:46] <slaporte> ankry: I don’t beleive there has been any discussion around that
[14:50:55] <ankry> slaporte: corrent WMF policy explicitely forbids non-CC-BY-SA license for texts
[14:51:17] <jrogers55> NotASpy: Exactly. I wouldn't look at that one as a huge problem. I doubt there's any system that can completely stop users from making mistakes, especially since we draw content from other sites that aren't necessarily going to be as strict as we might want to be. But if something makes you suspicious or we get a DMCA, we can investigate that specific thing and fix the mistake.
[14:51:35] <ankry> regardless whether they are created by wiki users or texts from external sources stored in expliceite form
[14:51:59] <ankry> slaporte: where can such a discussion be started?
[14:52:13] <ankry> I did not find an appropriate place
[14:52:21] <slaporte> ankry: the TOU allows licenses thata re compatible with CC BY-SA
[14:52:30] <ankry> GFDL is not
[14:52:49] <slaporte> ankry: let’s email about this a bit if you want
[14:52:55] <NotASpy> jrogers55: I wrote a lengthy guidance note on Flickr washing many years ago - it's quite easy to spot, but you need a reason to look. It would be nice to have something look automatically for us and flag up more concerns.
[14:53:20] <Steinsplitter> +1
[14:53:48] <NotASpy> which gives me an idea for a bot to compare EXIF data for individual Flickr photostreams and flag up any with unusual patterns.
[14:53:52] <jrogers55> ankry: I would add that because text isn't discrete, it's a lot harder to have multiple licenses existing simultaneously than it is to allow choice for image licensing.
[14:54:14] <Guest43052> Good point
[14:54:22] <ankry> slaporte: and we still have some GFDL texts that *cannot* be relicensed because uploaded soon before wiki license change (but after deadlin in GFDL-1.3 for relicensing)
[14:54:46] <Revent> What’d love to see, long term, is something like Cluebot for looking at new uploads… a machine learning system to ‘score’ new uploads for likehood of being a copyvio.
[14:55:17] <marktraceur> Revent: There needs to be a way to analyze the images, though, and we don't have that tech. It's pretty hard to come by.
[14:55:24] <ankry> jrogers55: storing texts in Wikisource is not much different than storing images on commons
[14:55:29] <Steinsplitter> such "bad files" can bring re-users in (substantially) legal peril...
[14:55:44] <slaporte> ankry: Interesting. I have a few more detailed questions about this — shoot me an email at slaporte (at) wikimedia (dot) org
[14:55:45] <jrogers55> NotASpy: Steinsplitter: I think it would be cool to have a tool that helps the community flag issues. Cluebot style training would be hard, but not impossible, I think.
[14:55:49] <Steinsplitter> *may
[14:55:54] <ankry> slaporte: OK.
[14:56:05] <ankry> slaporte: I will include some examples
[14:56:08] <Revent> marktraceur: I meant more on simple metadata (like EXIF or lack thereof) and search results… still likely very complex, tho.
[14:56:09] <slaporte> great
[14:56:27] <jsutherland> Just going to give a five-minute warning here - we're scheduled to run to 15:00 UTC.
[14:56:30] <marktraceur> Revent: Search results are the hard part, I'd say. EXIF might be easier actually.
[14:57:18] <Revent> If we could use something like TinEye’s API… probably not cheap, tho.
[14:57:48] <marktraceur> Revent: We looked at that, without a special deal it would be several million a year
[14:57:54] <Revent> Ouch
[14:58:00] <marktraceur> That's assuming we check new uploads, and not stashed files or existing files
[14:58:22] <Revent> That’s still insanely expensive
[14:58:32] <marktraceur> And how.
[14:58:59] <Steinsplitter> a var for scanning exif in abf would be helpful, for example.
[14:59:11] <marktraceur> I bet matmarex would enjoy that...
[14:59:19] <marktraceur> (I'll actually float it to him in a second)
[14:59:34] <Steinsplitter> matmarex added a lot of <3 stuff/tools recently. i am very happy about that :)
[15:01:04] <jsutherland> Okay - we're at time. If you have further comments or proposals, please feel free to add them here: https://meta.wikimedia.org/wiki/Copyright_strategy/Issues
[15:01:09] <jsutherland> Thanks a lot, everyone!
[15:01:19] <jsutherland> We will be posting this log onto meta very soon.
[15:02:20] <slaporte> Thanks everyone! I’m glad to hear so much interest in creating clear, useful explinations for copyright
[15:03:11] <slaporte> I’ve added a few links to the meta page here: https://meta.wikimedia.org/wiki/Copyright_strategy/Issues#Project_copyright_guidance
[15:03:14] <croslof> I look forward to continuing the discussion on Meta!
[15:03:15] <NotASpy> thanks all, very good
[15:03:24] <slaporte> NotASpy: Thank you!
[15:03:44] <Revent> slaporte: As a slight ‘side issue’, I’d love it if we could figure out a better way to explain URAA issues to ‘established’ editors… it bothers me that the effective rule on Commons is to simply ignore it.