Wikipedia talk:WikiProject Medicine/Wellcome Library Editathon 2014

Licensing problems
Hello! I got notice about your event from WikiProject Medicine, a community group in which I am a member.

There are some images at Wellcome Images. Consider these examples:

On the website for each of these and many other images, it says "Copyrighted work available under Creative Commons Attribution only licence CC BY 2.0, see http://creativecommons.org/licenses/by/2.0/". Their prices page says "Hi-res historical images are also available to download from this site free of charge, for any usage, under a Creative Commons Attribution Only – CC-BY licence", and elsewhere on the site there is a notation system which I interpret as being a clear indication that these example images are historical and in this group. However, Wikimedia Commons displays the metadata associated with the pictures, and in that metadata, there is a message that "Copyrighted work available under Creative Commons by-nc 2.0 UK, see http://wellcomeimages.org/indexplus/page/Prices.html". I just presented what the prices page says and upon reading more there is nothing anywhere about noncommercial licensing, which is what the "nc" means. CC-NC licenses are of course not compatible with Wikipedia.

, thanks for uploading these two images. and, as facilitators of the Wellcome Library editathon, can you please get explicit confirmation of the licensing status of collections in Wellcome Images? The conflict to be resolved is the presence of two contradictory licensing assertions - one in the metadata and one on the website. If these images are licensed in a way compatible with Wikimedia Commons, then I expect that either the metadata should be changed or we should have an archived message from them confirming that the metadata shown on Wikimedia Commons is incorrect and the pictures actually are CC-By and not CC-By-NC.

If any of you would like for me to contact Wellcome Images myself, then I would, and if you are not managing this image sharing but another Wikipedian is, then please refer me to them. I do not intend to burden you with this but I did want to give you an opportunity to respond to this if you wished to have it.

Thanks for organizing this event.  Blue Rasberry   (talk)   02:57, 8 February 2014 (UTC)
 * Hi Bluerasberry, this is already under control and there has been a (very) initial meeting with the Wellcome and Wikimedia UK which both established that they are happy for me to complete a batch upload of all 100,000 images and that we need to have a discussion and workflow process for resolving/confirming the copyright status for these "non-obvious" cases, which appear to be for a relatively small percentage of the batch. Anyone that wishes to discuss issues related to the upload can contribute suggestions at Commons:Commons:Batch_uploading/Wellcome_Images_CC-BY, where I use one of the historic early ACT UP posters as an example for discussion of possible copyright issues.
 * In terms of timing, depending on availability from the right Wellcome folks or unexpected technical issues, I would expect to start testing and running some significant numbers of uploads before the end of this month, using the GWtoolset (I'm approved to use it as a steering group member), and hopefully quickly completing most of the upload in March. I will keep the schedule on the batch upload page updated as soon as there are more reliable dates to share.
 * If anyone would like a particular test set of images, such as 1,000 or so obviously public domain images based on something I can easily test for in the catalogue metadata (such as creation date ) or a list of digital library photo numbers that someone else generates and checks the licences for, then I would be happy to prioritize this and try to squeeze it in before the 26th February; on the assumption that I have the right tools from the Wellcome to by-pass the CAPTCHA by then. --Fæ (talk) 07:21, 8 February 2014 (UTC)
 * Copyright concerns are resolved here. Further concerns should be presented at Commons:Commons:Batch_uploading/Wellcome_Images_CC-BY where the upload of these pictures to Commons is being managed.  Blue Rasberry    (talk)   13:31, 9 February 2014 (UTC)
 * I did raise the apparently conflicting licences with the hosts of the event, and I see Wikimedia UK made a separate approach: glad to see this is resolved. I've asked that in the event itself the Wellcome hosts clarify the licensing so that participants understand what they can do., it would be awesome if we had a bulk upload of images before the 26th, but I recognise there may be technical or other barriers or other things taking your time. I greatly appreciate your (and the bot's) work in this area. MartinPoulter Jisc (talk) 12:58, 10 February 2014 (UTC)

My contacts at Wellcome aren't aware of you. Can you tell me which named person you have been emailing? Thanks, MartinPoulter (talk) 21:14, 13 February 2014 (UTC)
 * My email might have got trapped in their spam filter. I've sent again and asked John C. to forward me on. --Fæ (talk) 22:38, 13 February 2014 (UTC)
 * I do have a contact email now, though as there is no date yet for a first telephone call to discuss the pragmatics of getting an initial batch upload done. Looking at the calendar, it looks 50/50 that a significant test set will be available for the editathon, particularly as I am in Cornwall with no internet connection for a chunk of that time. --Fæ (talk) 08:50, 18 February 2014 (UTC)

Proposal
If anyone wishes to produce a preferred long list of images either by topic search or by a list of photo numbers in the digital image library (see project page) then I can offer to have these uploaded by bot on Monday/Tuesday next week in time for the editathon at the web-resolution size (~800px wide) and folks can either upload the full resolution "by hand" if they are using a file for something interesting, or we can leave it to a mass upload when they can be overwritten or redirected. You can either email me a list or if you prefer working in the open, add it to the Commons project page linked.

As an example, I am uploading around 1,300 lithographs in 'web resolution' from this list, these will be available at commons:Category:Files from Wellcome Images and will have the "credit bar" cropped off automatically. Fæ (talk) 08:50, 18 February 2014 (UTC)
 * There are large numbers of AIDS related pictures in this collection. What information would I need to get to you to have you upload any of those?
 * If I inserted images into articles at web-resolution size, then those same images would be automatically replaced if full-resolution images were ever added, right? Why not just upload full resolution now? Would it be easier for me to just upload what I want by hand? My concern would be sorting that exif data, because it is incorrect for these and I do not know how to address the problem of them being tagged with improper licenses. What is the long term plan for indicating that these are not NC licensed as the exif data says?  Blue Rasberry   (talk)  01:47, 19 February 2014 (UTC)
 * If there is a specific set of AIDS education related material you would like uploaded, just provide a search URL similar to this search, or provide me with a list of "photo numbers", these are the bold number in the search returns that look like "L0052515" (to find the search URL, you may need to right-click on the "results per page" links as the details may be hidden by website frames). Due to travelling and being away from a reliable connection, it may be Monday next week before I sort out any more uploads.
 * Posters in the last two decades or so have often been produced by other institutions; at this stage I cannot guarantee they will persist on Commons unless we can get some clarification from Wellcome as to the release from the original publishers. This is something we will pickup and detail later on the project page. Due to this slight uncertainty, I suggest we avoid uploading a large number of recent posters so that the numbers stay fairly manageable for the test period and if we do have to eventually delete some from Commons, we do not disappoint too many contributors.
 * There is nothing to stop you going ahead with uploading manually at full resolution, here is an example, however I do not have the access to by-pass the CAPTCHA yet. Once I do, then I'll refresh this test set with high resolution versions though realistically I am not planning on this happening before the editathon date. If you are uploading some "by hand", I suggest you stay reasonably consistent with the uploads so far by using "Wellcome "+ in the title, and linking back to the digital library in a similar way too. This should help avoid too many duplicates or help identify them more easily.
 * I can change the EXIF data, this is something mentioned on the Common project page, but I would like to ensure that the exact same digital version is first uploaded so that the same checksum is available to compare with. Once we change the EXIF, the file will have a different SHA-1 checksum and so this easy way of checking for duplicates gets lost.
 * To avoid confusion, the set of images I will aim to automatically update will be my uploads during the test, as defined by this "live" catscan2 search.
 * Thanks for your questions, glad to see you contributing. --Fæ (talk) 04:51, 19 February 2014 (UTC)


 * I have written to ACT UP, see Commons:User:Fæ/email/ACT UP, to confirm the copyright of their historic posters and uploaded a further sample based on this search. This appears to have doubled the size of Commons:Category:AIDS education and prevention, so this collection should have significant impact if we can provide assurance on copyright. --Fæ (talk) 09:01, 24 February 2014 (UTC)
 * I live in New York and go to ACT UP meetings sometimes, and was planning to go to a historical media exhibition they are organizing this month. If they would like to talk face to face with anyone then please refer them to me and I will talk to any of them and also demonstrate the impact of this work.  Blue Rasberry   (talk)  20:55, 24 February 2014 (UTC)
 * I will do, though I think this would be a good thing to do anyway. The exhibition sounds interesting, it could well be they would like some of their archives released to be preserved and available on the Wikimedia projects. I am sure the material in the Wellcome Library is just a small slice of educational material from ACT UP that we could make good use as part of a LGBT content creation project. --Fæ (talk) 22:31, 24 February 2014 (UTC)