Wikipedia:Wikipedia Signpost/2012-06-11/Special report

Last week, the Signpost was alerted to a blog in which a Cambridge researcher, Professor Peter Murray-Rust, observed that Springer Science+Business Media is taking Wikimedia content and asserting copyright over it. In his words, this is the "apparent systematic relicensing and relabeling" of Wikimedia content, "a breach of copyright and therefore illegal in most jurisdictions".

Springer SBM is the largest publisher of scholarly books in the world—a remarkable 7,000 a year—and with about 2,000 titles is the second-largest academic journal publisher after Elsevier. Springer owns 55 publishing houses and employs 6,200 people, and has considerable prestige and clout in the knowledge industry. One of their most popular websites is SpringerImages, the site on which Murray-Rust found the Wikimedia content. SpringerImages charges people for the use of their growing collection of 3.4 million scientific, technical, and medical images.

One commenter at Murray-Rust's blog sought to explain how this could have happened, rightly pointing out the copyright transfer process: "Folks have been submitting articles to Springer, using Wikimedia images in them, and during upload have ticked a box saying they were the creator of all images. During this process, Springer likely requires you to assign copyright to them. Springer now slightly lazily assumes it owns copyright on the images. No great conspiracy." Murray-Rust responded, "But laziness is no defence in law. And Springer are SELLING these. If I appropriate someone's scholarly image I check. Springer [does not]."

Murray-Rust's accusations drew a sharp response from Springer's executive vice president, Wim van der Stelt, on the Google+ SpringerOpen blog: "Mr [sic] Murray-Rust not only attributes the problem incorrectly to Springer Images, but also insinuates that Springer is selling commercial rights to use images that are already open access. This is not only outrageous and blatantly false, it also damages our reputation. ... The larger implication, that Springer is 'stealing' copyright and the insinuation that Springer is attempting to profit from 'ill-gotten gains' is false and we call upon Peter Murray-Rust to correct this allegation immediately."

Murray-Rust has indeed retracted his more trenchant allegations, including that of "copytheft". He told the Signpost, though, that the current position for Wikipedia and many other providers is that there are many instances of apparent rebadging of material in SpringerImages. While Springer has been informed of this, they have made no comment, and these images continue to be offered for resale. "A typical price is US$60 for re-use in teaching/coursepacks."

Murray-Rust's interest in uncovering the misappropriation of Wikimedia materials by SpringerImages was piqued when he discovered images there from a paper on which he was a co-author. He also found cases in which content imported from other publishers such as Wiley and PLoS—or in the public domain—was incorrectly labelled or licensed.

"I was personally affected", he said, "in that my CC BY content in BioMed Central journals had been copied and recopyrighted onto the Springer site. We've asked that at least SpringerImages announce to the world that there's a problem, and they have failed to do this. I've found hundreds of such instances, including content from museums and other companies. I'd guess there are thousands of images on SpringerImages that have been rebadged."

However, the poor attribution and licensing of material from Wikimedia Commons and the Wikipedias are far more widespread than just Springer's practices suggest. Even though detailed help is available on Commons, many downloaders make no effort to comply with the terms of the licences. With the exception of public-domain content, the use of materials found on Wikimedia projects requires attribution of the copyright holders and either the text of or links to the original licence.

Daniel Mietchen, Wikimedian in Residence on Open Science, told the Signpost that it's disheartening to see freely licensed images with instructions like "Viewing this image requires a subscription. If you are a subscriber, please log in." But the Springer issue is just "the tip of the iceberg", he says. "Wikimedia Commons has a dedicated category for cases in which uploaded files have been re-used externally in violation of these terms."

Currently, more than 1800 affected files are on the list, some of which have been used over 100 times outside Wikimedia platforms. The category's description reads "sometimes, media organizations just don't understand that in most cases, you just can't rip an image off Commons and just use it."

Mietchen says "media organizations are far from the only organizations and individuals misusing Commons' content." For example, the German Federal Archives (Bundesarchiv) declined to continue donating images to the Commons in 2010. Efforts by the German Wikimedia chapter had yielded a 100,000-image donation in 2008, the largest in Commons' history, but the results were troubling for the Bundesarchiv: "more than 90% of their images, while licensed correctly on the Commons, had been re-used without proper attribution across the Internet." In one notable case, more than 3,000 of the images, all available for free online, had been cropped to remove the attribution line and then listed for sale on Ebay as a "private collection" (Signpost coverage).

This is reflected on the individual level, too. Commons bureaucrat User:99of9 told the Signpost, "fewer than half of the files I've personally authored and uploaded to Commons have been attributed when re-used elsewhere on the internet; and fewer than 30% have appeared with the proper licence. Some Wikimedians have tried marking their files with a prominent notice about re-use in addition to the licence template, but this has been controversial at Commons. We've also trialled a click to re-use this image button, but there were technical problems and it's not currently in use."

"The 1800 files in the "misused" category at Commons," he says, "are almost certainly a vast underestimate, and re-use is a persistent problem for the site." We asked whether the solution lies in refining the warnings and making it easier for the public to understand their responsibility as re-users. "Certainly we need to educate the public, and what you suggest may be part of the answer."

Mietchen says that improper licensing sometimes starts at the source, even with publishers. For example, PAGEPress has been labelling their articles as "This work is licensed under a Creative Commons Attribution 3.0 Licence (by-nc 3.0)", i.e. with long and short forms of the licence text in contradiction—at one point with a bad typo—that has since been corrected to CC BY-NC throughout, which renders the content ineligible for re-use on Wikimedia projects.

While the failure of SpringerImages to comply with Creative Commons licensing terms had been pointed out as early as 2009 by archivist Klaus Graf, Springer appears to have finally taken action to clean up the problem over the past few days. This has been done in part by removing any content generically attributed to "Wikipedia" (of which there had been 368 results on Friday) and "Wikimedia" (157), with just one remaining watermarked "SpringerImages" and attributed to "Wikipaedia" [sic].

More than a day before this edition of the Signpost was published, the company's executive vice president of corporate communications, Eric Merkel-Sobotta, told us, "We have worked all weekend to solve the issues, and will be ready to make an announcement within 24 hours. I will make sure this is sent to you." The Signpost has received no further correspondence from Springer.