Wikipedia talk:Contributor copyright investigations/20111108

Sorting - good or too much clutter?
I started working on the cohort numbered 141–160. After doing a few, it wasn't perfectly clear which were done and which were not done, so I took the liberty of sorting them into three categories: OK, reported to CP or not done. This helps the organization of the section, but it is possible it will make the entire page too cluttered, so let me know if an alternative approach is preferred.
 * Update, I decided the subheadings created too much clutter, so I changedt hem to simply bold entries

FYI, I plan to look at every one in this cohort, but in some cases, a major reference isn't online. Given how many there are to do, I'll skip to ones that can be reviewed online, and come back to them later if we get through the rest, and/or leave them to someone who may have access to materials not online.-- SPhilbrick  T  16:23, 17 November 2011 (UTC)

Carrite/Davenport's assessment of this case subpage
Page 1 of 10 = largest contributions by RAN.


 * Number of articles listed = 660


 * Number of articles assessed in 14 months of investigation = 125 (18.9%)


 * Of articles assessed (n=125), number found "clear" = 74 (59.2%)


 * Of articles assessed (n=125), number found problematic = 51 (40.8%)


 * Number of problematic articles (n=51) in which the problem is an improperly attributed merge, split, or inter-wiki translation (i.e. very minor) = 8 (15.7%)


 * Number of problematic articles (n=51) in which the issue was clearly not created by RAN, but by others, or result of CCI error = 6 (11.8%)


 * Omitting these "WP Attribution" and "Misidentification" errors, number of problematic articles remaining = 37 (= 37/125 = 29.6% of all articles assessed)


 * Number of these articles (n=37) I was unable to assess due to revision deletion or history deletion in the investigation process = 8

Random observations about the remaining 31 problematic articles, which I can speak to:
About 5 were instances of hiding copyvio text behind < ! -- tags, not visible to readers, arguably very minor copyvio. One of these was unmasked by another editor, which remains on RAN's shoulders.

Several were flagged due to excessively long direct quotations of copyrighted material, generally in the "quote=" field of the Citation Template for footnoting.

Several were allegations of close paraphrase, ranging from serious to highly debatable.

Several were lazy pastes of long blocks of text with a single footnote, Encyclopedia Britannica 1911-style

There remain 10 instances of what could be reasonably considered "major" copyvio infractions:


 * Blackwells Mills Canal House: Inept attempt to render as blockquotes, 3/12/06
 * Vaporware: Large paste-in of Wikia list, copyright status not positive, 5/10/10
 * Siegmund Lubin: Large paste-in of time line, 4/20/06
 * Nassau Presbyterian Church, large paste-in of official church history, 12/31/05 (removed 6/4/06)
 * Russell Earl Marker, large paste-in, 3/30/06
 * James Gerard Kennedy, Sr., two big paste-ins, 1/31/04. (one hidden from readers by RAN on 12/30/07)
 * Elrey Borge Jeppesen, large paste-in, 4/7/06
 * Morey letter, 10/2/05
 * Reinhold Schlegelmilch, 7/9/05
 * Manuel Cuevas, 1/9/07

Excluding the May 2010 Wikia paste-in, the dates of these violations range from July 2005 to January 2007, which says everything, in my view.

Carrite (talk) 02:47, 4 February 2013 (UTC)

Disclaimer
I did not double count, triple count, and quadruple count but did my best. Carrite (talk) 02:54, 4 February 2013 (UTC)

Discussion
You argue that copyvios in hidden comments are "very minor violations", which is very debatable, certainly considering that your own research indicates that often, other editors undid the hiding, probably finding the information interesting, and (obviously) unaware that they were copyright violations. Similarly, unattributed copying from other Wikipedia pages (written by other editors) is more than a "very minor violation", it's denying people there attribution. It's not as serious as the "real" violations of inserting copyrighted text, but there's no need to label it "very minor" either. You also make strange claims, like with Lloyd Espenschied: "Note: And subsequently returned from limbo to mainspace, apparently not a copyvio after all..." No, it was deleted as a copyvio, and recreated with different contents afterwards. Among the deleted pages is Mabel Garrison, which was created in 2010, and Mod-2 from 2009. Looking at the other CCI pages quickly shows Perry Wilbon Howard (created 2008), this late 2010 edit, this March 2011 edit, ... It seems as if, while the majority of copyright violations (in article text) were from 2006, the problem continued afterwards right up to the CCI. And as has been established, he then continued with copyright violations in his image uploads and in references and links as well... Fram (talk) 08:54, 4 February 2013 (UTC)


 * No, the hidden comments are not "very minor violations" — they are huge PROBLEMS, but are arguably not copyright violations at all, per se, since the information is not really "published." Lawyers could debate that. The point is, it is something that needs to stop, but not the essence of the basic copy vio problem here. Transwiki copyright improprieties are impolite and technically illegal but do not put the project at risk. Carrite (talk) 19:45, 4 February 2013 (UTC)


 * We're not just concerned about the legal implications here. One of our core principles is that Wikipedia is a free encyclopedia, and that anyone is free to reuse our content provided they comply with some very basic requirements. Copyright violations fly in the face of this. Whether or not something is likely to lead to a lawsuit is of very little relevance to deciding whether or not it is a problem. Information included in hidden comments is undoubtedly "published" - it's made available on a publicly available webpage. That the webpage in question happens to be an edit window doesn't change that. Hut 8.5 09:37, 5 February 2013 (UTC)


 * I differ with several things you assert, but I won't argue the points of difference here. Carrite (talk) 03:27, 6 February 2013 (UTC)

Quotes
Did we ever have any consensus to systematically remove all quotes from Richard's articles? I see that Carrite has already removed from all the userspace drafts. Articles like Bayview – New York Bay Cemetery and Susannah Lattin continue to carry large quotes. 103.6.159.87 (talk) 16:14, 16 March 2016 (UTC)


 * Just boldly remove this gunk. It's not a copyvio under the law but it doesn't need to be there either and Norton's stuff needs to be clean as clean to pave the way for his eventual (long overdue) rehabilitation to all editing rights. Carrite (talk) 17:12, 16 March 2016 (UTC)