Wikipedia:Bots/Requests for approval/J Milburn Bot 4


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

User:J Milburn Bot
Operator:

Automatic or Manually assisted: Automatic

Programming language(s): AWB

Source code available: No.

Function overview: Tagging non-free images with multiple versions to have the old versions deleted.

Links to relevant discussions (where appropriate): None, although I will leave notes in appropriate places.

Edit period(s): Several runs. (Probably a single big run/several moderate runs to deal with the current backlog, and then a small run every week or month or something to pick up any new articles- whenever I turn it on.)

Estimated number of pages affected: Thousands

Exclusion compliant (Y/N): Yes, I believe AWB does that automatically.

Already has a bot flag (Y/N): Yes.

Function details: Old versions of non-free files fail our non-free content criteria, as they are non-free files which are no longer in use. So, we have literally thousands (perhaps tens of thousands) of files sitting around which fail our criteria. I am not proposing a bot to delete them (in some cases, the old versions may be useful) I am proposing the bot tags the old versions to be deleted, giving interested users time to revert to an old version or upload the old version separately as appropriate. We have a database report with all files that will be affected, which can be updated manually or automatically, or made far longer if necessary.

Discussion
What do you mean by "old versions? If you mean previous revisions of non-free images, these need to be kept. Non-free images that have been modified here have a dual copyright: a non-free copyright on the original image and a free copyright on the derived work that includes modifications. The free copyright requires us to keep at least a list of all editors who have modified the image, and by custom we keep all the old revisions. We do note usually even delete old revisions of articles that are patent copyright violations. &mdash; Carl (CBM · talk) 22:12, 19 September 2010 (UTC)
 * Also, at least some of those have multiple versions for other reasons than being reduced in size. Just from skimming through a handful of them I see File:08 - Koi No Dance Site.jpg and File:06orphansaz2.jpg were actually increased in size, and File:'Caucasian-Hawaiian', oil on canvas painting by Isami Doi, 1939.jpg was just uploaded twice for some reason. Is there a pressing need to review all of these how many thousands of articles? I'm not seeing any requirement regarding old versions of images in the NFCC. VernoWhitney (talk) 22:18, 19 September 2010 (UTC)
 * I double-checked WP:NFCC, and it does not seem to explicitly cover old revisions at all. That should be addressed before any large-scale action. This is not to say I am in favor of non-free images in general, but in all my dealings with NFCC I have always read "file" to mean a page such as File:Foo, including all revisions.  &mdash; Carl (CBM · talk) 22:22, 19 September 2010 (UTC)
 * Carl, I think you've misunderstood what I mean by "old versions"- I do not mean old revisions, I mean the old files, which do need to be deleted- they are non-free files not used in the article space, which quite clearly fail NFCC#7. As an example, this image is used in various articles, but the old version is not. (Also, for what it's worth, we do often do delete old versions of articles that are copyright violations if a complaint is made). As for the issue of not all images being reduced as such, that's not actually an issue- the template is the only one we have for deleting old non-free content; if there is non-free content that is not in use, it has to go. J Milburn (talk) 23:15, 19 September 2010 (UTC)
 * I might have misunderstood - but from my viewpoint a "file" as described by NFCC includes both all the "text" revisions of the File: page and all the "file" revisions as well. So all the links you provided are to the same "file", namely File:'Symphony No. 1, The Transcendental', oil on canvas painting by Richard Pousette-Dart, 1941-42, Metropolitan Museum of Art.jpg. If the most recent version of the file is used in mainspace, then the entire File page has generally been thought to satisfy the "at least one use" requirement. So if you read NFCC to refer not to pages in the file namespace, but to actual revisions of the images they contain, that interpretation should be clarified within NFCC itself. It's not clear to me that it's worth the effort to manually review thousands of images that have always been considered in line with NFCC. As I pointed out, if any non-trivial change has been made then the new revision should have a dual copyright, and we would need to preserve the edit history somehow for attribution requirements. &mdash; Carl (CBM · talk) 23:29, 19 September 2010 (UTC)
 * No, we don't. We don't need to leave non-free content that is not in use in the article space lying around. It's been a very long-standing practice to delete old versions of files- hence the existence of the template. I'm really, really not getting your objection. What we have are non-free files that are not in use- regardless of whether newer versions of the file is in use. To be frank, this is going to happen one way or the other- either I'm bored out of my skull for a week when I'd rather be working on articles, or a bot does it... J Milburn (talk) 23:33, 19 September 2010 (UTC)
 * I'm still not seeing any policy basis for this. A file is a page name, such as File:'Symphony No. 1, The Transcendental', oil on canvas painting by Richard Pousette-Dart, 1941-42, Metropolitan Museum of Art.jpg. That file is in use, so it does not justify any deletions for unused non-free files. There's nothing in NFCC that mentions deep links onto the image servers. Moreover, if someone edits a non-free file, I don't see how we can delete the revision history and still satisfy the attribution requirements for CC-BY-SA. Could you address the two points I am making: (1) the actual text of NFCC and (2) the attribution issue. &mdash; Carl (CBM · talk) 00:22, 20 September 2010 (UTC)
 * The text of the NFCC- the word "file" is not nearly as ambiguous as you are making out. You are more of a techy than I am, but, come on, this and this are not the same file. Both are non-free, and one is not used- therefore, it needs to be deleted. As for the need for attribution, if there is any need (unlikely- the nature of non-free images is that they belong to someone else, but I appreciate that it may be required in some cases) then we have the image page for that- the image page is where we attribute the author/licensing of the image, not in the page history. J Milburn (talk) 11:17, 20 September 2010 (UTC)
 * I would say they are two revisions of the same file. I could see the bot tagging just the images that have been reduced in size, because that presumably is not creative enough to give a new copyright on the derived work. But when the newest revision has been obtained by manually editing or optimizing the previous one, I have been under the impression that a new copyright on the derived work is created, in addition to the existing copyright on the image. Those are the images I'm concerned about, since it seems like the bot will tag them as "deletable" even though we should keep them for attribution. I'm not sure that a BRFA is the right place to sort this out. &mdash; Carl (CBM · talk) 11:42, 20 September 2010 (UTC)
 * They are not the same file; it's simply wrong to claim they are. Like I say, if there is any new copyright on the image (which in the vast, vast majority of cases, there won't be) it should be mentioned on the image page anyways. J Milburn (talk) 11:50, 20 September 2010 (UTC)
 * When I edit an article, it doesn't make a new article, just a revision of the old one. By analogy, when I edit a file, it doesn't make a new file. I'm not sure that we have ever given explicit advice that you have to update the image description when you edit a non-free file; the revision history is supposed to keep track of the list of editors. So if we delete the revision history, we lose that trail. &mdash; Carl (CBM · talk) 11:56, 20 September 2010 (UTC)
 * I guess what I'm saying is that the bot proposal seems to assume that a non-free image can have at most one non-deleted revision. I have never seen a policy that says that, and the policy should be worked out before it's implemented. &mdash; Carl (CBM · talk) 12:02, 20 September 2010 (UTC)
 * Ah, now there's a differentiation to be made here. No, when you edit an article, you do not create a new article, just as when you edit an image page, you do not create a new page. However, when you upload a new file, you create a new file- the clues in the fact that you're uploading a new file. The policy is pretty clear, and this has been the practice for a long time; we don't need yet another policy page expanding on this particular case. Anyway, like I said, the bot will not be actually deleting anything- it won't be enforcing anything, it'll just be tagging so that others can enforce it. J Milburn (talk) 12:28, 20 September 2010 (UTC)
 * Uploading a new version of a file with the same name is simply the way that we edit files here, since we can't edit them directly in the browser. NFCC refers to the page in the file namespace. I've had plenty of experience with non-free images, and I know that neither WP:NFCC nor WP:NFC says anything like "a non-free image can only have one non-deleted revision of the image". Indeed neither of those pages mentions the issue of revised images at all. This needs to be resolved there before you start tagging files. &mdash; Carl (CBM · talk) 13:19, 20 September 2010 (UTC)


 * I see absolutely no benefits. Nowhere does NFCC mention enything about old file revisions. And since the old revisions are not 'used', in any namespace (they are just stored), they technically are not governed by NFCC. I have come accross several situations where old revisions were deleted, making it impossible to revert inferior revisions (for non-admins). It also removes the attrribution to the original uploader. While non-free images technically do not require upload-attribution, it is a bad idea to remove any sort of history with regard to the files. — Edokter • Talk  • 14:23, 20 September 2010 (UTC)


 * NFCC does apply, even if it has not been routinely applied before. Prior versions of files can be substantially different than the current version. If the prior versions are not in use (and by default they are not), there's no strong reason to retain them indefinitely. Temporarily, yes. Indefinitely no. Wikipedia, despite having 371,000 non-free media files, is not a non-free media repository. There's no long term justification to retain non-free imagery that is not used. This bot isn't deleting things on sight, but rather tagging things. One thing of interest in looking at the linked report is the long gaps of time between versions. I picked five files at random, and found average prior version time spans to be measured in years. There's no justification for that. Since a human will still be processing these requests, I don't see there's any cause for alarm. I do think that any admin that does delete prior reversions needs to update the image description page to note who the original uploader was, if the current version is just a scaled version of the original upload. --Hammersoft (talk) 13:29, 21 September 2010 (UTC)
 * Note that the original uploader can be seen in the conventional page history anyway, as they created the page. J Milburn (talk) 13:43, 21 September 2010 (UTC)
 * We don't need just the original uploader; if the image is edited several times in a way that creates a new copyright, we need to preserve that just like article history. The template Template:Non-free reduced is less problematic because just reducing an image is not going to create a new copyright. But there's no sign that that template applies to all the images on the list - it would be inaccurate, for example, for an image that has been manually enhanced. And Template:Non-free reduced is a delayed CSD template, so there's no guarantee that the pages will be reviewed before they are deleted. I don't seen any NFCC-based reason to force people to do that review. &mdash; Carl (CBM · talk) 13:52, 21 September 2010 (UTC)
 * @Hammersoft: as I said, I do not see anything in NFCC that refers to old revisions. I have always interpreted NFCC to mean that if the File: page is transcluded than that page is "used", and that this covers all the revisions of the file page. &mdash; Carl (CBM · talk) 13:52, 21 September 2010 (UTC)
 * The original uploader cannot be seen in the conventional page history for non-admins - see for example File:Hank3.jpg which I did not originally upload but the first version was deleted to remove PII. VernoWhitney (talk) 13:55, 21 September 2010 (UTC)
 * Terrible example, as the licensing is completely messed up on that image anyways. It's not GFDL, and the uploader requires no attribution- it's PD or it's non-free. J Milburn (talk) 15:14, 21 September 2010 (UTC)
 * Regardless of the licensing for that particular example, it does show that your page history claim regarding attribution is false. VernoWhitney (talk) 15:24, 21 September 2010 (UTC)
 * We don't use page history for citations on images, as I have said many, many times. We use the image page. Furthermore, in the vast majority of cases (one where revdel/oversight has been used is a real one-in-a-million case) the page history will have it, and, furthermore furthermore, in the vast majority of cases (your case included) the uploader has no claim on the image anyway. Yes, they do sometimes (FOP issues, for instance) but in those cases, the attribution should be on the image page itself. J Milburn (talk) 15:33, 21 September 2010 (UTC)
 * I have never seen any guidance to the effect of "when you edit an image you need to also add more text to the image description page" (certainly not on NFC) and I don't think that it's common practice either. Normally the upload summary takes the role of documenting the changes that have been made, just like the edit summary for an article edit. &mdash; Carl (CBM · talk) 19:07, 21 September 2010 (UTC)
 * If you want to be attributed for an image, you will add details to the file page. The point still stands that who uploaded which version of the image is, in the vast majority of cases, not at all important with regards to NFC. J Milburn (talk) 19:12, 21 September 2010 (UTC)
 * Most people assume that they will be attributed by having their user name in the file history. Your proposal here is not aimed at "the vast majority of cases"; you proposed to tag all the pages on the report for delayed deletion, with no attempt to sort out which ones have important attribution history in the file history. &mdash; Carl (CBM · talk) 19:48, 21 September 2010 (UTC)
 * What's your bloody evidence for that?! Our upload guidelines (and our policies) note that authors need to be attributed- if people want to be attributed, they should add their name to the image page. J Milburn (talk) 19:54, 21 September 2010 (UTC)
 * My evidence is that have never seen an image where people duplicate the file upload history in the text of the image description page. But when I check images that I know have been edited (like File:Pi-unrolled-720.gif), the only way to see that they were edited is to look at the file history. Regardless whether the original image was free or non-free, the any non-trivial edits made here should have their own free copyright in addition to the non-free copyright of the original image, right? &mdash; Carl (CBM · talk) 20:15, 21 September 2010 (UTC)
 * I don't think so. For our purposes, they certainly remain non-free... And if the editing is anything other than non-trivial, why are the edits being made anyways?! Could you find me a single case where this would be a problem? J Milburn (talk) 20:18, 21 September 2010 (UTC)
 * Yes, they remain non-free, but they also have a second copyright from the person here who edited them, right? That new copyright is what I am concerned about. The non-free copyright is what is handled in the NFUR, while the edits of wikipedians are shown in the file history. If we delete the file history, it obscures the editing history of the file. &mdash; Carl (CBM · talk) 20:24, 21 September 2010 (UTC)


 * @Carl. I sometimes see arguments of the structure "Your argument  is not stated in policy ". That sort of argument is strict constructionism. If we were to abide by that philosophy, we'd need a whole new bureaucracy with extremely detailed policy to manage this project. So, no, the policy doesn't say "prior versions of non-free images should be deleted". It does say "Non-free content is used in at least one article." (or else it is orphaned and subject to deletion). Prior versions of non-free files are not in use. They might be linked to in discussions, but they are not in use by default. --Hammersoft (talk) 14:48, 21 September 2010 (UTC)
 * My point is that the prior revisions are "used" in the sense of NFCC, because the File: page that they are a revision of are transcluded onto at least one article. NFCC is not about deep links on web servers, it's about whether the entire File: page for a non-free image should be deleted. The same it true, for example, for WP:CSD - it applies to a template page as a whole, not to individual revisions of a template page. &mdash; Carl (CBM · talk) 19:07, 21 September 2010 (UTC)
 * As I have explained, there is a big difference between revisions of a page and individual files. No one believes we should delete old revisions of image pages. J Milburn (talk) 19:10, 21 September 2010 (UTC)
 * As I understand it, that is exactly what you are proposing. Image pages have two types of revisions: text revisions and file revisions. Both of them are used to record attribution history. They are separate because the editing process is different for the image part than the text part. &mdash; Carl (CBM · talk) 19:48, 21 September 2010 (UTC)
 * Right. So, "files" refer, not to files, but to file pages, and revisions refer to both revisions and files. Any other definitions you'd like to tell me about? I don't get it. What's the real reason you're opposed to this so violently? J Milburn (talk) 19:53, 21 September 2010 (UTC)
 * Right: I think "revisions" refers to file revisions and text revisions, and that NFCC is about file pages, not about revisions. &mdash; Carl (CBM · talk) 20:11, 21 September 2010 (UTC)
 * Its fairly evident at this point that we're not going to reach a consensus on this page. I would suggest starting a thread on WP:VPP to see if there really is consensus for this task. Mr.Z-man 20:19, 21 September 2010 (UTC)
 * Ok, withdrawn, or whatever. I'll start a discussion to find out whether non-free content criterion 7 exists. I thought it did, but apparently not. J Milburn (talk) 20:21, 21 September 2010 (UTC)

I don't understand bots and bot approval, but I would strongly oppose anything from J Milburn in that regard, as his actions in relation to images are disruptive (and not only images) to the point where the issue may end up before ArbCom. SlimVirgin talk| contribs 20:31, 21 September 2010 (UTC)
 * As usual, SlimVirgin, your insightful, reasonable comments only serve to calm the situation. J Milburn (talk) 21:14, 21 September 2010 (UTC)


 * JM, if this situation continues, I intend either to ask for community action, or to approach the ArbCom to ask that you be desysopped. It's not just the images, but the whole attitude. You wade heavily into sensitive situations, whether images of the Holocaust, or the issue of pedophiles editing Wikipedia, and you strike up the most extreme position you can think of, in post after post, on page after page, then when the situation swings against you, you start cursing at and blaming everyone else and/or going on wikibreak. It's very far from what's expected of admins. To add a bot to that mix would be a disaster. SlimVirgin  talk| contribs 21:18, 21 September 2010 (UTC)

'Can this please be closed now? There is nothing more to be said.' J Milburn (talk) 21:25, 21 September 2010 (UTC)
 * @SlimVirgin: You didn't like J Milburn putting File:GermanPoliceTormentingJew.JPG to a deletion debate, and post there an insinuation of some anti-holocaust spree on his part, and you're after HIM about his attitude? Put down the knives and back away. J Milburn isn't your enemy. Also, J Milburn already has a bot, previously approved for various tasks. No apparent disaster caused. --Hammersoft (talk) 21:30, 21 September 2010 (UTC)
 * I don't see any insinuation of an anti-holocaust spree in SV's post, just a (justified or otherwise) allegation of poor judgment of a more general sort, plus a recognition that combining bad judgment with bots is a perennial recipe for madness. Having seen this kind of thing many times before (not from J Milburn, who I've never had any contact with, but in general), I know exactly why SV is concerned.  I also think CBM is right about keeping old file versions in the revision history. 66.127.54.226 (talk) 08:45, 22 September 2010 (UTC)

per Anomie⚔ 03:41, 23 September 2010 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.