Wikipedia:Bots/Requests for approval/OgreBot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved.

User:OgreBot
Operator:

Automatic or Manually assisted: Manually assisted.

Programming language(s): Perl

Source code available: No, unless requested.

Function overview: Bot will assist the deletion of images on commons via a two-step process. The bot will not actually delete any pages: I will do that with my account.

Links to relevant discussions (where appropriate): Please see the thread I am creating at AN: WP:AN.

Edit period(s): Supervised, thus only when I'm available to edit.

Estimated number of pages affected: Thousands (until backlog cleared). Hundreds per day.

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): N

Function details: ;Stage one Bot will look at 100 images at a time in Category:Wikipedia files on Wikimedia Commons. Bot will then print out a list via the PHP page on my local server, printing something in this format: I will manually check next to each image upon approving it for deletion or not approving it. input a list of images from Category:Wikipedia files with a different name on Wikimedia Commons for the bot. I will indicate either that I am approving the image or not approving the image. If I indicate I am not approving the image, I may also include a nsd or npd flag; I have no plans to worry about a puf tag at this time. *If I leave the edit summary blank, the bot will simply ignore the image. Finally, once unlinking has been done, bot will print out a page with a "delete this image" button for me to click. It will also print out any errors which need to be fixed manually before the image can be deleted. Magog the Ogre (talk) 03:40, 26 September 2010 (UTC)
 * 1) It will determine if each Commons image is a duplicate of the en.
 * 2) It will determine if the Commons image has appropriate licensing to match the en image.
 * 3) It will determine if the Commons image page lists the uploader at en.
 * [checkbox here] [250px of en image] [250 px of commons image], [commons link here], [wording indicating if image is dupe: if not, print out each's resolution], [wording indicating if licensing is right], [wording indicating if uploader is linked], [uploader username] [Wikitext for en image], [Wikitext for commons image], [textbox for why image was not approved, may remain blank].
 * Stage two (see )
 * If I approve the image, the bot will unlink all instances from en and replace them with the instances on commons
 * The bot will verify there is not a superseding and conflicting image on en.
 * The bot will ensure that resolution information is maintained properly (in case of higher res on commons)
 * If I do not approve the image, the bot will tag the image with . If I choose the nsd or npd flags, the bot will tag the image with {{subst:nsd}} or {{subst:npd}} and notify the original uploader.
 * Where there are any errors in this, bot will remember them. Errors may include protected pages, duplicate images found (could be a problem, say, if the image was hidden in comments), or confusion due to resolution issues (e.g., resolution is listed in an infobox, and it's too hard for the bot to parse). The bot will print out a list of errors on the server side so I can manually fix them.

While I can understand everyone is worried this will not save time, I must whole-heartedly disagree. I've spent time on the backlog, and I spend a huge amount of time the menial work of that I've listed above that the bot can do.

Discussion
Frankly, I am doubtful whether this will work well. First, I doubt this on grounds of practicality. If it were easy to validate automatically, Metsbot would still be running. Again, if it were easy, Commonshelper would do a much better job than it does. Second, I question the necessity. Some form of move-to-commons backlog has existed since forever, either of images moved and needing processed, or of images not yet moved. The sky has not yet fallen and probably never will. Next, I believe that this proposal aims to solve the wrong problem. So long as local upload of free content is allowed by default, rather than redirecting all free uploads to Commons unless the uploader jumps through hoops (similar to what it required to get a blank upload form), the problem will never go away. The very first technical step in resolving these backlogs should be to prevent them growing further by reducing local free content uploads to an absolute minimum and maximising direct uploads to Commons. Finally, with so much of the backlog having been processed already, I would argue that the remaining images include an abnormally high proportion of crap. That is to say, images which should not have been uploaded to Commons, images which are licensed incorrectly, images which lack descriptions and sources, and so on. These should be processed manually with due consideration of the appropriate action - deletion included - rather than being assumed to be ok unless some glaring error is picked up automatically. For these reasons, I strongly oppose any automation of this process. [An afterthought: I would have fewer objections to the non-backlog, that is to say the newest uploaded-to-Commons categories, being processed with automated assistance.] Angus McLellan (Talk) 16:12, 26 September 2010 (UTC)

All this is already availible in various templates. The problem is that the more automation you build in the weaker the sanity check on the copyright status becomes.©Geni 16:14, 26 September 2010 (UTC)


 * To respond to both of your concerns: the bot is something I'm already considering to write for myself, without making any edits, to assist in the manual deletion. The only part that I really need approval for is the ability to unlink images automatically after I've already approved them for deletion. And of course, again, I'm reviewing each image manually, ensuring there are no obvious copyright issues for sanity check reasons. Finally: to the concern dealing with Metsbot, again, this information is all something I'm going to create anyway on the back end (no edits = no approval needed), and it's only meant to assist me as the deleter; it will not do anything radical. Magog the Ogre (talk) 18:19, 26 September 2010 (UTC)


 * I'm a bit confused on how this bot will work, but I'm not sure it is entirely necessary. As the system already shows on the bottom of every image page whether it is a duplicate of some Commons file or not, the likelihood of the uploader's name being missed by the bot and whatnot seems too great to be efficient. Also, what is necessary is the validation of the local files first—if a local file never listed the source and/or author's name, both that file and the Commons one should be tagged. A bot would never be able to help identify this, so those checks would need to be done manually for each image anyway. Basically, I don't think this will save enough time to be necessary at this point. / ƒETCH COMMS  /  01:01, 27 September 2010 (UTC)

Restate of purpose
OK, it's become obvious I did a poor job of selling this bot and explaining its actions. Please ignore the entirety of stage one above. That's already something I'm going to write, and I'm doing it for me because it saves time. But it's not actually any bot edit, and as such doesn't need approval. The only important edits that this bot will do is:


 * I will input a list of files from Category:Wikipedia files with a different name on Wikimedia Commons that have been manually reviewed by me as acceptable transfers to commons. Not by the bot, by me. I have been unclear about this, apologies. The bot will then complete step two above: unlink the English image where acceptable, and replace it with the commons image. E.g., File:NameOnEnglish.jpg -> File:NameOnCommons.jpg. If may also input information into the bot indicating I've declined to transfer the image; the bot will then replace the {{subst:ncd}} tag with, and possibly add a {{subst:nsd}} or {{subst:npd}} tag to the image if I specify (and warn the uploader). This is really only a semi-automated bot; frankly, I could do it in JavaScript which wouldn't need bot group approval; however, it would be much less time consuming to have the bot do the edits, rather than my browser. Magog the Ogre (talk) 01:24, 27 September 2010 (UTC)
 * OK, so the bot is just unlinking after you delete the files and not actually doing any "real" reviewing? That seems reasonable enough; however, I have a suggestion:
 * Make a template to place on the image page of a reviewed file. This should tell the bot to change the links for that image.
 * Let any admin add the template onto an image he/she reviews.
 * Have the bot change the template once the links are updated.
 * This should place the image in a new category for speedy deletion under F8, as all of the images will have been checked by admins beforehand and shouldn't require any extra checks beforehand, so that category can be cleared out quickly and daily.
 * I think it is possible to have a template check the revisionuser who added it, name the admin on a parameter, and have the bot relink only the images that have been checked by an admin. This is similar to the John Bot II system and is more efficient, as more users can help out. Does this sound reasonable? / ƒETCH COMMS  /  02:26, 27 September 2010 (UTC)


 * I can do that, but I might want to add it as an additional function, because it will create an extra step. The admin would have to a) add the template, b) wait until the bot is run again by me, and delinks, then c) delete the image. But if you think this is a good idea, I'd be happy to implement it instead or in addition. Magog the Ogre (talk) 02:42, 27 September 2010 (UTC)
 * Well, any admin could delete the file later, like clearing out the CSD categories every 00:00 UTC or whenever, and that's a quick task with a batch-delete script. If you can add this function, it would let more users help out in clearing the backlog. / ƒETCH COMMS  /  04:11, 27 September 2010 (UTC)


 * Alright; when do I start? Magog the Ogre (talk) 01:02, 28 September 2010 (UTC)

although given my work with images, I will recuse from final approval.  MBisanz  talk 22:45, 19 October 2010 (UTC)
 * Alright, done. I went quite a bit over on the 50, more like 90, because the last image I instructed the bot to delink had 50 transclusions (!). Magog the Ogre (talk) 05:25, 26 October 2010 (UTC)
 * BAG assistance needed - when can I get an update here? Magog the Ogre (talk) 22:47, 1 November 2010 (UTC)
 * While I can't view the deleted images, it looks like everything went OK, and I don't see much harm in approving this request. Gigs (talk) 02:08, 16 November 2010 (UTC)
 * Looks fine to me (I can see the deleted images). Of course, this is a task for which most of the hard work is still performed by a human, and in that sense there's less to approve (the bot operator remains responsible for their actions). - Jarry1250 [Who? Discuss.] 17:13, 16 November 2010 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.