Wikipedia:Bots/Requests for approval/ImageRemovalBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

ImageRemovalBot
Operator: Carnildo

Automatic or Manually Assisted:
 * Automatic

Programming Language(s):
 * Perl

Function Summary:
 * This bot will periodically go through the deletion log, looking for images that have been deleted but not removed from articles, and will remove them.

Edit period(s) (e.g. Continuous, daily, one time run):
 * Hourly

Edit rate requested: X edits per TIME
 * 6 edits per minute, will increase if needed to keep up with deletions

Function Details:
 * Every hour at 35 minutes past the hour, this bot will retrieve the log of recently-deleted images. For each image in the log, it will check first to make sure it's still deleted, and second to see if the image is still used in any pages.  If it finds pages still linking to the image, it will remove the image, using the same removal code that OrphanBot uses.


 * The bot will only remove images in the article, category, and portal namespaces. Images used in the template namespace will be marked for human removal.  All other uses will be ignored.

Discussion

 * How will it detect whether an image was moved to Commons? It would still be deleted in the en.wikipedia database. Also, will it log its removals somewhere? Tito xd (?!? - cool stuff) 01:25, 13 July 2007 (UTC)
 * A Commons image will not have the "No file by this name exists; you can..." wording at the top of the page, and it will have the Commons banner. Either of these is sufficient to tell the bot that it shouldn't do anything with the image.
 * There's not really any sensible place to log removals. The only permanent record is going to be the bot's edit history. --Carnildo 01:54, 13 July 2007 (UTC)
 * I would recommend creating a log at User:ImageRemovalBot/Log or a page like that. Matt/TheFearow (Talk) (Contribs) (Bot) 09:18, 13 July 2007 (UTC)
 * Why? Special:Contributions/ImageRemovalBot will contain the same information.  Further, if admins start to depend on the bot for image removals, the log could grow by about 100kb a day. --Carnildo 09:50, 13 July 2007 (UTC)


 * Damn, I was just thinking about writing this bot. ;) I haven't written any code yet but it seems like a pretty straightforward idea. It might be a good idea to replace deleted images with a placeholder image instead of simply delinking them to avoid any changes in layout though. --S up? 11:57, 13 July 2007 (UTC)
 * It's going to remove images the same way OrphanBot does: usually by commenting them out, or by replacing them with a placeholder in the case of certain infoboxes. There's no good way to replace images with placeholders in the general case: the bot would need one placeholder for each image aspect ratio it encounters, and would need some way to tell when a placeholder is inappropriate.  Could you give some examples of situations where the layout would be damaged by removing an image? --Carnildo 18:20, 13 July 2007 (UTC)
 * It would create a big difference, as it would only show what images were deleted, not every time one was deleted. It makes it alot easier to find out which ones it did. Matt/TheFearow (Talk) (Contribs) (Bot) 23:24, 13 July 2007 (UTC)
 * What are you replying to, my comment about the log, my comment about placeholder images, or something else? --Carnildo 02:11, 14 July 2007 (UTC)
 * Why not either replace all images with a placeholder the same size, or use several place-holders of different aspects and choose (or even stretch) the closest one? There is a lot of value in leaving the deleted image in there because a number of images are deleted inappropriately, in ways subject to being restored, or in situations where a replacement image should and will be added.  If all of the images from an article are suddenly deleted it creates a mess for the editor to clean up...though leaving them in the comments is a good idea that will help minimize the damage.  Why not leave some visible sign?  Or a two step process where the images are replaced with a placeholder for a week or so to give people time, and if nobody has fixed it in a week, take out the placeholder.  Wikidemo 18:46, 15 July 2007 (UTC)
 * Could you give a concrete example of how a placeholder image would help an article? Specifically, something like this article (the deleted images are a gallery of non-free images). --Carnildo 20:09, 15 July 2007 (UTC)
 * Just thought of another reason to leave something in there. If someone does rehabilitate or replace the image, they should quickly be able to search for all the places the image had been used with a "what links here" or some other facility.  Anything that achieves that goal is good, it doesn't have to be a lingering link in the article file but that seems the most straightforward.  Wikidemo 18:48, 15 July 2007 (UTC)
 * "What links here" would only give a meaningful result if the bot uploaded a placeholder image for each and every image it removed. If someone wants to see all the places where the image was still in use after being deleted, they can view the bot's edit log: because of the way the bot works, all edits removing a given image will be grouped together. --Carnildo 20:09, 15 July 2007 (UTC)

Boy, this sounds like a great bot! I fully support it. But. . . is the bot able to reliably remove images from infoboxes? Sometimes infoboxes say "image=", sometimes they say "image=Image:Sample.png", sometimes they say "image=Sample.png", and sometimes they're even less standard. (Some poorly written infoboxes even break without an image -- not sure what you should do there.) In particular, I've seen some infoboxes that say "image=Sample.png|220px", and if you just remove "Sample.png", the infobox shows 220px, which isn't right. (Then again, it wasn't right to start with either.) I'm just wondering how smart your bot will be about this. – Quadell (talk) (random) 03:27, 16 July 2007 (UTC) P.S. And what about galleries? – Quadell (talk) (random) 03:28, 16 July 2007 (UTC)


 * Galleries are not a problem except in certain extremely rare cases: if the image is also used inline in the text of the article, or if the image is also in a comment. Infoboxes are more of a problem, since anything other than the "parameter=[[Image:Sample.png]]" form needs customized instructions for removal, but I've been working on a system that will make writing these rules simple.  The bot's already got rules for the most common infoboxes, and I'll be adding more as the bot finds them. --Carnildo 04:25, 16 July 2007 (UTC)


 * You could try using templates=expand with action=raw, find where the image is added, then remove it. The code would be a bit tricky, but I think I could do it with Text_Diff.  — Madman bum and angel (talk – desk) 21:16, 16 July 2007 (UTC)


 * It's easier and less likely to break if I write up the rules as needed. --Carnildo 00:06, 17 July 2007 (UTC)

Sounds like a useful task. Carnildo is an experienced bot operator and programmer and I trust his skills. Just be tricky about those edge cases that Quadell pointed out above, like when image names are passed as parameters to templates without using the Image: syntax. -- Cyde Weys 03:04, 18 July 2007 (UTC)
 * Is the bot going to use a maxlag setting? If so, what value? --ais523 17:17, 24 July 2007 (UTC)
 * It's going to use the same system that OrphanBot does: if an edit takes too long to complete, it will wait longer before making the next edit. --Carnildo 07:13, 30 July 2007 (UTC)

Removing images that were deleted on Commons is already covered by CommonsDelinker. ( [ →]O - RLY?) 17:12, 25 July 2007 (UTC)
 * ImageRemovalBot completely ignores images that are deleted on Commons. --Carnildo 07:13, 30 July 2007 (UTC)
 * Ok, just a clarification: It will remove instances of deleted images from articles and it will ignore images from commons? How will it handle images deleted because they are on commons? Does the bot check to see if they exist? Also, I would not approve this unless it outputs a list of all the images that it has gone through and deleted. Matt/TheFearow (Talk) (Contribs) (Bot) 23:38, 31 July 2007 (UTC)
 * An image moved to Commons looks just like an image that's been deleted and re-uploaded. The bot ignores such images.
 * The bot does not delete images itself. It removes images that have already been deleted.  A list of image removals is available in the bot's contribution logs; I will not provide a separate log page because such a log could easily grow by 30 MB a month. --Carnildo 00:31, 1 August 2007 (UTC)

It sounds like a good, useful idea, and as long as you properly deal with infoboxes, it should be fine. I recommend that you hardcode all the infobox formats into the bot with limits, so that the bot doesn't accidentally screw up infoboxes it doesn't recognize or templates with "image" attributes that aren't infoboxes. Have the bot check to see that the image attribute it's removing originates in a box that it knows, and if it doesn't recognize the infobox, err on the safe side and leave it. You should also have a page on the Wiki where a list of infoboxes that the bot knows can be found, and requests for additions made. Also, don't forget exceptions for things like Image:Example.jpg, if there are things like that anywhere (and there may be) that aren't supposed to be real images. Finally, I don't think a separate log page is necessary, but I wonder if the bot should leave any messages on talk pages about its removals? Andre (talk) 05:34, 1 August 2007 (UTC)
 * That seems like overkill considering how many edits the bot will make if successful. Almost every talk page will then end up with an ImageRemovalBot message. People can check the deletion log to see what happened with the image. Garion96 (talk) 06:22, 1 August 2007 (UTC)
 * Matt/TheFearow (Talk) (Contribs) (Bot) 05:39, 1 August 2007 (UTC)
 * Image deletion notices are already taken care of, and it is usually easy enough to find why an image was deleted. Likely, the uploader would have been notified anyway. Matt/TheFearow (Talk) (Contribs) (Bot) 06:25, 1 August 2007 (UTC)


 * A 20-edit run has been completed. --Carnildo 00:08, 3 August 2007 (UTC)


 * One issue, on Half a Mill, it did two edits in a row. Maybe a good idea is to consolidate those edits, and remove them both at the same time? Shoulnd't be too hard. They were both within the same minute, right next to each other. This has happened several times. Apart from that, I would recommend moving the log to a user subpage, not the usertalk page. It keeps it clear for discussion. Once those are fixed, If it needs a higher editrate to work, just ask. Matt/TheFearow (Talk) (Contribs) (Bot) 00:51, 3 August 2007 (UTC)
 * Consolidating edits like that would require a complete re-write of the bot architecture. Right now, the bot works through the deletion log one image and one page at a time, and the entire bot framework is designed around that fact. --Carnildo 01:01, 3 August 2007 (UTC)
 * Ahh ok, thats fine then. Consider approved for trial. Matt/TheFearow (Talk) (Contribs) (Bot) 11:09, 3 August 2007 (UTC)
 * 24-hour trial is finished. --Carnildo 18:00, 4 August 2007 (UTC)


 * I've seen this bot working as it removed some redlinks from pages on my watchlist. At first I was wondering about the redlink in the image summary, but on reflection it works because with popups and can get to the deletion log pretty easily if I care to.  I would like to say that I think that this bot's effort should be much appreciated. --After Midnight 0001 17:50, 4 August 2007 (UTC)

Is this bot's code public? I'd be interested in seeing it. --ST47 Talk·Desk 17:53, 4 August 2007 (UTC)


 * Not yet, but it's based on OrphanBot's code, which is at User:OrphanBot/orphanbot.pl, User:OrphanBot/libBot.pl, and User:OrphanBot/libPearle2.pl. --Carnildo 18:00, 4 August 2007 (UTC)

If 6epm is good for you, fine, but you're free to bring it up to 15, as I'd call this an essential task. (Plus, once this is approved, I don't have to worry about removing the images on my own) --ST47 Talk·Desk 18:12, 4 August 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.