Wikipedia:Bots/Requests for approval/PNG recompression


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol delete vote.svg Denied

PNG recompression
Operator:

Time filed: 08:40, Monday January 16, 2012 (UTC)

Automatic or Manual: Automatic unsupervised

Programming language(s): Java

Source code available: Yes, see User:PNG recompression

Function overview: PNG lossless recompression, with certain caveats related to ancillary chunks. The initial run will read every PNG image on the wiki; future runs will see if more revisions have been made to a page before trying to recompress them, and will try to recompress any new PNG uploads.

Links to relevant discussions (where appropriate): None. This request is to seek consensus as well as for approval.

Edit period(s): Each run will be started 1 month after the last one ended.

Estimated number of pages affected: Around 175,000.

Exclusion compliant (Y/N): N or N/A

Already has a bot flag (Y/N): N

Function details:

This is a maintenance bot that will indiscriminately re-upload PNG images with recompression across the entire wiki. The aim is to allow some of the wiki's pages using many images to load more quickly, as well as to allow Web hosts redistributing these images to use less bandwidth to serve them.

Exactly one revision will be made by the bot for an image if no further uploads are made to replace its contents; this is not a repeated action bot. Therefore, using the bots exclusion may or may not apply to this bot.

For full information regarding the function of this planned bot, please see its user page.

If approved for a trial, I would like the trial to be based on edits instead of time, if possible. This would leave me time to perform some adjustments needed to fit the existing bot code (on the other wiki on which it is running) to Wikipedia's bot policy. Additionally, due to the nature of this bot, it cannot be run in my user space. I may also test on a server designated for this purpose, if a staging area is available. A proofreader (talk) 08:40, 16 January 2012 (UTC)

Discussion

 * Support. PNG recompression bots just like this have been run in the old days successfully. I recommend doing the equivalent of "pngcrush -brute", which tries all combinations of all parameters. You will also need to implement a "maintenance" version of your bot, which recompresses all new PNG uploads. Also, obviously, you're going to want to run your bot on Commons as well, since most files on enwiki are very small and compression is not nearly as helpful here. Finally, note that recompression does nothing at all to help thumbnails, so this would only affect the full-size images - however, readers who download images would still benefit, and I believe it would also help in cases where the full-size image is used in the article, which is very common for fair use images. Dcoetzee 09:16, 16 January 2012 (UTC)
 * Comment: New PNG uploads are caught in a future run of the bot. For example, if a run ends on 2012-02-14 and a user uploads an image on 2012-02-20, the image will be caught during the visit of Special:Allpages on 2012-03-14. This is the edit period of 1 month that I wrote above.
 * I was aware of the inability to recompress thumbnails, which was also a shortcoming of the bot on the wiki it is currently running on; it's a shame, really. As you say, it does help people who download images, though. A proofreader (talk) 09:22, 16 January 2012 (UTC)
 * Oppose There are numerous things that need to be worked on in order for me to remove my opposition:
 * First, I really don't see the value in this. For most files, this would shave far less than a megabyte, and would therefore do little in the way of reducing the time it takes for people to load the image. At the same time, however, this will be asking for a lot of resources, and it will mean that people have to re-cache those pages that the bot effects. If files are so big that compression would have an effect, that can be done on a case by case basis. It shouldn't be too hard to get a list of all .png files on Wikipedia, sorted by filesize. I'm sure you'd find closer to 200 than to 200,000 files that actually need this.
 * Second, if it's going to be uploading new versions of non-free files, it needs to add a {{subst:fur}} tag on it's own when it's done.
 * Third, if a file has existing metadata, what will this process do to that metadata? I've got a sinking feeling from seeing other automated/semiautomated uploading bots that the answer is 'replace it', and that's unacceptable. Unless you're able to have the bot check to make sure that the Template:Information form is fully filled out, and have the bot skip files where that isn't the case, or you have the bot edit the file description page to record the original metadata, I don't want this bot to touch free files.
 * All in all, I think this bot will create a whole lot more work for the rest of the file workers, all for almost no benefit.  S ven M anguard   Wha?  17:29, 16 January 2012 (UTC)
 * Comment: The metadata is removed entirely by OptiPNG. If I replace OptiPNG with PNGCrush and preserve ancillary chunks, the bot may be able to preserve metadata. I can also look into the file format of PNG files and preserve the metadata chunks in RAM before running the tools. However, EXIF metadata in JPEG is much more common, includes camera manufacturer and model etc., and will not be touched at all by PNG recompression.
 * The 8 KB threshold for looking at recompressing a PNG image from Special:Allpages could be changed to something much higher, like 512 KB, as a result of this discussion. It's an API call parameter.
 * There's also no denying that this would use a lot of resources, and I wrote a bit about this in its user page. A proofreader (talk) 21:31, 16 January 2012 (UTC)
 * I doubt this bot's usefulness because many images are used as thumbnails and better compression doesn't get passed over through resizing. Max Semenik (talk) 07:51, 20 January 2012 (UTC)
 * This isn't a useful thing for a bot to do. Almost no images are displayed to the user directly; rather, they're resized by the server, and the resize will undo any gains from optimizing the compression of the original. --Carnildo (talk) 02:24, 21 January 2012 (UTC)


 * Note &mdash; I have some concerns about the operator. This user has only 11 global edits to their account as of this post, and it was created 03:14, 16 September 2011. Please correct me if I'm missing something here. -- slakr  \ talk / 02:22, 21 January 2012 (UTC)

Per the concerns raised above. This is something that is best left to the Mediawiki software to do. Also, the operator seems very new, and i'd like to see someone slightly more established, before approving a potentially "dangerous" task (possibility of damage to a large amount of images), like this. -- Chris 04:20, 21 January 2012 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.