Wikipedia:Bots/Requests for approval/PDFbot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

PDFbot
Operator: Dispenser

Automatic or Manually Assisted: Automatic

Programming Language(s): Python (pywikipedia framework)

Function Summary: Updates filesize for external links tagged with PDFlink.

Edit period(s) (e.g. Continuous, daily, one time run): Monthly

Edit rate requested: 1 EPM (dependent on the querying servers responsiveness)

Already has a bot flag (Y/N): N

Function Details: It examines pages that trancludes the PDFlink. Applies fixes related to the formating of PDFlink if needed. After it queries the HTTP server as indicated by the containing URL for the  and , it records these values. It then inserts (or replaces) the second parameter the a three significant figure binary-prefixed size, which is derived from the content-length, followed by a HTML comment that contains  and   from the query. This is repeats for every instance of PDFlink found in the wikitext. If the  value is not found or the URL 404s it leaves that instance unchanged. In the instance that PDFlink is embedded as the format parameter in a citation template it is removed. The list is created from the "What links here" page and filter out all non-article pages.

The source code been posted at User:PDFbot/pdfbot.py

Discussion
This seems like a good idea to me. The only concern I have, is if there is a consensus that PDFs should be linked in this manner? HighInBC (Need help? Ask me) 21:02, 14 February 2007 (UTC)


 * PDFlink provides a css wrapper around a regular external link which changes the external link icon for IE5-6 users. It also append PDF to the link and the the filesize if a second parameter is given.
 * I've also generated a sample edit and I hope to be cleaning up the code for posting. --Dispenser 21:47, 14 February 2007 (UTC)

Neat, I like how the details are in comments, they stay out of the way but are available. Sort of like how you can get the second that an edit occurred by using the Special:Export page. HighInBC (Need help? Ask me) 22:02, 14 February 2007 (UTC)

Second trial run:    

I'm not sure if I like the unit linked as suggested by MOS:NUM. I'll remove it as it's not require. You'll also note that there's a tiny parsing bug where it doesn't parse indirect character references. --Dispenser 23:26, 14 February 2007 (UTC)

I like this idea a lot, but I worry that it will require too many edits. I counted 54,916 articles with PDF links as of the November 2006 DB dump. I guess that's only 38 days at one/minute but still, it's a lot of edits. If this were a vote I'd support. -Selket Talk 23:48, 14 February 2007 (UTC)


 * I think adding the file sizes is a good idea, but you should take out the space between the unit and the comment. — Omegatron 00:14, 15 February 2007 (UTC)


 * Changed. --Dispenser 00:33, 15 February 2007 (UTC)


 * I've been using AWB to generate the list so far. It has only reported 1,583 articles (excluding talk pages) which would take a little over a day to run. --Dispenser 00:33, 15 February 2007 (UTC)


 * I just realized that you were counting all PDFs linked from Wikipedia, no the bot only only work on those that specifically transclude the PDFlink template. --Dispenser 03:09, 15 February 2007 (UTC)


 * Well, cheers then. I like it.  -Selket Talk 05:26, 15 February 2007 (UTC)

Third trial run:   (These will probably be the last) --Dispenser 23:27, 16 February 2007 (UTC)
 * I like, but perhaps using more standard size abbreviations would be appropriate? MB as opposed to MiB and KB as opposed to KiB?  Locriani 09:18, 18 February 2007 (UTC)


 * I tend to agree about the units. High InBC (Need help? Ask me) 13:19, 18 February 2007 (UTC)


 * According to the Manual of Style using binary prefixes is the preferred (and proper) method since the base I'm using is 1024 and not SI's 1,000. The MOS also suggests linking the unit for those unfamiliar with them.  However, I had wished to avoid creating unneeded links.  I will linking the unit if it is support for it.  —Dispenser 01:13, 19 February 2007 (UTC)
 * I'd support linking it as MiB . --ais523 13:29, 19 February 2007 (UTC)
 * I agree. Please add the link. — M ETS 501 (talk) 04:48, 24 February 2007 (UTC)
 * I've linked KiB, MiB, and GiB, but not bytes is that alright? —Dispenser 20:19, 4 March 2007 (UTC)
 * That seems sensible to me. --ais523 13:26, 5 March 2007 (UTC)

This bot shall run with a flag. — M ETS 501 (talk) 20:35, 5 March 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.