Wikipedia:Bots/Requests for approval/PDFbot 2


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

PDFbot
Operator: Dispenser

Automatic or Manually Assisted: Automatic

Programming Language(s): pywikipedia

Function Summary: Dictionary based dead link repair

Edit period(s) (e.g. Continuous, daily, one time run): Monthly

Edit rate requested: 1 edits per minute

Already has a bot flag (Y/N): Yes

Function Details: Adding a task to fix dead links using a dictionary that replaces all URL instances on a pages that are normally processed (pages with PDFlink). This task is only useful if a large number of links have been relocated such as with the Virginia State Routes.

Discussion
How will the bot know if a link is dead of if the server is temporarily down? — M ETS 501 (talk) 23:36, 23 March 2007 (UTC)


 * The dictionary is human generated and verified. The bot currently checks the content-type to see if it's either application/PDF or octet-stream, so it doesn't report the size of 404 pages.  I plan on using it on the VA highway pages, see . —Dispenser 00:29, 24 March 2007 (UTC)


 * Sorry, I don't understand :-( Not your fault.  You can either try to explain it more step by step to me or wait for another BAG member or user. — M ETS 501 (talk) 01:46, 24 March 2007 (UTC)


 * The WikiProject U.S. Interstate Highways has been using www.virginiadot.org as a reference and tagged the pages with PDFlink. About half a year ago, I believe, virginiadot had moved their resources to a different directory.  To fix these URLs is trivial, but since I did not ask for it in my original request I can't do it.  Thus, I'm filling out a second request to cover fixing URLs.   —Dispenser 03:34, 24 March 2007 (UTC)


 * With the content-type precautions you've made and the use of a manually created list of what to replace in the URL to get it to work, I don't see any problem with the idea. However, in the diff you referenced above, the PDF size isn't shown in brackets afterwards (the parameter having been removed).  Is this a problem present in the bot, or is it just the fact that you were carrying out the replacement manually?  Also, I'd just like to seek assurance that the bot does check that the new links work before replacing the old ones with them, to avoid any problems with some dead links being introduced. Mart inp23  21:05, 24 March 2007 (UTC)


 * The manual removal the parameter was done since it was incorrect (those were the size of the 404 html) and is not reflective of the bot would do. The links will pass the normal (correct content-type) tests before they are committed. —Dispenser 22:08, 24 March 2007 (UTC)


 * No more than 50 edits, please report back when complete. Mart inp23 05:46, 26 March 2007 (UTC)


 * Trail was completed last week. —Dispenser 00:53, 4 April 2007 (UTC)


 * Everything looks OK, but it would be nice to change the edit summary to make it more fully describe the changes. On this understanding -  Mart inp23  09:38, 4 April 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.