Wikipedia:Bots/Requests for approval/WildBot 4


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved.

WildBot 4
Operator: Josh Parris

Automatic or Manually assisted: Automatic

Programming language(s): Python

Source code available: https://svn.toolserver.org/svnroot/josh/redirects/

Function overview: Bypass redirects tagged R from incorrect name

Links to relevant discussions (where appropriate): Redirects for discussion/Log/2010 January 4, Bot requests/Archive 33 (missing, but can be seen at http://en.wikipedia.org/w/index.php?title=Wikipedia:Bot_requests/Archive_33&oldid=339120664 ), Bot requests/Archive 33, Bots/Requests for approval/WildBot 2

Edit period(s): periodic, perhaps daily

Estimated number of pages affected: Initial run: there are about a thousand redirects in Category:Redirects from incorrect names, so many thousands of pages could be affected. Subsequent runs: dozens of pages, perhaps not even that many, depends on the rate of use of incorrect name redirects.

Exclusion compliant (Y/N): Y, standard in pywikipedia

Already has a bot flag (Y/N): Y

Function details: Every redirect in Category:Redirects from incorrect names will be evaluated for semantic correctness with 11 tests. For examples of the checks run, see User:Josh Parris/Redirects from incorrect names. Redirects that fail any test will not be processed.

Normally bypassing redirects is strongly discouraged by WP:NOTBROKEN, but in this case R from incorrect name places these redirects into Category:Redirects from incorrect names and also Category:Unprintworthy redirects; WP:NOTBROKEN expressly permits bypassing.

For redirects that are semantically clean, in any linking article the redirects will be replaced: piped text will be changed to piped text, where correctname is either supplied as a parameter to R from incorrect name or the redirect target. The presence of either #section or piped text is not necessary.

Discussion
will you publish the actual semantic tests?  MBisanz  talk 03:50, 24 January 2010 (UTC)
 * Various versions of User:Josh Parris/Redirects from incorrect names in the page history contain in total all the tests, they are (the final test is done from two different perspectives):
 * Page is not a redirect yet is in Category:Redirects from incorrect names
 * Redirect is in Category:Redirects from incorrect names without a redirect template
 * Redirect requires a capitalization template
 * Redirect has a correct name parameter but that correct name isn't an article
 * Redirects with a mismatch between the correct name and the redirect target - for example, the correct name is a redirect targeting a different article
 * Redirect targets a #section, but don't use a template appropriate for redirects targeting a #section
 * Redirect targets a #section, but don't use a template to specify the correct name
 * Redirect is to a #section, but is missing correct name parameter
 * Redirect targets a #section with a template, but the #section doesn't exist
 * Redirect targets an #anchor that doesn't exist in the target article
 * Once source code is available, you'll be able to inspect the evaluation routines. Josh Parris 08:06, 24 January 2010 (UTC)

Apologies for the delay; I've been nursing WildBot's disambiguation activities and that's soaked up all the time I intended to devote to this. During development I discovered it's not uncommon for links to appear in main and variants, so if so no wikilinks can be found, the bot falls back to raw text substitution to deal with templates. I've also taken the liberty of not changing pages in talk or Wikipedia namespaces.

I paused WildBot's normal activity to perform the trial, to leave the trial edits in a contiguous lump, but the two runs have a substantial delay - this is the bot loading all thousand redirects and validating them. The second run is broken up by a bunch of API unavailability. I ensured that trial edits would include the redirects included in Bots/Requests for approval/WildBot 2. I was expecting I'd have to do something tricky to ensure a good breadth of articles edited, but it turns out there aren't all that many links to these dodgy redirects. 30 Trial edits: http://en.wikipedia.org/w/index.php?title=Special:Contributions&offset=201002090515058&limit=30&target=WildBot - around this time I had to restart the bot, as it somehow lost its http connection. And kept losing it; I'm going to write a pile of recovery code given that a run takes about an hour.

Current status: 30/50 done. Josh Parris 06:23, 9 February 2010 (UTC)

The final 20 edits are http://en.wikipedia.org/w/index.php?title=Special:Contributions&offset=20100209234825&limit=20&target=WildBot that recovery code really helps; it seems that perhaps pywikipedia has some difficulty in storing large pages without getting its knickers in a knot. Josh Parris 23:53, 9 February 2010 (UTC)


 * The trial results look good, but is it really necessary to edit pages in the userspace? Many users keep database reports or personal backlogs in their userspaces, so a bot modifying this area of Wikipedia may cause trouble. &mdash; The   Earwig   @  18:10, 10 February 2010 (UTC)


 * I was hoping that by editing in userspace, incorrect names would be corrected before drafts became articles. Let's ignore the fact that the dump in question was very, very old (and, as such, wrong insofar as it stood when edited).  The semantic behind R from incorrect name is that use of the incorrect name is wrong and ought to be corrected.  Incorrect names are a dead end; they ought not be used.  I can see two reasonable responses to the difficulty of userspace links: dogmatic insistence that they be changed, and permissive freedom to use Incorrect names in userspace; I'm leaning towards the former.  As a personal backlog, assuming this is some kind of article-in-progress, the link ought to be fixed; database reports are grayer.  Given how few uses of incorrect names there are, I'm willing to check each edit individually to ensure no existing userspace pages are adversely affected, revert and tag those that are, and let the bot run free in the future. Josh Parris 09:17, 11 February 2010 (UTC)

&mdash; The   Earwig   @  23:25, 14 February 2010 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.