Wikipedia:Bots/Requests for approval/JL-Bot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

JL-Bot
Operator: JLaTondre

Automatic or Manually Assisted: Automatic (operator initiated & supervised)

Programming Language(s): Perl (uses perlwikipedia)

Function Summary: Replace incorrect external links to Wikipedia & sister sites with internal link syntax

Edit period(s) (e.g. Continuous, daily, one time run): Daily to weekly

Edit rate requested: 10 accesses per minute (built in delay of at least 6 seconds between any read or write)

Already has a bot flag (Y/N): N

Function Details: It will correct links, only within articles & templates, to Wikipedia & sister sites that are incorrectly formatted using the external link syntax. It uses Special:Linksearch to determine pages to process, but each run is limited to links starting with a random character (ex.  en.wikipedia.org/wiki/A ) to keep the number of pages being processed to a reasonable number. It is conservative in the patterns that it matches.

The following replacements are made:


 *  [link:target name]  →  name 
 *  target[link:target]  →  target 
 *  target[link:target]  →  target 
 *  name[link:target] </tt> → <tt> name </tt>
 * <tt> <a href="link:target">name</a> </tt> → <tt> name </tt>

where <tt>link:</tt> is of the form:


 * <tt> http://\w+.wikipedia.org/wiki/ </tt>
 * <tt> http://\w+.wikipedia.org/w/index.php?title= </tt>
 * <tt> http://\w+.wikibooks.org/wiki/ </tt>
 * <tt> http://\w+.wikinews.org/wiki/ </tt>
 * <tt> http://\w+.wikisource.org/wiki/ </tt>
 * <tt> http://\w+.wiktionary.org/wiki/ </tt>
 * <tt> http://commons.wikimedia.org/wiki/ </tt>
 * <tt> http://meta.wikimedia.org/wiki/ </tt>
 * <tt> http://www.mediawiki.org/ </tt>

The replacement <tt>target</tt> is properly prefixed if it is a <tt>Category:</tt>, <tt>Image:</tt>, or interwiki link.

It also fixes minor formatting errors with internal links, but only if an external to internal link fix has been made.

Discussion
<tt> http://en.wikipedia.org/wiki/Page_name </tt> and <tt> http://en.wikipedia.org/w/index.php?title=Page_name </tt> patterns only. The exception is <tt> http://en.wikipedia.org/wiki/Special:Search/Page_name </tt>. It will also convert that as I cannot see a reason why, in article space, there should be a link to a search instead of linking to the actual page. -- JLaTondre 12:40, 3 July 2007 (UTC)
 * Looks pretty good. My bot does something similar to this I believe. Does the bot change external links to interwiki links, does it correct interwiki links, or both? <font color="#0A9DC2">~  <font color="#0DC4F2">Wi <font color="#3DD0F5">ki <font color="#6EDCF7">her <font color="#9EE8FA">mit  03:32, 3 July 2007 (UTC)
 * It changes external links to interwiki links. The original purpose was to deal with "external" links to enwikipedia itself, but handling interwiki links was similar enough that I added it in. As far as correcting interwiki links, it doesn't validate that interwiki links lead to the correct page. It will do some minor clean-up such as remove excess spaces and decode URL-encoded characters. -- JLaTondre 11:23, 3 July 2007 (UTC)
 * How will this handle oldids and other cases where there is a query string? Matt/TheFearow (Talk) (Contribs) (Bot) 12:02, 3 July 2007 (UTC)
 * It ignores them (with one exception). It looks for (using examples)

50 edits. --ST47 Talk 12:53, 6 July 2007 (UTC)
 * 45 edits made. I had two issues at the start that I resolved. I forgot to specify an edit summary on my first edit. I also had two UTF-8 problems (edits 5-6) that I had to revert. Updating to the latest perlwikipedia resolved those & when re-ran (edits 10 & 12), everything was fine. -- JLaTondre 21:33, 6 July 2007 (UTC)

\n:Great, Approved! ST47 Talk 13:36, 8 July 2007 (UTC)


 * --ST47 Talk 13:36, 8 July 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.