Wikipedia:Bots/Requests for approval/LemmeyBOT


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

LemmeyBOT
Operator: Lemmey

Automatic or Manually Assisted: Automatic

Programming Language(s): Python

Function Summary: Restores missing reference names

Edit period(s) (e.g. Continuous, daily, one time run): As needed.

Already has a bot flag (Y/N):

Function Details: Bot processes articles in the Category:Pages with incorrect ref formatting. Bot looks for missing reference names and the looks at the article history to restore those names.

Discussion
Looking at the contribs this BOT has already fixed 4 articles. The best example is Overpopulation ( diff) where the bot fixed 2 broken references dating back several edits. Category currently contains 1075 articles Excluding Wikipedia, Talk, and User pages. --Lemmey talk 05:30, 3 May 2008 (UTC)

That's pretty nifty. What do others think? — Werdna talk 05:18, 3 May 2008 (UTC)
 * Sounds good to me. —paranomia (formerly tim.bounceback) a door? 20:28, 3 May 2008 (UTC)

BlackListed Links
This seems useful. What happens if a ref used a now-blacklisted link? Gimmetrow 07:35, 3 May 2008 (UTC) Why not check if each link is blacklisted? Also, the talk link in your signature is annoying. — Werdna talk 04:12, 4 May 2008 (UTC)
 * If the ref used a blacklisted link the proper correction the editor should have taken would be to remove all mentions of the reference not just the first named reference. Should the bot find a blacklisted link it will restore it like any other reference. --Lemmey talk 07:40, 3 May 2008 (UTC)
 * Yes, that's what an editor *should* do, but an editor might legitimately remove a named ref and not catch all uses of the name. If the bot restores them, an editor seeing this is likely to rollback the bot edit. The bot should have some way to break the cycle. Antivandalbots only do one revert - perhaps this bot could only add refs to an article once in an hour. Gimmetrow 08:06, 3 May 2008 (UTC)
 * Seems like the best way to end a cycle is to have better anti-vandal bots. If they are breaking named references they obviously know that a named reference exists, the av bot should just look for the short version ('< ?ref ?name ?= ?[/w-"] ?/ >') I can throttle the bot for a trial period and look at creating an anti-vandal bot. --Lemmey  talk 14:12, 3 May 2008 (UTC)
 * How are anti-vandal bots related at all to ending the cycle, except that a common technique used by anti-vandal bots is to only revert once? -- Cobi(t 13:12, 5 May 2008 (UTC)
 * Gimmetrow seemed worried that the two bots might get into an editwar. As shown below this is impossible for blacklisted links as per current MediaWiki protection controls. --Lemmey talk 18:15, 5 May 2008 (UTC)
 * According to MaxSem and confirmed by testing it appears that it is not possible to save a blacklisted link when making an edit. It appears to be a non-issue. --Lemmey talk 08:11, 4 May 2008 (UTC)
 * Yes, if the link is in the spam blacklist, the bot won't save. What happens? Will the bot crash, or keep trying to make the same edit? But I'm also asking about links simply removed because they are not reliable sources - a soft blacklist if you will. Gimmetrow 20:53, 4 May 2008 (UTC)
 * The function throws an exception and then goes on to the next article. The bot is designed to attempt each article in "Category:Pages with incorrect ref formatting" once. I can create a list for use in future runs that will skip any articles attempted in the previous pass. This will prevent a rollback war between the bot and any editors / other bots.


 * As far as a particular named source being deemed unreliable, my view is that likely occurred due to a conversation on the talk page. As such the article would likely have enough eyes to already have all the instances of the named reference removed. (Example Source "BLOGGER" is deemed unreliable, it is unlikely a giant red broken ref warning with the name "BLOGGER" will not attract attention.) Since I'm only looking at ~1100 articles in the category, I expect this particular scenario to be minimal. --Lemmey talk 05:20, 5 May 2008 (UTC)

Once the trial is done, how often do you think you'll scan through the category? It will have a lot of articles at first, but eventually it will get down to just the handful that appear after the last scan by the bot. So once a day? once an hour? Related to that, I think it might be helpful to identify in the edit summary how long the named reference was missing, either by date or version number. If you're doing this like I would expect, that shouldn't be too hard. Restoring really old refs would be a flag to check the ref, I would think. Finally, if you try to edit an article and can't, it may be a blacklisted link, or it may be protected, or it may simply be an edit conflict. You would want to re-try edit conflicts after some delay. Gimmetrow 06:30, 5 May 2008 (UTC)
 * I'd say no more than once a week. It really depends on how many are left and how fast the category turn over is (how fast it grows or shrinks). The bot isn't perfect. Right now it skips  as in United States housing market correction. I'll need to add that and be able to search really deep (500+ versions), something I currently have capped for processing time reasons. I'll look into the version number idea. --Lemmey talk 13:03, 5 May 2008 (UTC)

Issues

 * What happened here ? Gimmetrow 05:42, 10 May 2008 (UTC)
 * It appears that the only existence of "alternate etymology" is a blank ref. --Lemmey talk 06:25, 10 May 2008 (UTC)
 * I have resolved this issue. You can see the fix here . The bot will not put in any ref that is like . It keeps looking in the history for a non-blank reference. --Lemmey talk 20:41, 13 May 2008 (UTC)

Case

 * There is a problem: your bot considers ref names case-insensitive, while it's not the case. Max S em(Han shot first!) 17:54, 10 May 2008 (UTC)
 * Issue is that the editor considered ref names to be case-insensitive substituting Columbia when he should have used columbia, it was a non-rendered ref and was fixed by the bot. Had it been looking for Columbia the Bot would have bottomed out and not fixed the ref. I'll state that having a named ref stated in full more than once is unsightly, unnecessary, and inefficient but I'll argue that it is not a more serious problem than a visible fault. I ran the bot on the article twice to fix all occurances. --Lemmey talk 18:35, 10 May 2008 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.