Wikipedia:Bots/Requests for approval/Legobot 15


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Legobot 15
Operator:

Time filed: 23:50, Thursday July 26, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available:

Function overview: Replacement for User:HBC Archive Indexerbot

Links to relevant discussions (where appropriate): WP:BON thread

Edit period(s): Twice a day

Estimated number of pages affected: ~2000 (I think) pages which have the opt-in template

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details:

Functionally this will act the exact same as HBCAI did (minor differences in logging, edit summaries, etc.), but with brand new code. All existing pages will require no new update to be compatible with this system.

I've done a few tests in my userspace, see User talk:Legoktm/Index as an example, and User:Legobot/Archive Log for logging.

Discussion
For whatever it's worth, as the most recent operator of HBC Archive Indexerbot I support this request for approval. —Krellis (Talk) 00:00, 27 July 2012 (UTC)


 * Could you summarize what changes have been made in the new code? Are there any new features, or was it just a performance upgrade or a start-from-scratch sort of thing? Hers fold  non-admin (t/a/c) 14:28, 27 July 2012 (UTC)
 * The main reason I wrote new code was because I don't understand perl to the point where I would feel comfortable running and maintaining that code. There are functional differences in the backend (how the instruction template is parsed, cache storage, etc.), however all edits the new code will make should be nearly identical.
 * Currently the only new feature I added was the support of  in the masks, however I do plan on working on the current todo list once the bot is up and running. LegoKontribsTalkM 18:24, 27 July 2012 (UTC)

Well this needs to be trialed more extensively first, but there's no reason this shouldn't be eventually approved. I'm giving you 7 days mostly because I don't know how often it needs to archive pages, but feel free to cut this short after you feel it's been trialed enough for full rollout. It's obviously expected that user feedback about malfunctions will be heard and addressed during the trial period, and that things should be stopped if something goes horribly wrong. Headbomb {talk / contribs / physics / books} 23:04, 30 July 2012 (UTC)
 * Thanks Headbomb. I've had it running on an irregular schedule right now (so I am around to watch it). It's slower than what I want right now, so I'll also work on some performance upgrades by using threading, and a better caching system. LegoKontribsTalkM 07:36, 2 August 2012 (UTC)
 * What's the status on this? 23:44, 18 August 2012 (UTC)
 * Technically . The bot is currently operational for the most part, I'm just working on detecting null edits like this, and a few other small things. LegoKontribsTalkM 10:22, 19 August 2012 (UTC)
 * Well, I was only able to review the last hundred edits related to the trial as seven days = a ton of indexing (especially when HBC Archive Indexerbot has been gone for so long). One of the nice things about this task, though, is if there were any problems with the code the output pages could all just be overwritten. The format looks great; never saw any problems with encoding, linking, or sorting. A little over 10% of the edits in the sample I reviewed were null edits (just changing the time); has this been resolved? I think masks being delimited by \n,\n is a bit odd, especially since the comma always appears even if there's only one mask. Why wasn't /Archive 5 indexed here? What's with the  here? Thanks, &mdash; madman 16:54, 22 August 2012 (UTC)

<--I think I have the null edit problem figured out, I would like to test a few more things out locally first. I've fixed the problem with the commas showing up if there is only one mask.

As for the diffs you mentioned: Archive 5 was not indexed since there are no == level 2== headers on that page. I've updated the code so if a page has no level 2 headers, it will scan for level 3 headers (and pretend those are level 2 headers), before returning nothing.

User:N419BH was using a custom template here, which contained the template in   tags as opposed to the    which the default template uses. The bot now will first scan for the nowiki tags, then look for pre tags. Thanks, LegoKontribsTalkM 04:08, 23 August 2012 (UTC)
 * once the null edits issue is fixed. I'm confident that it will be and have no other concerns. &mdash; madman 18:15, 23 August 2012 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.