Wikipedia:Bots/Requests for approval/Joe's Null Bot 4


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Joe's Null Bot 4
Operator:

Time filed: 23:27, Monday April 29, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PERL

Source code available:  Yes, this would be a slightly modified version of the code for task 1.

Function overview: Update categories of AfC pending submissions with a purge/forcelinkupdate

Links to relevant discussions (where appropriate):  Wikipedia_talk:WikiProject_Articles_for_creation

Edit period(s): 1/day

Estimated number of pages affected: 1000 (purges, not edits)

Exclusion compliant (Yes/No): y

Already has a bot flag (Yes/No): y

Function details:  Articles for Creation tracks pending submissions from new editors by days since submission in order to ensure that new editor submissions don't get stale before review. (See: Category:AfC pending submissions by age) As we've saw with task 1, this sort of categorization-by-template-based-on-what-time-it-is doesn't function properly. I propose to duplicate the existing Null Bot task 1 code but traverse Category:Pending AfC submissions, essentially keeping those categories more or less up to date. Existing task 1 bot code does exclusion compliance and honors the server load indications.

This is different than the task 1 in that the category affected is larger, I'll have to tweak the category length paranoia check. The discussion at AfC linked above suggests a daily cap of 1500.

Since I've been asked this about previous tasks: Yes, should the problem this works around be fixed, I'll gleefully dismantle the 'bot. See, which my was marked a dupe of. I'm guessing it's a non-trivial thing to fix, though, the server isn't going to necessarily be able to backfigure through code to know when it should reevaluate the templates in advance, and there would certainly be a signficant performance penalty for always recalculating categories on read.

Discussion
The number of runs is variable depending on the number of pending submissions. Ideally it will be under 400. Typically it will be 200-1500. Occasionally it will get over 2000. Throttling should be sensitive to the loads of the server. I recommend logging "All requested work done, N pages purged, HH:MM:SS elapsed" or something similar on a successful run and, on a run that ended for any reason including a deliberate early termination due to too many pages to purge. It's okay if this is in "version 2.0," as it's more important to get something running soon, we can do enhancements later. davidwr/ (talk)/(contribs)/(e-mail)  00:21, 30 April 2013 (UTC)
 * davidwr: Thanks for the additional data on your experiences with the backlog size at AfC, much appreciated. Throttling is sensitive to server load, there's a nice preexisting mechanism for that in the MediaWiki software, and we're using it, my code stacks two layers of increasing backoffs when the server load exceeds a threshold.  There's also an elapsed time limit already in my code, so both "max articles handled" and "max total run time" are trivially configured, this is all the sort of things we faced with task one, just in somewhat smaller numbers.  Don't know if you read PERL, but you may find the source for task 1 illuminating:  --j⚛e deckertalk 01:18, 30 April 2013 (UTC)

One cycle.  MBisanz  talk 22:25, 30 April 2013 (UTC)


 * Cycle started, will report when complete. Relative to the existing BLPPROD (task 1) code, I dropped the minimum interval between purges to 5 seconds, raised the maximum number of files addressed to 1500 (it should run about 800), and of course changed the category.   --j⚛e deckertalk 22:45, 30 April 2013 (UTC)


 * Cycle complete. There are a couple things it doesn't need to hit (3-4 subcats don't need poking themselves) but they're doing no harm, the basic functionality worked.  I did "look over its shoulder" as it was working, however, and noticed one flaw.  The individual by age categories, e.g., Category:AfC pending submissions by age/12 days ago‎ do update as expected here.  The parent category, however, does not display updated totals for the by age cats, it appears to want it's own its own purge--a plain old purge works fine. Tacking a single purge to the very end of the run should do the trick.  819 purges, 5650 seconds run-time.  Before and after snapshots of the category counts and the list of titles purged at    If you'd like, I'd be happy to do another run tomorrow with that one extra purge in place.  --j⚛e deckertalk 00:48, 1 May 2013 (UTC)
 * The extra purge is fine.  MBisanz  talk 02:21, 3 May 2013 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.