Wikipedia:Bots/Requests for approval/BU RoBOT 25


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

BU RoBOT 25
Operator:

Time filed: 23:30, Monday, July 18, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic (after templates/categories are manually chosen and inputted)

Programming language(s): AWB

Source code available: AWB

Function overview: Categorize stub articles in more detailed stub categories based on existing categorization. This involves replacing one stub template with another more detailed stub template.

Links to relevant discussions (where appropriate): Uncontroversial. See, for instance, this discussion relating to songs/singles categories. See also Category:Underpopulated stub categories, which is the tracking category full of stub categories which need articles merged into them, and Category:Overpopulated stub categories, which is the tracking category full of stub categories which need articles merged into subcategories. This is routine maintenance work relating to stub sorting.

Edit period(s): As needed.

Estimated number of pages affected: Depends on the categories. I think an artificial limit of 800 articles per target stub category would be sensible, above which I would file a specific BRFA for that stub template.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: This is a request for blanket approval to sort stubs into more specific stub categories as needed. Technically, this task is virtually identical to Bots/Requests for approval/BU RoBOT 24, which was just successfully run with no errors. This approval would only cover the most obvious of stub sorting cases, with anything more advanced requiring further BRFAs. Specifically, I'd like to be able to operate within the following limits:


 * 1) The existing stub category must be a parent of the target stub category. (i.e. No moving stubs across category trees without further approval.)
 * 2) Articles will be identified based on a clear categorization scheme. If identifying the articles to be moved requires regex or anything else other than existing categorization schemes, I will seek further approval. For an example of how this would work, see the previously-approved BRFA above, where song stubs were re-categorized as single stubs when found in a singles category.
 * 3) I will either not use recursive categorization or will manually check the entire category tree to ensure recursive categorization is appropriate. An example of "appropriate" recursive categorization can also be found at the above BRFA. If the categorization isn't clear-cut, I'll seek further approval.
 * 4) Either the parent category must be in Category:Overpopulated stub categories or the target category must be in Category:Underpopulated stub categories, demonstrating a clear need for further stub sorting. I want to explicitly note I may place stub categories in those categories myself. If I do, the categorization will be according to the guidelines of less than 60 articles for underpopulation and more than 800 articles for overpopulation.

I understand blanket approval can be a bit of an ask, but there are currently 165 overpopulated stub categories and 859 underpopulated stub categories. Attempting to handle these one-by-one would quickly flood the bot approvals process. The suggested task is non-controversial and follows the maintenance instructions at the tracking categories referenced above, but I'm notifying the stub sorting project of this BRFA to allow them to comment if they wish.

Discussion

 * Hi Rob - I have a handful of comments/queries:
 * 1) just want to clarify... this will only move stubs from overpopulated stub categories to currently existing stub subcategories, right? That's how I read the proposal, but I just want to make sure I've got it right. I'd be hesitant if it was also intended to automatically create new stub categories.
 * 2) Note that sometimes a stub category deliberately holds only a handful of stubs - an example would be, which is primarily a parent-only category but also includes a few odd stubs which cover several different subcategories. The usual threshold of 50-60 stubs for a category is loosened considerably if there are subcategories, so many of the categories in aren't really that underpopulated (as mentioned at the top of the category). Hopefully the first point of the function details should cope with this problem.
 * 3) Thirdly - and probably most problematically - it's often the case where one stub template would need to be replaced by more than one subtype. So as not to drown an article in templates, there's an arbitrary limit of four templates for any one article. There needs to be some way to control that. An example would be the article Rocky Mountains. If it was a stub just marked with geo-stub it would make sense to move it into subcategories for the US and Canada, but moving it into categories for New Mexico, California, Utah, Oregon, Washington, Montana, and British Columbia would be overkill.
 * Other than those three things, the idea seems a great and very useful one. Grutness...wha?  02:43, 20 July 2016 (UTC)
 * Thanks for your comments, . In order:
 * Currently existing stub subcategories or new stub subcategories approved via the usual proposals process. My existing proposal at WP:WSS/P for Depressariinae-stub/Category:Depressariinae stubs would be something this task could be useful for, for instance. Especially for the very large categories, creating new stub subcategories may be necessary, but it would all be done within process. Keep in mind that none of the "thinking" behind stub categorization is being automated here; I'm manually choosing the two categories that I'm sorting between. I have no intention to go out of process when doing that. It's only the tedious individual edits that are being automated.
 * Yes; in that case, I would just remove the underpopulated template because no further population is needed. This all requires a bit of editorial judgement. If something isn't straightforward, I'd take it back to the stub sorting project or bot approvals. This task is meant to take care of the obvious stuff. In any event, the broadest categories wouldn't be eligible for this task because they're top-level, so there's no parent to pull from.
 * I wouldn't replace with multiple templates using this task. I hadn't spelled it out above (mostly because it hadn't even occurred to me), but this task is meant to be one-to-one. If something is complicated enough that it needs multiple stub templates or requires removing multiple templates in favor of one intersecting template, then it probably needs editor eyes on the specific article.
 * I had a difficult time concisely explaining how simple this task is meant to be in #2 above, so let's have another go at it: If it's more complicated than me typing "Category:X" into AWB and then running the find and replace, then I'd seek further approval. The sort of situation that you were talking about above where I'm supposedly looking at all categories on the article at once, multiple stub templates, etc. is well beyond the complexity of what I'm trying to get approval for here. This won't be a bot that runs loose on all articles and recategorizes them based on a bunch of complex rules. It will be a simple script where I plug in a category of articles, a stub template to look for in those articles, and the stub template to replace the first one with if its found. That's it. Given the state of categorization on the English Wikipedia, a hypothetical fully-automatic stub sorting bot which could handle the type of sorting you gave as an example above is well outside the reach of even the best bot operator. And I'm not even close to the best bot operator. ~ Rob 13 Talk 03:20, 20 July 2016 (UTC)
 * Thanks for the explanation. Sounds good. Thumbs up from me. Grutness...wha?  06:29, 20 July 2016 (UTC)


 * BAG assistance needed ~ Rob 13 Talk 01:26, 26 July 2016 (UTC)
 * Can you list some of the categorizations you intend to make? — Earwig   talk  19:17, 29 July 2016 (UTC)
 * In addition to the example provided above from a past BRFA, here are a few more that have come to my attention:
 * Change Gelechioidea-stub to Depressariinae-stub for pages in Category:Depressariinae, defined recursively; the subcategories are genuses within this subfamily. There's a lot of similar work on that tree, since a bunch of new stub templates were created there following a discussion at WP:WSS/P.
 * Change Videogame-stub to the various genre templates when articles are in categories such as Category:Strategy video games. Categories would not be defined recursively here, as there's some messiness involved, but I may selectively include things like Category:Grand strategy video games through manual review of each category where the potential sub-genres very clearly belong to the genre.
 * Change Canada-sport-bio-stub to Canadianfootball-bio-stub when players are within Category:Canadian Football League players by team (recursive, all subcategories are obviously filled with Canadian football players) or possibly to Canadianfootball-defensiveback-stub or similar when players are within Category:Canadian football defensive backs or similar.
 * Basically, the goal is to hit the easiest and least controversial ones in categories which badly need it based on the very large or very small tags. ~ Rob 13 Talk 19:42, 29 July 2016 (UTC)


 * Just a thought but it makes sense to me for stub cats that include Category diffuse into your from list. Jamesmcmahon0 (talk) 17:44, 31 July 2016 (UTC)
 * That would certainly make sense, with the stipulation that I will not process any to which I add Category diffuse so as to avoid any gray area where I'm accused of bending the rules. This would be a good idea. ~ Rob 13 Talk 17:48, 31 July 2016 (UTC)


 * I don't see a need for a trial here, since the technical aspect is just building off another approval. This should be uncontroversial. The one thing I want to make sure of is that a replacement from to  doesn't cause a page to be tagged twice if it happens to use both stub templates. I wasn't clear if this was considered from your description.  —  Earwig   talk  02:41, 9 August 2016 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.