Wikipedia:Bots/Requests for approval/MusikBot 11


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

MusikBot 11
Operator:

Time filed: 18:47, Friday, February 24, 2017 (UTC)

Automatic, Supervised, or Manual:

Programming language(s): Ruby

Source code available: GitHub

Function overview: Replaces certain categories on BLPs to the BLP-specific counterpart.

Links to relevant discussions (where appropriate): Bots/Requests for approval/Yobot 28, Bots/Requests for approval/Yobot. This is an extension to a previously approved Yobot task

Edit period(s): Daily

Estimated number of pages affected: Around ~40 on the first run, the maybe 2-3 a day by my estimate, or less than 200 a month according to Magioladitis.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: For pages that are in Category:Living people, the bot will replace:
 * Category:Year of birth missing &rarr; Category:Year of birth missing (living people)
 * Category:Date of birth missing &rarr; Category:Date of birth missing (living people)
 * Category:Place of birth missing &rarr; Category:Place of birth missing (living people)

Any duplicates are removed. The bot will also preserve any sort keys used, for instance changing &rarr; Dylan, Bob. If there are duplicate categories with different sort keys, the bot will ignore the page altogether, since there's really no way to infer which sort key is the correct one.

See this example, assuming "Foo category" and "Bar category" are the ones that need to be converted. This example also illustrates how capitalize, white spacing, newlines, etc., are handled.

Discussion

 * See Bots/Requests for approval/Yobot 28 for why I'm taking this up. Basically it was easier and more efficient (in my opinion) to query the replicas to find these categories that need updating, as opposed to intersecting all pages in each category, which is close to a million pages. I don't know exactly how AWB would have done this behind the scenes, but a single query I tried ran pretty fast. suggested I take it on, so here I am. AWB is not being used so there are no general fixes &mdash;  MusikAnimal  talk  18:47, 24 February 2017 (UTC)

Support The list generation in AWB is slower than this. Moreover, I would like more bots to do the things I do so I have more time to try new things. -- Magioladitis (talk) 19:18, 24 February 2017 (UTC)


 * Please post trial results below when done. — xaosflux  Talk 19:22, 24 February 2017 (UTC)

. Everything worked as planned, however there were two articles that had both Category:Date of birth missing (living people) and Category:Year of birth missing (living people), which are not meant to be used together. We could have the bot remove one of them but in my opinion we should leave that to the humans, especially since "Date of birth" shouldn't be used at all in some cases if the year of birth is known (see notice at the top of Category:Date of birth missing (living people)) &mdash; MusikAnimal  talk  00:33, 26 February 2017 (UTC)

Reviewed all edits.


 * - bot duplicated a category.

Otherwise, looks fine. — HELL KNOWZ  ▎TALK 17:30, 28 February 2017 (UTC)
 * Two is better than one, right? :) This is an easy fix. Open to another trial if we want &mdash; MusikAnimal  talk  19:21, 28 February 2017 (UTC)
 * Please add to the function details that two categories are preferred how the bot handles this -- I presume remove the parent category -- or does it skip the change? — HELL KNOWZ  ▎TALK 19:25, 28 February 2017 (UTC)
 * Can do, but also I just thought of another potential problem. What if they use sort keys on one category and not on the duplicate? For BLPs they should be using DEFAULTSORT, but it's possible you'll see things like and also  on the same page. I can't envision where'd you want a (living people) category to be sorted differently than what's specified with DEFAULTSORT. What do you think? &mdash;  MusikAnimal  talk  19:35, 2 March 2017 (UTC)
 * That's a good point. You should preserve the sort key if present. — HELL KNOWZ  ▎TALK 21:26, 2 March 2017 (UTC)
 * Function details updated, see also the example diff &mdash; MusikAnimal  talk  05:29, 7 March 2017 (UTC)

If there even are that many pages left. — HELL KNOWZ  ▎TALK 13:01, 7 March 2017 (UTC)
 * Only 22 pages left . Pretty uneventful, none contained duplicates or sort keys. We can wait until we do a full 50, or at least run into some duplicates/sort keys, up to you. If not hopefully the sandbox example is enough to show it will handle those edge cases properly &mdash; MusikAnimal  talk  19:49, 7 March 2017 (UTC)
 * I think there's no need. It's an edge case and the sandbox edit looks good. — HELL KNOWZ  ▎TALK 20:14, 7 March 2017 (UTC)

Looks good. — HELL KNOWZ  ▎TALK 20:14, 7 March 2017 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.