Wikipedia:Bots/Requests for approval/WikiCleanerBot 12


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

WikiCleanerBot 12
Operator:

Time filed: 21:24, Tuesday, March 10, 2020 (UTC)

Function overview: Do edit for fixing (Punctuation inside link).

Automatic, Supervised, or Manual: Automatic

Programming language(s): Java (WPCleaner)

Source code available: On GitHub (especially algorithm 548)

Links to relevant discussions (where appropriate):

Edit period(s): Twice a month

Estimated number of pages affected: Unknown for the moment: I will generate a dump analysis for error #548 to see how many pages could be concerned. I've generated a dump analysis for #548, it reports 16867 pages. Dry run on the first 100 pages results in 99 pages modified (the one not modified is 1775 in music: )

Namespace(s): Main

Exclusion compliant (Yes/No): Yes

Function details: The bot will simply move punctuation (comma, semi-colon or colon) at the end of a link (internal link, external link or interwiki link) after the link. The dot is not included as it would create many false positives (abbreviations). If the link text contains only the punctuation, it will be ignored also to be dealt with manually. Special cases are also handled for the semi-colon : things like  or   will also be ignored.

My bot is currently running the same task on frwiki (around 6000 pages, only for internal links in the first pass).

Discussion
One request and one question; first, yes please run the numbers and see how many pages are affected. Second, I notice that in cases like fr:14e cérémonie des Oscars the bot removes the duplicate/second link; are those cases being ignored here or is there logic to deal with that? Primefac (talk) 19:46, 15 March 2020 (UTC)
 * Hello Primefac.
 * Ok, I will generate the list of pages in the next dump analysis (also for Bots/Requests for approval/WikiCleanerBot 11), and come back here when it's available.
 * For fr:14e cérémonie des Oscars, this one was done manually by me, not automatically: as you can see on the update of the list on frwiki, such cases were not fixed by the automatic run, and they are kept for manual fixing. I will probably see in a next request for approval to handle at least part of such situations (I'm starting to work on links broken in several consecutive links, see Projet:Correction syntaxique/Analyse 549).
 * --NicoV (Talk on frwiki) 10:14, 16 March 2020 (UTC)
 * Hello Primefac. I've generated the list, it contains almost 17k pages. Ready to run the test on a few dozen pages. --NicoV (Talk on frwiki) 19:28, 21 March 2020 (UTC)

Primefac (talk) 22:26, 22 March 2020 (UTC)
 * Hello Primefac. I've done the 50 edits, my bot behaved as I expected. I've included the other fixes that are already approved for my bot, so some edits contain additional fixes (1922 Coppa Italia, 1911 New Zealand census). --NicoV (Talk on frwiki) 11:53, 23 March 2020 (UTC)

As per usual, if amendments to - or clarifications regarding - this approval are needed, please start a discussion on the talk page and ping. -- The SandDoctor Talk 18:27, 23 March 2020 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.