Wikipedia:Bots/Requests for approval/PrimeBOT 17


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

PrimeBOT 17
Operator:

Time filed: 14:24, Saturday, May 27, 2017 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AWB

Source code available: AWB

Function overview: Remove UTM parameters (Google analytics) from external links and references (i.e. resurrect Theo's Little Bot task #23)

Links to relevant discussions (where appropriate): Bot requests/Archive 55

Edit period(s): Once a month

Estimated number of pages affected: 16000 in the initial run, and maybe 200 a month after that? Theo's task ran in batches of 500, which also works, but I couldn't then give a timeframe.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Straight-forward find-and-remove. Regex: As near as I can tell, I've managed to cover all of the edge cases which were of concern in the original BRFA. The blue section covers the case where ?utm_ is followed by an & not followed by another utm_ (e.g. ). The red hits everything else (i.e. where the utm_ term(s) are only at the end of the URL). Green is when utm falls in between two other codes
 * (test cases)
 * (tests)

Discussion

 * As a note, unlike the original bot run this will not be checking to see if the URLs are still valid. AWB doesn't do that. Primefac (talk) 14:24, 27 May 2017 (UTC)


 * Please post results here when done. — xaosflux  Talk 14:27, 27 May 2017 (UTC)
 * . Edits. Note that there were three errors (1, 2, and 3), which I undid and corrected (1, 2, and 3) with new regex, which I've amended above to reflect the changes. Primefac (talk) 15:30, 27 May 2017 (UTC)

In addition to the UTM parameters, there's also "?cmpid", and probably others. DS (talk) 16:14, 1 June 2017 (UTC)
 * An easy addition, just replace  with   in the regex. Primefac (talk) 18:37, 1 June 2017 (UTC)


 * Task approved. — xaosflux  Talk 03:44, 6 June 2017 (UTC)


 * Amended (00:29, 7 August 2017 (UTC)) to include  parameter cleanup as well "speedily approved" in lieu of another task as this is low volume. —  xaosflux  Talk 00:29, 7 August 2017 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.


 * Amended to include removing tracking from New York Times URLs; see talk. —&#8239; The Earwig (talk) 15:36, 25 March 2024 (UTC)