Wikipedia:Bots/Requests for approval/IznoBot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

IznoBot
Operator:

Time filed: 15:02, Saturday, November 11, 2017 (UTC)

Automatic, Supervised, or Manual: Supervised/Manual

Programming language(s): WP:AWB

Source code available: AWB

Function overview: WP:Lint &lt;center> inside of US Census population

Links to relevant discussions (where appropriate): None available

Edit period(s): One-time run

Estimated number of pages affected: 20,000

Namespace(s): Main

Exclusion compliant (Yes/No): AWB default

Rationale: I identified an opportunity to WP:Lint for &lt;center> in footnote of Template:US Census population a few weeks ago (to work on our 8 million errors-worth of obsolete HTML tags). Yesterday I took the time to start hacking at this project on User:IznoRepeat. When I got through the list of items I knew about, I went to see how large the problem was and found that there were 20k pages in mainspace alone. I was already concerned about the rate I was making the edits, so I'm here to request a bot flag for a separate account (User:IznoBot) to work on this problem.

Function details: The exact regex I ended with yesterday was the following:
 * Find (with regex):
 * Replace:

This is an extremely permissive find pattern and I would be willing to modify the regex if desired to look for the exact parameter name (footnote). I will be reviewing most/all edits regardless. This exact find and replace is evidenced at.

I also plan to run with general fixes on, which suggested several fixes to me yesterday. One with gen fixes accepted as-is; one with gen fixes suggested which I modified manually.

Discussion
That search patter is indeed to permissive, at the least change it to start with the parameter you care about:  —  xaosflux  Talk 00:00, 12 November 2017 (UTC)
 * Correct me if I am wrong, but this seems like a WP:COSMETICBOT. It also seems controversial since it is a low priority lint error. If you are going to supervise the edits,, why do you need the bot flag? (Also, I'm not sure I like having my subpages be copied without my knowledge and/or permission.) Nihlus  04:01, 12 November 2017 (UTC)
 * It may be, I haven't looked at good examples yet. For 20000 repeated edits, it should be a flagged account to avoid watchlist flooding etc (assuming it should happen at all). —  xaosflux  Talk 04:32, 12 November 2017 (UTC)
 * That's a fair point; however, I got yelled at in multiple areas about clogging up user's watchlists with my bot when doing medium level lint fixes. I don't think a low priority run would be a good idea. Nihlus  04:35, 12 November 2017 (UTC)
 * &lt;center> will stop working on Wikimedia wikis at some point in the future (this is a fact), at which point the change is clearly no longer cosmetic. I would call it "egregiously invalid HTML" given that it's obsolete in the version of HTML that Wikipedia outputs (that is, DOCTYPE html aka HTML 5). This is regardless of its priority for linting, which is assigned by an engineer without solicitation from the community.
 * I suspect you were having problems mostly because your edits were being made outside the main space (which critical), but maybe I'm not aware of some specific edits. The bot will only run in the mainspace, so "ensuring Wikipedia continues to look beautiful" is the acceptable rationale for most/all people, whereas it is difficult to defend signature cleaning in the same way as it is not outward-facing.
 * For the flag, Xaosflux covers that nicely. For the supervision, that's due to running gen fixes as well as taking the opportunity to make "better" edits than are suggested for gen fixes, if I identify such (optional behavior; I am happy not to make these suggested changes). I don't expect false positives, but there is always that potential as well. I have no problem with performing the task fully-automated, but you will find no requirement to do so in the policy for the flag. --Izno (talk) 04:55, 12 November 2017 (UTC)
 * That's fine. Does finding  and replacing with   work for you? --Izno (talk) 04:55, 12 November 2017 (UTC)
 * Anything coming in the future with these types of lint errors, considering they are a low priority? Nihlus  05:16, 12 November 2017 (UTC)
 * We prioritized the linter categories in relation to the goal of replacing Tidy. At this time, on the parsing team, we don't have any immediate parsing related work that depends on the other linter categories. It is up to wikis what they wish to do with these issues. But, I know that some UI folks and designers at the foundation prefer that the obsolete tags not be used (See T175709). Editor on the Italian Wikipedia have been replacing the obsolete tags and have even set up abuse filters for discouraging their use in edits. Hope this context is helpful. SSastry (WMF) (talk) 23:35, 12 November 2017 (UTC)
 * I'll retract my objection then. I still think a manual approach to fixing 8 million tags is not the best way to go about it, but I won't stop people from trying. Nihlus  23:16, 13 November 2017 (UTC)
 * For 20,000 edits, we're going to need a discussion. Could you start one at the Village Pump explaining what you plan to do, why you're doing it, and asking for feedback or support? As a side note, I'm not comfortable approving the "extra" edits beyond the main task and genfixes. That would basically give your bot broad leeway to make any useful edit under the bot flag, which is not a good idea, in my opinion. ~ Rob 13 Talk 15:59, 18 December 2017 (UTC)
 * Yes, I'll start one soon-ly. No problem nixing the extra edits, but if so, I'd prefer to run semi for trials and full-automatic for the full 20k. --Izno (talk) 18:10, 18 December 2017 (UTC)
 * Posted at VPT and will cross-post to VPPRO shortly. --Izno (talk) 17:55, 28 December 2017 (UTC)

NOTE: There are 34 pages with  but don't use Template:US Census population which you'll need to watchout for, see here. -- WOSlinker (talk) 18:10, 28 December 2017 (UTC)
 * If they don't use US Census population they won't be in the edited batch. There is also no intersection between the two templates with center. --Izno (talk) 18:39, 28 December 2017 (UTC)
 * US Census population could be coded to strip  from the  parameter, and behave like   if it's present. It's not efficient to do this in every call but is it worth 20000 edits to avoid it? Will  cause problems if it's still in the source but not the output? The template is transcluded in 30000 articles. PrimeHunter (talk) 00:17, 29 December 2017 (UTC)
 * I left a response to that question on WP:VPT. My short answer is "yes, that's possible, but I don't see the value given we have 60k other articles from which to remove center". --Izno (talk) 00:20, 29 December 2017 (UTC)

One thing I want to mention here is that, to my understanding, it's not because something isn't HTML 5 that the something will stop being supported. It's certainly possible  will stop working, but not being HTML 5 alone usually isn't reason enough to assume this. I'd be uncomfortable approving this as is, short of confirmation from WMF devs that center tag will actually stop working. Headbomb {t · c · p · b} 13:56, 17 January 2018 (UTC)
 * There are three ways it could stop working, two of which WMF engineers control and one of which the browsers control, namely, 1) removal from the whitelist, meaning we see visible &lt;center>; 2) stripping during parsing, meaning we lose centering; and 3) browsers drop support (which could cause either of those two depending on what way they feel like going). There isn't a timeline on any of those, yes. My assumption however, is that WMF engineers won't do anything without support from the community (and general browser feeling has unfortunately been "it's still used, so we also should respect it!"). So then you're proposing a chicken and egg (:. As it is, we have validation errors today on any page where center is used.
 * The thread (now in archive 161 of VPT) was archived with general support. We can ask Subu (SSastry above) about whether the WMF would actually remove support for it, but I think we really should take care of the low-lying fruit like this case. --Izno (talk) 15:30, 22 January 2018 (UTC)
 * Well, I'm not sure there was general support for removal of center tags in general, but there wasn't any real opposition to the removal from this template. So let's go to trial. Headbomb {t · c · p · b} 16:47, 25 January 2018 (UTC)
 * Two things: 1) Waiting on checkpage addition, and 2) I will be running the task to remove center entirely within the template, per the VPT discussion. --Izno (talk) 01:21, 28 January 2018 (UTC)
 * ✅; see Special:Contributions/IznoBot.
 * I filed T185840 regarding this diff: general fixes suggested replacing the emphasis tags with wikitext italics. I've added a regex F+R for it ( which runs before gen fixes.
 * I added a second find regex to catch the unclosed case, which was caught by the list generated by the search but not by the original regex. (That regex is  replaced by , running after the first regex.)
 * --Izno (talk) 15:26, 28 January 2018 (UTC)
 * Poke. --Izno (talk) 19:54, 14 February 2018 (UTC)

As far as I know, there is no consensus to disallow general fixes to be done in addition to main tasks for AWB bots. Thus, I strognly support generla fixes to be performed in addition to main task. -- Magioladitis (talk) 18:55, 25 January 2018 (UTC)
 * AWB granted. Proceed. Primefac (talk) 04:57, 28 January 2018 (UTC)

No issue found with trial, no objections raised. Headbomb {t · c · p · b} 20:23, 14 February 2018 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.