Wikipedia:Bots/Requests for approval/Xenobot 6.2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved.

Xenobot 6.2
Operator: –xenotalk contribs

Automatic or Manually assisted: Automatic

Programming language(s): AWB or Python

Source code available: On request

Function overview: The bot will deploy a new ISO region code parameter to Infobox settlement, reducing excessive parserFunctions and template calls to CountryAbbr and its children. It will also set coordinates_display where appropriate.

Links to relevant discussions (where appropriate): Bot requests (perm), also here (perm), and here (perm)

Edit period(s): One time one, with occasional patrols as necessary

Estimated number of pages affected: close to 160,000 200,000

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details: (Modified from original) The bot will use the subdivision_name & subdivision_name1 parameters to generate a new parameter, coordinates_region. This is currently done through parserFunctions and fairly expensive template calls to CountryAbbr which are asserted to have a noticeable slowing effect when editing settlement articles.

| coordinates_region = $SUBDIVISION_NAME

The bot will also set inline,title where appropriate (see for additional details).

Discussion
— Andrwsc (talk · contribs) 22:31, 17 March 2010 (UTC)
 * The bot should also work on articles that use Geobox, since its internal Geobox2 coor and Geobox2 coor title templates also uses CountryAbbr and its ilk. The problem there is that multiple   fields are possible.
 * Also, perhaps the simple subst solution you propose might not be the best idea, long-term. I think it's a good idea to build the parameter string to coord based on other existing infobox parameters (e.g. use the population field for city types) and that coding within Infobox settlement and Geobox seems sound.  The only problem we need to fix is the parsing of country/subdivision/etc. fields to construct the ISO region string.  Therefore, I think a better solution here might be to add a new   parameter to Infobox settlement and Geobox, so that the ISO region string is specified directly, and then use this bot to add that line of markup to all the affected geographical articles.
 * coordinates_region = $SUBDIVISION_NAME
 * For my part this makes things worlds easier, so I'll modify the task to suit the new parameter(s). – xeno talk 23:36, 17 March 2010 (UTC)
 * How long would you think the bot would need to get through all those articles? I'm wondering if we need to create some intermediate "backwards compatibility" mode for the infobox templates during the time we transition from CountryAbbr? Or perhaps just make the final infobox template changes immediately prior to the bot run and let 'er rip? — Andrwsc (talk · contribs) 23:45, 17 March 2010 (UTC)
 * There are close to 200,000 infobox settlements so it will definitely take some time. =) (14 days @ 10 epm) – xeno talk  00:10, 18 March 2010 (UTC)


 * The bot will also strip out commas from population totals, which (correct me if I'm wrong) aren't valid inputs to coord. –xenotalk 00:46, 18 March 2010 (UTC)

Err, what's the benefit to this bot? Reducing the number of ParserFunctions sounds like worrying about performance. --MZMcBride (talk) 00:34, 19 March 2010 (UTC)
 * It's more the CountryAbbr. If you look at the template and its children, you'll see how tough it is has been for them to keep up with the various ways names are passed (flags, etc.) Having the standard ISO code in there would futureproof the template. It might make sense to split the parameter in two for the country and province/state. –xenotalk 00:59, 19 March 2010 (UTC)
 * Agree. CountryAbbr is a disaster and must be replaced. — Andrwsc (talk · contribs) 05:04, 19 March 2010 (UTC)
 * However, is it really necessary to make 200,000+ edits (I think that's the correct number) that aren't doing anything to the final rendered page? Wouldn't this be better as a bot that doesn't make edits solely to substitute CountryAbbr, but makes other edits and does this substitution along with those (e.g., AWB's genfixes), so as to avoid making a bunch of edits that could easily be combined into others? I hope you understand my point; I'm concerned about a bot making a very large number of edits that do not affect the final page. &mdash; The Earwig   (talk)  22:51, 19 March 2010 (UTC)
 * To increase the utility of the edits, general fixes could be turned on if approval was granted for that. –xenotalk 03:03, 20 March 2010 (UTC)
 * Alternatively, we could add this logic to the AWB genfixes and then the templates would be improved over time, while users made other useful edits. Rjwilmsi  22:23, 28 March 2010 (UTC)
 * I considered that - someone is also working on optimizing CountryAbbr2 (though I think it's dependent on "plainifying" the names passed through subdivision_name). My main concern with adding it to general fixes and just letting it happen over time is that depending on the input provided by subdivision_name and subdivision_name1, the coordinates_region may not actually be valid input. Though it would likely be a graceful failure, there is no easy way to track when the substitution actually occurs out there 'in the cloud'. Further, an AWB general fix would then be dependent on the CountryAbbr and its child-templates not changing. Lastly, at close to 200,000 articles, the time for an AWB general fix to propagate completely without someone focused on the task probably approaches infinity.
 * Some investigation into the claim that editing these articles is noticeably slower due to CountryAbbr should be conducted. I'll admit I've done none. If there is indeed a substantial lag, and the suggested improvements to CountryAbbr don't significantly reduce it, the cost-benefit analysis may favour the bot completing this task: Multiply the time saved by bypassing CountryAbbr by the number of edits to the settlement articles over a year and you have a strong case for recovering that aggregated human time. There are also other tweaks that may be done to the infoboxes while the bot runs - stripping commas from population fields, for example. – xeno talk 23:41, 28 March 2010 (UTC)
 * It's not just a speed issue, it's a maintenance one. I do a lot of flag template maintenance work, and this is severely impacted by CountryAbbr.  Even with the changes mentioned above, I still see many thousands of "false positives".  For example, Special:WhatLinksHere/Template:Country data Canada shows Aalborg Municipality near the top of the list, and the only reason it appears is because of CountryAbbr et. al.  It simply must go. — Andrwsc (talk · contribs) 23:51, 30 March 2010 (UTC)
 * This really should be done, although I have no preference whether it may be best to first make it parts of AWB's genfixes and wait a month or two before working through the rest of them. Worrying about performance is only a minor aspect, as far as I'm concerned. The current solution is an unmaintainable hack with weird side effects. It's patently absurd that Chojnów renders Country data Macedonia, Country data Mauritius, Country data Nepal, Country data People's Republic of China and several others as part of comparisons to figure out that 🇵🇱 is "PL". I'm sure it was intended to be a smart hack, but it's really just a hack. Infoboxes should get the standardized input to generate the more complex representations, not vice versa. Amalthea  13:48, 20 April 2010 (UTC)


 * BAGAssistanceNeeded FYI four days ago, I left a message at VPM (perm) inviting comments or objections to this BRFA and so far there have been no objections raised. I would like to move forward with a trial, as the fixes mentioned above and recently introduced by Wikid77 have not solved all the problems (in fact, they apparently introduced some more issues) that lead to this request. – xeno <sup style="color:black;">talk 14:51, 20 April 2010 (UTC)


 * I really don't know what to say about this. I'm a little undecided on the task, but it isn't really harmful, appears to have adequate support, and no opposition has been raised recently. Let's try it out. &mdash; The Earwig   (talk)  20:18, 21 April 2010 (UTC)
 * BotTrialComplete 100 edits. I tried to give a good cross-section of countries, with a few U.S. and Canada sprinkled in for good measure to show the State/Prov Abbr is working. Here is a graceful failure (subdivision_name did not contain a Country, it improperly had the subdivision_name1 value). This error was my fault, I temporarily fudged up the regex (now fixed). – xeno <sup style="color:black;">talk 19:10, 3 May 2010 (UTC)
 * This looks good!! I especially like that you are also cleaning up the flag template usage, such as   to  .  Are you also able to replace the hard-coded image syntax (e.g.  ) with the flag template equivalent? That would also solve a current WP:Accessibility problem. — Andrwsc (talk · contribs) 19:20, 3 May 2010 (UTC)
 * I can try, though I stopped collapsing flagicon+Countryname because sometimes it would have strange things like  and it was hard to pick out the right one of the two. But with some clever parserFunctions, I can probably overcome this and still do the work when it's the same. How common is the second thing you mention? I don't think I saw it in the trial run. – xeno <sup style="color:black;">talk  19:25, 3 May 2010 (UTC)
 * I pulled that example directly out of CountryAbbr, and there are several countries like that in the #switch statement, so I presume it is common enough that led to it being included there. — Andrwsc (talk · contribs) 21:56, 3 May 2010 (UTC)

I just stopped by to comment how happy I am to see the bot helping with the task for which I summoned it. Keep up the good work, all. --Stepheng3 (talk) 03:03, 4 May 2010 (UTC)

Revision to functionality

 * FYI I'll probably add this request (perm) to do work on coordinates_display into the task and increase the utility of the edits. – xeno <sup style="color:black;">talk 12:50, 4 May 2010 (UTC)
 * Please update the Function details appropriately, and perform a trial of this newly extended functionality Josh Parris 14:03, 4 May 2010 (UTC)
 * Shall do, as soon as I understand the additional task a bit more. – xeno <sup style="color:black;">talk 14:04, 4 May 2010 (UTC)
 * 31 edits (AWB's new "max edit" featured goofed ;>). – xeno <sup style="color:black;">talk 13:10, 5 May 2010 (UTC)
 * On Los Angeles, why did the bot leave coordinates_region= blank? --Stepheng3 (talk) 16:45, 5 May 2010 (UTC)
 * Probably because  is an invalid input to my fork of CountryAbbr - I'll clean that up in the production run. I think a more important question is why such a large city didn't already have title coordinates. Are we sure these are desired across-the-board? – xeno <sup style="color:black;">talk  16:50, 5 May 2010 (UTC)
 * This example perfectly illustrates why  and your bot are necessary.  Editors have freedom to use whatever markup they like for the infobox parameter, so that meant CountryAbbr needed constant maintenance (like a game of whack-a-mole) to catch all the variations, and of course, always had some articles that fell through the cracks.  MUCH better to have a distinct infobox parameter for this purpose. After the bot run, a hidden category can be used to manually check all the articles with blank  . — Andrwsc (talk · contribs) 17:00, 5 May 2010 (UTC)
 * Quite right - though, it is probably a good idea to add the maintenance category before the bot run, so that graceful failures such as the Los Angeles example can be fixed promptly. – xeno <sup style="color:black;">talk 17:09, 5 May 2010 (UTC)

"Are we sure these are desired across-the-board?" There seems to be consensus on the template talk page, but of course there may be editors attached to particular articles not having title coordinates -- they are free to override by setting.

"invalid input to my fork". What if someone were now to add this variant to CountryAbbr? Since the bot added a blank  to the template, that would block upgrades to CountryAbbr from affecting the region code in the article's coordinates. I'm thinking we should modify Infobox settlement to treat blank  the same as missing. Or else keep the bot from adding the blank parameter.--Stepheng3 (talk) 00:49, 7 May 2010 (UTC)
 * Hopefully we won't have to maintain CountryAbbr much longer. We should consider a maintenance category for blank & missing , fixing them manually as the bot progresses. — Andrwsc (talk · contribs) 02:42, 7 May 2010 (UTC)
 * That seems an acceptable route. --Stepheng3 (talk) 04:05, 7 May 2010 (UTC)
 * Yep, the bot has no no way to know if the subst of the subdivision names will come up with an proper or blank result, so the template needs to go into a maintenance category if it is blank. – xeno <sup style="color:black;">talk 04:21, 7 May 2010 (UTC)

Why does the edit summary for this edit to Herat claim that the bot is doing something with coordinates_region, when it isn't? Josh Parris 08:48, 7 May 2010 (UTC)
 * Because the botop didn't exclude articles from the first trial from the second =) (Here's hoping for a feature to allow AWB to dynamically change the edit summary) – xeno <sup style="color:black;">talk  12:29, 7 May 2010 (UTC)

I see you're constrained by your tools. Josh Parris 13:02, 7 May 2010 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.