Template talk:Infobox City/Template parameter changes bot

Specification
Does anyone have a bot that could be used to change/update parameters to a template? The specific task I have in mind is to split references to the subdivision_type/subdivision_name parameters to template:Infobox City that include embedded s to use inidividual pairs of parameters for each "row" of these. For example, many invocations of this template currently look like:

which creates two "rows" within a single row of the resultant HTML table. My understanding is that most screen readers read tables row by row, cell by cell, so would read this as "country state, united states california" rather than as indicated by the visual appearance, which would be "country united states, state california". I've added additional pairs of parameters so in the structure of the HTML table these entries can be separate rows, which requires changing the example above to

There are roughly 2000 references to template:Infobox City. Lacking a bot to do this, I've started doing this by hand and have convened a project to manually go through every reference to this template and update per the above (see Template talk:Infobox City/links). Note that the precise format for the template parameter reference varies considerably, and there are a variable number of "virtual rows" (I haven't seen more than 3 yet, but who knows?). I'd also be interested in such a bot flagging articles for further examination if any other parameters have the string " " in them.

If anyone has a bot they've used to manipulate template parameters, please let me know. Given how long it might take to do this by hand, I think I'd be willing to modify an existing bot to do this.

Thanks. -- Rick Block (talk) 16:42, 4 September 2006 (UTC)


 * Could be done pretty easily with AWB using pretty simple regexes, e.g.

Find: |subdivision_type = (.*?) (.*?)

Replacewith: |subdivision_type = $1\r\n|subdivision_type1 = $2

or something similar should do the job. Martin 17:04, 4 September 2006 (UTC)


 * If there is consensus for doing this change, I would be willing to help doing the edits. I think I can provide the regexes. I invented Replace special function for this kind of stuff (although that's probably not needed here, given these specific parameter names). If you want to use the above regexes, you might want to take white space into account:

Find: |[\s]*subdivision_type[\s]*=[\s]*(.*?) (.*?)

Replacewith: |subdivision_type = $1\r\n|subdivision_type1 = $2


 * Also the above regex assume there is no other use of a subdivision_type parameter in any other template call. So each diff should be checked to make sure everything goes well ;-). I could probably provide the settings file so that we could split the task and work together (In case you do have a Windows 2000 box available ;-) --Ligulem 17:46, 4 September 2006 (UTC)


 * It's a little more complicated than these regexes. For example, there might be two pairs of BRs (or three or more, although I haven't seen this yet).  The number of BRs in the pairs of parameters might not match (needs manual attention), the "|" might be at the beginning of the line or the end of the line, there might be whitespace before or after the "|", etc.  I generally use a Mac, but could work on a Windows box (if necessary).  I could write a script to download the source for the articles and filter out the ones that don't need any change or look weird in some way (or, alternatively, produce lists of articles that all need to be changed in exactly the same way).  Before going too far with this, maybe I should do some more analysis like this.  Regarding consensus to make the changes, the parameters exist in the template so I think there's implied consensus already.-- Rick Block (talk) 20:07, 4 September 2006 (UTC)


 * Forget about the analysis. If there is consensus, I would say I just start doing this. I'm going to use MWB. I tend to build up some rules and see how it flies. If I have some useful settings, I'll let you know, you could then help editing if you do have a Windows box (needs at least Windows 2000 or newer, same prerequisites has for AWB). BTW, MWB is not a bot and it displays each diff before saving (which is needed for the special cases). There is also a text edit window for the nonstandard cases. Example of another older template migration settings I've developed in the past can be seen here (we migrated thousands of template calls). I'll report to your talk page. We can then move the detailed discussions into userspace somewhere. --Ligulem 22:01, 4 September 2006 (UTC)


 * Well, OK, and thanks. I assume you've looked at the template syntax and noticed that there are at this point 3 sets of subdivision paramters, the original subdivision_type/subdivision_name (perhaps with embedded breaks), subdivision_type1/subdivision_name1 (per the example above), and subdivision_type2/subdivision_name2.  When you get a rule set together I'd be interested in seeing it.  -- Rick Block (talk) 22:48, 4 September 2006 (UTC)

I have a confirmation question: what should be done to Davenport, Iowa? There we currently have Am I correct in assuming this should be transformed to: ? --Ligulem 00:54, 5 September 2006 (UTC)
 * subdivision_name = United States Iowa Scott County
 * subdivision_name = United States
 * subdivision_name1 = Iowa
 * subdivision_name2 = Scott County


 * Yes. Similarly it should end up with 3 "subdivision_type" params, i.e.


 * subdivision_type = Country
 * subdivision_type1 = State
 * subdivision_type2 = County
 * -- Rick Block (talk) 03:06, 5 September 2006 (UTC)

Journal
I started creating some MWB settings at User:Ligulem/work/Infobox City. I did this edit, with these settings. Will sure need more subcases for 2 and one br separated info pieces. This one worked for the full tripple case (two br's). --Ligulem 13:31, 5 September 2006 (UTC)

In Concord, New Hampshire there is: | leader_title = City Manager Legislative body | leader_name = Thomas J. Aspell, Jr. City Council There are no leader_title1, leader_name1 params in the template. Ok, you said only subdivision_name and subdivision_type should be changed, right? So I'll move on with my list, ignoring that. --Ligulem 14:38, 5 September 2006 (UTC)

On Dayton, Ohio (diff) I had to manually remove a series of &amp;nbsp;. If this should show up more often, I'll try to integrate that into the rules. --Ligulem 14:54, 5 September 2006 (UTC)


 * There are leader_title1/leader_name1 params (and these should be changed as well). As you surmise, formatting "cruft" (e.g. nonbreaking spaces or dashes or combinations) should be deleted. -- Rick Block (talk) 17:43, 5 September 2006 (UTC)


 * Ok: edit on Concord, New Hampshire / new settings used. The formatting "cruft" removal is a bit complicated. Not yet done in settings. --Ligulem 18:08, 5 September 2006 (UTC)

New uncovered case: Ajax, Ontario: Needs up to leader_title3/leader_name3. Do you update the template? --Ligulem 18:17, 5 September 2006 (UTC)
 * leader_title          = Mayor Governing Body MP MPPs
 * leader_name           = Steve Parish Ajax Town Council Mark Holland (Ajax-Pickering) Christine Elliott (Whitby-Ajax) Wayne Arthurs (Pickering—Ajax—Uxbridge)
 * There was a proposed addition for city council members a while ago, that most folks seemed not to like. This seems kind of similar.  Perhaps rather than expand the template we can propose at talk:Ajax, Ontario to delete this information from the infobox.  I'll do this.  -- Rick Block (talk) 20:10, 5 September 2006 (UTC)
 * Ok. I'll move on then ;-) --Ligulem 22:10, 5 September 2006 (UTC)

Just as a side note: this task here is definitely not "bottable". Too many special cases. However, with MWB this is much faster to do than with pure manual editing. But great care must be exercised and each page manually checked. 2057 pages left to go. --Ligulem 00:17, 6 September 2006 (UTC)


 * Who needs a bot when we've got Ligulem? You are a machine! :) Just kidding, but thanks to you and Rick Block for all the work on this. --MattWright (talk) 15:04, 6 September 2006 (UTC)

I think I'll give up listing Canadian cities with problems caused by leader issues. These are epidemic. --Ligulem 22:52, 8 September 2006 (UTC)
 * Is there a list someplace? I'll start going through the whatlinkshere list I made a few days ago, looking for Canadian cities.  There are now 4 pairs of leader params (no suffix, 1, 2, and 3) which is usually sufficient.  Where this has not been sufficient I've done other things to make it work (like move items out of the infobox into the article). -- Rick Block (talk) 01:01, 9 September 2006 (UTC)
 * I've updated the references from all Canadian cities I can find that reference this template. -- Rick Block (talk) 04:51, 9 September 2006 (UTC)

Progress
Re : I do hereby promise to finish this bot task, but it *is* very time consuming (even with MWB). There are tons of exceptional cases that need hand tweaking. I also have to carefully check each diff (in order not to loose my good bot reputation/license — I have a bunch of serial edit haters, that watch after me and my AWB colleages ;-). I also have some ideas to change MWB to make work like this easier. In short: a lot that could be done. I could also speed up my edits because I do have to wait at least 20 seconds between two consecutive edits. In theory, this value is at 30s per WP:BOTS. But I'm enough bold to go down to 20 seconds. If I have a series of no problem edits I could go with around 7..10 seconds per page. Instead, I do have to sit watching my timer until it is on 20 seconds and then click on save. I must say, MWB work could be made a bit easier, if we had a bit less red tape instruction creep and self declared police officers on this project here. Just some infos for the non-botters. --Ligulem 09:13, 10 September 2006 (UTC)

Task completed (I did approx. 500 edits) --Ligulem 22:32, 10 September 2006 (UTC)

Problem cases
All fixed (including Washington, D.C.).

Problems in template

 * established_title / _date use a different numbering scheme (number 1 is left out) ← very odd, but needs to be kept for compatibility with existing calls
 * established_title     =
 * established_date      =
 * established_title2    =
 * established_date2     =
 * established_title3    =
 * established_date3     =