Wikipedia:Bots/Requests for approval/BU RoBOT 6


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

BU RoBOT 6
Operator:

Time filed: 04:12, Saturday, September 26, 2015 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: AWB

Function overview: Substitute from sandbox for Infobox China station and Infobox Japan station as per TfDs.

Links to relevant discussions (where appropriate):
 * Templates_for_discussion/Log/2015_February_8
 * Templates_for_discussion/Log/2015_May_9

Edit period(s): One time run

Estimated number of pages affected: 1,722 + 5,474 = 7,196

Exclusion compliant (Yes/No): Yes, per AWB default

Already has a bot flag (Yes/No): Yes

Function details: This task will substitute from the sandbox of both templates to complete the merge into Infobox station. You can find the sandbox versions at Infobox China station/sandbox and Infobox Japan station/sandbox. Testcases are at Template:Infobox China station/testcases and Template:Infobox Japan station/testcases. The sandbox versions (which were created by another editor) appear to work fine, and the task itself is technically trivial.

Discussion
I promoted the China sandbox to live before noticing this BRFA. It's working cleanly on the 25 I substed. Should be no issues. Alakzi did a good job on the rewrite. Bazj (talk) 13:25, 26 September 2015 (UTC)

—  Earwig   talk 05:16, 27 September 2015 (UTC)
 * Is there any particular way you want me to divide up this trial? Since it involves substituting two distinct infoboxes, I imagine there should be edits for both. I can just split it 25/25, or if you prefer to see 50 edits of each, I can do that too. ~ RobTalk 19:56, 27 September 2015 (UTC)
 * Hmm, good point. Let's do 50 each. —  Earwig   talk 21:01, 27 September 2015 (UTC)

Overall went well, but a few notes:
 * The sandbox versions as they stand add a lot of empty parameters. I consider this to be a positive thing, personally. The Japan/China station infoboxes have evolved over time and have not always had functionality to include many of these parameters. I do not believe alt names were an option under these templates, for instance. By adding in the empty parameters, it encourages editors to fill in relevant information that isn't currently present. I can edit the sandbox versions to take out empty parameters if you'd like, but I think that's a net negative considering we lose nothing but a few bytes of storage space by keeping them in.
 * Infobox China station had in-template categorization by administrative district, which appears to violate WP:TEMPLATECAT. Alakzi's solution when creating the sandbox version was apparently to have the previously automatic categories be inserted after the new infobox. This violates WP:ORDER. The order issue can't be fixed by genfixes, because AWB doesn't receive the output of the substitution before running the genfixes. It's possible to handle that part with AWB instead and then enable genfixes to fix the order issue. I'll look into that later today or tomorrow, as time permits.
 * When this bot task is approved, either template protection or full protection should be placed on both sandboxes. Any vandalism that occurred in the middle of the bot run would be automatically substituted onto potentially thousands of articles, which is a huge risk that doesn't have a very easy fix (mass rollback, I guess). Please don't place that protection yet, because I don't have the template editor permission and I'll need to remove a few lines if the categorization is handled by AWB instead of the sandbox version. ~ RobTalk 22:25, 27 September 2015 (UTC)
 * I've written regex to handle categorization in AWB and tested it. It works as expected. It was needed in the China station template, where Alakzi had created a piecemeal solution, but also in the Japan station template, where the automatic categorization was missed. Let me know if you want an extended trial on this; it probably wouldn't hurt. As a side note, both sandboxes are ready for template or full protection now. ~ RobTalk 02:01, 28 September 2015 (UTC)
 * Some things I noticed (so far):
 * Weird category placement &mdash; A few of these had that dangling category (e.g. Category:Railway stations in Shanghai). This then encourages a followup bot edit, which is less ideal. Is that what you're saying you've already addressed?
 * Template-merge formatting &mdash; Somewhat less of a bot issue and more of a template issue (possibly), but already a few editors have added to presumably make things easier to read.  It might be an idea, if this is a reasonable problem with merging the two templates, to either account for this in the template or potentially just use it pre-emptively while merging. I don't truly know if this applies (as I'm not a fluent speaker or reader of either Japanese or Mandarin), but examining the before-and-after versions does lead me to see why the bolded headers might be problematic when it comes to reading these languages (e.g., pre-merge version vs post-merge version of the subtitle, whereby the post-merge clearly looks more difficult to read, at least in my browser on my platform).
 * -- slakr \ talk / 00:59, 30 September 2015 (UTC)
 * The categories are already fixed, yes. Basically, the problem was that the templates used to automatically categorize articles, which is against guidelines. The editor who created the wrapper that I'm substituting handled these in a way that caused the weird category placement. I've taken that part of his code out and replaced it with additional find and replace rules in AWB which will handle this better. General fixes will need to be turned on for my fix, by the way.
 * I'll discuss this with WikiProject Japan. The easiest fix, in my opinion, is to make use of the Nobold template within templates such as Nihongo, which should contain all instances of Japanese text. I can't imagine any circumstance where bolded Japanese text would be desirable, but maybe they'll have a different take on this. ~ RobTalk 01:29, 30 September 2015 (UTC)
 * Link to the WikiProject Japan discussion is here. I've also alerted the accessibility WikiProject, as this concerns them. I don't believe bolded Japanese text can be considered to be accessible to those with visual impairments, which would certainly violate the spirit of WP:ACCESS and the word of the WMF's non-discrimination policy. ~ RobTalk 01:42, 30 September 2015 (UTC)
 * The replacement template has a lot of differences from the original (at least for the Japanese ones that I looked at), some subtle and some more glaring. In Hakata Station, for example, the image size gets mucked up and we lose the Station/駅 text. The layout is also slightly different – I guess we're going with the Infobox station standardized form, but honestly I prefer Infobox Japan station.
 * In the few examples I looked at, the empty parameters didn't concern me. Will need to look at more, but this isn't like the baseball player task where a lot of the added parameters would never apply, so I don't think we need to worry about that too much here. —  Earwig   talk 06:26, 30 September 2015 (UTC)
 * I've fixed that issue with the station. This task is a bit tricky for me because I'm inheriting the wrapper from an indefinitely blocked user, so I'm unable to consult them in how they put it together. In this case, it appears they left out if statements that are present in the original template. I've re-added them. If Station (or the Japanese character for it) appears in Infobox Japan station, it will now also appear in Infobox station. The merge discussion did not address stylistic uniformity within this template, and that can be achieved at a later date if anyone cares enough to get consensus on it. I doubt anyone will. ~ RobTalk 08:34, 30 September 2015 (UTC)
 * A couple of things about Infobox Japan station:–
 * Alakzi removed the image size parameter, probably without taking into account the portrait images used in the infobox (landscape is near-universally used in Infobox station); this is because the parameter can only pixels for size, whereas images with size of a multiple of  (which evidently he prefers) scale according to user preferences.
 * I removed "駅" and "Station" because typically in Infobox station and the other remaining railway station infoboxes, with the exception of Singapore only the station name (e.g. "Great Victoria Street" (Belfast), "Charles de Gaulle – Étoile" (Paris), "Corrientes" (Buenos Aires), "München Ost" (Munich), "Pacific Central" (Vancouver), "Central" (Sydney), "Shanghai Hongqiao" (Shanghai), "34th Street – Hudson Yards" (New York), "King's Cross" (London), "Seoul" (Seoul)) is shown in the header. I personally think it should be removed for consistency, but I don't particularly mind if it's kept.
 * Jc86035 (talk • contribs) Use &#123;&#123;re&#124;Jc86035&#125;&#125; to reply to me 14:22, 30 September 2015 (UTC)
 * Reviewing more edits, it looks pretty good. The Chinese infobox seems to have been modified to transclude the main template already, so it's hard to tell if there are meaningful visual differences there, but I don't see hints of anything too problematic. Did you manage to fix the nobold issue with the Japanese infoboxes? I see the Chinese ones have the same issue, in case that wasn't noticed (1). Jc86035 presents a good case for leaving "Station" out, but I'm not entirely sure, since it has been the standard in this group of articles until now. (Singapore was brought up as an exception—is that for a particular reason for that or could we consider China/Japan exceptions too?) Perhaps we should consult the relevant WikiProject(s). That just leaves the image size bug, which I admit I don't understand fully. Any other concerns? —  Earwig   talk 07:36, 2 October 2015 (UTC)
 * The image size isn't a bug; there was previous consensus that Infobox station should have a larger default image size than is typical. We can debate whether that consensus was a good idea (it wasn't), but until a discussion about that is started, we should stick with the default for the merge target. To do otherwise would be to ignore the consensus there. I've corrected the nobold issue in the wrappers. The discussion taking place at the Japanese WikiProject shows no opposition to making these non-bold, and a more general solution (using nobold as the default for templates like Nihongo) does not affect this bot task. ~ RobTalk 07:49, 2 October 2015 (UTC)
 * Noted about image size; the reason I brought it up is because we are dropping that parameter regardless of its value (although the station infobox doesn't even have an image size parameter, so I guess we have no choice there). Maybe I misunderstood, but doesn't Nihongo already apply the no-bold effect? (That's what I gather from reading its source.) —  Earwig   talk 14:28, 2 October 2015 (UTC)
 * Please, please do not add templates like nobold through the wrappers which are going to be substituted; add them in Infobox station as something like . This makes it easier to remove if for some reason we decide we don't want it anymore. Jc86035 (talk • contribs) Use &#123;&#123;re&#124;Jc86035&#125;&#125; to reply to me 10:39, 3 October 2015 (UTC)
 * Jc86035, if you'd like to code it that way, go ahead, and we can place this on hold until you make the necessary edits. My activity level has dropped off a cliff the past week, and I don't have time to do it myself. I think it's a non-issue since the WikiProject Japan discussion appears to conclude that bold in these subheadings is not desirable for the Japanese characters, and the Chinese characters are very similar. ~ RobTalk 15:13, 3 October 2015 (UTC)
 * I agree that we should keep the nobold/nihongo invocations out of the individual transclusions, but is there really no more general way of doing this than special cases for each language code? —  Earwig   talk 19:34, 3 October 2015 (UTC)
 * A yes/no switch nativename_nobold makes the most sense. Just don't have time to add it myself at the moment. ~ RobTalk 07:02, 4 October 2015 (UTC)
 * Alternatively we could have a style template for each infobox (like Amtrak style or TTC style), since Infobox station already has a parameter (style template; e.g. "Amtrak") for calling these like S-line does, and then change Infobox station so that it calls something from the switch parser function of the style template:
 * (Additions to actual template shown in bold.) Then Infobox station will call  from the style template. The addition to the class declaration makes Nihongo2 unnecessary, since all Nihongo2 does is add the class declaration   and the   language code. Jc86035 (talk • contribs) Use &#123;&#123;re&#124;Jc86035&#125;&#125; to reply to me 14:46, 4 October 2015 (UTC)
 * That sounds like a good idea to me. Are you or BU Rob13 willing to implement it? —  Earwig   talk 03:39, 8 October 2015 (UTC)
 * That sounds like a good idea to me. Are you or BU Rob13 willing to implement it? —  Earwig   talk 03:39, 8 October 2015 (UTC)


 * Is it not more complicated to go that route than to just put in a nobold switch that works for all languages and all styles? Nihongo/nihongo2 are used outside of this template, so doing away with them here isn't really simplifying the template space. I don't really care what the final implementation looks like, to be honest, but this seems particularly messy to me. My time has disappeared for the coming weeks, as I'm trying to rush to finish my thesis around half a year ahead of schedule so I can send it off to potential grad schools as part of my application package, so I unfortunately couldn't code anything much more difficult than a nobold switch. ~ RobTalk 05:10, 8 October 2015 (UTC)
 * Possibly. You know, I think you might be right; it seems like the style templates are used for a different purpose than this and the native_name_lang is most closely related to what we're trying to do here. I'll let the final implementer do what they think is best. Don't worry about your free time; we'll get this stuff sorted out and you can run the bot when you are able. Hope the thesis goes well. —  Earwig   talk 05:29, 8 October 2015 (UTC)
 * I thought it seemed very oddly specific to make a style parameter simply for giving the local name, when we already have style templates which are much less arbitrary in what styling one could apply. Additionally, except for style and custom_header, there are currently no other options for styling, and adding something like native_name_nobold just adds complexity to the template as a whole. Again, adding this change would be helpful for other style templates for the rest of the 20,000 stations which Infobox station is used on.  I cannot apply this change to Infobox station without administrator or template editor assistance, although if it can go ahead I will experiment in the template sandbox. However, I will be unavailable to edit for much of the next week. Jc86035 (talk • contribs) Use &#123;&#123;re&#124;Jc86035&#125;&#125; to reply to me 10:19, 9 October 2015 (UTC)

Since I've been mentioned repeatedly here, and, though it is unclear to me why Rob was "unable to consult with [me]", as I'd emailed him twice previously - and he'd responded the first time - I thought I'd make a note of the following:


 * WP:THUMBSIZE is the policy which states that images should not be specified in pixels without very good reason. Portraits can be scaled down using |image_upright=, if there are any. The |image_size= parameter was removed from the target infobox a few months prior, as a means to put a stop to its proliferation. (Images can still be specified in pixels using regular image syntax, but don't tell anybody.)
 * I do not believe that I removed any other information; at least, not intentionally. The "station" suffix was chopped off by Jc86035, IIRC.
 * The Chinese and Japanese Wikipedias use bold text for headings and titles, and infobox headers. Native speakers have got no trouble parsing emboldened characters, presumably. At any rate, this has got wider implementations, and should be discussed at a more proximal location.
 * Yes, a successive edit would've been required to reorder the cats. If the categorisation can be reliably reproduced using AWB, then by all means.
 * I don't recall why I left the cat out for the Japan station infobox, but there must've been a reason. Perhaps the Japan station articles are already placed in all of the relevant categories? Perhaps the categorisation was unreliable, or the categories non-diffusing?

Well, I shall now retreat back to the shadows. Good luck with the replacement. Alakzi (talk) 14:10, 9 October 2015 (UTC)


 * Thanks for your comments, Alakzi. I was considering asking for your input on this, but given all that's been going on lately, I didn't want to burden you with even more things to deal with. I believe the only two outstanding concerns are regarding the "Station" suffix and the bolding issue. For the former: I like keeping the templates as similar as possible to their pre-merge states, in which case we should be preserving the suffixes (I note this is how jawiki displays them, for what it's worth), but I believe wider discussion may be warranted here. For the latter: based on the WPJapan discussion, I think we should be making the text non-bold. I'll repeat Rob's comment there, which says "The Japanese wiki sidesteps this by making their default font size larger than ours", as a response to "The Chinese and Japanese Wikipedias use bold text for headings and titles, and infobox headers". —  Earwig   talk 08:27, 11 October 2015 (UTC)

I've updated so that if you use, text will be unbolded. So, just be sure the bot adds native_name_lang=ja (for japanese) and leaves the text for the parameter normal (e.g., "native_name=博多") and it should automagically work. The end result, html-wise, is a bit redundant as far as prettiness and nesting goes, but it should work (e.g., like this). -- slakr \ talk / 05:51, 5 November 2015 (UTC)


 * I made a couple tweaks to the substitution sandboxes. "Station" suffixes are removed, and slakr's change is supported now; the bot should add native_name_lang correctly and it should all work without per-template nobold invocations. Another trial to check over everything would be good. Rob, it does seem you are busy at the moment, so please take your time with this. I don't think you need to do anything aside from running the bot. —  Earwig   talk  07:02, 5 November 2015 (UTC)
 * Big thanks for stepping in and helping out with this. I'll run the trial probably in late November, after my first round of applications go out. I'm in crunch mode until then. Again, thanks for your help! ~ RobTalk 02:38, 8 November 2015 (UTC)
 * Started the trial, but there were some issues with the regex. I need to consider a better way to handle the "open" dates. We want categories to be created for "railway stations opened in 1988", but not for "railway stations opened in December 19, 1988", which means I need more specific regex here. I'll think on this more and have another go with better regex. ~ RobTalk 22:07, 13 November 2015 (UTC)
 * For the record, I went through the bot's original trial edits with AWB and updated the templates to use the new system. —  Earwig   talk 11:22, 20 November 2015 (UTC)
 * I'd like to withdraw this for now. It's just clogging up the BRFA queue at the moment, and if I'm being realistic, I'm not getting to this within the next month. It won't take long, but it's near the bottom of my priorities with everything else going on. I'll pick it back up if the task hasn't been done by someone else at a later date (January?) ~ RobTalk 20:13, 30 November 2015 (UTC)
 * I'll give it a shot: Bots/Requests for approval/EarwigBot 20. —  Earwig   talk 04:46, 1 December 2015 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.