Wikipedia:Bots/Requests for approval/Hazard-Bot 21


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol neutral vote.svg Request Expired.

Hazard-Bot 21
Operator:

Time filed: 01:33, Tuesday May 28, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: GitHub

Function overview: Fixing citation style

Links to relevant discussions (where appropriate): bot request

Edit period(s): Periodic

Estimated number of pages affected: thousands

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Currently, it only fixes Help:CS1_errors, but I might implement more fixes as well (would separate approval be required in that case, since it's still fixing CS1 errors?).  Hazard-SJ  ✈   01:33, 28 May 2013 (UTC)

Discussion

 * ...simply beautiful, my friend. Pywikipedia strikes again! :) On a related note, task 13 on my bot also deals with a CS1 error.  Theopolisme ( talk )  01:41, 28 May 2013 (UTC)
 * Yes, thanks :) Also, I believe I remember seeing that task, and I'm currently considering implementation of Help:CS1_errors. In such a case, It might be also useful to add your task to it as well ;)  Hazard-SJ  ✈   02:44, 28 May 2013 (UTC)
 * Questions
 * Does it fix parameters in citation, or only those that start with "cite" or "web"? (noob question based on line 52)
 * Does it fix parameters that contain templates (e.g. ), or should that be done in a separate bot task?
 * Thanks! GoingBatty (talk) 01:58, 28 May 2013 (UTC)


 * Answers:
 * With that version of the code, only those starting with either "cite" or "web". I've just added support for citation as well in this change, though I plan on coming up with a better and more specific list soon.
 * As of this change, it should remove templates as well.  Hazard-SJ  ✈   02:44, 28 May 2013 (UTC)
 * You're welcome, and thanks for the code review ;)  Hazard-SJ  ✈   02:44, 28 May 2013 (UTC)
 *  ·Add§hore·  Talk To Me! 08:33, 28 May 2013 (UTC)
 * with [//en.wikipedia.org/w/index.php?title=Special:Contributions/Hazard-Bot&offset=20130529002700&limit=52&target=Hazard-Bot 52 edits]. Also, would I need separate approvals per what I mentioned above?  Hazard-SJ  ✈   00:29, 29 May 2013 (UTC)
 * I don't know if it requires separate approvals or not, but if it is all the same type of errors, couldn't you just trial an assortment of errors, BAG folk? -68.107.136.227 (talk) 03:13, 29 May 2013 (UTC)
 * Seems useful, catches an error, although the explanation template is too much to read. -68.107.136.227 (talk) 22:43, 28 May 2013 (UTC)

These are surely not a good idea:

This is also not ideal, a better outcome would be to shift the subscription flag to the citation's via parameter, e.g.:

Dragons flight (talk) 03:51, 29 May 2013 (UTC)


 * [//github.com/HazardSJ/English-Wikipedia/commit/1b18ce6db904ec6655b517959ec93f668f2a21dc This commit] fixes the subscription issue (and potentially others?). I'm still considering about the lang issues. Should I just remove the template, but leave the value of 2 which is the text itself, or put the entire citation template in the lang template (I think this is unwise)? Otherwise I'd probably have to either just skip those errors, or set 2 to a null value before the citation template, leaving the actual value in the citation template. What do you suggest?  Hazard-SJ  ✈   07:02, 1 June 2013 (UTC)
 * I suggest converting lang to language and leaving the value of 2 in the title parameter (e.g. ), and then deleting any duplicated language parameter. GoingBatty (talk) 15:16, 1 June 2013 (UTC)
 * I have an uncommitted fix for it, but I'm trying to work out [//en.wikipedia.org/w/index.php?title=Alexandrov_topology&diff=557905433&oldid=557900529 this issue] (see bug 2700).  Hazard-SJ  ✈   01:24, 2 June 2013 (UTC)
 * Fixed in [//github.com/HazardSJ/English-Wikipedia/commit/04e7f89af005d06f6648e259376bf63d777c4b73 this commit].  Hazard-SJ  ✈   02:52, 2 June 2013 (UTC)
 * Is this ready for another trial? please ping me with your response :)  ·addshore·  talk to me! 09:12, 2 June 2013 (UTC)
 * No, I'm not yet ready, I'd like to improve the code and add a few more features first. I'll keep you updated.  Hazard-SJ  ✈   22:59, 3 June 2013 (UTC)
 * [//github.com/HazardSJ/English-Wikipedia/commit/8600350a784904e292fe7dcd8dfdf56d106cc96e (diff)] I added some more features and did some code clean-up, so I think I'm ready again. As a side note, [//en.wikipedia.org/w/index.php?title=Special:Contributions/Hazard-Bot&offset=20130606004042&limit=31&target=Hazard-Bot these edits] were accidentally made, though with an outdated version of the code.  Hazard-SJ  ✈   02:36, 7 June 2013 (UTC)
 *  ·addshore·  talk to me! 08:17, 7 June 2013 (UTC)

([//en.wikipedia.org/wiki/Special:Contributions/Hazard-Bot?offset=20130611041500&limit=55 edits])  Hazard-SJ  ✈   04:20, 11 June 2013 (UTC)
 * Edits such as and  look a bit odd.  GoingBatty (talk) 04:36, 11 June 2013 (UTC)


 * etc.
 * (Don't see this task explicitly listed, but I see you mention it in the discussion, so posting here) I strongly suggest you limit the automatic edits to article space, to avoid edits like this. We generally treat all bots as article-only, unless otherwise stated. In this case, it is not uncommon to have wrong citation style as examples or problem tests, and the bot should not assume other namespaces require automatic correction. — HELL KNOWZ  ▎TALK 07:37, 11 June 2013 (UTC)
 * [//github.com/HazardSJ/enwiki/commit/e28e21eacd09159346156aaa89fd1d25e4f951b2 Fixed]  Hazard-SJ  ✈   03:46, 16 June 2013 (UTC)
 *  ·addshore·  talk to me! 18:46, 16 June 2013 (UTC)
 * Hazard, I'm not a pywikipedia expert and don't pretend to be, but doesn't  just return the namespace number? I bet I'm just missing something, but an explanation'd be great. Thanks!  Theopolisme  ( talk )  04:35, 17 June 2013 (UTC)
 * You're correct about it returning the namespace number, but remember that  in Python returns False, and other digits return True, so in other words, if the namespace number is not zero, it continues to the next page.   Hazard-SJ  ✈   05:54, 18 June 2013 (UTC)
 * *headdesk*, duh :p  Theopolisme ( talk )  14:18, 18 June 2013 (UTC)
 * ([//en.wikipedia.org/wiki/Special:Contributions/Hazard-Bot?limit=49&offset=20130622171000 edits]); I haven't checked them all as yet (it somehow only did 49, though), but an obvious problem so far is the comments being copied from archiveurl the url where present.  Hazard-SJ  ✈   17:14, 22 June 2013 (UTC)
 * I only checked half the edits. Some of these edits may be garbage in, garbage out, but they look strange:
 * In and, I wouldn't expect citation needed within .
 * In and, the URLs were already in the deadurl parameter.
 * moved | from the title parameter to the end of the reference. — Preceding unsigned comment added by GoingBatty (talk • contribs) 03:21, 23 June 2013 (UTC)
 * I have raised similar concerns on your talk page and [//en.wikipedia.org/w/index.php?title=User:Hazard-Bot/DoTask/21&diff=561209352&oldid=557255710 disabled the task] for good measure. Graham 87 14:13, 23 June 2013 (UTC)

The ongoing errors are a bit of a worry for me, especially as this bot is running in the article space. I'm leaning towards denying this task. -- Chris 13:20, 24 June 2013 (UTC)
 * I will leave out the part of the code that moves templates out of citation templates (maybe the language replacements are okay, since that's specifically hard-coded?). Also, I can code the bot to not make replacements in ref tags (as I did on a recently approved task). Also, I will have it check for deadurl as well.  Hazard-SJ  ✈   02:41, 25 June 2013 (UTC)
 * BAGAssistanceNeeded In response too what I've said, [//github.com/HazardSJ/enwiki/commit/0c8bf814b0663cbc00cce52a9a286adec4ab5218 may I have another trial please]?  Hazard-SJ  ✈   00:35, 3 July 2013 (UTC)
 *  ·addshore·  talk to me! 12:01, 21 July 2013 (UTC)
 * Started, only did these so far, I'll try to finish when I get back online.  Hazard-SJ  ✈   04:02, 24 August 2013 (UTC)
 * In, the bot added 02 March 2012 - the leading zero is not needed. It would also be nice (but too much to ask for?) if the bot could have detected that the reference already had an archivedate and was just missing the pipe.  GoingBatty (talk) 14:07, 24 August 2013 (UTC)
 * This change should strip the leading "0" if available. Also, as for the pipe issue, there might be (hopefully) few of such cases, and though I'm not sure if all such mistakes would all be in the same format, but if it's a frequent issue I could get a pattern to attempt it (running from false positives here).  Hazard-SJ  ✈   02:45, 25 August 2013 (UTC)
 * I thought this task was about removing links and templates from citation templates, but these edits are fixing archive link errors...:Jay8g [ V•T•E ] 18:07, 25 August 2013 (UTC)
 * This task focuses on errors related to citation templates, which include the archive links. As for the templates, it's better to have that part more specific (hard-code for certain templates only, because, as seen from the above, can cause many problems). IIRC, there isn't a problem with links. However, thanks for the mention, it caused me to look back at the code and notice that I disabled the entire link/template section rather than just that part that isn't specific as it pertains to templates (which, as I said, can be very troublesome). [//github.com/HazardSJ/enwiki/commit/c39f0004605a53fe5fc3bc60a34fdf54ac259d9b That has now been fixed in the code].  Hazard-SJ  ✈   05:15, 28 August 2013 (UTC)

I'll also be adding [//en.wikipedia.org/w/index.php?title=Wikipedia:Bot_requests&oldid=570496002#Category:Pages_using_citations_with_accessdate_and_no_URL this], and as I mentioned before, possibly others in the future.  Hazard-SJ  ✈   06:07, 28 August 2013 (UTC)
 * OK, [//github.com/HazardSJ/enwiki/commit/3755e84dfcf56716d60370fd2e6cde61e6d1a344 here it is].  Hazard-SJ  ✈   06:25, 28 August 2013 (UTC)
 * I just resumed the trial, and from the above code, got [//en.wikipedia.org/wiki/Special:Contributions/Hazard-Bot?limit=15&offset=20130828062828 these]. I however, stopped the trial to disable that part for now, so I can get some of the other parts involved.  Hazard-SJ  ✈   06:30, 28 August 2013 (UTC)
 * OK, continuing from above, I resumed with a trial that actually made [//en.wikipedia.org/wiki/Special:Contributions/Hazard-Bot?limit=49&offset=20130828071314 49 more edits]. It would have actually been 50, had the attempt to edit Georgia O'Keeffe actually been successful. The attempt was:
 * which, according to my checks, failed because of a spam filter for medaloffreedom.com on MediaWiki:Spam-blacklist.
 * Error-wise, I picked up things like [//en.wikipedia.org/w/index.php?title=Grave_of_the_Fireflies_%282005_film%29&diff=prev&oldid=570499977], [//en.wikipedia.org/w/index.php?title=FMetro&diff=prev&oldid=570499540], [//en.wikipedia.org/w/index.php?title=Gaoqiao,_Kai_County&diff=prev&oldid=570499696], and [//en.wikipedia.org/w/index.php?title=Der_K%C3%B6nig_Kandaules&diff=prev&oldid=570498803], all of which are as a result of the bot not having a record of those language codes (well, at least one of them was invalid), and [//en.wikipedia.org/w/index.php?title=Dutch_East_Indies&diff=prev&oldid=570498980 this], which the bot wouldn't have been able to correctly fix. As for the first issue, I'll simply fix it by, firstly, updating the list of languages it's aware of, and secondly, to avoid repetition of this, the bot will no longer add, but rather, either leave the lang template or remove it there's already a language parameter set. I hope this request is more promising now. Thanks,   Hazard-SJ  ✈   07:39, 28 August 2013 (UTC)
 * which, according to my checks, failed because of a spam filter for medaloffreedom.com on MediaWiki:Spam-blacklist.
 * Error-wise, I picked up things like [//en.wikipedia.org/w/index.php?title=Grave_of_the_Fireflies_%282005_film%29&diff=prev&oldid=570499977], [//en.wikipedia.org/w/index.php?title=FMetro&diff=prev&oldid=570499540], [//en.wikipedia.org/w/index.php?title=Gaoqiao,_Kai_County&diff=prev&oldid=570499696], and [//en.wikipedia.org/w/index.php?title=Der_K%C3%B6nig_Kandaules&diff=prev&oldid=570498803], all of which are as a result of the bot not having a record of those language codes (well, at least one of them was invalid), and [//en.wikipedia.org/w/index.php?title=Dutch_East_Indies&diff=prev&oldid=570498980 this], which the bot wouldn't have been able to correctly fix. As for the first issue, I'll simply fix it by, firstly, updating the list of languages it's aware of, and secondly, to avoid repetition of this, the bot will no longer add, but rather, either leave the lang template or remove it there's already a language parameter set. I hope this request is more promising now. Thanks,   Hazard-SJ  ✈   07:39, 28 August 2013 (UTC)

Language
lang and language do different things. Please read up on the former, which should not be removed without thought. this page is difficult to read because of your garish sig. Please tone it down. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 19:58, 28 August 2013 (UTC)
 * - Template:Lang states "This template also includes a categorisation link when used by main namespace pages, therefore it should not be included inside a wikilink." Since the title parameter of cite web contains a wikilink, using the lang template within the title causes a visible categorisation error in the reference: see Compagnie des Transports Strasbourgeois for an example.  Your thoughts on the best way to fix these errors would be appreciated.  Thanks!  GoingBatty (talk) 23:28, 28 August 2013 (UTC)
 * Correct. Hazard SJ 08:17, 29 August 2013 (UTC)
 * Good: Zut alors!. Bad: Zut alors! . Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 10:16, 29 August 2013 (UTC)
 * - Also Bad:  — Preceding unsigned comment added by GoingBatty (talk • contribs) 17:20, 29 August 2013‎
 * How else do you suggest that we mark up the titles of non-English works, such that the emitted HTML complies with HTML and WCAG standards? (And where would be a better place to discuss this issue; which is probably drifting from relevance here?) Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 22:30, 29 August 2013 (UTC)
 * What exactly is your issue with language? Hazard SJ 02:18, 30 August 2013 (UTC
 * None whatsoever. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 16:29, 30 August 2013 (UTC)
 * - I was suggesting that this bot be coded so it would change my example above to .  If Hazard-SJ doesn't want to include that in the scope of this bot, then I agree we should stop discussing it here.  GoingBatty (talk) 16:48, 31 August 2013 (UTC)
 * Yes; and my point is that that contains nothing that indicates tat the phrase "Zut alors!" is not English. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 17:10, 31 August 2013 (UTC)
 * - The language parameter indicates that the reference text (and therefore probably the title too) are not English. I'm open to alternate suggestions that do not produce errors.  GoingBatty (talk) 17:40, 31 August 2013 (UTC)
 * The use of language may suggest a probability that the title is in another language, but it does not guarantee it; and it does not indicate it in the emitted HTML, as does lang, though the use of the appropriate HTML attribute, as described in the latter's documentation, to which I referred you earlier.  Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 17:58, 31 August 2013 (UTC)
 * - Could you please provide an alternate suggestion for Hazard-SJ for removing the errors generated by using lang in the title parameter? GoingBatty (talk) 18:17, 31 August 2013 (UTC)
 * What errors? Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 17:21, 1 September 2013 (UTC)
 * - Sorry I haven't been able to explain this properly, so let me try again. Up above on June 1, I provided, which fixed the Wikilink embedded in URL title error on .  Similarly,  by another editor fixed similar errors on .  Both of these examples demonstrate that using lang in the title parameter of a citation template produces a visible error and categorizes the article in Category:Pages with citations having wikilinks embedded in URL titles, and that my suggestion to Hazard-SJ for fixing them is to remove lang and use language instead.  GoingBatty (talk) 00:30, 2 September 2013 (UTC)

Thank you; I wasn't aware of you having attempted an explanation previously. lang does not emit or cause to be emitted a link in the text which it contains; the issue appears to be the emitted category. The solution would therefore seem to be one of: ask for that template to not emit a category when used in a reference; change the way in which it emits a category;have a sister template for use in references, which dos not emit a category; or have the functionality embedded in the citation template core itself. the later is probably the most elegant solution. As I said above, language does not have the same functionality as lang and is not a workable alternative to it. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 10:07, 2 September 2013 (UTC)
 * P.S. In the interim, the template could be commented out, allowing its later reinstatement., Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 11:08, 2 September 2013 (UTC)

I seem to have missed out a lot here, but to put it simply, my bot has already had that feature for some time now. The current version works for both lang and style templates, as should be seen from the trials. In that case, if there's anything I missed, please let me know. Hazard SJ 01:29, 3 September 2013 (UTC)

Experienced wikilink error remover offering help
I have fixed about 5,000 of the 8,000+ articles that were in Category:Pages_with_citations_having_wikilinks_embedded_in_URL_titles. I have often thought about having a bot do the tasks, but they are so varied that it seems like less than 50% of them should be fixed by a bot. I'll be happy to have a discussion with you here or elsewhere about the fixes I have been making and the kinds of edge cases I have encountered. You can also look at my contribution history to see how I have handled any number of cases. – Jonesey95 (talk) 14:33, 30 August 2013 (UTC)
 * - You've done great work cleaning up this category! I hope the bot could fix the most common cases, but we'll still need passionate editors such as yourself to manually fix the edge cases.  Thanks!  GoingBatty (talk) 18:21, 31 August 2013 (UTC)
 * My bot already supports removing wikilinks from in titles, but the category also lists those with templates in the titles (I believe, though I'm not sure), which is where an issue has been raised. I've therefore have to hard-code this to handle specific instances of this (currently supports lang (see section above), and subscription required), so the rest would need manual review, unless they are general simple cases, which I could also hard-code. As we've established from the earlier trials, I can't just remove random templates, because there are far too many false positives. Hazard</b> <b style="color:#FFF">SJ</b> 01:37, 3 September 2013 (UTC)
 * The templates that add articles to this category are those that generate wikilinks, either to articles, e.g. sic, or to categories, e.g. fr icon and lang and nihongo. If you're interested in bot-fixable editing of these citations, here's my advice:
 * Unilaterally change sic to an appropriate form that conforms to WP:MOS, if there is one. I haven't looked. Maybe [sic] or [sic].
 * Move subscription required outside the cite template but before the closing tag. Make sure to put a space between the closing braces for the cite template and the opening braces for the subscription template.
 * lang can be dealt with by stripping the lang template from the title parameter (or commenting it out), but you need to make sure that there is an appropriate language=XXXXX (using the full name of the language) parameter in the citation. As described in the discussion above, there does not appear to be an error-free way to indicate "This title (as opposed to the publisher's name or the work's name) is in language X" without generating a Lua error. Maybe someone will modify the cite template to make that option available.
 * nihongo and nihongo2 should follow the same rule, except that the language is always Japanese. nihongo and its ilk take multiple parameters separated by pipes, so they may not be bot-fixable. This template often requires the addition of the trans_title parameter to make it look right, and I don't think there is a general way for a bot to decide how to fix it in each case.
 * When I find XX icon, where XX is the two- or three-letter code for a language, I have been moving the template outside the cite template but before the closing tag. Make sure to put a space between the closing braces for the cite template and the opening braces for the XX icon template.
 * Note that many XX icon templates have redirects of the form XX, such as fr for fr icon. Not all two-letter versions of this template are redirects to the XX icon template, however.
 * Another note: for some reason, most articles that use ru icon have the incorrect parameter value language=ru instead of language=Russian . If your bot wants to fix those, that would be great.
 * There are other templates that cause problems, but I don't think they are bot-fixable. – Jonesey95 (talk) 19:19, 3 September 2013 (UTC)
 * If you want your bot to handle some or all of these situations, here are some options:
 * For sic, you could add y to suppress the wikilink.
 * You could change to y.
 * You could also change XX icon to the appropriate language parameter.
 * Thanks! GoingBatty (talk) 16:47, 4 September 2013 (UTC)
 * Thanks. For subscription required I've been switching them to via. <b style="color:#FFF">Hazard</b> <b style="color:#FFF">SJ</b> 04:43, 5 September 2013 (UTC)


 * For the templates that have been converted to Lua (,, , , , , , , , and ) a better solution to  issues is to set the CS1 template parameter yes.  When  contains via, then also set the CS1 template's via parameter. — Preceding unsigned comment added by Trappist the monk (talk • contribs) 13:14, 5 September 2013‎


 * The Lua based CS1 templates now support ISO639-1 codes in language. This provides the same categorization as the  templates.


 * —Trappist the monk (talk) 10:00, 25 September 2013 (UTC)

D There's been several improvements and extended trials. Is there any outstanding issues left? Hasteur (talk) 01:12, 21 September 2013 (UTC)
 * Well, with what is in mind from the last trial, with the fixes I said, that's basically it. There are probably one or two cases of the wikilink removal instances above I could hard-code as well, but, again, as far as I'm aware, that's about it. <b style="color:#FFF">Hazard</b> <b style="color:#FFF">SJ</b> 03:16, 25 September 2013 (UTC)
 * D
 * I read that response as: since the last trial there have been code changes. Is that right? Josh Parris 10:33, 5 November 2013 (UTC)

Operator has edited on four days in the last two months and has become unresponsive. I'm expiring this without prejudice; the operator is welcome to re-open. Josh Parris 07:40, 17 November 2013 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.