User talk:Citation bot/Archive 20

Fails to expand cite arxiv on a specific page
https://en.wikipedia.org/w/index.php?title=List_of_exoplanets_discovered_in_2019&type=revision&diff=947872230&oldid=947858354 Back in the old days the bot would have ran on this page. We do a lot better checking now, since things can go really off the rails when the templates are not right. AManWithNoPlan (talk) 23:48, 28 March 2020 (UTC)
 * Old days? Since when, and for what reason, does the bot not expand a cite arxiv? &#32; Headbomb {t · c · p · b} 23:56, 28 March 2020 (UTC)
 * Nevermind, a bad template somewhere else on the page. I'll check for those in the future. However, it would be good to know what the error was that caused the bot not to edit. &#32; Headbomb {t · c · p · b}
 * it will often report an error, but it’s really hard to figure out what exactly is wrong. It’s quite possible that after the mis-parsing there was no cite template found. AManWithNoPlan (talk) 01:37, 29 March 2020 (UTC)

Needs two runs to get this almost right

 * Is it ever correct to add journal to a cite book, as this sequence of edits did? I would think series, as in your final edit, would be more appropriate. I think adding journal to cite book is unlikely enough to be correct that it should count as a bug by itself, even though it can be patched up. —David Eppstein (talk) 06:31, 29 March 2020 (UTC)
 * It's sometimes correct, I recall seeing cases where papers from Frontiers Media journals are republished as physical books. It's a garbage practice, but it is what it is. I could be mixing it it with the Frontiers series of books by Karger though (see Frontiers search for Karger). But some journals issues are assigned ISBNs, confusing the bot about them being journal or books. &#32; Headbomb {t · c · p · b} 06:49, 29 March 2020 (UTC)

Step 1 done: https://github.com/ms609/citation-bot/pull/2769 and now waiting on Step 2: https://github.com/ms609/citation-bot/pull/2770 AManWithNoPlan (talk) 12:11, 29 March 2020 (UTC)


 * It now takes two runs to get this right. It is one of those "you don't know what you don't know" situations.  I will have to think about if we can get it down to one without changing the bot so that is wastes a huge amount of effort 99.9% of the time. AManWithNoPlan (talk) 12:28, 29 March 2020 (UTC)
 * So, when I have free time, I will poke around what Crossref sends us. AManWithNoPlan (talk) 12:52, 29 March 2020 (UTC)

Publication place vs location
I've read the documentation before making citations, so was annoyed about the bot changing the parameters the "wrong" way. So I reverted it and tried to file a bug report, but upon finding User_talk:Citation_bot/Archive_15 I learned that the bot was "correct". But that discussion was a year ago, and nothing happened since. Quick fix would be to stop bot from making these edits until documentation is changed. Having it inconsistent wastes time of uninvolved people (like me) who need to search through bug report archives just to learn what is going on. Attomir (talk) 15:40, 31 March 2020 (UTC)
 * Simple, fix the doc. And you don't need to do anything, location is simply preferable to publication-place. Both work just fine, they are aliases of each other. &#32; Headbomb {t · c · p · b} 16:31, 31 March 2020 (UTC)

Removal of citation page range
I gather that some citation style guides say that referencing page numbers for journal articles (as opposed to books) is not needed, but I do not see an explicit prohibition in any enwiki guidelines. Is the bot doing this because it is a journal article or because it is short?

As an aside, I do not think getting the page range of the whole journal article is all that important nowadays, as no one reads them in print anymore... Attomir (talk) 15:43, 31 March 2020 (UTC)


 * the bot only updates single pages when the page listed is the first page. In the extremely rare case that one actually wants the first page only, you can add the bot bypass to that parameter.  AManWithNoPlan (talk) 16:13, 31 March 2020 (UTC)
 * "when the page listed is the first page" - but this was changed from "1493" to "1491–3", so not a first page (that would be "1491" if I understand you correctly). Attomir (talk) 21:35, 31 March 2020 (UTC)
 * Not sure why it changed it then. Will investigate. AManWithNoPlan (talk) 22:21, 31 March 2020 (UTC)
 * Figured it out https://github.com/ms609/citation-bot/pull/2784 Once this is on the server, the bug will be fixed. Code expect 123-125, not 123-5.  AManWithNoPlan (talk) 23:21, 31 March 2020 (UTC)

Redundant sciencemag.org/cgi/ URLs

 * Uh? Not similar enough? goes to https://science.sciencemag.org/content/321/5891/931.abstract while the DOI goes to https://science.sciencemag.org/content/321/5891/931 . The only difference is the ".abstract" suffix and that's a bit like the "/doi/pdf/" etc. prefixes in wiley.com and others. It's not unreasonable to account for it given there are only a few variants on the theme. Nemo 21:16, 1 April 2020 (UTC)
 * Weird. It went to a different URL when I tried, but now to the one you list.  Grrrrr.  Seems to depend upon your client (ie. my phone).  AManWithNoPlan (talk) 21:47, 1 April 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2787 AManWithNoPlan (talk) 21:50, 1 April 2020 (UTC)

MOS:STRAIGHT
I'm reporting this because the user (User:Sb008) prefers to just have a go at the person who activated the bot rather than addressing the underlying issue. Timrollpickering (Talk) 16:15, 30 March 2020 (UTC)


 * This is not meant as a general problem but specifically in cites (and probably quotations). If the value of the title parameter matches with the title at the source, it should be left alone no matter what. Even if it contains spelling mistakes. What if the author put a deliberate spelling mistake in the title? Likewise formats used at the source should be respected and not be subjected to Anglo-Saxonization. After all it's a cite and not a personal description. The trans-title is a different story, changes can be made there. Solution: Don't perform any automated changes on titles in cites if the language parameter indicates a language different than English. Even with English cites one should be cautius, what seems a mistake, could be intended. --Sb008 (talk) 17:33, 30 March 2020 (UTC)


 * It is a shame that some people incorrectly complain to the user that suggested that the bot look at a page instead of the person responsible (the bot). It also means that these issues never get fixed.  AManWithNoPlan (talk) 00:31, 31 March 2020 (UTC)
 * I think that the complaint actually goes against the manual of style on this. The MOS tells us to change the case of titles in many situations also, so perfect source matching is not desired.  AManWithNoPlan (talk) 00:39, 31 March 2020 (UTC)

VisualEditor vs. Citation bot
Two automatic systems, VisualEditor and Citation bot, are at odds over The automatically generated WorldCat URL in the automatically generated Template:cite book seems to be this bone of contention.
 * 1) the URL parameter, and
 * 2) the oclc parameter.

For a details, see also User:Lent/sandbox/VisualEditor_vs._Citation_bot This automation-driven WP:EDITWAR seems counter-productive. Or maybe we use it as entertainment: "Let's get ready to rumble!" :) Lent (talk) 01:44, 29 March 2020 (UTC)
 * This isn't an edit war. The visual editor puts things sub-optimally. Citation bot cleans it. VE doesn't re-add things after cleanup. The solution is to fix the visual editor to output a correctly-formatted citation in the first place. &#32; Headbomb {t · c · p · b} 02:06, 29 March 2020 (UTC)
 * Agreed. This sort of cleanup of frequently-inserted problems in citations is exactly what Citation bot is good for; it is not a bug. It would be better for VE to get it right the first time, so that Citation bot didn't need to step in, but there is no good reason to preserve the URL in this case and Citation bot is doing the right thing by removing it. —David Eppstein (talk) 02:26, 29 March 2020 (UTC)
 * I think something similar happens when citing journal articles: the Visual Editor inserts URLs that Citation bot finds redundant for duplicating the DOI. XOR&#39;easter (talk) 02:38, 29 March 2020 (UTC)

As I was working backwards from the last change, I started with Citation bot, but I have also opened a task on VisualEditor [T248770] VisualEditor Automatic cite book generated from isbn vs. Citation bot
 * This has been merged into [T232771] In cite journal, do not add URL expanded from existing identifier like DOI which was opened Sep 12 2019, 3:01 PM.

Would knowledgeable editors please go there to instead [T232771] and add to the task more details clarifying the problem and suggested fixes? Lent (talk) 03:07, 29 March 2020 (UTC)

Title param added when chapter and encyclopedia are already present
This may be problem with book's metadata, but is there a legitimate situation in which all three of "title", "chapter", and "work/encyclopedia/etc." are needed in one citation?

As an aside, what is the rationale of changing title/work to chapter/encyclopedia? When I was creating the citations for Ullmann's in the first place, I could see there were 2 conventions for existing articles, one was using "chapter" for chapter name and "title" for encyclopedia name, other used "title" and "work" respectively. Why is the former preferred? Attomir (talk) 15:49, 31 March 2020 (UTC)

Turns range of hyphenated page numbers into range of ranges of page numbers
https://github.com/ms609/citation-bot/pull/2798 This should fix it once implemented. If not, I will keep digging, since all dash changed should go through this function first. The use of HTML dashes was not anticipated, since that's not supposed to be done in general. AManWithNoPlan (talk) 11:53, 4 April 2020 (UTC)
 * I will also add detection for html like "span", since that really is gross. AManWithNoPlan (talk) 12:08, 4 April 2020 (UTC)
 * That is a citation template issue not a bot issue, but if you have some idea how to persuade the template to show the spaces around the dash with less-gross coding, please let me know. I guess double parens is what the template doc suggests; it's also ugly, but not quite so bad. The bot understands those, right? —David Eppstein (talk) 17:18, 4 April 2020 (UTC)
 * Yes, that is something we got a while ago. It is nasty wiki markup, but it looks nice. AManWithNoPlan (talk) 19:10, 4 April 2020 (UTC)

splitting out links from title to title-link
Yesterday I realised Citation bot is possibly running another unapproved bot task, i.e. transforming wikilinks in the "title" parameter to content of the "title-link" parameter, see Help talk:Citation Style 1. Is there a reason why the bot operates these transformations? --Francis Schonken (talk) 06:39, 5 April 2020 (UTC)
 * Linked titles corrupt COINS metadata, so the link is put in its own parameter, per template doc. &#32; Headbomb {t · c · p · b} 08:41, 5 April 2020 (UTC)
 * Afaics, template doc nowhere says that a wikilink in the "title" parameter corrupts COinS:
 * Template:Cite book/doc expresses no preference for either a wikilink in the title parameter or a wikilink defined via the title-link parameter.
 * Template:Cite book/doc is a template-copy of the csdoc COinS guidance (Template:Citation Style documentation), which apparently nowhere warns against wikilinks in the "title" (or any other) parameter: the COinS guidance does not even mention the existence of the title-link parameter as an alternative for a direct wikilink in the title parameter.
 * Hence my efforts to get an improved cite template documentation at Help talk:Citation Style 1. I'd continue the topic of an appropriate cite template documentation there. Unless the documentation is clear on the point, or the bot acquires a permission for the task elsewhere (e.g. a successful WP:BRFA), it should however not continue edits that would seem WP:COSMETICBOT and/or adding unnecessary layers of complexity to most editors, including those who have read the template documentation. --Francis Schonken (talk) 09:38, 5 April 2020 (UTC)
 * No. Wikilinks in title do not corrupt the COinS metadata.  Here are two examples; one has a wikilinked title and the other uses title-link.  The metadata for both are exactly the same
 * The only differences between these two renderings is where the module places the title's italic markup and the templatestyles stripmarkers. For completeness, here is the same example template without a linked title:
 * Metadata are the same.
 * —Trappist the monk (talk) 10:54, 5 April 2020 (UTC)
 * The only differences between these two renderings is where the module places the title's italic markup and the templatestyles stripmarkers. For completeness, here is the same example template without a linked title:
 * Metadata are the same.
 * —Trappist the monk (talk) 10:54, 5 April 2020 (UTC)
 * Metadata are the same.
 * —Trappist the monk (talk) 10:54, 5 April 2020 (UTC)

Thank you for the update on the handling of COINS data in CS1/CS2. It used to be much different. Once this is deployed, the bot will stop the splitting out. https://github.com/ms609/citation-bot/pull/2805 AManWithNoPlan (talk) 12:19, 5 April 2020 (UTC)


 * used to be notabug, but with improved Template code, it can now be flagged as fixed. AManWithNoPlan (talk) 17:49, 5 April 2020 (UTC)

some odd changes
Both the changes to Vincenty's formulae are bogus. Prepositions in French titles shouldn't be capitalized. The "issue=3" for Rainsford's paper is just wrong.
 * For the prepositions, the issue is that they follow dots, which means these are very nonstandard abbreviations. As for the third issue, that's because that's what's listed in the associated the ADSABS entry. The problem might be a conflict between an old number scheme and a new numbering scheme. It could also be that ADSABS has the wrong information. &#32; Headbomb {t · c · p · b} 09:56, 5 April 2020 (UTC)
 * That journals numbering is either changed or wrong in bibcode. I added a comment to the page to stop adding of issues number.  Added French abbreviation to foreign words list. fixed AManWithNoPlan (talk) 17:46, 5 April 2020 (UTC)

URL removal that duplicates DOI
User:SandyGeorgia has a complaint, about this in medical articles. The idea being that URLs that duplicate DOIs should be left if free. AManWithNoPlan (talk) 20:05, 23 March 2020 (UTC)


 * Thoughts on it staying if the url is free parameter is set? AManWithNoPlan (talk) 20:13, 23 March 2020 (UTC)
 * Perhaps some way to detect medical articles? AManWithNoPlan (talk) 20:28, 23 March 2020 (UTC)
 * I don't personally see the point of keeping duplicate URL/DOIs in hard code, but ideally those would be marked as free and the template should use that to automatically create links like it does for PMCs. And since this isn't currently possible, there might be a case to keep them for now. &#32; Headbomb {t · c · p · b} 21:20, 23 March 2020 (UTC)
 * There's nothing special about medical articles in this. Given "doi-access=free" is already used on almost 10k articles, maybe the time is ripe for a discussion at Help talk:Citation Style 1 on making it linkify the title (if no PMC is available). Nemo 07:24, 24 March 2020 (UTC)

Samples

From REM Sleep Behavior Disorder Single-Question Screen, for all the citations in the article, here is what our readers see: In the second case, the impression that could be given to readers is that they can't read the full text of the second source. We cannot assume that our readers know that they can click on the DOI link in that one case to get to the free full text. We can't even assume they know what a DOI link is. We can't expect them, in a larger article, to click through to every DOI to see if, by chance, free full text is available. We WANT our readers to be able to access text as often as possible, and to be able to verify text, so by in effect "hiding" free full text from the uninitiated reader is a disservice. Our readers may know that in ALL Wikipedia articles, a blue link in the title means they can read the article. The citation style I use is consistent with the Wikipedia-wide convention: that is, to provide a blue link in the title whenever free full text is available. We do this automatically when a PMC is available, but we have to do it manually when a PMC is not available, but free full text is otherwise available. I have been asking for a long time to stop changing the citation style in articles I edit; I hope not to have to install a deny bot template, because that would eliminate valuable bot edits. Please stop removing URLs to free full text when a PMC is not available, so citations will be consistent. Sandy Georgia (Talk)  22:53, 23 March 2020 (UTC)
 * 1) when the URL for non-PMC free full text is included and
 * 2) when the URL is deleted, because it duplicates what is in the DOI link.
 * I would simply like to confirm I have concerns similar to . Thankyou. Djm-leighpark (talk) 08:18, 24 March 2020 (UTC)


 * https://github.com/ms609/citation-bot/pull/2757 Will only do it when there is a PMC and a DOI once this is deployed AManWithNoPlan (talk) 01:23, 26 March 2020 (UTC)


 * fixed, looking into rebooting bot to stop existing runs that do not load new code. AManWithNoPlan (talk) 11:16, 26 March 2020 (UTC)


 * This is a terrible decision. The bot should follow the best practices and what the documentation recommends. There is no purpose in keeping redundant URLs, often added by VisualEditor without the user even asking for them. Leaving redundant links makes is very hard to respect recommendations on putting only the best URL in the url parameter. Nemo 12:57, 26 March 2020 (UTC)


 * Could someone bring in links to policies etc that are relevant. Will revert if they support.  Honestly how is someone who can’t figure out links gonna understand a linked medical article. AManWithNoPlan (talk) 13:26, 26 March 2020 (UTC)
 * Help:Citation_Style_1 reflects the long-held consensus on not having redundant URLs and Citing sources is the overall guideline on avoiding paywalled links when possible. A discussion on Help talk:Citation Style 1 is the way to go to change the recommendations on the usage of the templates and what is turned into a link. Nemo 13:30, 26 March 2020 (UTC)

fixed closed for now. AManWithNoPlan (talk) 20:35, 6 April 2020 (UTC)

ODNBsub template (again) etc.
The problem outlined (and supposedly fixed) here is back or still happening. We're still seeing a double "(subscription or UK public library membership required)", as happened here.

Can bot drivers also stop the bloody bot changing book url field to a chapter url field, and stop changing cite web to cite book every time they see a citation from WorldCat, as happened at, and ? I'm getting bored having to clear up when the bot passes through; the idea of a bot is to make life easier for editors, not add to their workload. - SchroCat (talk) 07:14, 5 April 2020 (UTC)


 * https://github.com/ms609/citation-bot/pull/2806 - much better non-chapter URL detection. AManWithNoPlan (talk) 11:53, 5 April 2020 (UTC)


 * And the other problems? - SchroCat (talk) 12:02, 5 April 2020 (UTC)

https://github.com/ms609/citation-bot/pull/2807 - if the OCLC URL has the word "edition" in it and the citation is not "cite book", then the URL conversion will not be done. AManWithNoPlan (talk) 12:03, 5 April 2020 (UTC)

https://github.com/ms609/citation-bot/pull/2808 - this should once deployed remove the library template in the case where the ref= parameter is set to a template. AManWithNoPlan (talk) 12:34, 5 April 2020 (UTC)

https://github.com/ms609/citation-bot/pull/2809 And a little more generic regex. AManWithNoPlan (talk) 12:49, 5 April 2020 (UTC)

That should do it. AManWithNoPlan (talk) 13:06, 5 April 2020 (UTC)

fixed AManWithNoPlan (talk) 20:32, 6 April 2020 (UTC)

Cosmetic edit in parameter: ISBN -> isbn

 * It would be quite annoying to stop doing this because then we'd need to have two modes of operation for the bot and gadget, and users would then perform those changes with their personal accounts, which is both allowed and more noisy. Nemo 14:58, 10 April 2020 (UTC)
 * I do not understand this comment. It is annoying to do it (i.e. make a cosmetic edit), not to stop doing it. There is no need, and it is mildly disruptive, to change ISBN to isbn without making any other changes. If that is the only change that the bot or tool sees as an option, it should skip editing the article entirely, moving on to the next article. The bot/tool already has the option to not perform an edit to a selected article if it does not see anything in the article that needs changing; the bot/tool should not make cosmetic edits.
 * Is making this type of cosmetic edit approved as part of this bot's BRFA? If so, please link to it. – Jonesey95 (talk) 15:19, 10 April 2020 (UTC)
 * From Template:Cite book: "All parameter names must be in lowercase." Grimes2 (talk) 15:56, 10 April 2020 (UTC)
 * Thanks for catching that overly prescriptive language, which has never been enforced, and which conflicts with WP:CITEVAR. I have updated it to reflect long-standing practice. In any event, the edit linked above is still cosmetic, which is typically not allowed, per WP:COSMETICBOT. Does the bot have BRFA approval to perform these cosmetic edits? If not, it should stop performing them. – Jonesey95 (talk) 16:42, 10 April 2020 (UTC)
 * I think that wasn't meant to be prescriptive but rather a descriptive notification to users that the template will (generally) not work with sentence/upper case parameters. I do however agree that this is an instance of a cosmetic edit. --Izno (talk) 16:51, 10 April 2020 (UTC)
 * Are you sure? "Note: None of the parameters should be capitalised, in order to avoid the article being tagged as having a broken citation. For example, use "url", "title", etc. - not "URL", "Title", etc." Grimes2 (talk) 16:58, 10 April 2020 (UTC)
 * Yup. Module:Citation/CS1/Whitelist lists all the parameters; the only ones with upper case support are the identifiers, and those always have an all-lowercase variant, which, while it may be the canonical parameter, is an alias of the upper case form. As they do the same thing, the change is cosmetic and should not be performed unless there are other changes to make in the edit. The bot is clearly wrong here. --Izno (talk) 17:15, 10 April 2020 (UTC)
 * If the bot wants to pointlessly change ISBN to isbn while also adding or fixing useful information that changes the appearance of the article for readers, I will not complain. Cosmetic edits are no good, though. Does the bot have BRFA approval to perform these cosmetic edits? If not, it should stop performing them. – Jonesey95 (talk) 17:57, 10 April 2020 (UTC)
 * We could change the templates so that they fail when the uppercase parameter is used. Then the edits would no longer be cosmetic. Either way, the incorrect parameters will need to be fixed. Nemo 19:52, 10 April 2020 (UTC)
 * The parameters are not incorrect however. I do not know where this is coming from, but you're definitely into Help talk:CS1 territory and strictly out of the remit of the bot talk page. The bot is making cosmetic edits and should stop. --Izno (talk) 21:01, 10 April 2020 (UTC)
 * The parameters are not incorrect however. I do not know where this is coming from, but you're definitely into Help talk:CS1 territory and strictly out of the remit of the bot talk page. The bot is making cosmetic edits and should stop. --Izno (talk) 21:01, 10 April 2020 (UTC)

Izno is right here. It's fine when you're previewing the page and then manually saving it. But as an automatic-no-review edit, those (and many other edits) shouldn't be done when it's the only thing being done. &#32; Headbomb {t · c · p · b} 05:02, 11 April 2020 (UTC)


 * I think should catch most of them. https://github.com/ms609/citation-bot/pull/2812 AManWithNoPlan (talk) 11:54, 11 April 2020 (UTC)

After your edit to my draft
Not a single refrence is working please make it proper as it was Maizbhandariya (talk) 02:33, 12 April 2020 (UTC)


 * No one has any way of knowing what you are talking about. AManWithNoPlan (talk) 11:02, 12 April 2020 (UTC)


 * I think, he is talking about https://en.wikipedia.org/w/index.php?title=Draft:Saqib_Iqbal_Shami&diff=950435595&oldid=950435029 Grimes2 (talk) 11:29, 12 April 2020 (UTC)


 * So, a broken page was broken after the bot ran, so blame the bot?! notabug AManWithNoPlan (talk) 11:48, 12 April 2020 (UTC)

bad arxiv expansion
expands to when it should expand to &#32; Headbomb {t · c · p · b} 23:00, 9 April 2020 (UTC)


 * https://github.com/ms609/citation-bot/pull/2813 fixed AManWithNoPlan (talk) 19:18, 12 April 2020 (UTC)

CITEBOTREQ
What about creating a redirect WP:CITEBOTREQ (following the WP:*REQ standard) added to the top of WP:BOTREQ page along with the other WP:*REQ pages for automated requests? Understood this page is not primarily a "request" page, but given its active development and frequent feature requests it seems reasonable. -- Green  C  15:17, 9 April 2020 (UTC)
 * Seems like a reasonable idea to me. AManWithNoPlan (talk) 12:20, 13 April 2020 (UTC)


 * Done. -- Green  C  15:49, 13 April 2020 (UTC)


 * Flag to archive:fixed. thank you. AManWithNoPlan (talk) 21:25, 13 April 2020 (UTC)

Google books top domain
In this edit the bot changes  to , but in this edit it does not do the same with. Jonatan Svensson Glad (talk) 21:49, 12 April 2020 (UTC)


 * https://github.com/ms609/citation-bot/pull/2814 AManWithNoPlan (talk) 12:19, 13 April 2020 (UTC)


 * fixed AManWithNoPlan (talk) 13:15, 13 April 2020 (UTC)

Adding Google books url as a floater

 * Same at the bottom at https://en.wikipedia.org/w/index.php?title=Bolt_action&diff=prev&oldid=950728112 Jonatan Svensson Glad (talk) 14:50, 13 April 2020 (UTC)


 * https://github.com/ms609/citation-bot/pull/2815 AManWithNoPlan (talk) 15:24, 13 April 2020 (UTC)

Bot down
Not sure what. After someone restarts it, need to run https://tools.wmflabs.org/citations/gitpull.php to make sure up to date. AManWithNoPlan (talk) 15:50, 13 April 2020 (UTC)

fixed AManWithNoPlan (talk) 16:44, 13 April 2020 (UTC)

Changing wikilink

 * This looks like GIGO to me. --Izno (talk) 22:44, 13 April 2020 (UTC)
 * Well, changing from a working wikilink to a red wikilink is not a desired result regardless of garbage. Also, the bot used to replace this with e.g. ''' The Worst Journey in the World chapter 9 instead. Jonatan Svensson Glad (talk) 22:56, 13 April 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2817 AManWithNoPlan (talk) 00:13, 14 April 2020 (UTC)

Citation bot removed a valid URL that was a preview (viewport) from WorldCat when an OCLC # was displayed
My suggestion is to also search for the viewport string & to exempt it when found from removal. Here is the difference between the record & the preview: Peaceray (talk) 15:21, 15 April 2020 (UTC)
 * WorldCat record: https://www.worldcat.org/title/utopian-communities-of-illinois-heaven-on-the-prairie/oclc/1004538134
 * WorldCat preview: https://www.worldcat.org/title/utopian-communities-of-illinois-heaven-on-the-prairie/oclc/1004538134/viewport


 * these are google books previews wrapped in a worldcat URL. One should link to those instead since the worldcat URLs are notorious for being unstable.  AManWithNoPlan (talk) 15:44, 15 April 2020 (UTC)
 * this has already been discussed https://en.wikipedia.org/wiki/User_talk:Citation_bot/Archive_19#removing_links_to_worldcat AManWithNoPlan (talk) 15:48, 15 April 2020 (UTC)
 * Okay, I get it now. You have decided that my curation of previews does not matter. For instance, if the page range that I am referencing appears in the preview ... too bad. If I prefer Worldcat's preview wrapper to Google's presentation ... too bad.
 * It is unfortunate, because Citation bot does a lot of good, such as converting JSTOR links, substituting endashes for hyphens, & inserting PMIDs, for instance. I wonder at the need to switch the publication-place parameter for location when Citer uses publication-place, but that is not a bother to me. That you assume that your judgement about online links that are previews is better than this former reference librarian who has been an editor at least as long as you have been, is a problem however. Now that I know, I can disable citation bot for the article or the citation, or use one of a couple of possible workarounds. It's a shame that the useful edits may not be always applied, though.
 * Previously I have had bots screened out of my watchlist. I have turned it back on so that I can now be aware of bots behaving badly.
 * Peaceray (talk) 04:05, 16 April 2020 (UTC)
 * "if the page range that I am referencing appears in the preview" the viewport page displayed can change over time. If you really want a stable link to a page, you should link to the google books page (or archive.org copy's specific page), at least there you can specify a page in the URL, which you cannot do with viewport.  With the worldcat wrapper you might think you have the right page only to have the default page change later.   Direct links also fill the page without half the screen being a false advertisement for worldcat.  AManWithNoPlan (talk) 11:27, 16 April 2020 (UTC)
 * Again, you are substituting your judgement for individual editor's judgement. We both know that the argument that you make about a page changing can apply to many pages on the web. I have over the course of time found many media sites that have removed a story or reorganized their site, & that those pages were not archived.
 * I find your argument to be unconvincing that a Worldcat preview changes over time. It has been my experience that the OCLC # refers to one edition, & that the preview has pointed to that one edition. Maybe the preview has gone away, but I have not encountered it pointing to a different edition.
 * However, since you are convinced of your point, I invite you to post some examples where the Worldcat preview has posted to a different edition than the one linked to in the record.
 * Peaceray (talk) 14:39, 16 April 2020 (UTC)
 * The Worldcat preview has never loaded anything on my browser (it's a less than secure frame), so it's a bit hard to tell what it might have loaded on your browser. We should link URLs which can work for most people. Nemo 14:50, 16 April 2020 (UTC)
 * "since you are convinced of your point, I invite you to post some examples where the Worldcat preview has posted to a different edition" I NEVER claimed that. Please do not use straw man arguments, even if on accident.  AManWithNoPlan (talk) 14:59, 16 April 2020 (UTC)
 * Maybe the preview has gone away... and therein is the issue. Links to Google books, either directly, or indirectly through agents like WorldCat, do rot.  There is an essay, WP:GBWP that discusses this issue among others.
 * —Trappist the monk (talk) 15:22, 16 April 2020 (UTC)
 * My apologies. I misread "the viewport page displayed can change over time" (maybe due to not finishing my morning coffee). That has not been my experience, & I am wondering if you may have examples. But again this is a possible problem with any web page that is unarchived&mdash;content can change over time.
 * Wow, that is a problem & more of an argument. I am curious what browser / OS that you use. I have successfully used the Worldcat preview on Firefox, Chrome, IE, & Edge on Windows OS, Safari years ago on a Mac, & typically Chrome on an iPhone+ using the Wikipedia desktop view. Is it just the Worldcat preview, or does the Google preview present a problem as well?
 * Thanks, I am aware of WP:GBWP & have previously read it. Link rot of unarchived URLs is a problem, but it certainly is not limited to Google Books.
 * Peaceray (talk) 15:38, 16 April 2020 (UTC)
 * OCLC prominently displays a "preview this item" button. It is not the job of Wikipedia to duplicate the OCLC functionality, especially since which page are accessible via OCLC/Google Preview varies by location, country, and time. &#32; Headbomb {t · c · p · b} 16:38, 16 April 2020 (UTC)
 * OCLC prominently displays a "preview this item" button. It is not the job of Wikipedia to duplicate the OCLC functionality, especially since which page are accessible via OCLC/Google Preview varies by location, country, and time. &#32; Headbomb {t · c · p · b} 16:38, 16 April 2020 (UTC)

I do consider it my responsibility as an editor & a former reference librarian to make it as easy as possible for users to get to sources. I do not expect most users to know that they may be able to get to a preview by clicking on an OCLC #. If I have educated them through a WorldCat preview that OCLC's WorldCat is a resource for them to use, then so be it. I would much rather send someone to WorldCat's unencumbered interface than to Google Books for philosophical reasons. OCLC, as a non-profit cooperative whose mission is to get bibliographic & holding information to users, aligns far better with Wikipedia's missions than Google. In the matter of how to wrap that or not, I recognize that we differ. I am just asking that my decisions as an editor be respected.

I am sensitive to the argument that Google Books preview content may vary by time & space. Do you have any resources that you would recommend to me for further reading about that?

Peaceray (talk) 02:26, 18 April 2020 (UTC)
 * And that's why Wikipedia templates supports OCLC. Those are no more special than any other identifiers, nor is the OCLC partnership with google so special that OCLC links should take the place of actual free-to-read stable links. &#32; Headbomb {t · c · p · b} 04:23, 18 April 2020 (UTC)
 * Peaceray, I love your work here as a reference librarian! See some information on Google Books and Wikipedia. Personally I don't like those /viewport links precisely because they "force" me to load JavaScript and other resources from Google. In my browser I block such third party JavaScript for privacy reasons, so I don't see anything at all on that page apart from a small head, and I have to go back to the main record to read any significant bibliographic information. (That page, then, tries to load Google Maps, which is even worse.)
 * If you're interested in avoiding Google for philosophical reason, the correct approach is to send people to the Internet Archive or other digital libraries which actually host the content and provide it without infesting it with Google JavaScript and other privacy-unfriendly surveillance devices. (For a 2017 book it's going to be hard to find anything outside e-commerce websites like Google Books or Amazon: in such cases I personally may prefer to just link the publisher, if their page has some usefulness.) Nemo 07:33, 18 April 2020 (UTC)

Old wrong handle

 * This came to my attention today when OABot marked this as being an open-access handle. Which may be true, but it's the wrong handle. Searching found 16 other articles with the same mistake. I'm getting really tired of running across months-old damage to citations by the bot, repeated across many articles, and having to fix them. We shouldn't have to manually check every edit it makes; if we do, the bot isn't really saving us any work. —David Eppstein (talk) 18:24, 15 April 2020 (UTC)
 * I suppose you mean 10338.dmlcz/143192 . You're right, there was a major upstream bug in that period which has since been resolved, as far as I know. I think I manually removed many of those wrong matches at the time but evidently I missed some. Nemo 18:31, 15 April 2020 (UTC)
 * For the record, citation bot no longer adds the handle in this specific case. Nemo 18:18, 16 April 2020 (UTC)

Metadata doesn't match date text was published
Moving here, as suggested by, from their talk page. They and I both used the bot on 500 Queer Scientists, and it tagged a lot of links to the site the article is about with the publish date from their metadata, which was 5 February 2020. The problem is that the text appears to have been written before that date: for instance, [ Anson W. Mackay, as originally created] by, contains a link (acccess date 4 July 2019) to what appears to be the same page as the link I added.

The link on the old page is dead (I've been meaning to go through Wikipedia and update them); what seems to have happened is that the website has been substantially restructured/redesigned, so that the date of the page is not the same as the date the text was first published (which in most cases is not given/is unknown). What's the right thing to do in this circumstance? YorkshireLad ✿  (talk) 09:18, 12 April 2020 (UTC)


 * the bot won’t add dates newer than access dates, nor will it add dates newer than the archive date. Sounds like adding archive-url from the webarchive (if they exist would be good start). AManWithNoPlan (talk) 11:04, 12 April 2020 (UTC)
 * , Sadly the Wayback Machine doesn't seem to have archived it . YorkshireLad ✿  (talk) 11:06, 12 April 2020 (UTC)
 * if the date is unknowable then one can always add a comment to the date field to let editors and bots know that fact, so they don’t try to “fix” it. AManWithNoPlan (talk) 11:50, 12 April 2020 (UTC)

notabug we can fix. Hopefully, running bots sooner or adding access-date or date sooner will help. AManWithNoPlan (talk) 14:46, 18 April 2020 (UTC)

Stop imposing hdl parameter
notabug Please stop imposing the use of the hdl parameter: when it is wanted editors would likely have used it. Most editors won't know what it is, which makes it particularly user-unfriendly for future improvement of articles. --Francis Schonken (talk) 08:58, 4 April 2020 (UTC)
 * How can you claim at the same time that most editors don't know it, and that everyone who wants it already uses it? Please pick one. Nemo 04:41, 5 April 2020 (UTC)
 * Someone put it in the citation template's code, so they know what it is. Someone put it in Citation bot's code, so they know what it is. That's at least one or two persons who know what it is, and who can use it when they're writing article content, and are updating the references that go with that content. I'm sure there are more people knowing what it is. Those who think it is a feature that should be used more often can make some publicity for it (e.g., bring it up at WP:VPT, file a WP:BRFA – which, even if not approved, at least will make the feature more widely known). What they can't do, is run an unapproved bot task, forcing this additional layer of complexity (taking into account that citation templates are already complex) where it is not wanted. --Francis Schonken (talk) 06:20, 5 April 2020 (UTC)
 * You write as if you are not, yourself, one of the people who know what it is. Perhaps you should learn before you make demands on its use or non-use. (Hint: it's a way of identifying web resources with a more-permanent address than a url; see Handle System. Why do you think it's a bad thing to replace often-out-of-date urls with a more-persistent way of making the same link?) —David Eppstein (talk) 06:26, 5 April 2020 (UTC)
 * Sure, since yesterday I know what it is. I'm probably not going to use it (at least not very often), although it is my plan to get it at least listed in the "full parameter set"s of cite book's documentation (see Help talk:Citation Style 1). --Francis Schonken (talk) 06:39, 5 April 2020 (UTC)

Thank you for working on the documenting of this. AManWithNoPlan (talk) 20:27, 6 April 2020 (UTC)

fixed

Question: what was fixed? --Francis Schonken (talk) 21:38, 6 April 2020 (UTC)


 * Once again, I think the underlying concern (titles not getting linkified) will be solved (only) by running a bot which adds hdl-access=free and doi-access=free based on Unpaywall data. Meanwhile you can ask at Help talk:Citation Style 1 that such parameters turn the title into a link. Nemo 07:17, 8 April 2020 (UTC)
 * Interesting, but not a reply to my question. In this edit a "fixed" template was added to this section, hence my question: what was fixed? --Francis Schonken (talk) 08:36, 10 April 2020 (UTC)
 * I know this is "not a bug", it was never reported as such. One editor saying something was fixed, another implying nothing needed to be fixed ... so what is it? --Francis Schonken (talk) 13:40, 23 April 2020 (UTC)
 * There is nothing to fix, because the behaviour is correct. &#32; Headbomb {t · c · p · b} 13:49, 23 April 2020 (UTC)
 * Sure, but another editor indicated that something *was* fixed: so my question: what was fixed in the eyes of the editor who posted the "fixed" tag? Tx for a reply this time. --Francis Schonken (talk) 14:19, 23 April 2020 (UTC)
 * fixed here meaning "yo archive bot, there's nothing to do here, archive this". &#32; Headbomb {t · c · p · b} 14:46, 23 April 2020 (UTC)

I will more properly tag it as wontfix AManWithNoPlan (talk) 00:25, 24 April 2020 (UTC)

urldate → pridate
pridate is valid in some templates. This will add urldate to the list of 'dead' parameters to not be changed. AManWithNoPlan (talk) 13:31, 23 April 2020 (UTC)

Caps

 * not a bug That's because this is a bunk abbreviation. GIGO, this is the fix. &#32; Headbomb {t · c · p · b} 14:12, 24 April 2020 (UTC)

remove pp.&amp;nbsp; in pages parameter
Can you remove also p.&amp;nbsp; in page and pages? Grimes2 (talk) 22:34, 24 April 2020 (UTC)

Convert hard-coded special spaces into regular spaces
Example for a hidden non-breaking space. &#32; Headbomb {t · c · p · b} 14:50, 24 April 2020 (UTC)
 * We already do many of those. For some reasons already discussed we do not do these non-breaking ones.  I do not remember why though or if I agreed. AManWithNoPlan (talk) 18:33, 24 April 2020 (UTC)
 * As part of the recent module release, they are now identified in Category:CS1 errors: invisible characters as errors which I suspect is what Headbomb plans to run the bot on shortly. There shouldn't be an issue removing the Unicode version of the character (given that it is invisible and likely unintended). Intentional non-breaking spaces are almost always the  form, which should not be removed. --Izno (talk) 19:39, 24 April 2020 (UTC)
 * Wasn't planning on running the bot against that category, but it would be a good idea to do so in a way that didn't hog down too many resources from Citation bot. &#32; Headbomb {t · c · p · b} 20:17, 24 April 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2822 AManWithNoPlan (talk) 21:09, 24 April 2020 (UTC)

Invalid date
That's acually the date on the website  ! I will add some code to catch that sloppiness. AManWithNoPlan (talk) 21:00, 24 April 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2821 AManWithNoPlan (talk) 21:02, 24 April 2020 (UTC)

remove ref=harv
wontfix harv is now the default in citations, and having the explicit version no longer does anything. &#32; Headbomb {t · c · p · b} 17:26, 18 April 2020 (UTC)
 * We might want to let the CS1 module updates settle for a few weeks before taking systematic action to remove ref=harv from articles. Sometimes module updates cause problems that result in rollbacks of parts of the code. It does no harm in the meantime, as far as I know. – Jonesey95 (talk) 18:23, 18 April 2020 (UTC)
 * Waiting is a good idea. I think that another bot would be the way to go the first big run. AManWithNoPlan (talk) 01:03, 19 April 2020 (UTC)

Well, a related BOTREQ is at WP:BOTREQ – came here because I thought this would be something up Citation bot's alley? Anyhow, there's no formal BOTREQ yet, but the issue is discussed at Help_talk:Citation_Style_1/Archive 69. Is it possible to let your light shine there, possibly advising what would be desirable and what not, what the bot can handle, and what not, etc? Tx. --Francis Schonken (talk) 07:52, 19 April 2020 (UTC)
 * how many pages need this? My main concern is that this bot would take way too long. I could run a slimmed down version for this run, which I have done in the past.  AManWithNoPlan (talk) 11:19, 19 April 2020 (UTC)
 * . Removal of this parameter or its value is cosmetic if that is the only change made.  But, when combined with other fixes that this bot might do, then the bot can pick away at the articles in that category.  There is no hurry, no pressing  to clear the category or remove/replace these parameters.
 * —Trappist the monk (talk) 11:37, 19 April 2020 (UTC)
 * (EC) This isn't about running the bot on Category:CS1 maint: ref=harv directly (this would be a violation of WP:COSMETICBOT anyway), but rather taking care of opportunity targets and tidying up old useless code like removing none when those do nothing. Like the ISBN/isbn capitalization, this shouldn't be the only edit being done on a page. &#32; Headbomb {t · c · p · b} 11:37, 19 April 2020 (UTC)
 * I only just saw this discussion. I understand ref=harv is now a default to all CS1/CS2 templates, and the affected templates are being modified to assume the change is permanent. Was there an open discussion and vote about this? - I probably missed it because I'm not usually tuned in to these things. In any case, I'd like to point out that it will cause a significant mincrease in false positive error messages from HarvErrors.js. In the one case I'm always banging on about, EB1911 is intended to be referenced in a footnote or other style of Harvard reference, so it included ref=harv. But Cite EB1911 is more usually just in a "Further Reading" section, generally without the need for a footnote. On the occasions when it is so referenced, we've been adding ref=harv explicitly. There are several other paired templates with the same pattern. So HarvErrors.js has switched from usually correct error messages (you've generated an unmatched reference) to a lot of false positives. This time, it doesn't matter to the general reader and is probably safe on balance, but, again, was this discussed somewhere first? David Brooks (talk) 15:35, 19 April 2020 (UTC)
 * Changes to cs1|2 are always discussed at WT:CS1. For this particular change, a series of discussions ending at.
 * —Trappist the monk (talk) 15:44, 19 April 2020 (UTC)
 * Yet another spectacular example of a change affecting tens of thousands of pages being waved through with little (or, apparently this time, no) discussion whatsoever. ——  SN  54129  15:52, 19 April 2020 (UTC)
 * There was plenty of discussions, see Help_talk:Citation_Style_1 &#32; Headbomb {t · c · p · b} 11:25, 20 April 2020 (UTC)
 * While I agree in principle with 54129's comment, after consideration I think it's probably a good change, apart from the little matter of adding untold numbers of false positives to the warning message from HarvErrors.js (which I admit I have suppressed). But I did want to verify one thing: will the set of changes to CS1/2, and to templates that call them (e.g. here) still honor a specific value of ref (other than "harv" or "none")? My ability to read template code is rudimentary. Also, this discussion is happening in what is probably the wrong place, but I'll keep it here for now. David Brooks (talk) 21:59, 19 April 2020 (UTC)
 * User scripts/Most imported scripts would seem to suggest that the cohort of editors using the HarvErrors.js script is rather small.  will honor whatever you give it.
 * The change that you note is incomplete. That template uses Module:Template wrapper which, unless instructed to do otherwise, passes everything it gets to .   also uses Module:Template wrapper so every parameter it gets from  and every parameter that it has inside gets passed to .  Here is a  linking to  (prescript turned off for clarity):
 * → – links to the second EB1911 showing that the template honors what you give it
 * —Trappist the monk (talk) 22:29, 19 April 2020 (UTC)
 * As far as I can tell, custom ref values will still work, as long as template editors make correct modifications to templates. If something goes wrong, we are likely to see a spike in the article count at (32,775 articles at this writing). So far, I have seen no such spike. Please post at that category page or on my talk page if you see any EB1911 template usages with broken links that you are unable to fix. – Jonesey95 (talk) 22:32, 19 April 2020 (UTC)
 * I was lazy yesterday, so I actually tested it. Yes, an explicit "ref=CITEREFNobody2020" is handled correctly, after HeadBomb removed the default "harv" setting and then Trappist removed the  entirely. Clever wrappers. Thanks for the verification. David Brooks (talk) 13:55, 20 April 2020 (UTC) ETA: Thanks for the patient explanation, Trappist. David Brooks (talk) 14:00, 20 April 2020 (UTC)
 * —Trappist the monk (talk) 22:29, 19 April 2020 (UTC)
 * As far as I can tell, custom ref values will still work, as long as template editors make correct modifications to templates. If something goes wrong, we are likely to see a spike in the article count at (32,775 articles at this writing). So far, I have seen no such spike. Please post at that category page or on my talk page if you see any EB1911 template usages with broken links that you are unable to fix. – Jonesey95 (talk) 22:32, 19 April 2020 (UTC)
 * I was lazy yesterday, so I actually tested it. Yes, an explicit "ref=CITEREFNobody2020" is handled correctly, after HeadBomb removed the default "harv" setting and then Trappist removed the  entirely. Clever wrappers. Thanks for the verification. David Brooks (talk) 13:55, 20 April 2020 (UTC) ETA: Thanks for the patient explanation, Trappist. David Brooks (talk) 14:00, 20 April 2020 (UTC)
 * As far as I can tell, custom ref values will still work, as long as template editors make correct modifications to templates. If something goes wrong, we are likely to see a spike in the article count at (32,775 articles at this writing). So far, I have seen no such spike. Please post at that category page or on my talk page if you see any EB1911 template usages with broken links that you are unable to fix. – Jonesey95 (talk) 22:32, 19 April 2020 (UTC)
 * I was lazy yesterday, so I actually tested it. Yes, an explicit "ref=CITEREFNobody2020" is handled correctly, after HeadBomb removed the default "harv" setting and then Trappist removed the  entirely. Clever wrappers. Thanks for the verification. David Brooks (talk) 13:55, 20 April 2020 (UTC) ETA: Thanks for the patient explanation, Trappist. David Brooks (talk) 14:00, 20 April 2020 (UTC)


 * Headbomb, you say "this shouldn't be the only edit being done on a page", but if this change gets added to citation bot, and there are 110k articles currently affected, the chances will be quite high that the bot will end up doing such changes while not doing anything else. Or, if it's configured to abort a series of changes as merely cosmetic (not so trivial), it will waste a lot of resources preparing edits only to discard them. So I agree with AManWithNoPlan: get a pywikibot bot to perform this change on existing articles and only then come here to propose the addition. Nemo 16:06, 19 April 2020 (UTC)
 * *OR* get permission for the COSMETICBOT edits via BRFA. It has happened before that COSMETICBOT edits were explicitly approved, when they were due to a change in setup (e.g., removal of commented-out metadata in mainspace when Wikidata became operational). --Francis Schonken (talk) 16:25, 19 April 2020 (UTC)
 * the chances will be quite high that the bot will end up doing such changes while not doing anything else. If the bot coded to not do that on its own, then the chances of the bot doing that on its own would be 0%. The processing power lost on this would be trivial. &#32; Headbomb {t · c · p · b} 16:59, 19 April 2020 (UTC)

I oppose removing "ref=harv". Having that parameter in place gives editors who are not technically inclined a marker that there is a parameter available that controls the production of an anchor. As you know, editors will sometimes move a citation from Works cited to Further reading and vice-versa, which involves switching between "ref=harv" and "ref=none". It doesn't matter to that group of editors which of those is the default, and it is helpful to them to be able to use either value for the parameter without somebody removing one or the other. Software works best when it's designed to accommodate the working practices of the users, not when it forces the working practices of the users to accommodate its idiosyncrasies. --RexxS (talk) 15:12, 21 April 2020 (UTC)
 * It's there and it does nothing. Why keep it? Users that aren't technically inclined won't be using this to begin with.&#32; Headbomb {t · c · p · b} 15:30, 21 April 2020 (UTC)
 * you're quite wrong. I've helped lots of technically-unskilled editors install scripts that they have learned to use without having to understand the finer points of what the default value of an optional parameter is. For the majority of users of the Harverrors script, they have become accustomed to using it in a particular way, i.e adding ref=harv to full CS1 citations that they use in conjunction with sfn, and then checking that there are no errors. Now, you're asking them to switch their workflow to not adding anything to full CS1 citations that they use in conjunction with sfn, but having to add ref=none to full CS1 citations in Further reading sections, and then checking that there are no errors. Why can't you acknowledge that the change has resulted in a disjunction in how they are accustomed to work? It is utterly unhelpful to suggest they switch off the harv error checking, because they have made an effort to get to grips with the functionality of the script, and want to retain the benefits of seeing errors, all of them.
 * "... it does nothing" Of course it does something. It allows folks like me to explain to other editors that there is a parameter which needs to be set to "harv" when they want to connect the citation to a sfn, and set to "none" when they use the full citation elsewhere. That allows them to move a source from reference to reading and vice-versa with minimal effort. It's cognitively easier to learn how to switch a parameter from one value to another than it is to learn when to omit it and what value it has when it must be included. If we had encouraged editors to add "ref=harv" and "ref=none" in the past, hardly anybody would have even noticed when the default changed. Leave the "ref-harv" parameter alone: it's not hurting anything, and it would allow any future change in default to happen transparently. Robust systems don't rely on defaults. --RexxS (talk) 01:01, 22 April 2020 (UTC)
 * There is zero newcomer out there that will see harv and go "Oh, there's a variety of shortened footnotes templates you can use to make a shortened foot note to this citation!".   And robust systems rely on defaults all the time. &#32; Headbomb {t · c · p · b} 14:31, 22 April 2020 (UTC)
 * I'm not talking about newcomers. I'm talking about content editors who have learned to use sfn, but are not technically inclined and won't want to re-learn a different way of working just to do the same job. I'm sure won't mind me using her as an example. I can remember discussing with her several years ago the pros and cons of citation templates against hand-crafted citations before we had the speed of Lua implementations. She's learned over the years how to use the cite templates and sfn to good effect, and to use Ucucha's script to spot errors, without ever wanting to go into technical detail about their inner workings. Those are the editors who will find it far more logical to switch between "ref=harv" and "ref=none" when moving a source from a reference to further reading or back. The systems that rely on defaults all the time are never going to be robust. --RexxS (talk) 18:53, 22 April 2020 (UTC)

Oppose removing "ref=harv". We need that extra functionality when moving citations around from source sections to FR to Selected works. Without it, the scripts are malfunctioning, articles look ugly, and there's no clear way to fix them. SarahSV (talk) 22:18, 21 April 2020 (UTC)
 * Clearly you don't understand what harv does. There is zero difference with a citation with harv and one without. If your script is broken, update your script, or get a new one made at WP:SCRIPTREQ. &#32; Headbomb {t · c · p · b} 22:42, 21 April 2020 (UTC)
 * I oppose these changes, the original change and this bot proposal. SarahSV (talk) 00:09, 22 April 2020 (UTC)

No particular opinion on whether the bot should remove harv or not, but as I stated on the CS1 talk page: As things currently stand this isn't a WP:COSMETICBOT task - harv currently throws a maintenance category. Maybe it shouldn't, and maybe a bot removal isn't a good idea, but removing the parameters is not a cosmetic edit according to the definition in the policy. Jo-Jo Eumerus (talk) 08:30, 22 April 2020 (UTC)
 * It's not technically cosmetic, in the sense that there's a maintenance category that goes with it, but the category simply track if harv is there or not. It's something purely self-referential. If falls in the broader definition of "edits of such little value that the community deems them to not be worth making in bulk". &#32; Headbomb {t · c · p · b} 14:26, 22 April 2020 (UTC)
 * To be fair, and clear to those coming newly to the discussion, the maintenance category is only 6 days old, and was created to manage the consequences of this (surprisingly impactful) change. Seven days ago it would have been cosmetic. David Brooks (talk) 15:45, 22 April 2020 (UTC)
 * Seven days ago, it would have broken a crapload of footnotes. &#32; Headbomb {t · c · p · b} 15:53, 22 April 2020 (UTC)
 * You're right; I forgot that this is where we came in (i.e. even without the scripts, we had bluelinks that went nowhere). I withdraw the comment. David Brooks (talk) 16:05, 22 April 2020 (UTC)


 * To those voting, this isn’t a vote. You already lost the discussion and vote.  AManWithNoPlan (talk) 14:21, 22 April 2020 (UTC)
 * There's no need for a WP:BATTLEGROUND approach here. &#32; Headbomb {t · c · p · b} 14:27, 22 April 2020 (UTC)
 * I'm voting to make clear why I think removal of "ref=harv" would result in a net negative effect, nothing more and nothing less. As far as I can see, the discussion on that is open, and different opinions of how desirable that would be have been presented in a reasonable manner.
 * If you'd like me to reopen the discussion on changing the default value of the ref parameter with an RfC at a central location, simply to be able to discuss issues around the implementation, just say and we can re-litigate the entire issue with a much broader participation. Otherwise, why not make constructive contributions to how we can meet different editors' concerns on the consequences of the change? --RexxS (talk) 18:37, 22 April 2020 (UTC)
 * Another option, instead of removing harv from citation templates, would be to remove the code from the CS1 module that generates a maintenance category and maintenance message when harv is used. We don't generate a maintenance message when type is present but empty, and that is code that does the same thing as harv, i.e. nothing. This might be a good discussion to have at Help Talk:CS1. As I proposed at the top of this thread, I propose closing this discussion as being premature, since there is not consensus to remove these ref parameters yet. – Jonesey95 (talk) 21:56, 22 April 2020 (UTC)
 * The maintenance category isn't the issue here, the clutter is. &#32; Headbomb {t · c · p · b} 22:30, 22 April 2020 (UTC)
 * As above, empty parameters are just as much clutter as harv, IMO, but I don't think there is consensus to remove empty parameters. Consensus to remove clutter from CS1 citations should be sought at Help Talk:CS1, not here. – Jonesey95 (talk) 22:45, 22 April 2020 (UTC)
 * Citation bot already removes plenty of clutter, this would be no different. &#32; Headbomb {t · c · p · b} 05:53, 23 April 2020 (UTC)
 * there is no plan to implement this until after some other bot Terraforms all of Wikipedia with this change and makes it the clear default. Unless someone does that, as I already said we won’t do it.  Secondly and more importantly, since this is a bot page, such policy votes here don’t matter: you should really take this discussion elsewhere as others have suggested before someone else decides to fire up their bot and just does it.  AManWithNoPlan (talk) 01:32, 23 April 2020 (UTC)

, Headbomb has now started implementing this. Does the above count as consensus? SarahSV (talk) 20:41, 24 April 2020 (UTC)
 * Not really sure what you mean, but yes, I edited the documentation to reflect the current state of the templates and did some AWB runs in template space. Really not sure what that has to do with Citation bot.&#32; Headbomb {t · c · p · b} 20:43, 24 April 2020 (UTC)
 * I can see you've removed "ref=harv" from something like 50 templates using AWB today. Am I right that you're working on the assumption that citations hard-coded into templates will never be moved from a Works cited into a Further reading section, so would not inconvenience editors like ? I trust you know that bot-like edits (around 50 pages in less than two minutes) to create a fait accompli is strongly frowned on when there is dispute over the appropriateness of the edits. I should caution you against doing similar removals in articles where is a chance of other editors objecting.
 * This section is discussing whether or not Citation Bot should be programmed to remove "ref=harv" from articles. The operators are aware of those of us objecting and the reasons. I usually find bot operators are particularly sensitive to making edits that are known to be controversial, so I expect more rather discussion before the bot is authorised to remove the parameter. Today Headbomb used AWB to manually remove the parameters from citations hard-coded into some templates. Neither of us will see any effect from those, because those citations are inside other templates and we will never need to move them into a Further reading section. I expect that he understands our concerns about removing the parameter from articles – particularly from articles that we might curate –  and I sincerely hope that before he considers broadening his manual removals to articles, he will weigh up the benefits of removing what he considers "clutter" against the objections made here (since the same objections apply to manual removals as to removals by bot). I hope this helps. --RexxS (talk) 21:23, 24 April 2020 (UTC)
 * Removing ref=harv has absolutely no effect on Sarah's script, regardless of what section the citation appears in. So I don't understand why you think it would be problematic or could become problematic to remove it. —David Eppstein (talk) 22:08, 24 April 2020 (UTC)
 * At the risk of repeating myself as you seemingly didn't see my previous posts, removing "ref-harv" will have an affect on Sarah's workflow, not her script. For example, she told us that she sometimes uses source from the Further reading section in the article and moves it to the Works cited section. And sometimes she does the reverse, taking a source out of an article but leaving it as further reading.
 * It is far easier for her (or anybody else) to locate a "ref=harv" in citations that are being moved from works cited to further reading and change it to "ref=none" (which is needed to avoid spurious error messages from the script) than to remember to add a parameter that was deliberately and unnecessarily removed. The same goes for moves in the other direction. Why is it wrong for editors to prefer always having a ref parameter whose value is simply changed, rather than having it in some cases and omitting it in others (depending on which is the changeable default)? --RexxS (talk) 22:41, 24 April 2020 (UTC)
 * If removing harv from these templates, which does absolutely nothing but remove a maintenance category from the template page and the articles these are transcluded onto, she is really going out of her way to have her "workflow" disrupted because no script is affected, no anchors are affected, and there is zero no rendering changes anywhere on any page associated with that removal. &#32; Headbomb {t · c · p · b} 22:54, 24 April 2020 (UTC)
 * As far as a content editor is concerned, it does a lot more than removing a maintenance category from the template page, but it seems you're not able to put yourself into the position of someone not steeped in the technical side of the project. If all you want is to remove the maintenance category, I could could do that for you at a stroke by disabling it in the module. --RexxS (talk) 23:03, 24 April 2020 (UTC)
 * Go ahead, explain to me how anyone's workflow is affected by this. &#32; Headbomb {t · c · p · b} 23:09, 24 April 2020 (UTC)
 * I think I addressed that a couple of hours ago, and concluded that editing templates wouldn't affect her. Did you not read You're really not going to understand anybody else's view if you ignore what they write. --RexxS (talk) 23:52, 24 April 2020 (UTC)
 * Someone's views who are based in misconceptions and complains for the sake of complaining is very unlikely to sway me, correct. And if you consider removing 50 harv from template pages in a sea of 110K+ articles a fait accompli, you and I have a very different definition of what that is. &#32; Headbomb {t · c · p · b} 23:57, 24 April 2020 (UTC)
 * Sigh. I made no complaint about you removing ref=harv from template pages, and I was expecting you to be able to read that. I was also expecting you to understand that I didn't think citations hard-coded in template pages could ever be moved into a Further reading section, but it seems I was to be disappointed.
 * As an aside, we clearly do disagree about whether editing 50 pages in less than two minutes creates a fait accompli, and although I had no issue with you doing that in template space, I would have an issue with you doing that in mainspace. I really hope that we don't have to test which of us is right about what constitutes WP:FAIT, because I'm pretty sure I understand what's written on that page. --RexxS (talk) 00:35, 25 April 2020 (UTC)
 * , thank you for commenting. The template change has disrupted the workflow for some of us, and this feels like more of it, even if minor. When I need a citation (whether as a source, Selected works, FR, talk-page discussion, etc), I try to remember where I last added it, then I copy it over. For some sections, we'll have to add ref=none now, so it would be helpful not to remove ref=. SarahSV (talk) 22:46, 24 April 2020 (UTC)
 * And none is indeed not removed. When I start removing none, you'll have a valid complaint. &#32; Headbomb {t · c · p · b} 22:55, 24 April 2020 (UTC)

I've been away for a while, but... doesn't "removing ref=harv from 50 templates" imply that Trappist's addition of ref=harv as a default for CS1 is permanent? I know that change seemed to have no downside (other than more warnings in HarvErrors.js), but I don't recall a strong consensus. Since this is being discussed in many places I may have missed an RFC. Now we've made more changes that rely on the CS1 change, it would be difficult to roll back now. David Brooks (talk) 23:14, 25 April 2020 (UTC)

3RR
notabug

Your recent editing history at BWV Anh. shows that you are currently engaged in an edit war; that means that you are repeatedly changing content back to how you think it should be, when you have seen that other editors disagree. To resolve the content dispute, please do not revert or change the edits of others when you are reverted. Instead of reverting, please use the talk page to work toward making a version that represents consensus among editors. The best practice at this stage is to discuss, not edit-war. See the bold, revert, discuss cycle for how this is done. If discussions reach an impasse, you can then post a request for help at a relevant noticeboard or seek dispute resolution. In some cases, you may wish to request temporary page protection.

Being involved in an edit war can result in you being blocked from editing&mdash;especially if you violate the three-revert rule, which states that an editor must not perform more than three reverts on a single page within a 24-hour period. Undoing another editor's work—whether in whole or in part, whether involving the same or different material each time—counts as a revert. Also keep in mind that while violating the three-revert rule often leads to a block, you can still be blocked for edit warring&mdash;even if you don't violate the three-revert rule&mdash;should your behavior indicate that you intend to continue reverting repeatedly.--Francis Schonken (talk) 06:27, 25 April 2020 (UTC)
 * Are you deliberately trying to be a drama-monger or what? It appears that you have not looked at this bot's user page, User:Citation bot. If you had, you would have seen the section "Stopping the bot from editing" on steps you could have easily taken to preserve your dubious citation formatting preferences on the article in question. —David Eppstein (talk) 06:53, 25 April 2020 (UTC)
 * The bot is, afaics, running a number of tasks that seem to be outside the tasks for which it received WP:BRFA approval. I tried to bring this up in a soft way on this page on a number of previous occasions, which were largely ignored. I think I wasn't explicit enough in these previous instances. Anyhow, I'm still trying to get this sorted on this talk page. --Francis Schonken (talk) 06:59, 25 April 2020 (UTC)
 * , I can't see anything wrong with the edits from the bot on the page you gave this, in my opinion way out of proportion warning for. This does not seem like an edit war between 2 users at all right now, this seems like a disagreement between you and, well the preferred way of citations. The bot is also user activated so posting a warning here normally wouldn't do anything - the user in question or someone else will just activate it again, but I'm pretty sure Headbomb will read this in this case. If the bot is making actual mistakes please explain what they are, and it is clearly explained how to prevent the bot from operating. Without any explanation this just seems like WP:OWNBEHAVIOR on that page you mentioned combined with some personal preference. It's a bit hard to explain why the bot does something, why a certain way of citation is used or anything about approval if we don't know what the problem is; as far as I can see other bugs or problems are always discussed and fixed here. If you do feel ignored that is a problem, and I think if you have some valid points they will for sure be discussed now, but without this we aren't getting any further either. Also telling the bot to go to the talk page is pretty pointless as well, the bot can't read and its quite likely that the user only checks for bugs and then leaves the page so does not see your edit summary at all. If you feel ignored because of that just ping the user in question?Redalert2fan (talk) 08:53, 25 April 2020 (UTC)
 * Re. "...the preferred way of citations..." – by whom? All I can see is a succession of local consensuses for tasks which could be approved at BRFA, or fail to do so. Some of these tasks are nor explicitly nor implicitly mandated by policies, guidance of all sorts or whatever, so would need at least a bit broader approval than a local consensus before such tasks can be commissioned to a bot. --Francis Schonken (talk) 09:06, 25 April 2020 (UTC)
 * The only "questionable" behavior I see is the changing of non-free jstor URLs to the jstor parameter. AManWithNoPlan (talk) 12:38, 25 April 2020 (UTC)
 * What a ridiculous warning. Also using the dedicated JSTOR parameter to put the JSTOR link is standard and hardly "questionable". If those are free links, you can mark them as such with free, and if those aren't, then they shouldn't be taking the place of the URL in the first place, gives the impression a free article is available there. &#32; Headbomb {t · c · p · b} 15:12, 25 April 2020 (UTC)
 * Does any open access URL exist at jstor.org? I'm not aware of a single one. Nemo 15:50, 25 April 2020 (UTC)
 * Old out-of-copyright sources at jstor (example: ) appear to be publicly accessible. At least, I can still open that one in an incognito window. —David Eppstein (talk) 16:21, 25 April 2020 (UTC)
 * Old sources will often be. Newer things tend not to however. I don't know if there's a specific cuttoff date, or if it's a journal-by-journal thing however. &#32; Headbomb {t · c · p · b} 16:29, 25 April 2020 (UTC)
 * After the Aaron Swartz dump of JSTOR in 2011, they went ahead and released about 6% of their holdings as open access, all public domain. But not all their public domain content is open access, unclear how they decide. -- Green  C  16:33, 25 April 2020 (UTC)

I was just going to say that - good job Green. Also, one page articles are often free on accidentally. AManWithNoPlan (talk) 16:39, 25 April 2020 (UTC)
 * Yes, I can download the PDF for too, but only after an interstitial which asks me to press a button to store a cookie;  doesn't work directly. So I think no JSTOR ID qualifies for jstor-access=free at the moment, unless I misunderstood what it means. The only option is to link the copies on the Internet Archive (example). Also, I believe JSTOR might behave differently for USA users as opposed to people in the rest of the world, although the differences are not as wild as with Hathi Trust. Nemo 18:29, 25 April 2020 (UTC)
 * P.s.: Aaron never published a JSTOR dump. User:Gmaxwell did (but that's only a small part of what got mirrored on the Internet Archive a few years later).
 * Direct vs indirect access is not a condition of open access. Most open access dois involve going to an interstitial; some of them then involve a second interstitial that tries to display the contents inline before actually letting you download the pdf. They are nevertheless open access. —David Eppstein (talk) 20:19, 25 April 2020 (UTC)
 * "Most" definitely gets a [citation needed]. I would use url-access=registration for any such walled gardens. Nemo 20:41, 25 April 2020 (UTC)
 * registration is when you must log in to access the material, not for interstitial 'ads' or cookies. (subscription is when you must pay to access the material.) --Izno (talk) 21:03, 25 April 2020 (UTC)
 * Please provide me an example of an academic-publication doi that does not go to an interstitial and instead leads directly to the pdf document. I don't recall ever seeing one. Certainly the major academic society and commercial publishers all have interstitials. —David Eppstein (talk) 21:04, 25 April 2020 (UTC)
 * I have certainly seen a few things that go straight PDF. Of those few times, I do remember one journal that did.  I was unusual enough to shock me.  But that’s one time and i was surprised enough to poke around and it was a real journal, but not major at all. AManWithNoPlan (talk) 21:50, 25 April 2020 (UTC)

"Bad Bot" Bug

 * JSTOR urls are put in the JSTOR paremeter. Likewise for DOI urls which are redundant with the DOI parameters. This is normal and not a bug. &#32; Headbomb {t · c · p · b} 17:02, 26 April 2020 (UTC)
 * It's not just JSTOR links. It removed a Springer link too (not to the show). -  Neutralhomer  •  Talk  • 17:05 on April 26, 2020 (UTC) •  #StayAtHome
 * Also, with the Springer link, it removed the access date completely as well. -  Neutralhomer •  Talk  • 17:07 on April 26, 2020 (UTC) •  #StayAtHome
 * Yes, those are redundant with the DOI. And when there's no URL, the access-date is removed too.&#32; Headbomb {t · c · p · b} 17:13, 26 April 2020 (UTC)
 * What about the "books" part removed from Google Books URLs? -  Neutralhomer •  Talk  • 17:15 on April 26, 2020 (UTC) •  #StayAtHome
 * Those are not required either, that's exactly why they are removed. Have you tried following the links yourself? You end up at the exact same page. Redalert2fan (talk) 17:35, 26 April 2020 (UTC)
 * Oh and the reason why the bot edit period seems random is because it is activated by users as can be seen in the edit summary, it does not run around freely on its own. Redalert2fan (talk) 17:41, 26 April 2020 (UTC)
 * OK, it would be very helpful if the edit summaries were a little more easier to read and less alphabet soup.
 * I'm more of a "stop, contact the user, see what the problem is" and a "if it ain't broke..." kind of editor. -  Neutralhomer •  Talk  • 17:53 on April 26, 2020 (UTC) •  #StayAtHome
 * "Removed or converted URL. Removed accessdate with no specified URL." seems clear enough here. &#32; Headbomb {t · c · p · b} 18:26, 26 April 2020 (UTC)
 * Clearly not for everyone. :) Clear terms, just think about it. -  Neutralhomer  •  Talk  • 18:53 on April 26, 2020 (UTC) •  #StayAtHome
 * it’s evolving over time but we are limited by length limits and some things are hard to describe.  Perhaps we should work on making a long list of all the things the bot does and and explanation of why. AManWithNoPlan (talk) 19:00, 26 April 2020 (UTC)
 * An FAQ would be useful, yes. &#32; Headbomb {t · c · p · b} 19:05, 26 April 2020 (UTC)
 * That's all I ask, is that it is considered. Thanks. :) -  Neutralhomer  •  Talk  • 19:15 on April 26, 2020 (UTC) •  #StayAtHome

� used in title
Funky Unicode punctuation instead of normal characters are a real pain, often not present in the meta data. I am making assumptions since we have no electricity today. :-{ AManWithNoPlan (talk) 14:11, 18 April 2020 (UTC)
 * The page source at that URL says
 * I don't see funky Unicode. I see U+2019, described at https://www.fileformat.info/info/unicode/char/2019/index.htm as "this is the preferred character to use for apostrophe". – Jonesey95 (talk) 15:31, 18 April 2020 (UTC)
 * With Google chrome running the bot this is the exact thing I got: "Red Velvet��s Irene, new face of world��s bestselling liquor brand Chamisul soju - Pulse by Maeil Business News Korea" - on the page it self I can see it as "Red Velvet’s Irene, new face of world’s bestselling liquor brand Chamisul soju" normally. Maybe something went wrong while exacting it? Redalert2fan (talk) 16:35, 18 April 2020 (UTC)
 * I have electricity again. You are correct, the websites meta data is actually right.  Will investigate at some point. AManWithNoPlan (talk) 16:45, 18 April 2020 (UTC)
 * Such mistakes usually happen when the encoding is stated incorrectly. The page declares to have charset=euc-kr, not UTF-8. Nemo 18:30, 18 April 2020 (UTC)
 * I only just read this, thanks for the explanation. So it seems the bot should always either convert to UTF-8 if possible or only "read" in UTF-8 itself. Redalert2fan (talk) 20:11, 29 April 2020 (UTC)
 * I only just read this, thanks for the explanation. So it seems the bot should always either convert to UTF-8 if possible or only "read" in UTF-8 itself. Redalert2fan (talk) 20:11, 29 April 2020 (UTC)

More JSTOR cleanup

 * semi-related to a few sections above, this one / turns out to be an example of a free jstor link. &#32; Headbomb {t · c · p · b} 15:42, 26 April 2020 (UTC)
 * For me it looks like all the others, maybe it's different for North American users. The version at http://hdl.handle.net/2027/emu.300000407463 should be OA in USA but for me it says "This item is not available online ( Limited - search only) due to copyright restrictions" and "Rights: Public Domain in the United States". Nemo 16:11, 26 April 2020 (UTC)
 * The JSTOR landing page looks like this for me, where it's clearly marked open-access. The HDL version isn't. &#32; Headbomb {t · c · p · b} 16:58, 26 April 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2827 AManWithNoPlan (talk) 01:15, 1 May 2020 (UTC)

Remove dead URL when doi-access=free

 * Probably should wait untill the autolinking RFC concludes, and then this would be a cinch. &#32; Headbomb {t · c · p · b} 18:36, 26 April 2020 (UTC)
 * It can wait, but it's unrelated. Garbage URL are garbage nonetheless. Nemo 19:08, 26 April 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2826 AManWithNoPlan (talk) 01:09, 1 May 2020 (UTC)

PLOS ONE inconsistent journal name change
https://github.com/ms609/citation-bot/pull/2825 AManWithNoPlan (talk) 22:20, 30 April 2020 (UTC)

Data dryad 'book' chapter
https://github.com/ms609/citation-bot/pull/2829 AManWithNoPlan (talk) 22:24, 1 May 2020 (UTC)

Inappropriate arXiv-bibcode

 * While those are indeed valid bibcodes for the preprint associated with the papers, and many of those won't ever be updated to a "final" bibcode simply because these journals aren't indexed in ADSABS, I also kind of agree with David here that bibcodes should be reserved for the version-of-record version of papers. So if there's a DOI, and the bibcode is YYYYarXiv... (or similar arxiv-bibcode patterns) it shouldn't be added and existing ones should be removed. &#32; Headbomb {t · c · p · b} 07:43, 2 May 2020 (UTC)
 * Just a personal opinion: I hate these bibcodes too. They're only a hindrance. Nemo 10:58, 2 May 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2830 New arXiv bibcode will not be added (since they are useless) and existing ones will be replaced if a non-arxiv one is found. AManWithNoPlan (talk) 11:54, 2 May 2020 (UTC)
 * Doesn't seem to be . &#32; Headbomb {t · c · p · b} 18:28, 2 May 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2833 That's what happens when the code is non-standard. Cleaned up.  AManWithNoPlan (talk) 20:21, 2 May 2020 (UTC)

Better cite report cleanup
Applies to the other identifier urls too. &#32; Headbomb {t · c · p · b} 15:57, 2 May 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2832 AManWithNoPlan (talk) 16:56, 2 May 2020 (UTC)

Format of suppression comment
The instructions under Stopping the bot from editing say "add a comment...such as". I think it would be helpful to clarify the "such as". Will any comment do? An empty comment? Or should it contain some key words? David Brooks (talk) 15:08, 2 May 2020 (UTC)
 * Any comment will do, empty or not. &#32; Headbomb {t · c · p · b} 15:14, 2 May 2020 (UTC)
 * We do encourage text within the comment that is useful to human beings. Sometimes it is useful to say something like "HDL data contains bad issue number, journal does not actually have issues". AManWithNoPlan (talk) 15:37, 2 May 2020 (UTC)
 * Thanks for confirming; can you clarify on the userpage? Encouraging a more targeted explanation is also good. David Brooks (talk) 16:00, 2 May 2020 (UTC)
 * Got for it. Please do! AManWithNoPlan (talk) 16:52, 2 May 2020 (UTC)
 * Flag as fixed to archive, since I added more text. Others feel free to expand or improve.  AManWithNoPlan (talk) 14:10, 3 May 2020 (UTC)

Changing Google Books links
The bot is changing Google Books links, which Google changes back again, so the change seems pointless at best. For example, the bot changes https://books.google.com/books?id=7zoaKIolT9oC to https://books.google.com/?id=7zoaKIolT9oC. SarahSV (talk) 05:58, 2 May 2020 (UTC)
 * Google doesn't "change" the links, which remain undisturbed in our wikitext and HTML. Linking an URL which redirects to another more complex URL is not some strange exercise in pointlessness, it's a desirable practice used for permalinking. For instance Special:Diff/954401894 links https://en.wikipedia.org/wiki/Special:Diff/954401894 which redirects to https://en.wikipedia.org/w/index.php?diff=954401894 . Nemo 06:06, 2 May 2020 (UTC)
 * Nemo, I'm sorry, I didn't understand how your answer relates to my question. I'll rephrase. What is the point of the bot changing  to   We could also write https://google.com/books?id=7zoaKIolT9oC, but again, why change it? SarahSV (talk) 06:35, 2 May 2020 (UTC)
 * When performed by a bot it is an edit that falls under the WP:COSMETICBOT policy. Simplest solution is the bot not engaging in COSMETICBOT edits, which in and by themselves are not "helpful", but only and attempt to force down a bot programmer's personal preference as WP:FAITACCOMPLI. --Francis Schonken (talk) 06:47, 2 May 2020 (UTC)
 * Those are not cosmetic changes because they change the rendered output when printed. Removing  is more efficient/cleaner, but I'll also agree that removing   only to have google re-add it is sort of pointless. WP:FAITACCOMPLI here is a red herring.&#32; Headbomb {t · c · p · b} 07:50, 2 May 2020 (UTC)
 * , Is the change of google books link done on its own or only in conjunction with other tasks that include a substantive change?
 * As you say per WP:COSMETICBOT Changes that are typically considered substantive affect something visible to readers and consumers of Wikipedia, such as the output text or HTML in ways that make a difference to the audio or visual rendering of a page in web browsers, screen readers, when printed, in PDFs. By that it is not a cosmetic edit. However if it is a considered a cosmetic edit again according to WP:COSMETICBOT Such changes should not usually be done on their own, but may be allowed in an edit that also includes a substantive change. So then still it is allowed if not done as the only change. Now if this is specific change is really useful is debatable yes. Redalert2fan (talk) 09:51, 2 May 2020 (UTC)
 * Further, the bot's edit changed a situation where the print version would contain:
 * one time "books.google.be/books?",
 * and ten times "books.google.com/books?"
 * ... to a situation where the print version contains
 * three times "books.google.com/?"
 * one time "books.google.be/books?"
 * and seven times "books.google.com/books?"
 * The one useful edit would have been to change a ".be" link to a ".com" link (which the bot left untouched), so I changed all eleven books.google links to what was the dominant version as well before as after the bot edit. The bot should not be used to introduce inconsistency with the dominant style, which is a CITEVAR issue. --Francis Schonken (talk) 10:06, 2 May 2020 (UTC)
 * Changes that are typically considered substantive affect something visible to readers and consumers of Wikipedia, such as the output text or HTML in ways that make a difference to the audio or visual rendering of a page in web browsers, screen readers, when printed, in PDFs <--- See that part in bold? Not cosmetic. &#32; Headbomb {t · c · p · b} 14:18, 2 May 2020 (UTC)
 * For clarity, the "CITEVAR issue" mentioned my 10:06, 2 May 2020 comment is already an acknowledgement it is not cosmetic: it is worse, it is a guideline infraction. --Francis Schonken (talk) 14:23, 2 May 2020 (UTC)
 * , Could you provide a diff or actually submit a bug report for that? it sounds like an actual bug in my opinion or something that can be improved. if "books.google.com/books?" should be change "books.google.com/?" has come under reconsideration but as you say changing .be to .com should be done. Redalert2fan (talk) 10:25, 2 May 2020 (UTC)
 * Oops, sorry, it's the first diff given under the subsection above (for clarity: this one), forgot we moved to another section here. Yes, was thinking about a bug report, but rather about the bot introducing inconsistency which is a bug and/or CITEVAR guideline transgression, than about the .be thing (which may be useful, but no guideline is wronged when it isn't done). --Francis Schonken (talk) 10:31, 2 May 2020 (UTC)
 * SarahSV, sorry for being less than clear: the point is explained at the page Permalink. I can share more resources on the topic if needed. Nemo 10:57, 2 May 2020 (UTC)
 * Sorry it is not. The Permalink article has a few banner tags on top, indicating its content is *questionable*, so can not be used to explain anything; further it does not explain whether or not Google Books *has* permalinks, and even less, if it would have, what the Permalink URLs of these pages would look like. In short: red herring. --Francis Schonken (talk) 11:18, 2 May 2020 (UTC)
 * Google Books does NOT have permalinks. Please follow the link and you will see that there is a link to try out the NEW google books -- which is a new link :-(    So, google has "Perma-until-we-change-our-minds-links".  As for why the bot its doing this now.  Well, the bot has always done this, but in some cases because of a minor code bug, this change did not occur.  So, now when you run the bot again, even more google book url simplification occurs.  Anyway, if google ever retires the current URL format, then it will be some poor bots job to go and remove a bazillion google book urls from wikipedia.  AManWithNoPlan (talk) 11:36, 2 May 2020 (UTC)

Anyhow:" Is a bug" The reply "even more google book url simplification occurs" does not sound like the bug is addressed. Either the bot is liable to introducing CITEVAR variation and/or inconsistency where there was none, or the bug is sorted, ending that undesirable bot behaviour. I suppose enough info has been given above to describe the bug so it can be sorted, which would make posting a formal bug report redundant. --Francis Schonken (talk) 12:19, 2 May 2020 (UTC)


 * The changes made to google book urls consist of two types: removal of unneeded extraneous text that does nothing but make the URL longer (the word book in the URL) and removal of unneeded extraneous text that helps identify the entity that did the original books search (all the other changes). The only possible bug is that the bot might only remove "books" from some urls and nothing more, and thus be cosmetic in nature.  Historically, the edits in question have been done for over a decade, but a small bug meant that some of them were not being done fully (books was removed from most but not all URLS), so that now on some pages the bot will shorten the urls a second time in a way that could be viewed as cosmetic.  AManWithNoPlan (talk) 12:51, 2 May 2020 (UTC)
 * ... which is still unclear on whether the bug (whether it is qualified as "small" or not) is being sorted? --Francis Schonken (talk) 12:55, 2 May 2020 (UTC)
 * I think the question is (correct me if wrong) "Does only (and nothing else) removing "book" from Google Books URLs count as a cosmetic edit that is not allowed within the cosmetic bot rules?" The follow-on question would be "If yes, then what is being done about it?" (I do know that the answer is currently "nothing", but I have been busy with the other topics on this page and my own real life.  I am off to reach a class, so see you all this afternoon or perhaps a little later. AManWithNoPlan (talk) 13:25, 2 May 2020 (UTC)
 * No, that is not the question: it shifted to "is the bot edit an infringement of WP:CITEVAR" after the example I gave. --Francis Schonken (talk) 13:35, 2 May 2020 (UTC)
 * What parts of WP:CITEVAR are being violated by specific URL anonymizations? It is a long (and yet incomplete) set of standards, suggestions, and examples. AManWithNoPlan (talk) 14:15, 2 May 2020 (UTC)
 * See above. --Francis Schonken (talk) 14:17, 2 May 2020 (UTC)
 * That's not a WP:CITEVAR violation by any stretch of the imagination, nor it is a style issue. &#32; Headbomb {t · c · p · b} 14:20, 2 May 2020 (UTC)
 * It is: the bot edit produces an inconsistent style when the article is printed or converted to PDF. --Francis Schonken (talk) 14:25, 2 May 2020 (UTC)
 * I never thought about the shortening being being non-cosmetic (other than helping editors by making the source easier to read) since it shows up in PDF's and other printed versions. Good to know. AManWithNoPlan (talk) 15:41, 2 May 2020 (UTC)
 * Well, neither did I, till Headbomb made that argument, convincingly, earlier today. But, this makes bots performing tasks resulting in an inconsistent style (of the PDF/printed version), and/or moving away from the dominant style (as visible in the PDF/printed version), an issue.
 * Note that Wikipedia output of PDF & printed versions has changed a few times (non-trivially!) over the period since when Citation bot had its last approval of a task... nine years ago. --Francis Schonken (talk) 15:50, 2 May 2020 (UTC)
 * I am curious what the point of the discussion is now. Is there some reason that anonymizing or shortening URLS should not be done?  AManWithNoPlan (talk) 20:44, 2 May 2020 (UTC)
 * The only reason I can think of that it shouldn't be done is if URLs become incomprensible/confusing to humans. Also normalizing the domain from .be to .com is not a style issue or CITEVAR issue.&#32; Headbomb {t · c · p · b} 21:18, 2 May 2020 (UTC)

please fix the bug, tx. --Francis Schonken (talk) 02:34, 3 May 2020 (UTC)


 * Carefully looking at the Google documentation, they suggest the word books be part of the url. https://github.com/ms609/citation-bot/pull/2834 I should note that when books is in the url before the ID, the hostname does not need books in it.    AManWithNoPlan (talk) 12:13, 3 May 2020 (UTC)
 * Flagging as fixed AManWithNoPlan (talk) 20:15, 3 May 2020 (UTC)

One last note We are now adding (instead of removing) the /books. This is because google is moving away from books.google.com to just google.com and putting the books in the URL path instead of the hostname. AManWithNoPlan (talk) 13:19, 4 May 2020 (UTC)

Caps "sur"
For reference:. Redalert2fan (talk) 16:00, 3 May 2020 (UTC)

https://github.com/ms609/citation-bot/pull/2838 AManWithNoPlan (talk) 17:20, 3 May 2020 (UTC)

Citation number: dash is incorrectly modified

 * This is an extremely unusual situation, the solution here is to use hyphen like so, (or alternatively put a comment in the number parameter).&#32; Headbomb {t · c · p · b} 05:17, 5 May 2020 (UTC)
 * Once this is deployed (https://github.com/ms609/citation-bot/pull/2843) pages/issues/numbers with both whitespace and a letter will not be changed. AManWithNoPlan (talk) 12:07, 5 May 2020 (UTC)
 * Thanks for tips and updates! + m t  21:14, 5 May 2020 (UTC)

[Feature request] capitalize C in cite template

 * I like them lowercase. I have mostly only ever seen them lowercase. Most of the actual examples in Help:Citation Style 1 are lowercase. I would be quite unhappy at a bot that started converting the case of them, just as I would be unhappy at a bot that changed one-line-one-parameter formatting of these templates to no-line-breaks formatting (both of which I use in different circumstances). —David Eppstein (talk) 05:47, 6 May 2020 (UTC)
 * Likewise, I prefer them lowercase. This isn't a suitable task for the bot. &#32; Headbomb {t · c · p · b} 10:34, 6 May 2020 (UTC)
 * I personally wish that a giant bot would run and make them all the same (one way of another - just pick one!). AManWithNoPlan (talk) 12:22, 6 May 2020 (UTC)
 * Thank you all for the information. I had since the "cite" tool in the visual editor uses an uppercase C that was what was expected. RayScript (talk) 15:51, 6 May 2020 (UTC)

Cleanup of cite chapter
Cite chapter is just cite book under a different name. You can apply any cleanup done to cite book to cite chapter. &#32; Headbomb {t · c · p · b} 14:51, 6 May 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2845 AManWithNoPlan (talk) 17:20, 6 May 2020 (UTC)

Better series handling

 * Possibly coded with exceptions for the hyphenated version of the series. Or a general hyphen = not a significant different logic. &#32; Headbomb {t · c · p · b} 17:25, 7 May 2020 (UTC)
 * https://github.com/ms609/citation-bot/pull/2848 AManWithNoPlan (talk) 18:51, 7 May 2020 (UTC)

Title=Log in or register to view

 * I thought this seemed a little familiar to me; User_talk:Citation_bot/Archive_19. Redalert2fan (talk) 08:29, 8 May 2020 (UTC)
 * Already fixed. We no longer even try on Facebook URLs.  We also have blacklisted that title.  AManWithNoPlan (talk) 11:44, 8 May 2020 (UTC)
 * Some people might be annoyed at you reporting an old bug, but the reality is that it is very possible that you will run across something the bot did in 2019, and it is still doing it, so please report even these old problems. AManWithNoPlan (talk) 11:44, 8 May 2020 (UTC)
 * For the record, I wasn't annoyed, I actually thought it might have gotten broken again :) Redalert2fan (talk) 12:16, 8 May 2020 (UTC)
 * the record, I didn’t think you were annoyed, I was referring to other people (myself in particular, since have been bombastic when under serious stress in the real world) AManWithNoPlan (talk) 13:20, 8 May 2020 (UTC)
 * , Okay my apologies then, totally understandable together with some of the other discussions here as well. I just removed the page from my watch list for a while but it would be hard for you to do the same. Redalert2fan (talk) 15:17, 8 May 2020 (UTC)
 * 👍🏻 This talk page is a unique place. AManWithNoPlan (talk) 18:15, 8 May 2020 (UTC)

Create a status page
Since we don’t run the tools, then probably no. https://tools.wmflabs.org runs it. AManWithNoPlan (talk) 16:38, 6 May 2020 (UTC)
 * IABOT (https://tools.wmflabs.org/iabot/index.php) and AnomieBot (https://tools.wmflabs.org/anomiebot/) both have similar sorts of pages. With IABOT you can see the status of your specific request and AnomieBot just lets you see jobs that are running or happened recently RayScript (talk) 16:44, 6 May 2020 (UTC)
 * sounds like a lot of work. What we really need is for the translation-server process to reboot itself every six hours automatically.  AManWithNoPlan (talk) 18:06, 6 May 2020 (UTC)
 * Also, the translation-server is a node.js machine. It's not quite the same as a PHP process which has a queue of tasks to run, quite the opposite. The difference may sound immaterial but it can make things harder. Nemo 19:05, 6 May 2020 (UTC)
 * Is the Zotero server something controlled by citationbot? If not, who is hosting it? I am willing to message them to ask about this. I do think it would be nice to have a status page to say how big the queue is or average response time but if it's too much work maybe this scheduled reboot of Zotero would suffice. Thanks RayScript (talk) 20:58, 6 May 2020 (UTC)
 * The bot dying is VERY rare these days. The Zotero server is another story.  That is one reason why it runs as a separate process so we don't get killed when it dies, we just get slow.  The zotero server is run by the same person in the same tool farm as the citation bot server.  I really think at least 90% of the problems could be fixed by a restart cron task.  I think the zotero server is like my old wifi router (public service announcement - do not waste your time on cheap routers, you will hate yourself) -- it just 'gets tired' over time (I once had to reboot my Christmas lights, so anything can crash).  AManWithNoPlan (talk) 21:33, 6 May 2020 (UTC)


 * Cron is the right answer here, I think, but unfortunately doesn't seem to be available on the wmflabs server. I'm sure an equivalent is available. I've not got time to look into it, but if you let me know what script to run I can run it!  Martin  (Smith609 – Talk)  07:29, 7 May 2020 (UTC)
 * How would a status page give you more than https://en.wikipedia.org/wiki/Special:Contributions/Citation_bot does AManWithNoPlan (talk) 12:23, 7 May 2020 (UTC)
 * Perhaps something like; it says I'm working on this page (or list of pages) or category now, which was requested by X? or something like I'm currently working on page 20/100 (and states the page) so you actually know how far along it is? that would be more like how IAbot does it - not that that means we should do the same or it is even possible, just an idea. I had another Idea, displaying https://tools.wmflabs.org/citations/index.html output in real time for everyone's requests together and make it visible for everyone in real time as well but I think I'm gonna scrap that suggestion myself for bandwidth considerations alone... Redalert2fan (talk) 14:24, 7 May 2020 (UTC)
 * I think the request is to see somehow if the server load (zotero) is high or unavailable and it is worth it to use the bot, or just wait, but as you say that can be fixed by actually working on zotero itself, which would mean you don't have to worry about a status in the first place. Redalert2fan (talk) 14:27, 7 May 2020 (UTC)
 * indeed zoreto is the weak link. And when it’s down, it cannot tell you that’s down.   It seems to die of old age, and just needs killed regularly and restarted.  Anyone have a way to do that easily with the tool server? AManWithNoPlan (talk) 14:40, 7 May 2020 (UTC)
 * you might have some insights here. &#32; Headbomb {t · c · p · b} 14:46, 7 May 2020 (UTC)
 * I don't know exactly what script to run as I don't know how this bot is setup in toolforge but I did find the toolforge guides for setting up cron jobs. In grid, kubernetes, and the general page. RayScript (talk) 14:55, 7 May 2020 (UTC)
 * Thanks for the links. Cron job set up to (re)start server at 12 minutes past each hour. Martin  (Smith609 – Talk)  09:16, 8 May 2020 (UTC)
 * I agree that a page showing all the realtime logs could be a bit much. However, a simple page showing how many pages are currently in queue (or what's in queue) would be helpful. That being said, if adding this small cron to zotero makes it so most issues go away then maybe this work would be superfluous. That being said, it looks like the zotero server has been down for a couple days now so maybe we should reboot it now. Is this the tool that we are having issues with? Are we on the latest version? If we are and we are still having this issue I'd be willing to communicate with their project to see if they've experienced similar things or are interested in trying to help fix this. RayScript (talk) 14:55, 7 May 2020 (UTC)
 * The Zotero server has just been updated to the newest version. Hoping for a improvement.  Yes, this is it.  AManWithNoPlan (talk) 16:24, 7 May 2020 (UTC)

Bad pages number
Looking at "Numéro d'article 0007" is stated, but "Nombre de pages 13" is actually also given. Taking a look at the the actual pdf in question I can confirm it has 13 pages. Redalert2fan (talk) 16:06, 3 May 2020 (UTC)
 * Same on where 1-13 is changed to 0007. Redalert2fan (talk) 16:09, 3 May 2020 (UTC)
 * the trials and tribulations of what is a page number in the modern era. Is it the article number or 1-N where N I the number of pages. If your intent with a page number is to find the article than 0007 is better, if your goal is to know how long the article is then you want 1-N.  The general belief is that the number of total pages is largely pointless.  AManWithNoPlan (talk) 17:26, 3 May 2020 (UTC)
 * , I see I didn't notice it was article 7 of a larger part, I thought it was a separate single article that just happened to be number 7. My bad. However, article 0007 does not have to be on page 0007 Redalert2fan (talk) 18:23, 3 May 2020 (UTC)
 * This article has no page numbers, in the sense that it is not part of a larger work, such as a published work. As the Cite Journal docs say "Pages in the source that support the content (not an indication of the number of pages in the source; displays after 'pp.'", and as previous discussions have pointed out the article number (0007) is the closest thing that this article has to page numbers for location purposes.  To be more precise, the page numbers are 0007–1—0007–13 (but this is strongly discouraged as ugly since many bigger journals have large article numbers). AManWithNoPlan (talk) 20:10, 3 May 2020 (UTC)
 * Not helpful, but this we be a funnier discussion if the journal used three numbers instead of four, then we could be talking about article double-O seven. Author Bond, James Bond. AManWithNoPlan (talk) 20:11, 3 May 2020 (UTC)
 * , Okay that's understood, so even though I know the relevant train is on page 7-8 of the document by counting, since there are no page numbers (or 0007–7—0007–8 would be discouraged) there is no way to indicate this to readers. It just reads weird to me that article number 0007 would be in the page parameter, since by itself its not really a page number either. Anyways this is not really citation bot related if that is what the documentation says. Redalert2fan (talk) 20:44, 3 May 2020 (UTC)
 * Standard format would be 0007 7–8 → 0007-7–8 to indicate a range from 0007-7 to 0007-8. &#32; Headbomb {t · c · p · b} 20:30, 4 May 2020 (UTC)
 * I manually adjusted it per the suggestion above on the pages in question. Redalert2fan (talk) 16:09, 9 May 2020 (UTC)

Caps: SpringerPlus
https://github.com/ms609/citation-bot/pull/2851 AManWithNoPlan (talk) 11:35, 9 May 2020 (UTC)

Bot still converting ISBN= to isbn=
See this discussion in the archive from a few weeks ago. This was reported to have been fixed, but it is still happening (the diff linked above is from the most recent 24 hours). – Jonesey95 (talk) 15:08, 9 May 2020 (UTC)
 * The bot can't do anything about people manually saving the suggested output when they are just doing general tyding up. &#32; Headbomb {t · c · p · b} 15:26, 9 May 2020 (UTC)