User talk:Citation bot

Note that the bot's maintainer and assistants (Thing 1 and Thing 2), can go weeks without logging in to Wikipedia. The code is open source and interested parties are invited to assist with the operation and extension of the bot. Before reporting a bug, please note: Addition of  to citation templates by this bot is a feature. When there are two identical parameters in a citation template, the bot renames one to. The bot is pointing out the problem with the template. The solution is to choose one of the two parameters and remove the other one, or to convert it to an appropriate parameter. A 503 error means that the bot is overloaded and you should try again later – wait at least 15 minutes and then complain here.

Please [//en.wikipedia.org/w/index.php?title=User_talk:Citation_bot&action=edit&section=new&preload=User_talk:Citation_bot/preload&preloadtitle=Untitled_new_bug click here] to report an error. Or, for a faster response from the maintainers, submit a pull request with appropriate code fix on GitHub, if you can write the needed code.

Feature requests
Example: https://en.wikipedia.org/w/index.php?title=Friern_Hospital&diff=prev&oldid=1167644213
 * Implement support to expand from https://doi.org/10.1093/ww/9780199540884.013.U192476 to
 * Implement support to convert cite web to BioRef and GBIF
 * Use https://www.crossref.org/blog/news-crossref-and-retraction-watch/
 * https://www.ncbi.nlm.nih.gov/books/NBK25497/ set NLM_APIKEY and NLM_EMAIL
 * journal/publisher that only differ by 'and' and '&' should be treated as identical https://en.wikipedia.org/w/index.php?title=Congenital_cartilaginous_rest_of_the_neck&diff=prev&oldid=1199200383
 * Free archive.org links such as curl -sH "Accept: application/json" "https://scholar.archive.org/search?q=doi:10.1080/14786449908621245" | jq -r .results[0].fulltext.access_url
 * Use GET instead of POST for better proxy caches when talking to data-bases when possible.
 * Start to convert Google Books URL to "new" format https://www.google.com/books/edition/_/m8W2AgAAQBAJ?gbpv=1&pg=PA379

Changing every citation of a publisher's webpage to Cite book
I have remained silent on this issue even though it has irritated me for a while now. And now that there is discussion above about the widespread useless cosmetic edits this bot continues to waste everyone's time with, I'll raise it: Why must every citation of a publisher's webpage be changed to to Cite book? I can only speak for myself, but every time I cite such book webpages I am not citing the book itself. I am specifically referencing the information published on the webpage. So of course I do not want the citation to be changed to Cite book with a bunch of parameters of the book itself (ISBN, date, etc) added. So I inevitably stop the bot or replace the reference with a third-party source. I realise the defense will be "It doesn't hurt" or that some users are actually citing the book. And I realise this is not the most pressing issue, but why must the bot come to its own conclusion of the editor's intent? I see another user complained of this issue last year. Οἶδα (talk) 22:25, 27 September 2023 (UTC)


 * This may be the kind of situation where it's safest to explicitly tell citation bot not to muck with the citation. It's hard to automatically judge whether the human editor actually wanted "cite web" or "cite book". (There are many examples of people using "cite web" to cite resources that should actually be books, journal articles, etc.) –jacobolus (t) 01:38, 28 September 2023 (UTC)
 * I understand. But it still feels like an another unnecessary task for this bot to insert itself into every article it can possibly find. For example, this edit is completely useless and actually corrupts my intention of the citation. Call me crazy but I don't want or need a bot telling me what I am citing (and actively altering my citations accordingly). Οἶδα (talk) 21:32, 13 October 2023 (UTC)
 * When I've quoted publisher blurbs in the past, I usually set publisher's blurb for clarity. In the specific case you've linked just above, another option would be not to cite the publisher's landing page at all, and add the book to a "Selected works" subsection or something. Indeed, the altered citation is sequential to another one, and so seems a bit superfluous. Or, alternatively, use "Citation bot bypass" somewhere in your citation as suggested by above.Given the overall lazy referencing culture of less experienced editors, it's likely that in the  majority of cases, people who drop a link to a publisher landing page are probably trying to cite the book itself, so this behaviour of assuming that's the case is net beneficial. Folly Mox (talk) 22:13, 13 October 2023 (UTC)
 * I cannot personally maintain that the majority of users citing a publisher's webpage are lazily intending to cite the book itself. My experience suggests otherwise which is why I have taken issue, but I realise my editing purview might be skewed. However, if that is observably true then I will resign to accepting this as a forgivable externality. Οἶδα (talk) 06:35, 14 October 2023 (UTC)
 * In fairness to your point, I haven't looked into the data about how frequently this sort of change is appropriate; it could be the case that my own perspective is the skewed one. Folly Mox (talk) 08:32, 14 October 2023 (UTC)
 * I couldn't find a list of tasks that the bot has been approved for (other than the very first approval) nor a thorough description of all of its mystical activities. I was surprised to find it would change "Cite web" to "Cite book" (for unclear reasons). The only cure, if the bot is unchanged, seems to be the   mechanism documented at User:Citation_bot - R. S. Shaw (talk) 04:12, 6 December 2023 (UTC)

Why is Citation Bot removing a page # from a cite's URL
On Charles Clinton, Citation bot removes "?seq=9" from this URL. That bit of code give the Page # within the larger cite, so why does Citation bot remove it? It makes sense to me to leave that bit of code in there but the bot doesn't seem to think so. It's removed it twice, once here and once here, so maybe I'm wrong... Would appreciate some clarification. Thanks, Shearonink (talk) 03:36, 4 March 2024 (UTC)
 * Also, if by some chance I am correct, is there any way to stop people from running the Bot needlessly on this supposed issue? Thanks, Shearonink (talk) 03:37, 4 March 2024 (UTC)
 * The landing page is the same in either case. &#32; Headbomb {t · c · p · b} 04:09, 4 March 2024 (UTC)
 * It isn't the same for me... The one without the ?seq lands me on the main page, the URL with the ?seq=9" lands me on the exact page with the quoted text... Shearonink (talk) 04:55, 4 March 2024 (UTC)
 * Yes, I also see the preview page showing page 9 of 17 with the ?seq parameter. —David Eppstein (talk) 08:06, 4 March 2024 (UTC)
 * Oh good it isn't just me... The ?seq code might be taking us & other registered editors to the exact page because we have a JSTOR account through the WP Library I guess... But even if people don't have a JSTOR account the *code* should be left there, otherwise the URL seems useless. I like to give readers the option of going down the rabbithole of verifiability if they want to. Why is WP giving readers an URL that is to the entire book or article as the Citation bot default when the bot is run on the article? Shearonink (talk) 15:49, 4 March 2024 (UTC)
 * Actually currently JSTOR thinks it is providing me access through UT Dallas, I guess because I was there for a conference last summer. But yes, this should be left in place, like the pg= parameter of Google Books links, for the same reason. —David Eppstein (talk) 17:27, 4 March 2024 (UTC)
 * @Headbomb 2409:4070:4381:EF12:0:0:1BF5:A5 (talk) 12:34, 29 April 2024 (UTC)

CITEVAR and manually formatted references
I asked this in the discussion of an earlier bug but it was archived without providing an answer. Can you please explain —David Eppstein (talk) 20:29, 2 April 2024 (UTC)
 * 1) How is it not a violation of WP:CITEVAR for Citation bot to convert manually-formatted references into templates, as it is doing e.g. at Special:Diff/1216926071? A human might do this but a bot automatically doing it is completely something else, especially in cases such as here where it does not even improve the consistency of formatting (the article is still a mix of CS1, CS2, and manually-formatted references).
 * 2) For those of us who might deliberately format references manually because we don't want bots messing with our citations, or we made a deliberate decision that the citation templates were inadequate for some specific citation, do we now have to start explicitly locking the bots out of articles altogether?
 * 3) Where is this included in the BAG-approved tasks for this bot?
 * 4) I find the bot's edit summary "Changed bare reference" to be significantly misleading. This is not a bare-url reference. It is a well-formatted reference that happens to be manually formatted. Where is there any guideline or policy suggesting that such references are a problem that needs to be fixed?


 * There's already citation templates on that page. No CITEVAR violation happened. &#32; Headbomb {t · c · p · b} 23:09, 2 April 2024 (UTC)
 * I mix manually formatted citations and template-formatted citations on pages all the time, deliberately. I would be extremely annoyed if a bot took it upon itself to change that deliberate decision. —David Eppstein (talk) 23:17, 2 April 2024 (UTC)
 * It should however, preserve the editors. &#32; Headbomb {t · c · p · b} 23:11, 2 April 2024 (UTC)

Use cite biorxiv
When encountering a cite journal or citation with bioRxiv or bioRxiv: The Preprint Server for Biology [case insensitive], the bot should convert the citation to a proper cite bioRxiv, i.e.



The bot should keep author/last/first/date/year/title/language, convert doi to biorxiv, and throw the rest away.



If it was from a citation, append cs2 to it.

To be extra safe, this should only be done when the DOI starts with 10.1101. &#32; Headbomb {t · c · p · b} 08:20, 20 June 2024 (UTC)

Convert cite biorxiv to cite journal
If encountering a cite bioRxiv that is fully published, convert it to a cite journal For example says "Now published in eLife doi: 10.7554/eLife.05856"

So TNT the citation and expand it

&#32; Headbomb {t · c · p · b} 08:36, 20 June 2024 (UTC)


 * Note to self. New DOI is in crossref.  https://api.crossref.org/works/10.1101/007237 AManWithNoPlan (talk) 15:03, 22 June 2024 (UTC)

STILL creating new CS1 errors
Changing an incorrect cite journal to cite book : Good (although would have been better as cite conference).

Creating a new CS1 error where there was none before, because it left the paper title in the book title parameter and did not change the journal parameter to a book title parameter: doubleplusungood.

Stop it.

Posting as a message rather than a new bug because this is not a new bug. It is an old bug that has been ignored far too long by the developers (see, above). It needs to be fixed. —David Eppstein (talk) 23:07, 20 June 2024 (UTC)


 * It's not creating error, it's flagging errors that were already there, but not reported. FM 2014: Formal Methods was wrong before. That the bot didn't manage to fix it doesn't make it a new error. Now the error is reported. This is an improvement, even though ideally the bot would be able to figure out and fix the error itself. &#32; Headbomb {t · c · p · b} 23:11, 20 June 2024 (UTC)
 * INCORRECT. It is creating an error, because formerly readers could see the paper title, see the book title (called a journal, but still formatted in italics the way readers would expect a book title to look), and see that it was a paper in a book with that title. After the edit, readers were presented only with the paper title, formatted as a book title, falsely telling them both in visible appearance and reference metadata that the reference was to an entire book-length work. It is not merely that it is creating CS1 errors, although that is bad enough. It is also making the reference less accurate in both its metadata and in its visible appearance. —David Eppstein (talk) 23:20, 20 June 2024 (UTC)
 * I've gotten really exhausted with this category of error introduced by Citation bot, which I encounter every day I edit. I used to creep its contributions and clean up after it, but I've started just reverting its edits that cause this kind of template error, regardless of any value added, and only sometimes actually fix up the citations myself. Few of the editors who call Citation bot on large sets of pages ever check in after it to see if it's causing errors, so typically no one notices my reverts.I saw a few weeks back that for one subset of conferences (IEEE maybe? or SPIE?) Citation bot has successfully been changing cite journal to cite book without introducing errors and growing the backlogs. So there has been a partial fix, but it's pretty frustrating that this known error has been perpetuated in thousands of edits spanning months.Citation bot does not have an approved BRFA task to change citation template types, and changing to cite book has been the one that's particularly fraught and error-prone ever since support for the aliases of periodical was dropped from cite book a year ago. The easiest thing would be if support were readded, but that seems highly unlikely. I do think that eventually, if this bug isn't fixed, I'll end up asking BAG to ban Citation bot changing template type to cite book. Disabling the functionality would be an improvement over the current situation. Folly Mox (talk) 00:02, 21 June 2024 (UTC)

"Page" parameter acting incorrectly

 * I have raised this issue in the past.  supports article-number (as does  when it has journal with an assigned value).  The bot should be using article-number for article numbers and should not be shoehorning them into page(s).
 * —Trappist the monk (talk) 17:24, 1 July 2024 (UTC)

Mathematical Reviews is not a book
This is likely about where a reference to MR is confused to a reference to the work reviewed by MR. &#32; Headbomb {t · c · p · b} 20:12, 3 July 2024 (UTC)
 * The majority of that reference is to the book itself (DOI, ISBN, volume, etc) and not the MR.   AManWithNoPlan (talk) 00:59, 4 July 2024 (UTC)
 * You are completely missing the point.
 * After multiple passes of citation-cleaning bots including Citation bot and OAbot, what was originally a reference purely to a review in Mathematical Reviews gradually became more and more borked, in the process resembling a reference to the reviewed work. The most recent pass of Citation bot took a reference that, by then, resembled a citation to a book and made it look more like a citation to the book. But that was only the latest step of this borkage. Sometime longer ago a bot planted a turd in the citation and then the bots kept on polishing it, making it shinier and shinier but not any less smelly.
 * The problem here is not the individual edit. The problem is that when bots repeatedly replace and replace and replace bits of citations, without intelligence or oversight, they have a tendency to amplify their earlier mistakes. All it takes is a month or two of a bug where bad dois or bad hdls get added to citations (and we've seen such bugs, not just in this bot) and then later iterations take that as gospel and keep massaging the citation to more closely resemble that bad piece of the reference. One or two passes of Citation bot is usually an improvement. After that, further passes are as likely to break things and make more work for human editors as they are to make anything better.
 * We need some sort of cone of shame that can stop the bots from continuing to worry the same sore spots over and over, without keeping them away from new citations in need of bot cleanup. —David Eppstein (talk) 19:19, 5 July 2024 (UTC)
 * The reciprocal operation seems more common in my experience: DOIs to book reviews where the citation points to the reviewed book. Perhaps the least fun is where the same content is published originally in a journal and later as a book chapter, and the citation scripts pick the opposite publication to the original editor, resulting in wholly mixy-match metadata that can take twenty or thirty minutes to untangle.Whenever I find myself fixing citations that Citation bot has micrd up in this way (which can often as not be blamed on Crossref), I'll drop a hidden html comment so it ignores the citation in the future, but it would nice not to have to do that every time. However, bots sprinkling script-embargo-date or suchlike all over doesn't feel like a super premium solution either. Folly Mox (talk) 16:40, 7 July 2024 (UTC)
 * There is no such parameter as script-embargo-date. What did you really mean?
 * —Trappist the monk (talk) 16:46, 7 July 2024 (UTC)
 * Sorry. I was workshopping ideas of how to slow down or arrest the process of citation scripts, and what it might implement like to have I think I skipped a step where I typed out the immediately rejected ideas of scripts keeping track of which citations they had previously edited (too resource intensive), or checking revision histories for their own activity (ditto). Then I leapt straight into rejecting the third idea, where bots drop themselves and each other little reminder notes using an invented parameter for the purpose.Unlike a few other problems that get mention on this talkpage, I don't have any clear idea how to prevent the sort of error described in this bug report. I forgot to type out some of my unclear bad ideas, probably due to being in an IRL conversation during the edit. Folly Mox (talk) 17:13, 7 July 2024 (UTC)

Link articles from theprint.in to ThePrint
Currently it is being changed to the Theprint, without the wikilink Skratata69 (talk) 10:19, 3 July 2024 (UTC)


 * Diff? &#32; Headbomb {t · c · p · b} 20:09, 3 July 2024 (UTC)

Changing to (cite journal) without giving a (journal) value
The proper conversion would be to a cite ssrn &#32; Headbomb {t · c · p · b} 06:34, 7 July 2024 (UTC)
 * True, and best case outcome for this specifically. For the general case, it might be prudent for Citation bot to check whether or not it has parsed out a journal parameter from the metadata, and abort the change in template type if the value is empty. Folly Mox (talk) 16:26, 7 July 2024 (UTC)

can we please not "Upgrade ISBN10 to 13"?
As far as I can tell this has no practical advantage at all, and only serves to make the opaque identifier take up more space at readers' expense. –jacobolus (t) 23:48, 15 July 2024 (UTC)

Invisible characters
Decodes to "Cyber%20%E2%80%8B%E2%80%8BPartisans" and "Weekly%20%E2%80%94".

So whatever the characters %E2 %80 and %8B are, they should get the boot. &#32; Headbomb {t · c · p · b} 07:02, 16 July 2024 (UTC)

10.4230 is open access, LIPIcs
10.4230 is open access, for LIPIcs. &#32; Headbomb {t · c · p · b} 21:09, 17 July 2024 (UTC)

Geophysical Journal International is open access
This should cover DOI prefixes


 * 10.1093/gji
 * 10.1111/j.1365-246X
 * 10.1046/j.1365-246X

Also RASTI
 * 10.1093/rasti

And MNRAS Letters &#32; Headbomb {t · c · p · b} 23:39, 17 July 2024 (UTC)
 * 10.1093/mnrasl
 * 10.1111/j.1745-3933