User talk:Citation bot/Archive 13

Don't remove final, from URL
https://github.com/ms609/citation-bot/pull/1165 AManWithNoPlan (talk) 18:53, 4 January 2019 (UTC)


 * Yeah never modify URLs without testing that the URL works. This is what I have learned with WAYBACKMEDIC. It is continually finding crazy things in URLs that are not predictable. One can not safely say a URL ending in a set of characters should be changed, or added to. Same with encoding schemes, they can be all over the place such as %20 vs + there is no right way, even within the same URL. Standards are out the window these days the only "right" URL is the one that works. -- Green  C  16:19, 5 January 2019 (UTC)

Upgrades Arxiv to Journal for no apparent reason
See also. Headbomb {t · c · p · b} 17:46, 5 January 2019 (UTC)
 * it gets updated because of the bibcode. We have a blacklist of bibcode a that are actually arXiv despite claiming to be journals.   Obviously, you found a new liar to add. AManWithNoPlan (talk) 19:04, 5 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1167 AManWithNoPlan (talk) 19:33, 5 January 2019 (UTC)

Adding redundant duplicate alias of "work" paramter
https://github.com/ms609/citation-bot/pull/1172 Once implemented it should fix this. AManWithNoPlan (talk) 01:17, 6 January 2019 (UTC)

API: New feature request, run from links on page
Let's have something like This would be super useful. We could be build lists of pages with crappy citations with AWB's database scanner or with clever insource:// search (e.g. pages with raw GoogleBooks links, pages with raw DOI links, ...), then put the list of pages to be edited somewhere (e.g. User:Headbomb/Sandbox5), then tell the bot to run against those pages (follow redirects if they exist). Headbomb {t · c · p · b} 14:45, 22 August 2018 (UTC)
 * since you seem to be the one to ask about API features, how doable is this? Headbomb {t · c · p · b} 11:11, 24 August 2018 (UTC)
 * Does the new "run on multiple pages separated by pipes" functionality address this request? Martin  (Smith609 – Talk)  07:46, 25 August 2018 (UTC)
 * not really. Those list would have to manually be built and fed manually every time. It's OK for a one-time list, but the idea is that you could embed have a one-click way of running the bot on a list of links. Book:Canada would be a prime example (or cleanup-centric lists, like WP:JCW/J30 and fix a crap ton of capitalization mistakes in one click). If you could have something like , that would find all links on the page (likely direct links for simplicity) and run the bot on those pages, that would be great. That is if you have   somewhere on the page, get Foobar (follow redirects if there are any), and run the bot on that. Repeat for all other links it finds. Headbomb {t · c · p · b} 22:30, 26 August 2018 (UTC)

This is still something that would be incredibly useful. Headbomb {t · c · p · b} 15:02, 30 November 2018 (UTC)
 * For example: https://en.wikipedia.org/w/api.php?action=parse&prop=links&page=Chemistry&format=json Wikipedia can make the list for us.  Would obviously need to remove talk and other namespaces AManWithNoPlan (talk) 23:10, 23 December 2018 (UTC)


 * - while there would be uses where non-mainspace would be useful, I think restricting this to mainspace+draft would be best, at least for now. Headbomb {t · c · p · b} 23:24, 23 December 2018 (UTC)

This is basically a request that would allow any user to run a full-automated bot without needing WP:BRFA. Given this is a tool designed for manual watching of diffs, I wonder how wise it would be to turn the bot keys over. -- Green  C  16:27, 5 January 2019 (UTC)
 * Indeed. It is not even a category, so one could do this on a fashion article and find a link to quantum mechanics because the designers uncle was a physics professor.  AManWithNoPlan (talk) 16:48, 5 January 2019 (UTC)


 * Yeah I agree there's a concern there. While running on Book:Canada (and other books) is no different than running on a category, maybe build a whitelist of users that could use it in such a fashion on other pages? Or some other whitelisting (e.g. any page that start with "Book:", "Wikipedia:WikiProject ..." + specific pages "User:EXAMPLE/SANDBOX2"). Headbomb {t · c · p · b} 18:07, 5 January 2019 (UTC)


 * https://tools.wmflabs.org/citations/get_linked_pages.php?page=   This will give you a list of all linked pages (we have a short black list to remove things like doi, isbn, etc).  This way a human has to a little work and think about it rather than just yelling “git her done” and leaving the seen of the crime.  Note that the extraneous html is removed in a non committed pull.  AManWithNoPlan (talk) 19:20, 7 January 2019 (UTC)
 * Yeah, but that's not extremely useful. I know what pages are on Book:Canada (or say WP:JCW/Sandbox ), the goal is to kick the bot into action once the list of pages to run on has been built, much like it does with a category. Headbomb {t · c · p · b} 20:58, 7 January 2019 (UTC)
 * It only takes a little copy and paste to make a pipe separated list. AManWithNoPlan (talk) 01:38, 8 January 2019 (UTC)


 * Which is extra work, for little reason, to have a piped list that no one knows what to do with. Headbomb {t · c · p · b} 01:40, 8 January 2019 (UTC)

fixed prints with pipes now. AManWithNoPlan (talk) 13:43, 8 January 2019 (UTC)


 * So now we have a piped list of things no one knows to do anything with, and articles that still don't edit edited by citation bot. Headbomb {t · c · p · b} 13:57, 8 January 2019 (UTC)

Adding unrelated bibcode
The adsabs database seems to more generous with matches suddenly. I have already submitted two fixes. https://github.com/ms609/citation-bot/pull/1174  https://github.com/ms609/citation-bot/pull/1169    AManWithNoPlan (talk) 14:59, 6 January 2019 (UTC)

Valid bibcode (book?) not expanded, but details added to a different citation in the same article
There something wrong with that one bibcode that redirects to another one. That makes us not expand it since one check we do is to make sure the bibcode we get back is the one we sent out. This is unfixable, since we will not remove the double check. The second issue is that the not currently rejects expansion of any book bibcodes since that rehires is to write code that we have not done yet. I might look into writing that code. AManWithNoPlan (talk) 22:53, 6 January 2019 (UTC)
 * The query changing the bibcode and then mixing the text is horrible. AManWithNoPlan (talk) 22:58, 6 January 2019 (UTC)
 * this is the evil bibcode: http://adsabs.harvard.edu/abs/1982MSS...C03....0H AManWithNoPlan (talk) 23:00, 6 January 2019 (UTC)
 * looks like we need to make sure we did not get a bibcode back that is new. AManWithNoPlan (talk) 23:03, 6 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1178 Detect corrupt query, I hope. AManWithNoPlan (talk) 23:22, 6 January 2019 (UTC)
 * No citations are mangled now, at least not in that example. Bibcode 1982MSS...C03....0H is ignored.  Book bibcodes are ignored, except that cite journal templates are changed to cite book templates.  Lithopsian (talk) 14:58, 7 January 2019 (UTC)
 * 1982MSS...C03....0H is defective and we will never expand that. AManWithNoPlan (talk) 20:10, 7 January 2019 (UTC)

Fails to convert hdl url
Every handle resolver has to be added separately. https://github.com/ms609/citation-bot/pull/1181 AManWithNoPlan (talk) 18:48, 7 January 2019 (UTC)

New feature request: merge templates into CS1|2
Commonly seen:


 * Bovver boot:


 * Anthony Chenevix-Trench:

On both subscription status is noted with the template, which can be inside or outside the CS1|2.

The better format would be:


 * Bovver boot:


 * Anthony Chenevix-Trench:

The is replaced with url-access and if there is a via argument, with a via in the CS1|2. The subscription template goes by many names. -- Green  C  19:30, 7 January 2019 (UTC)


 * Would also want to only do this if there was one cite template in the ref tag; since, one might be applying this to more than one cite template. Given that this is not easily done within the bot’s code, it might be best to make a Bot request.  AManWithNoPlan (talk) 20:02, 7 January 2019 (UTC)
 * I thought about making a bot but not sure it would pass COSMETIC. Understood about matching up is tricky. Will keep the idea in mind. -- Green C  20:23, 7 January 2019 (UTC)
 * Converting data into inline data within a template is really useful in so many ways. AManWithNoPlan (talk) 20:29, 7 January 2019 (UTC)
 * I agree. Started a discussion Template talk:Subscription required. Maybe it will need to be an RfC to 1) make the conversions and 2) change the docs to only use in free-form citations not CS1|2. -- Green  C  20:38, 7 January 2019 (UTC)
 * Is this the same as User talk:Citation bot/Archive 11? (t) Josve05a  (c) 20:38, 7 January 2019 (UTC)
 * yes it is. AManWithNoPlan (talk) 20:41, 7 January 2019 (UTC)
 * What can be said about the advantages of converting? --  Green  C  20:44, 7 January 2019 (UTC)

I will close this item as wontfix and have moved a link to the discussion area above. AManWithNoPlan (talk) 17:02, 8 January 2019 (UTC)

Remove format=pdf and variants when URLs end in .pdf
If you have something like
 * , giving
 * , giving
 * , giving

Citation templates automatically append (PDF) next to the link. So there's no point in having So if you find PDF or similar (e.g. pdf / Portable Document Format / pdf), remove it as pointless. Headbomb {t · c · p · b} 17:41, 5 January 2019 (UTC)
 * , giving


 * I think pdf exist in case the URL does not have an apparent ".pdf", so this suggestion would only be done when the URL has a ".pdf". But I wonder if there is any other reason for using pdf? -- Green  C  18:22, 5 January 2019 (UTC)


 * I find those rather pointless personally, but the above request was for when URLs end in PDF. I'll update the header. Headbomb {t · c · p · b} 18:37, 5 January 2019 (UTC)
 * I agree. To make sure the removal doesn't introduce an unknown problem, maybe some other reason for it to exist, I posted a question Help_talk:Citation_Style_1. -- Green  C  18:56, 5 January 2019 (UTC)

Flag to archive notabug. Moving link above. AManWithNoPlan (talk) 14:03, 7 January 2019 (UTC)


 * De-flag, it's been confirmed redundant and useless. Headbomb {t · c · p · b} 15:05, 7 January 2019 (UTC)
 * that was fast.  I was expecting to hear back in a week or three.  AManWithNoPlan (talk) 15:18, 7 January 2019 (UTC)
 * According to Xover: "An URL ending in ".pdf" can (and not infrequently does) return something other than a PDF." Trappist also brought up the concern that other wiki-languages don't support the PDF icon unless there is format=pdf thus when they copy cites from enwiki they loose this meta information. Those are the two concerns that came up.  --  Green  C  15:50, 7 January 2019 (UTC)
 * Yeah, Headbomb's description there about the result of the discussion is clearly biased or otherwise misleading in its intent. That discussion has not completed at this time. --Izno (talk) 15:51, 7 January 2019 (UTC)
 * a) If a url ending in .pdf returns anything but a PDF, then PDF will STILL be displayed. b) This is the English Wikipedia. Unlike English other wikis can easily implement automatic PDF detection, and would be better off doing so. Headbomb {t · c · p · b} 16:44, 7 January 2019 (UTC)
 * Isn't (a) a reason to remove automatic detection of the PDF format from the module? If that's your intent, best be off to argue for that instead. --Izno (talk) 17:07, 7 January 2019 (UTC)


 * No, the opposite. The only time the landing page will not be a PDF is if the PDF is behind a paywall. Headbomb {t · c · p · b} 18:28, 7 January 2019 (UTC)
 * Auto-detection is useful even if not always accurate (heuristics noted by Xover). -- Green  C  19:14, 7 January 2019 (UTC)

This is the wrong bot for the initial cleanup. Something else needs to fix this and then we can play whack a mole on new ones. Assuming this is a good idea of course. AManWithNoPlan (talk) 19:17, 7 January 2019 (UTC)
 * There wouldn't be any 'initial cleanup' really, it's a cosmetic cleanup, so that's akin to removing . or . It's simplifies the edit window and makes references easier/more consistent to edit. Headbomb {t · c · p · b} 21:03, 7 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1190 AManWithNoPlan (talk) 22:18, 8 January 2019 (UTC)
 * fixed AManWithNoPlan (talk) 14:47, 9 January 2019 (UTC)

Don't add journal= to citations with bibcode with a '.book' in them
Something with the bibcode database has gone wonky suddenly. Adding lots of data integrity checks. Obviously more needs done. AManWithNoPlan (talk) 19:08, 7 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1186 AManWithNoPlan (talk) 18:13, 8 January 2019 (UTC)

Does not remove stray dot at the end of pp
https://github.com/ms609/citation-bot/pull/1188 AManWithNoPlan (talk) 18:19, 8 January 2019 (UTC)

Book Reviews added to book citations
https://github.com/ms609/citation-bot/pull/1187 AManWithNoPlan (talk) 18:13, 8 January 2019 (UTC)

Timeout at Deim_Zubeir
It’s an internal php bug. Work around: https://github.com/ms609/citation-bot/pull/1193 AManWithNoPlan (talk) 22:14, 9 January 2019 (UTC)

Fails at Russian passport
wontfix at this time. It does run, just too slowly. AManWithNoPlan (talk) 22:04, 10 January 2019 (UTC)

Deal with both url and chapter-url
I will have to think about this and all the possible combinations AManWithNoPlan (talk) 22:37, 26 December 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1206 AManWithNoPlan (talk) 00:37, 11 January 2019 (UTC)

2001gpm..book.....L

 * Well that's not a bug. is a book. Headbomb {t · c · p · b} 01:53, 11 January 2019 (UTC)
 * Yes, but it seems to be adding it to all cite journal's with Genetics, which is a bug. Link to old edit. I saw the script trying to make this change on a page a few moments before I reported this as well, so it is still doing it. (t) Josve05a  (c) 02:13, 11 January 2019 (UTC)


 * That is just spiffy. Might have to block that bibcode explicitly.  I will investigate later tonight.  Probably will need to search for it and remove it where ever it is.  AManWithNoPlan (talk) 02:45, 11 January 2019 (UTC)


 * Let us all take a moment to ponder Headbomb being wrong about something.  This is a rare event.  Please observe a moment of silence.  🤣😁😂 AManWithNoPlan (talk) 03:58, 11 January 2019 (UTC)


 * Well perhaps if the initial report had included a diff... Headbomb {t · c · p · b} 06:20, 11 January 2019 (UTC)


 * I cleaned up the current uses, btw. The only thing in common they had is they all were concerning citations for various articles of Genetics. Headbomb {t · c · p · b} 06:24, 11 January 2019 (UTC)


 * The cause is that the journal Genetics is not indexed, but this one book has Journal=Genetics set in its record. Thus, any search for journal=genetics gets a hit. AManWithNoPlan (talk) 06:30, 11 January 2019 (UTC)
 * Fix: https://github.com/ms609/citation-bot/pull/1208 AManWithNoPlan (talk) 06:30, 11 January 2019 (UTC)

Wrongly upgrades arxiv, again
https://github.com/ms609/citation-bot/pull/1210 AManWithNoPlan (talk) 06:55, 11 January 2019 (UTC)

Removes URL

 * That's not a bug, that url is redundant with the DOI. Headbomb {t · c · p · b} 00:47, 12 January 2019 (UTC)


 * I beg to differ. The url is an alternate way to the source, independent of the doi. Who said we must not have both? &diams; J. Johnson (JJ) (talk) 01:03, 12 January 2019 (UTC)


 * The style guides and template documentation strongly discourage the use of urls unless they link to a 100% free full copy. Also, URLs that duplicate other indentifiers are discouraged even if free. One reason is that with a doi you know you are going to a publisher, a link is without context. AManWithNoPlan (talk) 01:12, 12 January 2019 (UTC)

Bot breaks URL in pages field of citation template by changing hyphen to en dash in hidden URL
This bug was previously reported at and  but apparently was not completely fixed.

This bug may occur in this case because the link is a protocol-relative URL, which is a deprecated link format on Wikipedia. In such cases, citation bot should update the link format instead of breaking the URL with the unfortunate hyphen/dash exchange. Biogeographist (talk) 16:14, 10 January 2019 (UTC)


 * URLs should almost never be modified unless it can issue a GET to verify the new URL works, or in known cases of URL changes. -- Green  C  16:21, 10 January 2019 (UTC)


 * fascinating how a url can be hiding within non url text. Surprising that it took so long for this bug to be reported.  AManWithNoPlan (talk) 17:11, 10 January 2019 (UTC)


 * Ah a PRURL inside an incorrectly placed square-bracket - gigo.  --  Green  C  17:35, 10 January 2019 (UTC)


 * Side bar: people often talk about old-crusty-unreadable code. They say things like: we need to replace this Fortan with C/C++/Java/Go/etc..  Then they do that and discover that the old code was unreadable since 90% of the code was error/exception handling.  The same is true of the Citation Bot: if the template were always used right and they did not have six different names for the exact same parameter, then the bot would be 75% smaller.  This is GIGO, but I think we can prevent the GO half.  AManWithNoPlan (talk) 17:48, 10 January 2019 (UTC)
 * Yes for sure. An infinite tail of exceptions and edge cases -- Green  C  18:52, 10 January 2019 (UTC)


 * New code will ingore PRURL once it is git pulled in and will add the https: when it is the very first characters of the page. AManWithNoPlan (talk) 00:25, 11 January 2019 (UTC)


 * https://github.com/ms609/citation-bot/pull/1212 this needed too. AManWithNoPlan (talk) 17:38, 11 January 2019 (UTC)

Fails to edit/finish on List of gravitationally rounded objects of the Solar System
https://github.com/ms609/citation-bot/pull/1215 AManWithNoPlan (talk) 17:39, 11 January 2019 (UTC)
 * Thank you for reporting these. AManWithNoPlan (talk) 02:03, 12 January 2019 (UTC)
 * fixed AManWithNoPlan (talk) 14:46, 16 January 2019 (UTC)

Fails to edit PageRank
We got it fixed so now it fails on all pages 🙄 AManWithNoPlan (talk) 05:44, 16 January 2019 (UTC)
 * fixed AManWithNoPlan (talk) 14:46, 16 January 2019 (UTC)

Time out on 2016 Turkish coup d'état attempt
wontfix so many links and so many that block us or time out that it does eventually finish (after a long-time), if you (and your web browser) will let it. Probably best to run section by section. AManWithNoPlan (talk) 16:49, 16 January 2019 (UTC)
 * Thanks for looking at it anyways, I`ll try it section by section. Redalert2fan (talk) 16:54, 16 January 2019 (UTC)

Time out on List of Flashpoint episodes
Running it in the debugger I find that there are mostly pdf files, which have no usable metadata. Once this pull is implemented https://github.com/ms609/citation-bot/pull/1229/ the bot will have a lot more cites on it "don't try to hard" list. AManWithNoPlan (talk) 17:00, 16 January 2019 (UTC)
 * wontfix AManWithNoPlan (talk) 17:00, 16 January 2019 (UTC)

Invalid last1 and first1
There also is "first=SPIEGEL ONLINE, Hamburg|last=Germany" on the page already which also does not seem to be correct, however this was not added by the bot.


 * yeah, we try not to do too much fixing existing bad data. wontfix AManWithNoPlan (talk) 18:16, 16 January 2019 (UTC)

Fails to add bibcode
They have changed the format to be longer. AManWithNoPlan (talk) 19:45, 13 January 2019 (UTC)


 * It's still 19 characters... ? Headbomb {t · c · p · b} 20:01, 13 January 2019 (UTC)


 * counting on an iPhone is not as easy as I thought AManWithNoPlan (talk) 20:17, 13 January 2019 (UTC)


 * I will look at it after the bibcode searches stabilize AManWithNoPlan (talk) 00:35, 14 January 2019 (UTC)


 * wontfix the doi search fails to return anything. They need to update their data files. AManWithNoPlan (talk) 20:19, 16 January 2019 (UTC)
 * The arxiv id does, however. Headbomb {t · c · p · b} 21:21, 16 January 2019 (UTC)
 * Why is this "2nd rate information"? Bibcode bot and others will add it. Headbomb {t · c · p · b} 21:32, 16 January 2019 (UTC)
 * once we have a doi to search with, we do not search absabs using arXiv. If the bibcode does not know about the doi, then it is outdated information.  AManWithNoPlan (talk) 21:36, 16 January 2019 (UTC)


 * Not really no, there's a slew of citations, mostly in mathematics, that never get anything but an arxiv bibcode. Headbomb {t · c · p · b} 21:37, 16 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1231 AManWithNoPlan (talk) 00:30, 17 January 2019 (UTC)

Failed to pickup another bibcode
The bibcode title does not match very well, so we reject it. Perhaps we are too picky. AManWithNoPlan (talk) 18:16, 17 January 2019 (UTC)
 * at the very least I should combine the two title checking codes into a function call and remove dashes before doing the compare since bibcodeland seems to eat em dashes and leave an empty plate of white space in its place. AManWithNoPlan (talk) 19:28, 17 January 2019 (UTC)


 * Isn't a doi query enough? I never found any wrong result when querying ADSABS via doi. Headbomb {t · c · p · b} 20:05, 17 January 2019 (UTC)
 * Trust me, some of the way off partial matches we get are nuts and we have to deal with GIGO on our end too. AManWithNoPlan (talk) 21:02, 17 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1233 more forgiving title compare. AManWithNoPlan (talk) 02:45, 18 January 2019 (UTC)

Mostly fixed, but this bibcode is still too different of a title to match. AManWithNoPlan (talk) 18:28, 18 January 2019 (UTC)

Cite Journal
Why does the bot remove publisher and location from the "Cite journal" template? Especially for magazines that have been published for a long time, these things change and may perhaps be of interest?  Mr.choppers &#124;  ✎  04:20, 19 January 2019 (UTC)
 * please see above discussion links and join in. One might ask why is the publisher information almost always wrong.  You might also ask why do people use cite journal for non-journals such as magazine? AManWithNoPlan (talk) 04:43, 19 January 2019 (UTC)
 * May be of interest is not a worthwhile reason - the citation is for a reference in an article and not intended for a treatise on the magazine itself. If such information is useful, then please wikilink the magazine name and create a nice article for it. AManWithNoPlan (talk) 04:46, 19 January 2019 (UTC)
 * Simply put, because the information is near useless and because no style guide recommends it. Headbomb {t · c · p · b} 06:07, 19 January 2019 (UTC)
 * Please consider contributing to this discussion. --Izno (talk) 15:14, 19 January 2019 (UTC)
 * If it is useless, then why do the parameters exist? I only found out about the existence of "cite magazine" a little while ago, hence the occasional reappearance of "cite journal."  Mr.choppers &#124;   ✎  16:34, 19 January 2019 (UTC)
 * they exist because all the citation templates are based upon the same core code and core documentation. So, there are lots of useless parameters.   AManWithNoPlan (talk) 16:45, 19 January 2019 (UTC)

Flagging for archiving since links exist above notabug. The documentation is lacking considering the publisher location removal has been standard for a decade.

Fails to expand doi
notabug tell them to publish metadata. Seriously it is just a doi.org url. AManWithNoPlan (talk) 04:46, 21 January 2019 (UTC)
 * these non-crossref dois usually have second rate meta data at doi.org, but this one has nothing. AManWithNoPlan (talk) 04:48, 21 January 2019 (UTC)

Can you take-over/merge reFill?
The maintainer of reFill is looking to pass the torch Village_pump_(technical). Is the functionality of reFill already part of Citation bot? I know this tool is very popular though it has a long list of bugs to be worked out and the code base is PhP. -- Green  C  13:16, 8 January 2019 (UTC)
 * We run our own Citoid installation.  He uses Wikipedia’s install.  The Wikipedia install would have to be willing to allow us to hit them much more aggressive than their policy allows, but that would make it easier for us.  We do nothing with combining equivalent references.  AManWithNoPlan (talk) 13:32, 8 January 2019 (UTC)
 * They Wikipedia citoid is better than ours, so refill does a better job than ours which is why I suggest that they whitelist us. AManWithNoPlan (talk) 13:48, 8 January 2019 (UTC)
 * I can't imagine the Wikimedia Foundation opposing such a usage of their Citoid instance. Nemo 21:25, 8 January 2019 (UTC)
 * I have looked at the reFill code base and it appears to not use the Citoid instance, at least not for everything. That is one reason it seems to handle international stuff much better. AManWithNoPlan (talk) 00:19, 11 January 2019 (UTC)
 * https://tools.wmflabs.org/refill-api perhaps we use them AManWithNoPlan (talk) 23:16, 14 January 2019 (UTC)

notabug seems like others are taking it over and a 2.0 version is moving fast. AManWithNoPlan (talk) 16:36, 21 January 2019 (UTC)

Converting bare links to cite journal
I know that some users are tirelessly working on converting bare links to journal articles into cite journal calls (which then citation bot can clean up). What are your preferred ways? Do you have regular expressions or other aids to share for the purpose? I see that a simplistic regex search for DOI URLs in bare links, like, finds several thousands of pages and I'm not sure what's the best way to help. Nemo 18:07, 16 January 2019 (UTC)
 * I usually just search for something like > or search for specific publisher links and try to "fix all" from that domain. (t)  Josve05a  (c) 18:20, 16 January 2019 (UTC)
 * And then you make edits like this by manually adding the basic cite journal call and using the citation expander to do the rest? Nemo 21:27, 16 January 2019 (UTC)
 * On that edit I only used the citation expander. The bot/tool can convert bare refs (with only URL) to the proper cite template, so no need to add basic cite journal fields maunually. (t) Josve05a  (c) 15:35, 18 January 2019 (UTC)

Here are 5000+ examples for whoever is interested: P8007. Nemo 14:49, 18 January 2019 (UTC)

https://github.com/ms609/citation-bot/pull/1236 Amy thoughts on this AManWithNoPlan (talk) 05:49, 20 January 2019 (UTC)

fixed bot now does more AManWithNoPlan (talk) 16:35, 21 January 2019 (UTC)

Fails to add issue?
10.1016/j.agee.2010.07.017 AManWithNoPlan (talk) 18:43, 18 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1238 we still won’t add issue of zero or one AManWithNoPlan (talk) 03:56, 19 January 2019 (UTC)

DOI glitch?
This is a bit of a garbage DOI (someone at science made the doi instead of  like a sane person would), but it's a valid one nonetheless. Headbomb {t · c · p · b} 17:32, 13 January 2019 (UTC)


 * WOW! AManWithNoPlan (talk) 20:18, 13 January 2019 (UTC)


 * WOW! even? Headbomb {t · c · p · b} 21:55, 13 January 2019 (UTC)


 * not that wowing https://github.com/ms609/citation-bot/pull/1225 AManWithNoPlan (talk) 22:32, 13 January 2019 (UTC)

Takes forever to run
Is it just me, or is the bot considerable slower since about a week? We're talking 30 minutes + to run on articles. Sometimes several hours. Headbomb {t · c · p · b} 03:24, 21 January 2019 (UTC)
 * It’s just you. 😁🤣😂😯. Actually we seem to have gained popularity and and load AManWithNoPlan (talk) 03:26, 21 January 2019 (UTC)
 * Well, is there a way to get more server resources? Or a new server? Headbomb {t · c · p · b} 03:27, 21 January 2019 (UTC)
 * I was wondering the same thing earlier today when I could not fully submit patches since they could not test themselves. AManWithNoPlan (talk) 03:30, 21 January 2019 (UTC)
 * I'm not sure, but other tools experience 500 errors due to a buildup of connections and work around the problem by periodically doing a "webservice restart". It seems kubernetes doesn't yet support increasing parallelism. Nemo 11:56, 21 January 2019 (UTC)
 * I can confirm getting a lot of 500 errors. Headbomb {t · c · p · b} 12:09, 21 January 2019 (UTC)
 * There may be a related discussion about Cyberbot at WP:BOTN. --Izno (talk) 13:11, 21 January 2019 (UTC)


 * Seems to be resolved. Maybe it's temporary though. Headbomb {t · c · p · b} 05:32, 22 January 2019 (UTC)


 * all evidence points to it being a toolbar problem. notabug AManWithNoPlan (talk) 13:29, 22 January 2019 (UTC)

Do not automatically add Citeseerx

 * Users are always responsible for the edits of the bot, since they are the ones that asked the bot to make the edit in the first place, so nothing is automatically added. The best way to deal with (the very small number of) copyvios on CiteSeerX is to contact them to take down the offending file (and possibly put a comment in the citeseerx parameter such as citeseerx, although the CiteSeerX page contains more than just the file and the metadata is gives is useful). Headbomb {t · c · p · b} 16:59, 7 November 2018 (UTC)
 * The number of copyvios is not small, because citeseerx copies all sorts of copies of papers — often copies made available for some course by someone else – that are neither author copies nor licensed from the publisher. They may be fair use for a course but that doesn't make them fair use for citeseerx and for us. And if the edit cannot be attributed to the specific user who caused it (and that user convinced or prevented from continuing to make bad edits) or if the process does not involve the user specifically vetting the edits that are made, with a big warning about COPYLINK, then it should not be happening at all. —David Eppstein (talk) 17:23, 7 November 2018 (UTC)
 * since we do not link to the PDF directly, does that make it okay? honest question about how close to the illegal copy do we need to be in order to be evil.  AManWithNoPlan (talk) 18:00, 7 November 2018 (UTC)
 * I doubt it. We're linking to a site whose only purpose is to provide the link. WP:ELNEVER seems unambiguous: "If there is reason to believe that a website has a copy of a work in violation of its copyright, do not link to it." —David Eppstein (talk) 18:07, 7 November 2018 (UTC)
 * So, slightly better, but not better enough. AManWithNoPlan (talk) 18:20, 7 November 2018 (UTC)

They have a takedown link on each page now, and they seem to be within the law as an NSF site http://vondranlegal.com/what-to-do-when-the-federal-government-infringes-your-copyright/ AManWithNoPlan (talk) 15:40, 22 January 2019 (UTC)
 * A funny case of sovereign immunity used to dismiss a copyright violation case brought by a photographer against a state university in USA: Indiana 1:16-cv-02463-TWP-DML and similarly in Kentucky. And Ohio, Indiana, Florida (more elaborately, with consideration of "established state procedure to deprive of property" and due process), Michigan, Michigan again. Nemo 18:09, 22 January 2019 (UTC)
 * goes double for people. The Pope and the queen of England are both exempt from all criminal prosecution worldwide.  They have sovereign≥ immunity at home and diplomatic immunity every where else.  AManWithNoPlan (talk) 21:26, 22 January 2019 (UTC)

Why is cit book preferred to cite web for online “books”?
Why is this bot constantly changing cite web to cite book here? I am using the online version of this dictionary, not the paper version. Peacemaker67 (click to talk to me) 23:04, 22 January 2019 (UTC)
 * When we query the website, it gives us publisher information and says “I am a book online”. Many of the journals/books/patents/etc people reference are through websites.  The issue of which template is preferable is another issue.  AManWithNoPlan (talk) 23:26, 22 January 2019 (UTC)
 * The problem is, the online version doesn't give page numbers in the paper edition, and I am not citing a book, I'm citing a web page. I've never seen the book. I don't know why a bot is overriding legitimate editor choice of template. To me it seems a perverse outcome. Peacemaker67 (click to talk to me) 23:49, 22 January 2019 (UTC)


 * See User:Citation bot/use, third example. Headbomb {t · c · p · b} 02:35, 23 January 2019 (UTC)
 * Thanks Headbomb! Peacemaker67 (click to talk to me) 02:55, 23 January 2019 (UTC)
 * I am shocked that works. I forgot I added support for comments in the template type a while ago. I will blame being 35000 feet up in the air.  AManWithNoPlan (talk) 03:15, 23 January 2019 (UTC)

fixed flag for archiving AManWithNoPlan (talk) 05:09, 23 January 2019 (UTC)

Fails to remove location in cite journal
It converts the place to location after the removal of location occurs. AManWithNoPlan (talk) 19:00, 22 January 2019 (UTC)
 * I am prone to leave code as it is. Otherwise we start looping over stuff again and again just for a few obscure edge cases.    AManWithNoPlan (talk) 21:31, 22 January 2019 (UTC)
 * wontfix not a high priority. Requires location and publication-place to both be set.  AManWithNoPlan (talk) 21:21, 23 January 2019 (UTC)

Two issues with book chapter on JSTOR
https://en.wikipedia.org/w/index.php?title=User%3AJosve05a%2Fcite-sandbox&diff=prev&oldid=879725824


 * 1) The link is converted from bare to a doi when it should have been added as a jstor (as well/instead).
 * 2) cite book with chapter should be used, not cite journal

(t) Josve05a  (c) 00:25, 23 January 2019 (UTC)


 * number one causes number two. AManWithNoPlan (talk) 01:32, 23 January 2019 (UTC)


 * much enhanced jstor support added. More coming.  fixed AManWithNoPlan (talk) 15:05, 24 January 2019 (UTC)

Expand bare doi templates
should be treated as 10.1111/jep.12752 (t) Josve05a  (c) 00:34, 23 January 2019 (UTC)


 * That could be generalized to other identifiers too. Headbomb {t · c · p · b} 14:48, 23 January 2019 (UTC)


 * https://github.com/ms609/citation-bot/pull/1255 AManWithNoPlan (talk) 18:48, 24 January 2019 (UTC)


 * fixed AManWithNoPlan (talk) 19:36, 24 January 2019 (UTC)

Bare converting
Meanwhile, I've checked the result of the recent bare ref conversion change and I've not found any mistake, only good edits. Special:Diff/879655266, Special:Diff/879653934, Special:Diff/879649478, Special:Diff/879639416, Special:Diff/879626624, Special:Diff/879617148, Special:Diff/879616120, Special:Diff/879615613, Special:Diff/879614147, Special:Diff/879613874, Special:Diff/879611681, Special:Diff/879611148, Special:Diff/879609896, Special:Diff/879601520, Special:Diff/879598740, Special:Diff/879592444, Special:Diff/879590812, Special:Diff/879581261, Special:Diff/879574435, Special:Diff/879568947, Special:Diff/879566233. Nemo 17:32, 22 January 2019 (UTC)


 * Things like this are fine, but edits like this or this are... iffy and can create WP:CITEVAR issues. Headbomb {t · c · p · b} 18:12, 22 January 2019 (UTC)


 * That said, I do love the feature, but I would restrict it with an API call / additional checkbox in so that usage is intentional, and users are warned to  only use this on articles they plan to fully cleanup citations after the bot. Headbomb {t · c · p · b} 18:22, 22 January 2019 (UTC)


 * Perhaps only do it if there are at least two citation templates on the page already? AManWithNoPlan (talk) 18:32, 22 January 2019 (UTC)


 * Still iffy. The first one is a bare link, but when it tries to reformat a manual citation, you'll get in trouble and some will demand heads on pikes. An 'advanced' checkbox would probably be OK, but by default this is likely too risky. Headbomb {t · c · p · b} 18:36, 22 January 2019 (UTC)


 * At least in Special:PermaLink/879639416, however, the citation ends up using a style consistent with the pre-existing one and all the others, which is why the specific case seemed fine to me. Isn't it? As for the general case, there ought to be a way to check whether the existing references use an inconsistent style or a (non-)style falling outside the realm of "where Wikipedia does not mandate". Nemo 19:03, 22 January 2019 (UTC)

"With the pre-existing ones and all the others?" Hardly so. Headbomb {t · c · p · b} 20:07, 22 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1250 best I can do on a phone in an airport AManWithNoPlan (talk) 21:02, 22 January 2019 (UTC)
 * that pull is now active fixed AManWithNoPlan (talk) 13:49, 25 January 2019 (UTC)

Please remove this jstor proxy
They did not include proxy in their url, annoying. AManWithNoPlan (talk) 18:46, 24 January 2019 (UTC)
 * Will add code to see www-jstor-org.some-stuff/stable/1234 AManWithNoPlan (talk) 18:47, 24 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1258 AManWithNoPlan (talk) 22:41, 24 January 2019 (UTC)

Final Ed

 * i Don’t remember why Ed is “ed”.  Might just add the string as a whole. AManWithNoPlan (talk) 13:50, 25 January 2019 (UTC)
 * because of 2nd ed.? (t) Josve05a  (c) 14:10, 25 January 2019 (UTC)
 * that’s right.  AManWithNoPlan (talk) 15:07, 25 January 2019 (UTC)
 * It's also a preposition in some Latin languages I think. Italian maybe. Headbomb {t · c · p · b} 20:22, 25 January 2019 (UTC)
 * In English ED is something else... 🤣😲😂🙄 AManWithNoPlan (talk) 20:25, 25 January 2019 (UTC)
 * What's so funny about ED? Headbomb {t · c · p · b} 20:39, 25 January 2019 (UTC)

Wrong date format

 * Deprecated? In what parallel universe? https://xkcd.com/1179/ Nemo 10:22, 27 January 2019 (UTC)
 * 🤣😂😁 notabug AManWithNoPlan (talk) 14:03, 27 January 2019 (UTC)
 * seriously, that page is flagged to demand the use of the date format that the bot used. I wish everyone would use yyyy-mm-dd for computer stuff.  In my writing I use 7 MAY 2001 format.  AManWithNoPlan (talk) 14:08, 27 January 2019 (UTC)

Publisher removed

 * You will be interested in this current discussion, I imagine. --Izno (talk) 15:56, 27 January 2019 (UTC)
 * As is often the case, the publisher of that journal has changed multiple times. AManWithNoPlan (talk) 17:49, 27 January 2019 (UTC)
 * I should note that there was consensus for over a decade to always remove publishers and recently this consensus is being challenged. I only note this since multiple are people incorrectly believe that this is a new feature of the Bot. AManWithNoPlan (talk) 18:00, 27 January 2019 (UTC)

notabug

Subscription site / bad title

 * The correct title is actually on the reference page, it can be found after "Subscribe to the FT to read:" in this case its: "Hanjin bankruptcy brings chaos but no capacity cut". Not sure if its feasible/possible to make the bot search for this. It seems to be like this for all Financial Times pages (ft.com) so preventing all links to that site from being edited by the bot is also a possibility. Redalert2fan (talk) 19:39, 26 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1261 AManWithNoPlan (talk) 20:11, 26 January 2019 (UTC)

Invalid ISO dates added
https://github.com/ms609/citation-bot/pull/1262 when we know, we will pad now (after this on wikipedia of course). AManWithNoPlan (talk) 18:56, 27 January 2019 (UTC)

not does not fully report actions reasons
See bug report template. Both an access date and a complete URL were removed by Citation Bot from a "Cite journal" template. RobDuch (talk) 20:50, 28 January 2019 (UTC)
 * The url removal is described merely as parameters removed. The reason, which the person using the bot would see, that the url is removed because it is redundant with the DOI.  AManWithNoPlan (talk) 21:11, 28 January 2019 (UTC)
 * The description of what is done is always a hard to describe since it is summarizing possibly 40 changed templates with 100 changes in one line. That is impossible to get right every time. AManWithNoPlan (talk) 22:06, 28 January 2019 (UTC)

More invalid dates added

 * https://github.com/ms609/citation-bot/pull/1264 AManWithNoPlan (talk) 01:18, 29 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1265 also AManWithNoPlan (talk) 17:36, 29 January 2019 (UTC)

CrossRef gives bad last=&Na? Please reject
This is a possible placeholder / shorthand for no authors or N/A. Maybe. Headbomb {t · c · p · b} 05:31, 29 January 2019 (UTC)
 * That makes sense. I will look at it. AManWithNoPlan (talk) 05:52, 29 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1270 AManWithNoPlan (talk) 17:19, 29 January 2019 (UTC)

Odd edits
Citation bot is making odd changes to references like this where it converts a cite journal to a cite book (when the reference in question very much is a journal, not a book) and removes valid publisher information. See also here where the bot simply removed parameters with no discernible reason. Can anyone explain why the bot is doing this? Parsecboy (talk) 12:40, 29 January 2019 (UTC)
 * The first edit is a bit strange, a alleged journal with an ISBN. Worldcat and Google Books seem to indicate that it is a book in a series rather than a typical journal. The bot might be able to be coded to avoid doing what it did, since context is important.


 * Removing the publisher parameter from journal citations is a long-standing feature. I'm not saying I support it, just explaining. See this RFC for more information. – Jonesey95 (talk) 14:44, 29 January 2019 (UTC)
 * So, the bot removed the publisher since it is a journal and changed the type since it is a book. So, is is a bournal or jook?  I am only 90% joking.  The distinction is not always clear between a series of books and a journal.  AManWithNoPlan (talk) 15:28, 29 January 2019 (UTC)

notabug data from databases is not clear. AManWithNoPlan (talk) 14:01, 30 January 2019 (UTC)

New feature coming - more DOIs
There are ten different DOI providers. We have always supported Crossref. We added more recently. Now even more are coming. We also are adding tests for the ones that don’t work so we know if they suddenly start working and can check for bugs. Who knew that movies had dois? And no, we don’t expand the black panther marvel movie doi even with the new code. https://github.com/ms609/citation-bot/pull/1253 AManWithNoPlan (talk) 18:18, 26 January 2019 (UTC)

fixed and running great. AManWithNoPlan (talk) 14:48, 31 January 2019 (UTC)

Citation template changes
Looks like this issue is similar/related to the previously reported bug where cite web was changed to cite journal. What are the criteria with which this bot is changing citation templates from one to another? I think we can assume that most of these templates have been specifically chosen by editors, what is the bot supposed to be "fixing"? Thanks.— TAnthonyTalk 18:43, 30 January 2019 (UTC)
 * the website tells us that it is a book.  I will look into it. Honestly editors generally don’t put much thought into which one they choose. AManWithNoPlan (talk) 18:48, 30 January 2019 (UTC)
 * Well they are online book reviews, not books. And I've found that even sloppy editors can see the difference between cite web and cite book, I do a lot of citation cleanup and have almost never had to change a template like this.— TAnthonyTalk 18:51, 30 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1274 should catch a lot of them once active. AManWithNoPlan (talk) 21:54, 30 January 2019 (UTC)

Washington, D.c
https://github.com/ms609/citation-bot/pull/1275 Will not remove trailing period, if there is another period in the last word. AManWithNoPlan (talk) 03:31, 31 January 2019 (UTC)


 * It should never remove the trailing period in a journal... too much WP:CONTEXT, for cases like Int. J. Med. Sci. or something. Headbomb {t · c · p · b} 20:56, 31 January 2019 (UTC)

Citoid discus
Citoid usage discussion on MediaWiki.org

wontfix — flag to archive

URLs containing an ISSN-DOI
The DOI is valid and points to the correct journal, but you are write that these ISSN only DOIs are probablematic and should probably be 100% ignored. AManWithNoPlan (talk) 16:57, 29 January 2019 (UTC)
 * Code to simply not ever add those. https://github.com/ms609/citation-bot/pull/1269 AManWithNoPlan (talk) 17:02, 29 January 2019 (UTC)
 * The bot adds the doi because of the ISSN in the URL. However, the doi goes to the journal mainn page, even if the URL was pointing to another page (e.g. the listing of the editorial board). What was being referenced here was a page on the journal's website, not an article published in the journal, so "cite web" was correct and "cite journal" is not. Note that the ISSN-containing URL has been abandoned by Wiley and pages have gotten new URLs that doon't contain the ISSN. The old URLs are still functional, they are rediected to the new (non-ISSN) URL. Ideally, the bot would replace the old URL with the new one, but I have no idea how easy/difficult that is. If it's too hard, the bot should leave these instances alone. --Randykitty (talk) 17:07, 29 January 2019 (UTC)
 * Sorry, but this is not fixed. I just reverted the above diff and ran the bot again, with almost the same result, except that there are now erroneous DOIs and it still is changed to "cite journal"... --Randykitty (talk) 15:15, 30 January 2019 (UTC)
 * The perennial problem with GIGO: there is always another code path AManWithNoPlan (talk) 15:42, 30 January 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1273 AManWithNoPlan (talk) 21:56, 30 January 2019 (UTC)

Much better now these too will help: https://github.com/ms609/citation-bot/pull/1279 https://github.com/ms609/citation-bot/pull/1277 https://github.com/ms609/citation-bot/pull/1278 https://github.com/ms609/citation-bot/pull/1280 So many small improvements for such a rare promblem. AManWithNoPlan (talk) 17:52, 31 January 2019 (UTC)
 * I have looked at those github links, but must admit my ignorance here and have no idea what all that means. Meanwhile, the bot is still doing this . I've corrected a few by hand, but that's quite tedious. This is such a wonderful tool and I really appreciate all the work and effort of you guys to keep this running, so I feel really bad to keep pestering you about this... --Randykitty (talk) 11:46, 2 February 2019 (UTC)
 * until they are merged, they are not alive yet. AManWithNoPlan (talk) 13:59, 2 February 2019 (UTC)

Remove URL if DOI resolves to the same place
Actually this could apply to any URLs that resolve to the same place as the DOI. Headbomb {t · c · p · b} 01:24, 15 January 2019 (UTC)
 * Indeed: according to https://www.crossref.org/blog/urls-and-dois-a-complicated-relationship/, resolving the URL is needed for over a hundred publishers. A simple HTTP request with cookies enabled, plus some custom HTML parsing for the most frequent DOI prefixes, would go a long way. Nemo 14:14, 15 January 2019 (UTC)
 * thoughts on this approach https://github.com/ms609/citation-bot/pull/1260 AManWithNoPlan (talk) 19:07, 26 January 2019 (UTC)


 * What is 'this approach', exactly? Headbomb {t · c · p · b} 19:37, 26 January 2019 (UTC)
 * The linked code. I will convert to English.   Drop url if:

Citation is complete The doi is not an ISSN-only doi (points to article not journal) The url hostname is on the list canonical publishers The url does not contain 'pdf', 'image', 'plate', 'figure', or 'picture' The doi resolves to something AManWithNoPlan (talk) 18:41, 27 January 2019 (UTC)
 * The title is slightly misleading because this code doesn't check at all whether there is a match, it just relies on whoever has previously compiled the template to have verified and stated an identity between the URL and the DOI. You could argue that if they didn't it's just GIGO (example) and that this assumption works in the large majority of cases but I'd be curious if it's 99 %, 95 % or 80 % or whatever. Maybe I'll run some regex on the dumps so that whoever wants can check a sample of URLs.
 * Speaking of which, it may be helpful to use a slightly different constant than CANONICAL_PUBLISHER_URLS, where several domains are unlikely to be the target of a DOI: for instance link.springer.com receives nearly all DOI redirects, while www.springer.com is more likely to contain journal descriptions where the URL patterns can get tricky. Nemo 20:02, 27 January 2019 (UTC)


 * I have changed the pull to now also check if the doi url matches the url in the template and also if it matches what the the url redirects to when actually polled. AManWithNoPlan (talk) 01:20, 29 January 2019 (UTC)

Please try to detect open access database errors
https://github.com/ms609/citation-bot/pull/1294 Will raise the bar. AManWithNoPlan (talk) 05:01, 7 February 2019 (UTC)

Better publisher handling
https://github.com/ms609/citation-bot/pull/1286 documenting improvements fixed

Redlink title-link
I'm not convinced the citation in question belongs in the article at all, but that's beside the point. —David Eppstein (talk) 22:06, 7 February 2019 (UTC)
 * We do not add a new title link, we just convert the inline link to the superior title plus title-link. AManWithNoPlan (talk) 22:24, 7 February 2019 (UTC)
 * Ok, thanks for the clarification. That change seems harmless enough to me. —David Eppstein (talk) 22:29, 7 February 2019 (UTC)
 * It is really hard to see in the diff notabug. AManWithNoPlan (talk) 22:56, 7 February 2019 (UTC)

Changing publisher of a book series into a journal
That is annoying that their meta-data has the publisher listed as journal. I will investigate. AManWithNoPlan (talk) 21:30, 7 February 2019 (UTC)
 * Annoying metadata elsewhere or not, the bot should never turn valid citations on this site into invalid ones. —David Eppstein (talk) 21:35, 7 February 2019 (UTC)
 * Fix one: https://github.com/ms609/citation-bot/pull/1303 AManWithNoPlan (talk) 21:36, 7 February 2019 (UTC)
 * Fix two: https://github.com/ms609/citation-bot/pull/1304 AManWithNoPlan (talk) 21:41, 7 February 2019 (UTC)

arXiv vs eprint
According to the documentation, the bots actions are correct. cite arxiv is an odd beast that does things its own way. AManWithNoPlan (talk) 21:42, 7 February 2019 (UTC)


 * Converting arxiv to eprint could probably be removed at this point, since that dates back to a time where arxiv was not supported. The addition of class to a cite arxiv is fine though. Headbomb {t · c · p · b} 22:32, 7 February 2019 (UTC)


 * While the conversion is technically correct, it is just one more pointless change to tick people off -- or at least confuse. Also, if the citation ever gets upgraded to cite journal we have to convert it back.  AManWithNoPlan (talk) 22:55, 7 February 2019 (UTC)

https://github.com/ms609/citation-bot/pull/1306 AManWithNoPlan (talk) 23:01, 7 February 2019 (UTC)

When expanding preprint into conference paper, deletes the paper title
I don't understand how this one happened. Citation bot did correctly find a publication matching the arXiv preprint. To do so, it must have matched title and authors, because that's the only information in common between the arXiv preprint and the published version. When I ask for bibtex metadata from doi.org, I get

@incollection{Grier_2013, doi = {10.1007/978-3-642-39206-1_42}, url = {https://doi.org/10.1007%2F978-3-642-39206-1_42}, year = 2013, publisher = {Springer Berlin Heidelberg}, pages = {497--503}, author = {Daniel Grier}, title = {Deciding the Winner of an Arbitrary Finite Poset Game Is {PSPACE}-Complete}, booktitle = {Automata, Languages, and Programming} }

which does correctly include the title of the paper (but not the series). So the information was obviously there. But Citation bot chose to remove it. —David Eppstein (talk) 21:47, 7 February 2019 (UTC)


 * Two points, we check DOIs in this order: 1.  CrossRef       2.  dx.doi.org JSON (not bibtex)     3.  Zotero on the website itself (yuck!).   So, you information is doubly irrelevant, it is not the dx.doi.org JSON, and we use CrossRef.  We get this:  AManWithNoPlan (talk) 22:02, 7 February 2019 (UTC)

978-3-642-39205-4 978-3-642-39206-1  0302-9743  1611-3349  Lecture Notes in Computer Science Automata, Languages, and Programming 7965  Daniel Grier Chapter 42</component_number> <year media_type="print">2013 <first_page>497</first_page> <last_page>503</last_page> <doi type="book_content">10.1007/978-3-642-39206-1_42 <publication_type>full_text</publication_type> <article_title> Deciding the Winner of an Arbitrary Finite Poset Game Is PSPACE-Complete </article_title>


 * Time to dig through the CrossRef parsing code AManWithNoPlan (talk) 22:04, 7 February 2019 (UTC)
 * This should fix it. I included a test too. https://github.com/ms609/citation-bot/pull/1305 AManWithNoPlan (talk) 22:22, 7 February 2019 (UTC)

Removes access date when there is no urls
Get a better url. DOIs have not access dates. AManWithNoPlan (talk) 02:23, 9 February 2019 (UTC)


 * The assessment date is clear. '2004'. No need for an accessdate. Change 2004 to 30 April 2004 to be more specific. Headbomb {t · c · p · b} 02:50, 9 February 2019 (UTC)

More ISSN DOI
Not sure if this is still happening:. Nemo 10:10, 10 February 2019 (UTC) fixed

better etal handling

 * See the actual test in Module:Citation/CS1 at.
 * The naive suggested implementation above can cause duplicate parameters (as in display-authors is already set and/or happens to be set to the exact number of authors in the list i.e. author1, 2, and display-authors=2 is set), or it can cross over into pages listed in Category:CS1 maint: display-authors. You can find some of the former in the contribution history there. I would say this is a bit context sensitive, which is why it's not an error at this time. might have an opinion. --Izno (talk) 04:22, 7 February 2019 (UTC)
 * E.g. [//en.wikipedia.org/w/index.php?diff=882151093 this one]. Here is another [//en.wikipedia.org/w/index.php?diff=882151249 fixed GIGO] of a different sort i.e. the name separators. You also need or want to catch italics, which I've had a few to do. --Izno (talk) 04:30, 7 February 2019 (UTC)
 * Then for editors, there's also the 2 or 3 different ways to use the parameters. [//en.wikipedia.org/w/index.php?diff=882151552 This one has internal numbers]. A different one may have external numbers with/without dash i.e. |editorfirst1. --Izno (talk) 04:33, 7 February 2019 (UTC)
 * Consider the regex above to be pseudocode for the general idea, rather than finalized solution. Headbomb {t · c · p · b} 04:36, 7 February 2019 (UTC)
 * Sure, just there's some falls to be aware of.
 * Also, one other thing I've been doing in the run is taking care of uses of authors where I see it, which are often used in combination. --Izno (talk) 04:43, 7 February 2019 (UTC)
 * And then there's dumb garbage like [//en.wikipedia.org/w/index.php?diff=882156664 this]. --Izno (talk) 05:18, 7 February 2019 (UTC)
 * Perhaps this is better done as a separate task and not this BOT? AManWithNoPlan (talk) 17:48, 7 February 2019 (UTC)


 * Well GIGO can be handled by a different bot/AWB thing, but cases similar to the ones I linked in the diff should be able to be handled by this bot relatively easily. Headbomb {t · c · p · b} 18:09, 7 February 2019 (UTC)

This will handle the simplest cases: https://github.com/ms609/citation-bot/pull/1302 AManWithNoPlan (talk) 21:22, 7 February 2019 (UTC)

Labour / Le Travail
In general '/' should be treated the same way as ':' is. Headbomb {t · c · p · b} 20:47, 8 February 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1314 AManWithNoPlan (talk) 21:44, 8 February 2019 (UTC)

Bad title
The "dead" page contains "Deze pagina is niet gevonden" which means "this page was not found", While the archived copy is a pdf which does not seems to contain a specific title (other than the file name). Redalert2fan (talk) 22:25, 8 February 2019 (UTC)


 * https://github.com/ms609/citation-bot/pull/1315/files will fix these specific examples once merged.  And this is why lazy webservers that simply say "page not found" but do not set the error code are a bad idea. AManWithNoPlan (talk) 23:06, 8 February 2019 (UTC)


 * I think I found one more for you, exactly the same style of problem. This time its in Vietnamese diff. title= "Bao phu nu - Đọc báo phụ nữ Việt Nam online tin tức mới nhất 24h" is added. I don't speak Vietnamese but according to google translate this means "Bao phu nu - Read newspaper Vietnamese women online latest news 24h" which seems like a title for the whole website and not for the specific article. By looking at the link a correct title should be something like "Bao-Trung-Quoc-noi-ve-may-bay-tuan-tieu-M28-cua-Viet-Nam". Thanks Redalert2fan (talk) 19:45, 9 February 2019 (UTC)

Fix line feeds in titles
https://github.com/ms609/citation-bot/pull/1317 AManWithNoPlan (talk) 01:31, 9 February 2019 (UTC)


 * Just make sure they're not start/end of line. Headbomb {t · c · p · b} 01:45, 9 February 2019 (UTC)

Miscapitalized journal
Bad metadata for this is so common that we actually have a whole list of capitalization rules and exceptions. In fact it is so bad that we don’t trust the metadata and change the capitalization after we get it. AManWithNoPlan (talk) 14:11, 9 February 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1318 AManWithNoPlan (talk) 15:23, 9 February 2019 (UTC)

Titles from russianplanes.net
Note: this happens with all aircraft pages like this from russianplanes.net. Thanks, Redalert2fan (talk) 20:56, 9 February 2019 (UTC)


 * Cannot fix. Tell Russian to give proper titles.  AManWithNoPlan (talk) 23:14, 9 February 2019 (UTC)
 * I'll have to learn Russian then! haha. On a serious note would it be an option to block the title from being added? Redalert2fan (talk) 23:22, 9 February 2019 (UTC)
 * It is better than no title, so I am not sure. It is technically the correct title.  AManWithNoPlan (talk) 23:28, 9 February 2019 (UTC)

Capitalization: AIAA Journal
https://github.com/ms609/citation-bot/pull/1318 AManWithNoPlan (talk) 23:15, 9 February 2019 (UTC)

Alternative ID used as page number
The full page number is 89017-1–89017-5. So which is more useful? AManWithNoPlan (talk) 00:50, 11 February 2019 (UTC)
 * we use open url API since it allows bots. AManWithNoPlan (talk) 02:05, 11 February 2019 (UTC)
 * it's pubmed actually. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?tool=DOIbot&email=martins@gmail.com&db=pubmed&id=18317533 AManWithNoPlan (talk) 18:01, 11 February 2019 (UTC)
 * Well, it's not wrong and PubMed has its reasons I guess, so it could be left as is. Nemo 18:15, 11 February 2019 (UTC)
 * we generally leave as is, but bogus preprint page ranges with one as the first are too common. notabug. AManWithNoPlan (talk) 18:18, 11 February 2019 (UTC)

Incorrect date
I cannot reproduce it. Very odd. AManWithNoPlan (talk) 23:28, 9 February 2019 (UTC)
 * Quite interesting, I also tried running it again and for the link it gave "Operation timed out after 10001 milliseconds with 0 bytes received" but in that case last time it probably didn't time out. Thanks for taking a look. Redalert2fan (talk) 23:35, 9 February 2019 (UTC)
 * no wonder I couldn't reproduce it, the bot timed out. AManWithNoPlan (talk) 00:21, 10 February 2019 (UTC)


 * Could always have a "future date" date where anything 2 days in the future doesn't get added. Headbomb {t · c · p · b} 00:28, 10 February 2019 (UTC)
 * lots of magazines and journals have future dates.  Might be best to put off adding them.  AManWithNoPlan (talk) 01:15, 10 February 2019 (UTC)
 * Holy crud strtotime('3 October, 2016') gives that date!!!! AManWithNoPlan (talk) 20:11, 11 February 2019 (UTC)

https://github.com/ms609/citation-bot/pull/1329 AManWithNoPlan (talk) 20:34, 11 February 2019 (UTC)

Capital om
https://github.com/ms609/citation-bot/pull/1330 AManWithNoPlan (talk) 23:59, 11 February 2019 (UTC)

Removes accessdate for citations with chapterurl
I wonder when that broke? AManWithNoPlan (talk) 00:43, 11 February 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1324 AManWithNoPlan (talk) 02:05, 11 February 2019 (UTC)

Adds No Authorship Indicated
https://github.com/ms609/citation-bot/pull/1326 AManWithNoPlan (talk) 17:05, 11 February 2019 (UTC)

Broken links to www3.interscience.wiley.com
I noticed we have some 1000 links to www3.interscience.wiley.com/cgi-bin/ which seem to all give an HTTP 403 error. Do they work for anyone? Should they be removed? Is it a job for a bot? For this bot or some other? Nemo 09:35, 8 February 2019 (UTC)
 * I just checked them from a computer that has a subscription to a multitude of journals. They do not work and they should be removed. AManWithNoPlan (talk) 15:55, 8 February 2019 (UTC)

wontfix by this bot. Some other bot should grab them all. Verify they are dead and then remove. AManWithNoPlan (talk) 17:01, 11 February 2019 (UTC)

Don't change page numbers to reflect entire range of article

 * This was already partly addressed, perhaps a regression? Some more complicated example which may be useful for additional unit testing: . Nemo 22:21, 13 February 2019 (UTC)
 * This has never been addressed. Addressing it has been discussed and the code is written, but it is not deployed to wikipedia.    AManWithNoPlan (talk) 00:08, 14 February 2019 (UTC)
 * Flag for archiving : Duplicate Issue


 * Not a bug! You appear to be referring to the cite/citation pages parameter, which is supposed to be a range, as appropriate for the full citation. And not the in-source specifier of where specific material is to be found, which is appropriate for individual (and multiple) short-cites within the article.


 * I suspect your complaint stems from, which replaced things like "|pages= 64, 66, 70" and "|pages= 396, 422" with "pages= 55–76" and "pages= 381–429". This is an instance of the perennial trying to "reuse a citation" with "named-refs". The problem is that while the "&lt;ref name=" construction can make a note appear in more than one point in the text, it is still just one note applied to multiple, and usually differing, instances. The proper solution is to use short-cites (such done with the harv family of templates), which can be individually customized.


 * The problem here is you don't want to lose the specific page information. Which I think is legitimate. The proper way to preserve that information is put them into short-cites. But that can't be done in the bot, as the correct page number to use at each point in the text is indeterminable. E.g., one of the examples above has three page numbers, and appears in two places. Correct assignment of those page numbers requires comparison of the text with the source at each location. Until someone comes along to do that, I would like to suggest the following: that the incorrect page "range" being replaced be preserved as a comment. Also: we should have a maintenance category for such misplaced in-source specifiers. &diams; J. Johnson (JJ) (talk) 00:23, 14 February 2019 (UTC)
 * It could easily be a bug, in the (not infrequent) case that the doi goes to a collection of smaller articles and the citation goes to an individual one of those smaller articles. For instance, some journals publish collections of book reviews under a single doi, but each review within that collection has its own smaller page range and its own author. Example:
 * I would be quite annoyed if I found Citation bot "fixing" these by expanding the page range to the whole book review column given by the metadata for the doi (pp. 241–247 in this example).
 * Also, putting detailed page information into short-cites only works for citation styles that use both short-cites and long-cites. Because our citation templates are unable to handle it, my usual solution for citing specific material within a longer journal paper is to write it out in untemplated text after the template. —David Eppstein (talk) 00:40, 14 February 2019 (UTC)
 * Also, putting detailed page information into short-cites only works for citation styles that use both short-cites and long-cites. Because our citation templates are unable to handle it, my usual solution for citing specific material within a longer journal paper is to write it out in untemplated text after the template. —David Eppstein (talk) 00:40, 14 February 2019 (UTC)

dead discussion
should publisher be removed – discussion about the above discussion

fixed - discussion above archives, so archive our link to it

not directly related discussion
merging subscription neeeded into cite templates

notabug looks like they have it all under control.

Again converts good combination of parameters to bad combination
CS2 sucks. I think I have a solution, I can work on. AManWithNoPlan (talk) 02:29, 11 February 2019 (UTC)
 * I'm sure nothing would be different if the template also had cs1. So it's not the style, but the all-in-one template parameterization that you're complaining about. But that has its advantages, too: for instance, that way you don't have quite as much of a problem with people using cite journal for conference papers. —David Eppstein (talk) 03:13, 11 February 2019 (UTC)
 * Yeah, CS1 encourages people to do wrong, CS2 encourages templates to guess wrong. AManWithNoPlan (talk) 17:02, 11 February 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1327 (I also already have added code to detect this specific instance (ie. it detects that Proc. === Proceedings)) AManWithNoPlan (talk) 17:57, 11 February 2019 (UTC)

url parameter removed
Urls that match the DOI are removed. AManWithNoPlan (talk) 21:23, 13 February 2019 (UTC)

url with "&" character in search query (books.google.com)
This is probably a Pale Moon browser fault which apparently doesn't encoded url properly. On SeaMonkey "&" is encoded as %26, and entering the full url with unencoded "&" trims it just like the bot did. (It apparently was a temporary browser glitch, because after testing in Pale Moon, url was properly encoded too) Cause found: automatic cite in Visual Editor decodes %26 in q= to "&" (VisualEditor/Feedback). --MarMi wiki (talk) 19:53, 14 February 2019 (UTC)
 * Thank you for following up. notabug AManWithNoPlan (talk) 20:03, 14 February 2019 (UTC)
 * I wouldn't mind if everything except id= and pg= were trimmed from Google Books links, but I think others disagree. Presumably, because this is a subject of editor disagreement, it shouldn't be overridden by the bot making a choice on what to trim. —David Eppstein (talk) 23:14, 14 February 2019 (UTC)

journal = Methods in Molecular Biology (Clifton, N.j.)
Most likely not fixable, will look at meta data AManWithNoPlan (talk) 02:34, 9 February 2019 (UTC)


 * You could specify an exception for that journal/series. It's really really common, and I need to cleanup about 30-40 conversions from some weird Methods in Molecular Biology → Methods in Molecular Biology → Methods in Molecular Biology (Clifton, N.j) + Methods in Molecular Biology → journal + Methods in Molecular Biology cycle per dump. Headbomb {t · c · p · b} 02:55, 9 February 2019 (UTC)
 * https://github.com/ms609/citation-bot/pull/1332 AManWithNoPlan (talk) 20:07, 12 February 2019 (UTC)

Incorrect Publisher Removed
That’s an interesting question. What should be done when a decade old consensus is challenged? Should we stop and wait or what. I don’t know. AManWithNoPlan (talk) 01:26, 10 February 2019 (UTC)
 * Related discussion: Help talk:Citation Style 1. Cunard (talk) 01:33, 10 February 2019 (UTC)
 * we know about that discussion. AManWithNoPlan (talk) 01:39, 10 February 2019 (UTC)
 * your example is funny since the publisher removed is wrong 😁🤣😲🤯 AManWithNoPlan (talk) 01:40, 10 February 2019 (UTC)


 * It almost always is because journals get purchased and repurchased over their history. Sure the Journal of Foo maybe be published by the Foo Society today, but in 2 years it might get published by Elsevier. Which means that all instances of Foo Society would need to be changed to Elsevier. This is one of the many reasons why it's completely pointless to include publisher information, against the advice of every style guide out there. Headbomb {t · c · p · b} 04:19, 10 February 2019 (UTC)
 * My ability to assume good faith is stretched pretty much to the limit by the behaviour surrounding CitationBot recently (not yours in particular), but since you asked an apparently sincere question I'll make an effort to answer it in kind.
 * First of all, a consensus is not a consensus if you can't link to it. CitationBot has no bot authorization for removing these parameters, and there is no community discussion supporting removing them. That means that what you have is not a consensus, but mere absence of challenge. And I didn't challenge it back in 2009 because I had no idea CitationBot existed and never saw it edit: if I did I would have challenged it then. The argument that this behaviour has consensus is thus extremely weak. Lack of objections ("implied consensus") is the very weakest form of consensus to begin with, and lack of objection due to obscurity weakens it yet further. It is sufficient to support that CitationBot's behaviour over that time was in good faith, but not sufficient to lean on when objections became evident.
 * In addition, the long standing and strong consensus on Wikipedia, exemplified in BRD and CON etc., is that when any consensus (both strong and weak) is challenged, the status quo prevails until a new consensus is reached. But note what status quo means in this context: article content should remain the way it was and changing it is considered edit-warring, pointy, gaming, and generally disruptive behaviour. This is why I say that, , and you are actually at peril of sanctions here! Once such edits are challenged, all edits should cease until consensus is reached! And in this case, not only are the edits challenged, but the first close of the RfC concluded that the consensus was against making these changes. Under these circumstances, the only constructive and collegial and respectful (of consensus, I mean) thing to do is to disable this function (or rule or module or however it's implemented) until the question is resolved. You should have done that the second the RfC was launched and waited for consensus to emerge, but if it didn't become clear to you sooner it certainly should have at the first close. It's always possible that consensus will turn out in your favour (unlikely at this point, yes, but by no means impossible), in which case you can re-enable the function afterwards and now with an actual consensus to back it up.
 * I'll add that if it is accurate that there's been a significant uptick in removals recently (after the start of the RfC, or, worse, after the first close that indicated consensus was against you) that would actually constitute using automated editing to enforce your preference against consensus and would have to end up at the drama boards. I really really hope that isn't the case, because the project never wins when that happens (at best we just limit the damage).
 * But that's why I say my ability to assume good faith is stretched to the breaking point where CitationBot is concerned: at every single crossroads its proponents make the choice concomitant with "What can I get away with?" and "How can I furthest advance my preference in spite of those pesky other editors?" and "I know better than those other editors that whine and complain.". I have so far seen not a single instance where the choice indicated any kind of respect for other editors or community consensus processes. It doesn't even matter if the community is wrong, by whatever standard you choose to apply: consensus and cooperation and respect for others' opinions is the fundament of how Wikipedia functions.
 * So apologies for the wall of text, but I really want CitationBot to succeed, because the state of citations on the project is shockingly bad and in desperate need of improvement. But not at the expense of fundamental pillars of the project. And all this, currently, over optional parameters that do no harm, even when used incorrectly, and are required in relatively few instances; and merely because they offend the sensibilities of a few (that is, the case against is essentially a style issue, much like whether commas or full stops separate datums in citations). Strident advocacy may appear to lead to "success", for CitationBot, in the short term; but in the long term it pretty much only leads to disruption, drama, and more loss of editors that we cannot afford. Please reconsider your (collective) priorities and mode of interaction with the wider community: I would love to be a cheerleader for CitationBot, but absent at least some measure of humility towards the community, that just cannot be. --Xover (talk) 08:32, 10 February 2019 (UTC)


 * The bot is user-activated. If you don't want the bot to remove publishers because of a misguided belief that this information belongs there, don't use the bot. Or put a comment in the publisher field. Headbomb {t · c · p · b} 08:55, 10 February 2019 (UTC)
 * Case in point. --Xover (talk) 09:46, 10 February 2019 (UTC)
 * "Consensus needs a link" is a common fallacy, to the point Consensus disproves it in the first sentence: «Editors usually reach consensus as a natural process [...] Consensus is a normal and usually implicit and invisible process» (cf. 2009).
 * Personally I wish this feature wasn't there, because I think very few people care about it either way, but I accept that it's been there for a long while for a reason. As for the rest, maybe the more discussions there are the more popular a tool becomes (and vice versa)? Nemo 11:24, 10 February 2019 (UTC)
 * Just asserting that something is a fallacy does not make it so. That certain forms of consensus can be presumed from "implied consensus" does not mean all consensus must be implied or even that all consensus can be implied. And this is the second time I've had to ask you to refrain from strawman arguments: I even acknowledge implied consensus in the message you presumably read since you're replying to it, and explain why "implied consensus" is not sufficient foundation for mass automated edits against explicit consensus. Even the very policy you cite (selectively) explains that an implied consensus does not hold once challenged: at which point you're supposed to engage in consensus building before editing further. --Xover (talk) 15:49, 10 February 2019 (UTC)

I don’t have strong opinion, I am here to code. Wow! That’s a lot a explanation! My one opinion is that people should remove publisher and location (which are almost always wrong sadly) and wiki link to a page about the journal-and make it if needed: a permanent fix that makes Wikipedia better and everyone happy. I just find it funny that pretty much every one who complains is pointing to journals with incorrect publishers listed or journals so obscure that even that information won’t help much. AManWithNoPlan (talk) 14:06, 10 February 2019 (UTC)
 * My apologies: I misunderstood the intent of your previous comment. Since it was phrased as a question and accompanied by a direct indication that you lacked knowledge, I took it to mean that you were soliciting answers to the apparent question. In light of your more recent comment I realize that was not the case. I shall bother you no further with either information or attempts to engage in constructive dialogue. --Xover (talk) 15:49, 10 February 2019 (UTC)
 * it was a real question. The wow! was a real response of being impressed.  AManWithNoPlan (talk) 17:30, 10 February 2019 (UTC)

Removing only when there's a unique identifier (https://github.com/ms609/citation-bot/pull/1323) seems a good way to address everyone's concerns. Nemo 10:20, 11 February 2019 (UTC)


 * I agree that's a workable way forward. Headbomb {t · c · p · b} 00:13, 14 February 2019 (UTC)

Another bad title
I have fixed this specific link with IABot and added the correct title myself. Redalert2fan (talk) 18:59, 13 February 2019 (UTC)
 * Returning HTTP 200 for what's in effect a deleted website is nasty. Is this a temporary state? Nemo 19:22, 13 February 2019 (UTC)
 * According to this tweet on 22 July 2018 the company decided to stop their activities. So this is a permanent state. Trying any link to any page from japakomusic.com redirects to http://japakomusic.com/cgi-sys/suspendedpage.cgi . Redalert2fan (talk) 20:56, 14 February 2019 (UTC)

Cosmeticbot issue
Note, also, in an edit the bot made earlier this month, it altered the same citation but without changing the spacing... so I'm not sure why it made the change as a separate edit a couple of weeks later. EdChem (talk) 14:21, 15 February 2019 (UTC)
 * The Bot does some mostly cosmetic changes and some very important changes. On rare occasions the changes are all cosmetic. Since this is very rare, we do not track changes and then not make the edit if only cosmetic changes are made.  AManWithNoPlan (talk) 14:50, 15 February 2019 (UTC)
 * The Bot does white space normalization. There a quite a few white space characters that we convert to spaces, and the last step is combining multiple spaces into one so that the wiki text matches the rendering.  AManWithNoPlan (talk) 14:57, 15 February 2019 (UTC)


 * Agreed that this should be avoided on its own when it's just regular spacing if possible, but at the same time, the coding complexity for it might be too much. Normalizing other spacing (like converting invisible non-breaking spaces to regular spaces) has enough advantages to do it on its own though. Headbomb {t · c · p · b} 17:30, 15 February 2019 (UTC)


 * It is only $99 44/100$% cosmetic since it improves the editors view of the page by making the editable text more in line with what is displayed. Humor intended.    AManWithNoPlan (talk) 19:40, 15 February 2019 (UTC)

Bot does not detect bad wiki code
Really hard to see in that diff, but I think this will do it. At the very least, it will crank down the greediness. https://github.com/ms609/citation-bot/pull/1343 AManWithNoPlan (talk)


 * I do not know if that is fixable.  Note that the template does not END! AManWithNoPlan (talk) 23:26, 16 February 2019 (UTC)


 * Fixed on page. https://en.wikipedia.org/w/index.php?title=Frederick_the_Great&type=revision&diff=883687734&oldid=883684916 AManWithNoPlan (talk) 23:27, 16 February 2019 (UTC)
 * Ah, right. In some of those regular expressions we could add the newline to the excluded characters (I would hope no DOI includes a newline!) but a broken template call is a broken template call... Nemo 23:32, 16 February 2019 (UTC)