User talk:Citation bot/Archive 10

Coitoid finds doi, Citation bot does not
If expanding from raw JSTOR URL (http://www.jstor.org/stable/3363372), the bot does not find doi's, while citoid does. Can we somehow call the same resources as Citoid?
 * Citation bot


 * Citoid (+Citation bot afterwards)

(t) Josve05a  (c) 21:18, 30 August 2018 (UTC)
 * We do not add DOIs that are not in CrossRef at this time. AManWithNoPlan (talk) 23:56, 30 August 2018 (UTC)
 * two more comments. It adds nothing since it is just a jstor stable ID doi.  Also we based our jstor code on Citoids so they having nothing that we don’t.  AManWithNoPlan (talk) 00:18, 31 August 2018 (UTC)

notabug

cite journal -> cite book wtf?
Looks like a false positive, but I can't reproduce from the citation alone. Did you get any clue from the bot's output as to what was happening here? Can you reproduce from the page? Martin  (Smith609 – Talk)  18:25, 24 August 2018 (UTC) API gives... Checking AdsAbs database > AdsAbs search 3476/50000: title:"Music and Connectionism" + Adding bibcode: 1994ASAJ...96.1218T + Adding journal: Acoustical Society of America Journal - Dropping parameter "publisher" + Adding volume: 96 + Adding issue: 2 + Adding pages: 1218 + Adding doi: 10.1121/1.410341 Headbomb {t · c · p · b} 18:30, 24 August 2018 (UTC)
 * So, it found a review of the book. Probably matching on name alone.  AManWithNoPlan (talk) 19:21, 24 August 2018 (UTC)

Whitespace issue
{{bot bug | what should happen = Retain pre-existing whitespace. There will be blood! }}
 * status = {{tl|fixed}} not sure when, but it is
 * reported by = Martin  (Smith609 – Talk)  15:07, 21 August 2018 (UTC)
 * what happens = Replacement of {Cite arxiv with {Cite journal modifies whitespace
 * link showing what happens = https://en.wikipedia.org/w/index.php?title=Black_hole&diff=prev&oldid=855893224
 * how to replicate the bug =

Removal of trailing full stop
This should apply to all such abbreviation (unspaced or S.H.I.E.L.D. or spaced R. G.), plus a small list of words like "Inc., Ltd." Headbomb {t · c · p · b} 14:14, 27 August 2018 (UTC)
 * I do not know why the bot even does this. There are just too many cases when it should be there. AManWithNoPlan (talk) 14:58, 27 August 2018 (UTC)
 * Actually if you leave the abbreviations alone, there are very very few false positives left. Headbomb {t · c · p · b} 16:32, 27 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/703/files AManWithNoPlan (talk) 00:27, 1 September 2018 (UTC)

Caps: I
That will take a special case for the journal name. AManWithNoPlan (talk) 13:28, 30 August 2018 (UTC)


 * https://github.com/ms609/citation-bot/pull/699

Caps: per
https://github.com/ms609/citation-bot/pull/710 AManWithNoPlan (talk) 02:45, 2 September 2018 (UTC)

volume / issue demixupification
See for a small sample of what is screwed up. The regex would catch more cases though. Headbomb {t · c · p · b} 23:53, 22 August 2018 (UTC)
 * I think if issue is set, the look for and remove (issue).

If not set then look for ^([A-Z0-9]+)(\([0-9].\))$ Thus volumes and numbers and capitals. Issues start with numbers AManWithNoPlan (talk) 03:45, 2 September 2018 (UTC)


 * Except for all the issues that don't (e.g. 'Suppl. 1', 'Fasc. 1', 'Special Issue'). If the issue is set, skip this fix. There's likely a problem with the citation, but it's not something the bot could reliably fix. (e.g. if you have weird stuff in issue volume number/year/pagenumber chances as you'll have weird stuff everywhere in volume/issue/page). My regex has been fairly well tested in User:CitationCleanerBot, and I don't recall running into any issue with it. The only things I can't make it do with AWB is clean up the volume if issue is already set and = $6, because I'm skipping on "if issue is set", which you presumable could do with citation bot (if ≠ $6, should be skipped, per above). Headbomb {t · c · p · b} 11:19, 2 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/713 AManWithNoPlan (talk) 15:16, 2 September 2018 (UTC)

cite article
I had to do this to get the bot to do this. (t) Josve05a  (c) 12:57, 23 August 2018 (UTC)
 * too many citation templates AManWithNoPlan (talk) 23:10, 23 August 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/709 AManWithNoPlan (talk) 23:37, 1 September 2018 (UTC)

fixed

eJournal of...

 * Let's see how recurrent this issue is, and re-evaluate if adding individual exceptions becomes unmanageable. Martin  (Smith609 – Talk)  18:29, 24 August 2018 (UTC)

Well, eLife has about 375 uses on Wikipedia WP:JCW/E12, and eJournal / e-Journal appear a crap ton. (Note that they will often display as ELife / EJournal /E-Journal due to how JL-Bot presents that information.) So most could probagbly be handled with an exception for eLife / eJournal / e-Journal. Headbomb {t · c · p · b} 18:36, 24 August 2018 (UTC)
 * My suggestion would be to (somehow) code so that "{lowercase}{Uppercase}bar" (such as iPhone, eLife, aJournal) not be cap/case-adjusted. (t) Josve05a  (c) 19:54, 24 August 2018 (UTC)
 * On top of eLife, e-?Journals?, there's also bioRxiv, eNeuro, engrXiv, ePlasty, e?Prints?, eVolo, hprints, mAbs, mBio, mSphere, mSystems. Headbomb {t · c · p · b} 13:37, 25 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/699 AManWithNoPlan (talk) 18:27, 1 September 2018 (UTC)

Arrows and not always quotes
Unless their is a pair of, we should not assume these are quotation marks, they may in fact be arrows, as here. (t) Josve05a  (c) 13:52, 27 August 2018 (UTC)
 * You are correct, they seemed to be misused often as arrow. &laquo;  &#171;  «   &#187; &raquo;  » AManWithNoPlan (talk) 21:11, 1 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/708 AManWithNoPlan (talk) 19:53, 2 September 2018 (UTC)

Wrongly sets class in generic citation template
https://github.com/ms609/citation-bot/pull/720 AManWithNoPlan (talk) 21:58, 3 September 2018 (UTC)

Errors detected in PMID search (SimpleXMLElement Object(   [FieldNotFound] => 161:SPASCN )
Not sure if this is a bug or not, but it feels odd to see an error message, so wanted to confirm what it ment.... > Extracting information from SICI > Found and used SICI [..> rifydoi] > Checking that DOI 10.3417/1055-3177(2006)16[161:SPASCN]2.0.CO;2 is operational... DOI ok. . Initial authors exist, skipping authorlink in tidy . Initial authors exist, skipping authorlink in tidy . Initial authors exist, skipping authorlink in tidy . Initial authors exist, skipping authorlink in tidy > Checking AdsAbs database > AdsAbs search 4720/50000: doi:"10.3417/1055-3177(2006)16[161:SPASCN]2.0.CO;2" > AdsAbs search 4721/50000: pub:"Novon: A Journal for Botanical Nomenclature" year:2006 issn:1055-3177 volume:"16" page:"161–167" [..> indpmid] > Searching PubMed... - Errors detected in PMID search (SimpleXMLElement Object ( [FieldNotFound] => 161:SPASCN ) ); abandoned. nothing found. (t) Josve05a  (c) 14:33, 22 August 2018 (UTC)
 * Expand citation: Solanum perlongistylum and S. catilliflorum, New Endemic Peruvian Species of Solanum, Section Basarthrum, Are Close Relatives of the Domesticated Pepino, S. muricatum
 * pubmed does not support URLs with square brackets, except when they do🙄.  If someone can figure out how to do this more reliably we are all ears.  AManWithNoPlan (talk) 15:23, 22 August 2018 (UTC)
 * there is a reason square brackets gone banned from DOIs.  Pretty much killed SICI too.  AManWithNoPlan (talk) 15:24, 22 August 2018 (UTC)
 * I have a trouble ticket submitted to pubmed about how to search for these evil DOIs AManWithNoPlan (talk) 02:20, 2 September 2018 (UTC)
 * wontfix sucks AManWithNoPlan (talk) 00:22, 5 September 2018 (UTC)
 * We will no longer attempt this search though: https://github.com/ms609/citation-bot/pull/724  AManWithNoPlan (talk) 03:48, 5 September 2018 (UTC)

Question to the maintainers
What is prefable? Filing bug reports and feature requests here, or on GitHube (as issues)? (t) Josve05a  (c) 14:13, 26 August 2018 (UTC)
 * Personally (not a maintainer), I prefer here, since I don't need to register on another site, we've got access to familiar wikitext, plus it's easier to link / browse / search / track issues, and we have watchlists. It's also where the bot summaries say to report bugs, and if you report something here it lets others know the issue was reported. If you report it on GitHub, then I additionally need to check GitHub to make sure I'm not filing a duplicate bug report. Headbomb {t · c · p · b} 15:28, 26 August 2018 (UTC)
 * I was just thinking that it is easier to connect code fixes (pulls) with issues, and search for issues on github (and see which are fixed/to be fixed). And other free software coders may be able to find reported issues and help out. Plus all ’contributors’ to an issue gets notified when updates to their reported issue is made. I’m not promoting the usage of one or the other, just asking if one was preferred or not from a mainenence pov. (t) Josve05a  (c) 15:34, 26 August 2018 (UTC)
 * I prefer here only. Most people cannot post issues on GitHub.  AManWithNoPlan (talk) 17:34, 26 August 2018 (UTC)
 * This page is definitely more accessible for bug reporters, which is our principal aim. But as a maintainer, I see a number of advantages to GitHub issues: firstly, I'm more likely to spot them in a timely fashion; secondly, they are integrated with GitHub edits, so it's less overhead to keep track of what has been fixed, and it's possible to link code edits to the issue that has motivated them; thirdly, it's much easier for me to see which issues would benefit from my attention (particularly on occasions when ClueBot III is down!).  So I personally would encourage bug reporters who are comfortable doing so to report bugs on GitHub, so long as it doesn't cost them additional time or inconvenience – but certainly want anyone to feel welcome to submit bug reports in whichever format suits them best.  This said, a rare thing for me to have much time to contribute to the bot's maintenance, so the preferences of AManWithNoPlan are more pertinent than my own! Martin  (Smith609 – Talk)  08:51, 27 August 2018 (UTC)
 * emergency should go to both. AManWithNoPlan (talk) 22:13, 27 August 2018 (UTC)
 * notabug archive AManWithNoPlan (talk) 14:19, 5 September 2018 (UTC)

Vietnam War page fails
I think that it is too big. AManWithNoPlan (talk) 20:18, 24 August 2018 (UTC)
 * [Edit conflict] Was saying the same thing. I wonder what the easiest fix is here?  Increase the server's timeout? Martin  (Smith609 – Talk)  20:18, 24 August 2018 (UTC)
 * Either that, or perhaps process each section of the page seperatly or something (as a batch). (t) Josve05a  (c) 20:19, 24 August 2018 (UTC)
 * fractional pages is something one can do by hand. Annoying but doable.  AManWithNoPlan (talk) 22:28, 26 August 2018 (UTC)
 * It is not a bug in citation bot. https://bugs.php.net/bug.php?id=45735 This is the line that seg faults in Page.php     while(preg_match($regexp, $text, $match)) AManWithNoPlan (talk) 14:35, 5 September 2018 (UTC)
 * notabug Flagging and moving to GitHub for us to remember and think about. AManWithNoPlan (talk) 15:33, 5 September 2018 (UTC)

parses arxiv data incorrectly when page numbers are huge
https://github.com/ms609/citation-bot/pull/664 AManWithNoPlan (talk) 16:35, 28 August 2018 (UTC)

Maybe live, maybe not: Headbomb {t · c · p · b} 00:45, 2 September 2018 (UTC)

Adds bibcode to cite arxiv (and also an extra eprint to cite arxiv)
https://github.com/ms609/citation-bot/pull/716 https://github.com/ms609/citation-bot/pull/715 AManWithNoPlan (talk) 00:22, 3 September 2018 (UTC) https://github.com/ms609/citation-bot/pull/717 AManWithNoPlan (talk) 01:27, 3 September 2018 (UTC)

Duplicate journal name
The decision to include the stupid generic work in the citation templates is a bane to bots everywhere. AManWithNoPlan (talk) 14:28, 28 August 2018 (UTC)
 * Is there a reaosn not to use work for all citation templates? It is a global/generic parameter which works for all citation templates, and journal, website etc. are just synonyms. (t) Josve05a  (c) 16:27, 28 August 2018 (UTC)
 * work is vague and unclear to most people. What's a work? It can be a book title, a conference proceeding titles, a journal title, a website, ... journal or website or whatever is clear and cannot be confused. Headbomb {t · c · p · b} 16:44, 28 August 2018 (UTC)
 * Exactly! It can be anything, and the user doesn't have to specify the specific type of work, that shoudl be done with the template (such as or . It's much easier to use the parameter work. It looks terrible when the bot changes "cite news [...] BBC News" to BBC News instead of changing it to  and keep work. Now it creates more work for editors, both to change form website to work or newpaper, and to, instead of just to  which still results in the same output. (t)  Josve05a  (c) 17:08, 28 August 2018 (UTC)
 * Work is confusing AF for most people. If people are citing a journal, they go "this is the journal's name", not "this is the work's name". Recognizable and understandable parameter names are important. Maybe cite web should be excluding from renaming 'work', because all sorts of crap get puts in there, but everywhere else 'work' should be purged. Headbomb {t · c · p · b} 17:20, 28 August 2018 (UTC)
 * work is not the wrong choice all the time, but almost anytime you see it used, it is the wrong choice AManWithNoPlan (talk) 19:05, 29 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/719 AManWithNoPlan (talk) 03:30, 3 September 2018 (UTC)

Changing "work" to "website"
In this edit, all the bot does is replace the cite web parameter  with. The template's documentation says they are aliases. Even if the bot was also doing something useful, these changes clutter up the diff screen for a style that is not in any way preferable. Why is the bot changing these? — Bilorv(c)(talk) 16:51, 5 September 2018 (UTC)
 * Yes, agree completely! Have reported it as a bug... the "work" parameters should be left untouched. —Joeyconnick (talk) 20:19, 5 September 2018 (UTC)
 * flagging as fixed to archive discussion since all the real action is in the bug report AManWithNoPlan (talk) 21:25, 12 September 2018 (UTC)

Use on non-Wikipedia wikis?
Is it possible to use this on MediaWiki installations that are not part of wikipedia.org? — Omegatron (talk) 01:49, 7 September 2018 (UTC)
 * Theoretically Yes, but you would have to run it yourself and remove the en.wikipedia.org stuff and deal with authentication etc. AManWithNoPlan (talk) 16:29, 9 September 2018 (UTC)
 * I proposed some changes to aid in this. Still a long way off.  https://github.com/ms609/citation-bot/pull/743  AManWithNoPlan (talk) 21:21, 9 September 2018 (UTC)
 * we have done what we easily can for now. come back again later if still interested when we have fewer bugs to deal with wontfix AManWithNoPlan (talk) 21:26, 12 September 2018 (UTC)

API: Category refinements
Much, much better. However, it could still be a bit better: When you start, you have -- [12:13:02] Processing page '2018 FFA Cup preliminary rounds' — edit—history This should be simplified to -- [12:13:02] Processing page '2018 FFA Cup preliminary rounds' – edit – history This eliminates redundancy and, the spaces help + use endashes. When no changes are required, you have This should be simplified to When you have a change, you have Written to Peace Pledge Union The Peace Pledge Union (PPU) is a British pacifist ... This should be simplified to (with a line break after "Writing to Peace Pledge Union...")
 * Processing page '{2018 FFA Cup preliminary rounds}' : 12:13:01
 * 1) No changes required.
 * 1) No changes required.
 * 1) Writing to Peace Pledge Union...
 * 1) Writing to Peace Pledge Union...

The Peace Pledge Union (PPU) is a British pacifist ... And when you end with history / last edit This could be much simpler/clearer with diff Headbomb {t · c · p · b} 12:16, 21 August 2018 (UTC)

wontfix not going to pollute code with all sorts of "if ( running a category) then " code. AManWithNoPlan (talk) 21:56, 12 September 2018 (UTC)

Add JSTOR links
Not sure if there is a reason why this isn't done yet, but would it be possible to add JSTOR links in cases where this isn't already added? For example, source 12 in Brachiosaurus has a doi, but I know it is also on JSTOR, so shouldn't the bot be able to cross check? FunkMonk (talk) 00:22, 14 September 2018 (UTC)

wontfix nope. not searcable. jstor disabled tbat years ago. AManWithNoPlan (talk)|
 * dont be this guy https://en.wikipedia.org/wiki/Aaron_Swartz AManWithNoPlan (talk) 01:18, 14 September 2018 (UTC)
 * jstor -> data <-> doi. there is no jstor to doi mapping. AManWithNoPlan (talk) 02:13, 14 September 2018 (UTC)
 * Not true. https://www.jstor.org/openurl?doi=10.2307/455826 > https://www.jstor.org/stable/455826. Some doi mapping exists, just does not work for dois with brackets etc. in them from what I can tell... (t) Josve05a  (c) 07:41, 14 September 2018 (UTC)


 * This is a jstor-specific doi, which is mapped JSTOR --> DOI. You still can't query the JSTOR database by DOI, and get a JSTOR match. Headbomb {t · c · p · b} 12:06, 14 September 2018 (UTC)


 * you do not need to query a database to map that doi. That's jstors doi prefix.  AManWithNoPlan (talk) 13:11, 14 September 2018 (UTC)
 * wrong again. when we plug your doi into that url we find nothing.  

AManWithNoPlan (talk) 13:19, 14 September 2018 (UTC)


 * we might be able to find a few things, but we need an isssn. AManWithNoPlan (talk) 13:24, 14 September 2018 (UTC)

https://support.jstor.org/hc/en-us/articles/115005079047-JSTOR-OpenURL-Linking-

tiny font bug
Fixed

FYI, I noticed, which is not valid CSS. The property is. --Izno (talk) 22:34, 9 September 2018 (UTC)


 * https://github.com/ms609/citation-bot/pull/750 AManWithNoPlan (talk)

Misuse of Format

 * In that same edit: Full text, Accepted manuscript, Submitted manuscript are all inappropriate uses of format (not a 'new' parameter); see documentation.
 * —Trappist the monk (talk) 09:41, 24 July 2018 (UTC)

diff The bot continues to add format with inappropriate values, in this case. The purpose of the 'format' parameters is to identify for the reader the file format of the linked source, PDF, XLS, DOC, etc. (see the documentation).
 * This should make you happier. https://github.com/ms609/citation-bot/pull/513 AManWithNoPlan (talk) 14:18, 9 August 2018 (UTC)

But the just-as-innappropriate case 'submittedVersion': $format = 'Submitted manuscript'; break; case 'acceptedVersion': $format = 'Accepted manuscript'; break; remain. Headbomb {t · c · p · b} 14:26, 9 August 2018 (UTC)
 * Sorry, it doesn't. None of Full text, Accepted manuscript, Submitted manuscript are appropriate in format ever.  Read the template documentation.  The only thing that belongs in format is the electronic file format: PDF, XLS, DOC, MP3, etc.
 * —Trappist the monk (talk) 14:30, 9 August 2018 (UTC)
 * I said happier, not happy. Make a pull request to comment out the other two and discuss with the maintainer.   My change was a no brainer.  AManWithNoPlan (talk) 18:06, 9 August 2018 (UTC)
 * Here's a request that someone pulls this. Headbomb {t · c · p · b} 21:39, 9 August 2018 (UTC)
 * Perhaps a question for the template page rather than here: but how, if not through 'format', ought a link to a pre-print be indicated? If my institution has access to a full text, I want to be using the DOI link rather than scrubbing through an unformatted preprint, but if the title links to a formatted PDF, I'd rather click that and avoid navigating a paywall.  So I think it's worth indicating the destination of the URL. Martin  (Smith609 – Talk)  07:26, 11 August 2018 (UTC)
 * We should probably have a preprint-url that appends "preprint" at the end of the template. Headbomb {t · c · p · b} 13:50, 11 August 2018 (UTC)
 * To be honest, I have no idea why adding (PDF) to a link is useful. Only would make sense to me in the case of (Proprietary CAD program file) AManWithNoPlan (talk) 15:23, 11 August 2018 (UTC)

Unless it is a file format, nothing should be added by the bot in format. think what you want about the existance of such a paramenter all we want, but don't misuse it. (t) Josve05a  (c) 20:29, 24 August 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/771  by Josve05a
 * https://github.com/ms609/citation-bot/pull/780 also fixes old esots AManWithNoPlan (talk) 20:53, 14 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:04, 20 September 2018 (UTC)

Cite web for Google Books
Should perhaps have been fixed with https://github.com/ms609/citation-bot/pull/652 but in https://en.wikipedia.org/w/index.php?title=Vallejo_%28ferry%29&diff=prev&oldid=856647186 the bot converted raw Google Book URLs to. (Is this just because I'm using the gadget tool now and not the user script, and that pull fixes are delayed, or is this a new error?) (t) Josve05a  (c) 18:17, 26 August 2018 (UTC)
 * All tools use the same code base (unless you specify citations-dev in the URL). AManWithNoPlan (talk) 18:27, 26 August 2018 (UTC)
 * The Bot does not think that it’s a book because there is no evidence that it is a book: isbn etc. AManWithNoPlan (talk) 20:54, 27 August 2018 (UTC)
 * Except you know... being on Google Books. Headbomb {t · c · p · b} 01:44, 28 August 2018 (UTC)
 * All things on Google Books are books. --Izno (talk) 01:56, 28 August 2018 (UTC)
 * Didn't say it was a guarantee. Not sure what's best in those cases. Cite journal/magazine would be ideal, obviously, but failing that should we have a cite web or cite book. Cite web is really the shittiest of templates for citations, a sort of 'when all else fails' type of thing. Question is it is better to have magazines as books, or books as websites, which of the two are more widespread? Headbomb {t · c · p · b} 02:29, 28 August 2018 (UTC)
 * I'd rather use citation for when the bot does not know if a Google Books URL (and only those for now) is a book or other kind of media, since it isn't the webpage itself you reference, but the media it describes. Citoid uses citation widespread in cases such as this. (t) Josve05a  (c) 15:56, 3 September 2018 (UTC) (t)  Josve05a  (c) 15:55, 3 September 2018 (UTC)

wontfix for now. We have some improvements coming. AManWithNoPlan (talk) 17:08, 20 September 2018 (UTC)

I disagree with the Consensus the drives the bot's actions
Women's liberation movement in North America Don't even know where to start. Changing isbn numbers from those given in source viewed, changing publishing information or deleting publisher, removing publishing location, all distort and are incompatible with accuracy of citation. As a historian, this bot failed to improve any of the citations and distorted the accuracy of information about source and material. Improving citations are always welcome, but deleting information which identifies sourcing accurately is not worthwhile. SusunW (talk) 04:16, 8 September 2018 (UTC)
 * Exactly same comments apply to Women's liberation movement. Inaccurate data, or changes to data, which dilute sourcing is unacceptable. SusunW (talk) 04:22, 8 September 2018 (UTC)
 * Actually, those are all in line. Citing journals never include location or publisher information in any style guide etc..., and ISBN 13 are prefered over ISBN 10. Headbomb {t · c · p · b} 11:26, 8 September 2018 (UTC)
 * The second biggest problem with those articles is all the google books links that need deleted.  The Reference section reads like a paid shill for google wrote it.  The biggest problem is the use of session specific urls that lead to nothing-not even google previews. AManWithNoPlan (talk) 13:28, 8 September 2018 (UTC)
 * AManWithNoPlan, I have never and will never accept pay for writing an article for Wikipedia. Your comment is unfounded and way off base, as a group of editors, none of whom received any pay, wrote the article(s) using sources with links that were available to them in their areas. As these multiple editors live in locations around the globe, they may well have access to sources you are unable to access. That neither makes the links invalid nor advertisements for google. I am uninterested in discussing your edits further, as your manner is very aggressive and accusatory. as a historian, where you got it includes the publisher information and location to facilitate others in finding the material. As someone who does not live in the global north, it is often impossible to find a source shown on a web search without knowing the publisher/location. Going to the publisher's website, one can oft times locate the article of interest or ask for it to be provided. If the source gives 10 digits, it is a revision to make it 13; however, that is a minor issue. SusunW (talk) 16:05, 9 September 2018 (UTC)
 * The discussion would be facilitated if you provided diffs of specific edits as examples. Otherwise we may end up with everyone looking at different parts of the elephant. &diams; J. Johnson (JJ) (talk) 22:12, 8 September 2018 (UTC)
 * I never accused you of being a paid shill. I said it read like that(believe me, I have over google books articles and be chastised for it).  If you are referencing specific pages in a book, then please list page numbers and link to those pages in google.  If you are not linking to specific free pages then you should not include the google links and should let isbn be the link out.  As for 13 vs 10 on ISBN, Wikipedia style guides say they should be converted—it is equivalent to adding 1 to to a USA phone number to specify country code - no real change.  AManWithNoPlan (talk) 16:17, 9 September 2018 (UTC)
 * ISSN seems to be the Wikipedia approved method instead or publisher + location.  This bot automatically removes issn once a doi is added.  AManWithNoPlan (talk) 16:20, 9 September 2018 (UTC)
 * "Seems to be"? If there is no explicit basis for that then it becomes an unsanctioned and questionable alteration. And I would object to replacing an explicit identifier of a periodical with an identifier where the publisher might adopt a naming scheme for articles that does not expressly identify the periodical. Or does so in their own peculiar way, so that across a range of DOIs the only general way of identifying a periodical is to look at the record for the article. And  the ISSN is no longer available if a specific DOI becomes unavailable. I don't know that ISSNs should be required, or even routinely added, but where an editor sees fit to add one it should not be removed. &diams; J. Johnson (JJ) (talk) 21:56, 9 September 2018 (UTC)
 * my mistake. I only remove ISSN if it adds the doi.  AManWithNoPlan (talk) 22:30, 9 September 2018 (UTC)
 * You missed my point: I'm saying don't remove an ISSN even if you add a DOI. &diams; J. Johnson (JJ) (talk) 21:26, 10 September 2018 (UTC)


 * Thank you, you seem to grasp the problem. As I stated, I do not live in the global north. What someone in the US/UK, etc may have access to, does not mean that others can access it. Tying a citation to one instance of a document, like the DOI, limits accessibility. There are very often multiple access points to a single document. Giving broader information on where the source can be found improves the ability of both writers and readers to obtain source materials. If I am writing an anchor article, such as WLM, removing links to citations or altering them to links that I may not be able to access makes it far more difficult to write the biographies of the redlinked people in the article. While citing a DOI may allow someone who cannot access the link I used to access the source, it is also likely that it won't, leaving them searching for another point of access. I get that technicians and writers/researchers don't speak the same language, but we can always try to understand each other if our goal is improvement of the encyclopedia. SusunW (talk) 02:51, 10 September 2018 (UTC)
 * For journals including Publisher and Location is considered over-linking and thus considered to be incorrect. This is especially true for thing such J Chem Phys or J Phys Chem where they are super easy to find online.  For obscure journals (which seem to be very common in the articles you are discussing--seriously they are hard to find if at all), it might be best to include some extra contact information in the id since en.wikipedia.org has decided that to remove publisher and location for all journals.  AManWithNoPlan (talk) 03:35, 10 September 2018 (UTC)
 * the Consensus i believe is over a decade old. I was able to find proof a while back that the function was not new 7 years ago.  AManWithNoPlan (talk) 03:56, 10 September 2018 (UTC)
 * and as a long time editor and person interested in Scientometrics, academia, technical writing and library science, I can tell you that absolutely no one ever includes journal locations and publishers when citing journals. No style guide recommends doing so out there, and for good reason: The information is pointless and doesn't help anyone locate anything. Whether it's me in Canada, or someone in Djibouti, no one will look up e.g. Signs and ever need the information that it's published in New York or Syracuse or Philadelphia or London or Milan or Chicago to read the article or access them. Likewise, that Journal of Physics is published by the IOP is not information anyone needs to care about when accessing those articles. If you're at a library, journals aren't catalogue by publishers. If you're online, you DOIs and websites. The only time the location or publisher might ever be useful is if you have two or more journals named the same way, e.g. Open Medicine, and then you're better off disambiguating them via ISSN. Headbomb {t · c · p · b} 11:37, 10 September 2018 (UTC)
 * for the last decade, I have lived in various places in Latin America and the Caribbean. In no place that I lived was there a public library, so going into a library and "looking it up" is an impossibility. Most publishings from this region are not listed in World Cat (or digitized), as possibly you have seen me post at Women in Red. When repeatedly I have sourcing from RS stating that various persons have published 200-300 books and articles and there are 0 entries in Google Scholar, Scopus, PubMed, World Cat, or any other compilation system, you understand that it is part of the skew of systemic bias toward the global north. While "absolutely no one" might use this information in your location, the publisher and location are the primary way that I find sources from journals. If I can find the publisher, I can often back into the ISSN, if there is one, or sometimes find an accessible link. With an ISSN I can determine if there is a library which holds the work and ask Megalibrarygirl to try to find it. If that doesn't work, I try to find a Wikipedian who lives in that area to photograph the source or I write to the publisher to see if they will copy and send it to me. Yes, it is a lot of work, but it is the reality in much of the world. Open knowledge platforms should help overcome the difficulties of research, not reinforce the problems, but we cannot do that if people refuse to recognize that the diversity in the world doesn't often make single solutions viable. SusunW (talk) 13:57, 10 September 2018 (UTC)


 * Again, that's what the DOI is for (and occasionally ISSNs when the journal isn't obvious). And if something isn't listed in Scopus/Google Scholar/whatever it's not knowing that something is published in Madrid vs Chicago vs Montreal vs Shanghai that will help you find it. brings things closer to how citations should be presented (possibly with misfires when cite magazine / cite web should be used rather than cite journal), and that's a good thing. Headbomb {t · c · p · b} 15:38, 10 September 2018 (UTC)

The real solution is wiki linking the journal name and making an article about the journal. AManWithNoPlan (talk) 15:47, 10 September 2018 (UTC)


 * Have you considered that journals so obscure that the publisher's name and location would be useful are most likely non-notable? &diams; J. Johnson (JJ) (talk) 21:29, 10 September 2018 (UTC)
 * If that is true, then they might fail to meet wikipedia's requirement of verifiability and thus should not be used as reference (this is mostly a joke). Seriously, a single central article will allow people all over the world to find this journal.  AManWithNoPlan (talk) 22:25, 10 September 2018 (UTC)


 * Not at all. You have confused WP:V with WP:N. Verifiability requires reliability, but neither is the same criterion as notability. The latter is about "significant attention by the world at large and over a period of time". So it is quite possible to have a journal (likely very specialized) that is well-known and highly respected in a narrow field of experts, and has published articles relevant to some topic, but which has not gained "significant attention by the world at large". WP:V does not require a WP:N source. &diams; J. Johnson (JJ) (talk) 00:14, 11 September 2018 (UTC)


 * I know the difference that’s why I said mostly joking. But, links from references to the journal would note notability.  Just because a journal is published by Adair county library and ribs joint does not mean it is not notable.  90% of Wikipedia fails the notability test   AManWithNoPlan (talk) 01:48, 11 September 2018 (UTC)

notabug AManWithNoPlan (talk) 14:04, 20 September 2018 (UTC)

Don't forget to add author dots
The meta data does not have them. That’s the problem. AManWithNoPlan (talk) 22:15, 27 August 2018 (UTC)
 * Bot logic in those cases: If in any of first# in a citation you find the pattern, replace   with   in all other first# found in the citation.
 * Could be retroactive on existing uses of first# too. It's a really really widespread problem. Headbomb {t · c · p · b} 01:49, 28 August 2018 (UTC)
 * Harry S Truman would not approve, but we can fix this. AManWithNoPlan (talk) 16:25, 28 August 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/760  AManWithNoPlan (talk) 03:10, 12 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:05, 20 September 2018 (UTC)

Edit summary when expanding raw url
In https://en.wikipedia.org/w/index.php?title=Khanate_of_Kazan&diff=prev&oldid=856050147 the bot expands from an URL to a, which is amazing! However, the bot should mention this in the edit summary somehow. (t) Josve05a  (c) 14:38, 22 August 2018 (UTC)
 * report to user https://github.com/ms609/citation-bot/pull/764 AManWithNoPlan (talk) 22:23, 12 September 2018 (UTC)
 * report to edit summary https://github.com/ms609/citation-bot/pull/765 AManWithNoPlan (talk) 22:23, 12 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:07, 20 September 2018 (UTC)

When running from a category, mention the category
this edit was triggered via With the edit summary This should instead be
 * Alter: title. You can use this bot yourself. Report bugs here. | Headbomb
 * Alter: title. You can use this bot yourself. Report bugs here. | Headbomb running on Category:Livestock stubs


 * https://github.com/ms609/citation-bot/pull/763 AManWithNoPlan (talk) 21:37, 12 September 2018 (UTC)


 * fixed AManWithNoPlan (talk) 14:09, 20 September 2018 (UTC)

API: Make output more user friendly
When you run the page, you are presented with
 * Follow Citation bot’s progress below.
 * More details | Bot’s recent edits | Report bugs | Source code





This would be much clearer/less intimidating if it was something like
 * Follow Citation bot’s progress below.
 * How to Use / Tips and Tricks | Bot’s recent edits | Report bugs | Source code





Headbomb {t · c · p · b} 00:55, 30 August 2018 (UTC)


 * https://github.com/ms609/citation-bot/pull/772 AManWithNoPlan (talk) 23:25, 13 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:09, 20 September 2018 (UTC)

OCLC url → OCLC parameter
https://github.com/ms609/citation-bot/pull/741  AManWithNoPlan (talk) 03:02, 9 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:10, 20 September 2018 (UTC)

Running the bot again results in new changes being made
Running the bot multiple times after each edit on the same page results in new edits being made. All possible edits should be done before saving the article. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 21:33, 26 August 2018 (UTC)
 * Edit 1: https://en.wikipedia.org/w/index.php?title=Matilde_Marcolli&diff=prev&oldid=856672723
 * Edit 2: (1 min later; no edits in between): https://en.wikipedia.org/w/index.php?title=Matilde_Marcolli&diff=prev&oldid=856672850
 * (In edit 2 the bot perfomed an edit which is a bug, reported above as User_talk:Citation_bot.) <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 21:35, 26 August 2018 (UTC)
 * major refactoring of the code in the last few days. That must be why.   The code is more efficient now.   It used to check thing again and again and again.  AManWithNoPlan (talk) 02:35, 27 August 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/749 AManWithNoPlan (talk) 21:18, 9 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:11, 20 September 2018 (UTC)

Better arxiv url recognition
https://github.com/ms609/citation-bot/pull/711 AManWithNoPlan (talk) 02:54, 2 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:11, 20 September 2018 (UTC)

Bot can't decide which dash to use
I ran into the same issue. Sometimes I run the bot through an article twice because it appears in multiple reference cleanup required sections and I notice that the bot would add a page number with regular hyphen (-), then clean it up later with an en dash(–). Examples are and. If the intention of the bot is to have en dashes for page numbers, maybe it could do that when adding it so it does not have to make the subsequent edit again. -- AquaDTRS (talk) 20:07, 6 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/747 AManWithNoPlan (talk) 21:01, 9 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:12, 20 September 2018 (UTC)

remove website=Google for books
Which is better via= or delete? AManWithNoPlan (talk) 22:32, 2 September 2018 (UTC)
 * Delete, imo. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:35, 2 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/773 AManWithNoPlan (talk) 23:32, 13 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:12, 20 September 2018 (UTC)

In CS1 templates (i.e. all but Template:Citation), remove postcript =.
What are your thoughts on remove empty postcript on citation also, since it does nothing? AManWithNoPlan (talk) 21:35, 8 September 2018 (UTC)
 * with postcript
 * without postcript
 * Code in progress https://github.com/ms609/citation-bot/pull/740 AManWithNoPlan (talk) 21:47, 8 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:12, 20 September 2018 (UTC)
 * Code in progress https://github.com/ms609/citation-bot/pull/740 AManWithNoPlan (talk) 21:47, 8 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:12, 20 September 2018 (UTC)

Try and fix broken dois
In case a DOI does not resolve (i.e. is broken/inactive), check if the DOI has more than one forward-slash. If it does, remove the second and all content after it. Real example:  to. If it resolves and gives matching metadata, replace the doi field. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 01:49, 6 September 2018 (UTC)
 * Alternativly, find and match snippits such as  and remove them, if at the end of a broken DOI. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t)  Josve05a  (c) 01:50, 6 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/737 AManWithNoPlan (talk) 21:33, 8 September 2018 (UTC)
 * Same for  and  . Headbomb {t · c · p · b} 13:41, 18 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:13, 20 September 2018 (UTC)

Don't remove PDF URLs simply because it has a DOI in its path
Should we really remove url simply because the URL has a known doi in it? I think if the URL is a PDF file, it is worth keeping since it is linking to the journal article directly (as open source). We don't always remove url when doi is present, only if that specific URL happens to have the DOI in its path. Either we should always delete the URL, or never in my own opinion, but if we should, we shouldn't do so when the URL is a PDF. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:28, 27 August 2018 (UTC)
 * I'm not quite following the argument here; why is a PDF with a DOI in its URL any more likely to be open source than an HTML page with a DOI in its URL? Martin  (Smith609 – Talk)  07:11, 28 August 2018 (UTC)
 * Sorry, my bad. Not open source - open access. If it is a PDF-link it most likely links directly to a freely available version, while the identifiers (such as DOI) might link to a paywall. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 13:13, 28 August 2018 (UTC)
 * I'd be interested to see an example of a URL that contains a DOI where the DOI resolves to a paywall, but the URL leads to an open-access PDF. Martin  (Smith609 – Talk)  06:00, 1 September 2018 (UTC)
 * I’m sure the OABOT project has examples. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 10:28, 1 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/704 AManWithNoPlan (talk) 01:38, 1 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:14, 20 September 2018 (UTC)

Converts cite book to cite journal erroneously
Gotta love bad meta data. The bibcode has a journal parameter. AManWithNoPlan (talk) 02:46, 28 August 2018 (UTC)
 * ISBN should have higher precedence than journal, at least on ADSABS. Headbomb {t · c · p · b} 02:53, 28 August 2018 (UTC)
 * I've seen lots of edits in the last few days where the bot has mangled book references due to thinking they were journal references. The bot should be shut down until this is fixed.--Srleffler (talk) 01:43, 7 September 2018 (UTC)
 * can you point to pages people edited with the bots help where things went wrong. It will help us figure out a heuristic to decide if meta data is bad.   I should note that all bot edits are human initiated.  AManWithNoPlan (talk) 01:50, 7 September 2018 (UTC)
 * This edit had a bunch of problems, some of which looked like the bot mistaking books for journal articles. Check my subsequent edits for what I reversed.
 * This edit. The bot converted a cite web to a cite journal. The linked document was a book. (It was also the wrong URL, but the correct reference was not a journal article either.)
 * I think I saw several other bad edits by the bot in the last few days, but I can't find the others right now.--Srleffler (talk) 02:27, 7 September 2018 (UTC)
 * The citeseerx one is GIGO, so hard to fix. We made it better, just not perfect.   AManWithNoPlan (talk) 02:57, 7 September 2018 (UTC)
 * Thank you for the examples from the bibcode database. The books all look like: 2003hoe..book.....K and such.  That's really easy to notice. AManWithNoPlan (talk) 02:57, 7 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/736 AManWithNoPlan (talk) 03:14, 7 September 2018 (UTC)
 * Thanks for the quick fix!--Srleffler (talk) 04:45, 7 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 15:19, 20 September 2018 (UTC)

Possible GIGO or bug

 * In the amendment to the citation between the tags is mangled.  Although the item cited isn't really an issue of a "journal", the problem here isn't with the bot's substitution of "journal" for "book", but with the choice of "Gyros" for the title of the supposed journal.  The item cited is actually one of an aperiodically issued series of "Lecture Notes", and "Gyros" is simply the first word in the title of that issue.  There seems to me to be no ideal way for any of the  citation templates to cover this situation, but  seems to me to be among the best of a few reasonably acceptable alternatives.
 * David Wilson (talk · cont) 02:55, 11 September 2018 (UTC)


 * All but the Gyros GIGO fixed AManWithNoPlan (talk) 15:22, 20 September 2018 (UTC)
 * I verified that it is GIGO.

[numFound] => 1 [start] => 0 [docs] => Array (           [0] => stdClass Object                ( [arxiv_class] => Array (                           [0] => gr-qc                        ) [identifier] => Array (                           [0] => 2001gcit.conf..195H                            [1] => 2001gr.qc.....3067H                            [2] => 2001LNP...562..195H                            [3] => 10.1007/3-540-40988-2_10                            [4] => 2001gcit.conf..195H                            [5] => gr-qc/0103067                            [6] => 10.1007/3-540-40988-2_10                            [7] => 2001gr.qc.....3067H                        ) [year] => 2001 [page] => Array (                           [0] => 195                        )                    [bibcode] => 2001LNP...562..195H [pubdate] => 2001-00-00 [author] => Array (                           [0] => Haugan, Mark P.                            [1] => Lämmerzahl, C.                        ) [volume] => 562 [doi] => Array (                           [0] => 10.1007/3-540-40988-2_10                        )                    [pub] => Gyros, Clocks, Interferometers ...: Testing Relativistic Gravity in Space [doctype] => inbook [title] => Array (                           [0] => Principles of Equivalence: Their Role in Gravitation Physics and Experiments That Test Them                        ) )       )

bot is converting "work" parameters to "website" in Cite web
In cite web, since work is an alias for website which is the template native parameter. Might not be ideal for many things that should actually be cite news. AManWithNoPlan (talk) 20:22, 5 September 2018 (UTC)
 * Since it is an alias, then it is making edits for the sake of making edits, which as I understand it is relatively undesirable. Also, most "news" citations these days are from online sources, so cite web is widely used for these types of citations. —Joeyconnick (talk) 21:21, 5 September 2018 (UTC)
 * I'm not sure I follow. Looking at the documentation I can't see anything that should be cite news rather than cite web, unless it's a news article that's not available online (which would produce an error with cite web). — Bilorv(c)(talk) 01:51, 6 September 2018 (UTC)
 * Is anyone working on this issue? What's the process for getting it fixed? The bot just made this useless edit. — Bilorv(c)(talk) 17:26, 12 September 2018 (UTC)
 * there have been bigger fish to fry. i will look into it. AManWithNoPlan (talk) 18:14, 12 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/762 AManWithNoPlan (talk) 18:24, 12 September 2018 (UTC)

. This is very disruptive. Might suggest disabling the tool until you are able to find and fix the problem, particularly if this is not even a "biggest fish" bug. It should not be converting work -> website at the rate of 1,000s or 10's of thousands. -- Green  C  13:40, 16 September 2018 (UTC)
 * the fix is merged in, it just needes deployed. The reason it is small fish is because it in no way effectes what humans see on wikipedia.   AManWithNoPlan (talk) 16:30, 16 September 2018 (UTC)
 * Is there any plan to perform clean-up and revert the incorrect changes made by the bot? Keith D (talk) 20:30, 16 September 2018 (UTC)
 * There are no plans to generate a new series of edits that have no effect on what users see. In many cases website is better choice, so a mass revert would be disruptive.   Lastly, no one has volunteered to do it who is able to do it.  Also, the bot itself never does edits that are not under the guidence of a human  AManWithNoPlan (talk) 23:36, 16 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Follow-up on removal of URLs for broken DOIs
Follow-up from User_talk:Citation_bot/Archive_9, still not fixed. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 01:46, 6 September 2018 (UTC)
 * Weird.  Looks like need to remove /full form DOIs too. AManWithNoPlan (talk) 02:35, 6 September 2018 (UTC)
 * Also, see . <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 02:36, 6 September 2018 (UTC)
 * we do some of that already, but that’s a good addition to the tools. AManWithNoPlan (talk) 02:38, 6 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/737 once rolled out this will remove bad stuff from new and existing DOIs. It will probably not remove the broken notice unless you run the bot again. AManWithNoPlan (talk) 22:41, 7 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 15:23, 20 September 2018 (UTC)

Caps: voor
https://github.com/ms609/citation-bot/pull/729 AManWithNoPlan (talk) 15:10, 6 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Caps: till, av, och, för, mot, zum, non
https://github.com/ms609/citation-bot/pull/729 AManWithNoPlan (talk) 15:11, 6 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Invalid dates caused by arXiv data containing page numbers that look like dates
Thanks for reporting the issue. I believe this might occur for a number of articles which I ran the bot through, although I won't know which ones until the list of articles with invalid dates gets populated again in the next cycle. Also, I was thinking maybe the bot could include a feature to check for an invalid year before it replaces it, just in case it finds a set of numbers that look like dates elsewhere again. -- AquaDTRS (talk) 19:38, 6 September 2018 (UTC)
 * this bug has been fixed. Just waiting for the new code to get deployed on Wikipedia AManWithNoPlan (talk) 01:53, 7 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:15, 20 September 2018 (UTC)

Fouling causes internal server error
https://github.com/ms609/citation-bot/pull/733 AManWithNoPlan (talk) 23:53, 6 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:17, 20 September 2018 (UTC)

Bot stops in Neutrino
https://github.com/ms609/citation-bot/pull/732 AManWithNoPlan (talk) 20:27, 6 September 2018 (UTC)
 * Pull has been merged, but issue has not been fixed. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 19:59, 16 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:20, 20 September 2018 (UTC)

bug fixed code not propagated to Wikipedia
Regression of User_talk:Citation_bot/Archive_10
 * fixed AManWithNoPlan (talk) 14:17, 20 September 2018 (UTC)

Bot marks working DOI as broken
https://github.com/ms609/citation-bot/pull/751 AManWithNoPlan (talk) 16:24, 11 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:18, 20 September 2018 (UTC)

Did not expand raw url in first edit; required two edits
https://en.wikipedia.org/w/index.php?title=Andragogy&diff=859059054&oldid=859058657


 * https://github.com/ms609/citation-bot/pull/755 AManWithNoPlan (talk) 18:13, 11 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:18, 20 September 2018 (UTC)

Replaced specific at with non-specific pages
https://github.com/ms609/citation-bot/pull/756 AManWithNoPlan (talk) 20:36, 11 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:18, 20 September 2018 (UTC)

CrossRef has � instead of ü, ä and ö in metadata
Sorry, but the data is wrong in CrossRef. We could detect it, but we cannot fix it. AManWithNoPlan (talk) 16:19, 11 September 2018 (UTC)
 * Where there should be an ü and an ä there is a http://www.fileformat.info/info/unicode/char/0fffd/index.htm AManWithNoPlan (talk) 16:32, 11 September 2018 (UTC)
 * Unfixable AManWithNoPlan (talk) 00:11, 14 September 2018 (UTC)
 * How about not adding any author names at all if one such characters appears, since it is naste and puts garbage on Wikipedia. Or at least warn users when it added this somehow. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 07:53, 14 September 2018 (UTC)
 * You might complain to CrossRef. I think people would rather see The German Lust f�r Science that Error: no title specified in a reference.  AManWithNoPlan (talk) 16:00, 14 September 2018 (UTC)
 * This has come up before and the consensus has always been in favor of the bot's actions. AManWithNoPlan (talk) 16:13, 14 September 2018 (UTC)
 * I have complained to CrossRef. AManWithNoPlan (talk) 16:14, 14 September 2018 (UTC)
 * wontfix AManWithNoPlan (talk) 14:20, 20 September 2018 (UTC)

Bot fails on Science
I think thats fixed in out gothub development tree. AManWithNoPlan (talk) 01:05, 14 September 2018 (UTC)
 * PHP can be a little aggressive with memory usage AManWithNoPlan (talk) 01:08, 14 September 2018 (UTC)


 * It seems to happen on JSTOR links mostly. Headbomb {t · c · p · b} 17:17, 16 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:21, 20 September 2018 (UTC)

Square brackets with more than one pipe are unsupported

 * https://github.com/ms609/citation-bot/pull/783 fix code written AManWithNoPlan (talk) 04:28, 16 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:22, 20 September 2018 (UTC)

Not showing my name

 * did you try your canonical username? i.e. with underscores instead of spaces? AManWithNoPlan (talk) 19:32, 15 September 2018 (UTC)
 * Sorted! Thanks!--5 albert square (talk) 20:29, 15 September 2018 (UTC)
 * we should fix that, since most people use apaces.  should also remove user: if peolple add that. AManWithNoPlan (talk) 21:31, 15 September 2018 (UTC)
 * It would be helpful. I must admit I didn't think to try it with underscores.  Thanks again!--5 albert square (talk) 22:38, 15 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/782 AManWithNoPlan (talk) 00:14, 16 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:22, 20 September 2018 (UTC)

Bot stops in Fluvoxamine
assuming php memory bug AManWithNoPlan (talk) 03:19, 18 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:23, 20 September 2018 (UTC)

Adds redundant doi-broken-date when doi-broken already present

 * Thats what happens when a template is designed by people who do not plan ahead. AManWithNoPlan (talk) 02:33, 18 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/788 AManWithNoPlan (talk) 20:23, 18 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:24, 20 September 2018 (UTC)

Confusing links for conference papers
This is GIGO. Headbomb {t · c · p · b} 13:37, 18 September 2018 (UTC)
 * In that case notabug AManWithNoPlan (talk) 14:38, 20 September 2018 (UTC)

Remove leftover deadurl
fixed removes this when removing url now AManWithNoPlan (talk) 14:25, 20 September 2018 (UTC)

Redundant duplication of author name parameters
Thank you for changing the bug title. AManWithNoPlan (talk) 18:24, 9 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/746 AManWithNoPlan (talk) 23:39, 11 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:08, 21 September 2018 (UTC)

Remove access date when there is no URL.
cite web should be the exception. Leave that one alone, unless it's converted. e.g.. Headbomb {t · c · p · b} 23:18, 18 September 2018 (UTC)


 * true, because people are clueless. AManWithNoPlan (talk) 23:25, 18 September 2018 (UTC)
 * Recommend reviewing the criteria for orphan access-date removal at User:GreenC_bot/Job_5 that was arrived at by lengthy community input over a 5 month period. --  Green  C  01:48, 19 September 2018 (UTC)
 * notabug and another one bites the dust. AManWithNoPlan (talk) 03:42, 22 September 2018 (UTC)

Cleanup chapter-url too
https://github.com/ms609/citation-bot/pull/810 AManWithNoPlan (talk) 23:15, 20 September 2018 (UTC)
 * fixed

Bot converts garbage parameter to another garbage parameter

 * https://github.com/ms609/citation-bot/pull/791 AManWithNoPlan (talk) 23:08, 20 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 22:06, 21 September 2018 (UTC)

Bad bibcode data for arxiv

 * https://github.com/ms609/citation-bot/pull/807 AManWithNoPlan (talk) 16:27, 20 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 22:04, 21 September 2018 (UTC)

Title vs script-title
Same with  in https://en.wikipedia.org/w/index.php?title=7.62×39mm&oldid=860197219 <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t)  Josve05a  (c) 00:34, 19 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/809 AManWithNoPlan (talk) 19:58, 20 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 13:08, 22 September 2018 (UTC)

Weird edit summary
unfixable without massive effort AManWithNoPlan (talk) 02:49, 22 September 2018 (UTC)
 * Since we do not see the title, the bot could add one AManWithNoPlan (talk) 02:51, 22 September 2018 (UTC)
 * wontfix assuming rare and weird edit summary is a warning. AManWithNoPlan (talk) 19:00, 23 September 2018 (UTC)

Follow-up on the follow-up on removal of URLs for broken DOIs
Perhaps even replace with new one as in this cass the doi was missing a character AManWithNoPlan (talk) 21:06, 21 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 13:36, 24 September 2018 (UTC)

Dropping pdf urls
do not drop urls that point to .pdf even if they have doi AManWithNoPlan (talk) 03:14, 23 September 2018 (UTC)
 * this got flagged as el fixo on accident AManWithNoPlan (talk) 03:15, 23 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:46, 25 September 2018 (UTC)

Change parameters to better choices
I don't know if this is the case (pretty sure it isn't), but the bot should convert If location / date aren't set / are empty Headbomb {t · c · p · b} 13:48, 12 August 2018 (UTC)
 * publication-date &rarr; date
 * publication-place &rarr; location
 * thoughts on other things that we should upgrade. AManWithNoPlan (talk) 15:18, 12 August 2018 (UTC)

Only other one I can think of is Headbomb {t · c · p · b} 16:19, 12 August 2018 (UTC)
 * orig-year/origyear → year
 * I think (though I could be wrong) that orig-year should be converted to year only if (a) year is empty and (b) orig-year contains only a valid four-digit year. Both must be true. If orig-year contains additional text, it should not be moved to year; that will cause an error message to appear. – Jonesey95 (talk) 17:45, 12 August 2018 (UTC)
 * Yes, I mean doing those conversions only when they don't overwrite existing parameters. Slightly clarified the title of this section to reflect that. Headbomb {t · c · p · b} 02:37, 13 August 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/824 AManWithNoPlan (talk) 16:34, 25 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 15:03, 27 September 2018 (UTC)

wayback.archive.org
https://github.com/ms609/citation-bot/pull/822 AManWithNoPlan (talk) 15:49, 25 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 15:29, 27 September 2018 (UTC)

Bot fails to respect comment exclusion in Titles (capitalization and quotes) and others
https://en.wikipedia.org/w/index.php?title=CKMT1B&diff=858892445&oldid=858892287

https://en.wikipedia.org/w/index.php?title=%CA%BBOumuamua&diff=prev&oldid=861195731


 * Working on it https://github.com/ms609/citation-bot/pull/826 AManWithNoPlan (talk) 18:56, 26 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 16:04, 27 September 2018 (UTC)

Forget Amazon just the same as Google Books
https://github.com/ms609/citation-bot/pull/823 AManWithNoPlan (talk) 15:54, 25 September 2018 (UTC)
 * fixed AManWithNoPlan (talk) 16:19, 27 September 2018 (UTC)

Bot adds redundant parameters

 * I don't see this as a CITEVAR thing (and would vehemently challenge you to defend it there beyond the support for vauthors and the like for name-formatting or the use of CS1 versus CS2).
 * Anyway, this is a feature IMO, as the DOI may not always resolve to the same named identifier. --Izno (talk) 20:45, 24 September 2018 (UTC)
 * I'm not sure I understand your response. It's a CITEVAR issue because editors may disagree not only on how to format citations, but on what information to include in them. For example, whether to specify the medium is part of the "style" of a citation; the word is used in policy pages with that meaning. Do you mean that the cite template should only be used with one specific style? As best I know "Citation Style 1" refers primarily to formatting and is not a complete citation style of its own.
 * I make the suggestion because this is a specifically avoidable example of the "add as much as I can find" approach the bot uses. Other cases are not as easily detected; multiple identifiers may sometimes be useful. Unless the bot knows what citation style (if any consistent one) is in use then it often won't be clear whether adding an identifier is helpful or harmful to style conformance. Kim Post (talk) 21:33, 24 September 2018 (UTC)
 * This comes up from time to time and it always get overwelmingly resolved that users are more important than editors. AManWithNoPlan (talk) 22:38, 24 September 2018 (UTC)
 * Why do editors curate the bibliography if not for the benefit of the reader? (I'd concede that redundant identifiers have some potential use to editors.) I searched the archives for "identifier" and the closest parallel seems to be the removal of ASINs that were redundant with ISBNs. Please do link to a relevant consensus though; I'm happy to adapt if there is one. Kim Post (talk) 23:21, 24 September 2018 (UTC)
 * I do not have the time to look it up at this point. Several other people who watch this board will most likely do it.  They tracked down the justification for removing publisher and location from all journal references; so they can find anything. AManWithNoPlan (talk) 23:27, 24 September 2018 (UTC)


 * This is not a bug, and is desirable. I know if I have access to JSTOR or not, I don't know if I have access to a generic DOI or not. Plus, if in the future the DOI points to a different database than JSTOR, the JSTOR link will still be functional. Headbomb {t · c · p · b} 17:04, 25 September 2018 (UTC)
 * notabug AManWithNoPlan (talk) 14:35, 27 September 2018 (UTC)

Better cleanup of date/year, page/pages/at, via
page and pages are aliases. year and date are aliases. via online makes sense if a URL is provided, so remove it if there is no url provided.
 * 13-25 should be converted to 13–25
 * If any of page/pages/at is set, remove the others (if they are empty / redundant)
 * 2008 should be converted to 2008
 * If any either of year/date is set, remove the other (if it is empty)

So a citation like

cleans up to

Headbomb {t · c · p · b} 17:01, 25 September 2018 (UTC)
 * I would prefer to see the canonical version, which is date. Otherwise these are reasonable suggestions. --Izno (talk) 17:03, 25 September 2018 (UTC)
 * If you want to just present the year, it makes it clearer that it's not to be expanded to full dates. And if you want to present full dates, it also makes is much easier to find year-only dates. Headbomb {t · c · p · b} 17:07, 25 September 2018 (UTC)
 * date is not a a true alias of year – true aliases cause the 'more than one of param and param' error message as here with work and journal:
 * Because there are occasions when both date and year are required (or desired), they cannot be aliases. I agree with Editor Izno that date should be preferred over year when both are not required.
 * —Trappist the monk (talk) 17:45, 25 September 2018 (UTC)
 * "there are occasions when both date and year are required (or desired)" what would those occasions be? Having 2008-04-26 and 2008 just presents redundant information. Headbomb {t · c · p · b} 18:39, 25 September 2018 (UTC)
 * The requirement is described in the documentation. There are editors who do not like to have the disambiguator character displayed in the final rendering.
 * —Trappist the monk (talk) 11:14, 26 September 2018 (UTC)
 * And the game goes to Trappist the monk who scored the winning shot. But, obviously if pages is set, then blank page and at should be removed and such.  AManWithNoPlan (talk) 14:38, 26 September 2018 (UTC)
 * And the game goes to Trappist the monk who scored the winning shot. But, obviously if pages is set, then blank page and at should be removed and such.  AManWithNoPlan (talk) 14:38, 26 September 2018 (UTC)

https://github.com/ms609/citation-bot/pull/829 AManWithNoPlan (talk) 23:37, 26 September 2018 (UTC)
 * fixed

Causing cite date errors
Add tests and soon code. https://github.com/ms609/citation-bot/pull/825 AManWithNoPlan (talk) 18:27, 26 September 2018 (UTC)
 * Code added. Just waiting for deployment to wikipedia now. AManWithNoPlan (talk) 18:44, 26 September 2018 (UTC)
 * fixed

Wikilinks in journal name removed
Unless the entire journal name is wiki-linked, the data is almost always wrong. Secondly, partial links corrupt the COINS data and should not be done that way. AManWithNoPlan (talk) 20:02, 27 September 2018 (UTC)
 * That's probably OK. Some of the other links were to things like The Astrophysical Journal Letters, those can always be done with a redirect of the whole title if they're considered important enough.  Lithopsian (talk) 20:39, 27 September 2018 (UTC)
 * just link the whole title and the bot leaves it alone. notabug AManWithNoPlan (talk) 02:10, 28 September 2018 (UTC)

Bizarre last1 field generated for cite web
fixed AManWithNoPlan (talk) 21:02, 27 September 2018 (UTC)

Adds time element to ref date
fixed AManWithNoPlan (talk) 21:01, 27 September 2018 (UTC)

Can anything be done with ScienceDirect.com urls?
For example or Those URLs are extremely common and if they can be parsed (similar to DOI urls), that would be fantastic. And then they could be removed since they'll be redundant with DOIs. Headbomb {t · c · p · b} 22:52, 29 August 2018 (UTC) We would need to grab the hmtl, parse as xml, <meta name="citation_doi" content="10.1016/j.laa.2012.05.036" /> Or we could use: https://api.elsevier.com/content/object/pii/S0024379512004405 AManWithNoPlan (talk) 14:31, 30 August 2018 (UTC)
 * In other words super simple. If DOI is found, then add DOI and forget URL.  But, if DOI is already set, then forget URL if DOI is the same.  AManWithNoPlan (talk) 14:57, 30 August 2018 (UTC)
 * Glad to hear it's not a hard thing to do! Let's do it then! Headbomb {t · c · p · b} 15:59, 30 August 2018 (UTC)
 * More generally, the thing to do will be to use the Citoid API to extract information from any relevant URL. Citoid have an enormous and maintained database of journal web pages, including, I am sure, ScienceDirect, listing how to obtain relevant metadata from them.  I'm tapping away at getting something up and running on this basis (though I'm low on time again at the moment).  Martin  (Smith609 – Talk)  05:51, 1 September 2018 (UTC)
 * We have code to query Citoid in GitHub (original jstor code). We got throttled by Citoid so we just looked at Citoid jstor code and incorporated it. AManWithNoPlan (talk) 03:14, 9 September 2018 (UTC)


 * Should be Martin  (Smith609 – Talk)  11:55, 28 September 2018 (UTC)

API: New feature, reference rebuild
I'd like the option to have a 'rebuild references' when they are so crappy we need to TNT them (for whatever reason), and start anew. Two options would be present This would present things in a 'standardized' parameter order with 'standardized' whitespace
 * &rebuild=multiline (multiline option)
 * &rebuild=inline (inline option)
 * cite arXiv
 * cite book
 * cite journal

Whatever is marked  would be carried over from the old citation, with URLs/Identifiers used to rebuilt the rest of the citation. The rest would be present (if the bot can/would fill them), or omitted (if the bot can't/wouldn't fill them). Headbomb {t · c · p · b} 17:59, 1 September 2018 (UTC)
 * cite web (you can image the inline option)
 * The idea is that this would facilitate this type of cleanup and standardization. Headbomb {t · c · p · b} 18:10, 1 September 2018 (UTC)
 * If multi-line every argument ideally would have its own line as it's confusing for other bots when there is a combination, they can't determine automatically what kind of template it is supposed to be multi-line or single-line. -- Green  C  13:48, 24 September 2018 (UTC)
 * Not sure I follow. The point is for one time runs to rebuild stuff in an easily-reviewable way. What happens after that is business as usual. Other bots have nothing to do with this. Headbomb {t · c · p · b} 14:43, 24 September 2018 (UTC)
 * I think that other bots expect either multi-line or one-big-line and have multiple lines but with last and first on one line makes the other bots confused. AManWithNoPlan (talk) 14:48, 24 September 2018 (UTC)
 * Well, that's already the case, and not the problem this request is trying to solve. Headbomb {t · c · p · b} 15:00, 24 September 2018 (UTC)


 * From past experience, this type of behaviour has the potential to be unpopular. I'm not sure that it's quite within the remit of the bot, as it stands. If you think it's important, I would suggest opening a new bot request for approval to determine the parameters under which this behaviour would be acceptable. Martin  (Smith609 – Talk)  11:58, 28 September 2018 (UTC)
 * I agree. This puts a bullseye on the bot for complaints. wontfix AManWithNoPlan (talk) 22:34, 28 September 2018 (UTC)

Capitalization of journals
AManWithNoPlan (talk) 21:37, 27 September 2018 (UTC)

Don't expand raw URLs using Citoid/Zotero

 * The function of the bot just a day agao where it did not add titles to cite webs and cite news was perfect in my view. I feel that the curent version of the bot is too unstable and adding a lot of junk. Can a option not to run Zotero expansion be added (or the reverse). Such as  <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t)  Josve05a  (c) 21:42, 27 September 2018 (UTC)
 * That function went a lot overboard initially. No data validation, etc.. I was surprised to see it up and running.   AManWithNoPlan (talk) 22:47, 27 September 2018 (UTC)
 * We've scaled back our use of Zotero, and will continue to monitor until we strike the right balance. Marking as resolved. Martin  (Smith609 – Talk)  12:06, 28 September 2018 (UTC)

Capitalization and punctuation in Washington, D.C.
https://github.com/ms609/citation-bot/pull/858 AManWithNoPlan (talk) 16:08, 28 September 2018 (UTC)

More script-title issues
What do you suggest? The template supports two different title parameters and then shows them both. We added code that prevents duplicates, but in these cases the script title and title are very different (or maybe it basically one is printed and the other is cursive styling of the same words). Perhaps: if (has script-title and new title is not all western characters) then ignore new title else add title end if AManWithNoPlan (talk) 16:12, 28 September 2018 (UTC)
 * Yes, that would be a nice solution. However, a "non-script" title isn't necessary when a script-title is present, and in most cases where script-title is used, there is no "western title" availible at all, only a trans-title.

if (has script-title) then ignore new title else add title end if <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 16:18, 28 September 2018 (UTC)

Bot hasn't edited in a few hours / fails to edit when triggered
Activating the bot sends in into an endless loop of doing absolutely nothing. Can't really explain more save it just fails to run properly on any page you try to run it on. Headbomb {t · c · p · b} 14:39, 29 September 2018 (UTC)

fixed AManWithNoPlan (talk) 20:36, 29 September 2018 (UTC)

Adds invalid date

 * Thanks for the report. Looking into it. Martin  (Smith609 – Talk)  12:08, 28 September 2018 (UTC)

Cite LSA support
Why???? What you expect the bot to do? AManWithNoPlan (talk) 22:48, 3 October 2018 (UTC)
 * wontfix template is basically a fancy formatting tool. AManWithNoPlan (talk) 22:55, 3 October 2018 (UTC)
 * Perhaps expand:
 * to
 * Or somehting at least. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 23:09, 3 October 2018 (UTC)
 * Rarely used and actually not easy as all to implement. Bug us again when bug queue is empty.  AManWithNoPlan (talk) 03:03, 4 October 2018 (UTC)
 * Or somehting at least. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 23:09, 3 October 2018 (UTC)
 * Rarely used and actually not easy as all to implement. Bug us again when bug queue is empty.  AManWithNoPlan (talk) 03:03, 4 October 2018 (UTC)
 * Rarely used and actually not easy as all to implement. Bug us again when bug queue is empty.  AManWithNoPlan (talk) 03:03, 4 October 2018 (UTC)

url-access
Or, actually, in this case it should have added 11245/1.309707 instead of the URL, but in general when adding a free URL, it should add free. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:42, 3 October 2018 (UTC)
 * No. free is not supported by cs1|2 because values in url are presumed to be free-to-read.
 * —Trappist the monk (talk) 22:45, 3 October 2018 (UTC)
 * Hmm...sorry! My bad. I thought that was not tru with cite journal. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:57, 3 October 2018 (UTC)
 * —Trappist the monk (talk) 22:45, 3 October 2018 (UTC)
 * Hmm...sorry! My bad. I thought that was not tru with cite journal. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:57, 3 October 2018 (UTC)

Hdl
wontfix the meta data is poor quality AManWithNoPlan (talk) 03:05, 4 October 2018 (UTC)

Recognize NCBI bookshelf links?
The treasure trove of URL readers used by Citoid do actually parse this page: https://github.com/zotero/translators  AManWithNoPlan (talk) 18:37, 21 August 2018 (UTC)
 * The books have a lot of stuff like 2001 AManWithNoPlan (talk) 18:42, 21 August 2018 (UTC)

Further researchgate support

 * Would want to not add the researchgate specific dois automatically, but that one would be good. AManWithNoPlan (talk) 14:03, 22 September 2018 (UTC)
 * 10.13140 is the researchgate prefix. It is not a CrossRef DOI, but a DataCite DOI.  AManWithNoPlan (talk) 14:49, 25 September 2018 (UTC)
 * Does the Zotero API now handle this? Martin  (Smith609 – Talk)  12:03, 28 September 2018 (UTC)
 * Strangely think it did and now it doesn't. I do not know why.  Maybe I remember wrong, or maybe we added data integrity checks that thought this was too questionable of data.   For Citoid it does, but us it does not.  This test is commented out in the github tests:  https://www.researchgate.net/publication/23445361  AManWithNoPlan (talk) 14:51, 28 September 2018 (UTC)

wontfix they block us and anything that looks like scraping. AManWithNoPlan (talk) 02:01, 5 October 2018 (UTC)

Bare references in []s
Those were not bare urls though. Headbomb {t · c · p · b} 13:43, 4 October 2018 (UTC)
 * Would converting [http....html] to http....html be considered a significant enough improvement to be justified -- even if not expanded into anything? I hate those references that are just another number in square braces.  AManWithNoPlan (talk) 14:37, 4 October 2018 (UTC)
 * Well [http//...html] is a bare url. But in that diff, you have FOOBAR, which isn't bare. Headbomb {t · c · p · b} 16:52, 4 October 2018 (UTC)
 * I realize that, but what are your thoughts on references that are bare and have not title and yet have square brackets around them. AManWithNoPlan (talk) 02:49, 5 October 2018 (UTC)
 * Aren't they currently expanded? I thought they were? Headbomb {t · c · p · b} 03:01, 5 October 2018 (UTC)
 * Why yes they are..... AManWithNoPlan (talk) 03:05, 5 October 2018 (UTC)

Adding strange template
This is a GIGO problem, since the info exists in the headers of the web page(s) in question, but it would be great if the tool could ignore this junk instead of inserting it. See the archives of User talk:Zhaofeng Li/reFill, another tool that editors have been using to semi-automatically insert this junk for years. Gnomes remove it manually if tool-using editors fail to see it in Preview. – Jonesey95 (talk) 15:02, 29 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/873 AManWithNoPlan (talk) 03:06, 4 October 2018 (UTC)

ILR Review
https://github.com/ms609/citation-bot/pull/874 AManWithNoPlan (talk) 03:07, 4 October 2018 (UTC)

Bot chokes up on Signet ring cell carcinoma
https://github.com/ms609/citation-bot/pull/877 AManWithNoPlan (talk) 03:04, 4 October 2018 (UTC)

Adding book title as journal
Newer code avoids those book bibcodes AManWithNoPlan (talk) 03:08, 5 October 2018 (UTC)

API: New feature, random edit
I'm currently using https://tools.wmflabs.org/citations/category.php?cat=1980_births&slow=1 which makes one edit and then stops (this seems to be a bug from above discussions). I like to make the bot run on random pages and then stop when it has made an edit, I, however, don't want to specify a category. I'd love to be able to use a link such as https://tools.wmflabs.org/citations/random.php and just have the bot find a page where it will make an edit. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 20:59, 19 August 2018 (UTC)
 * That would be cool. The bot would grab a random page and then make the edit.  I tried https://tools.wmflabs.org/citations/doibot.php?edit=toolbar&slow=1&page=Special:Random and it did not just work, so new code would be needed.  AManWithNoPlan (talk) 21:19, 19 August 2018 (UTC)
 * API:Random. --Izno (talk) 21:50, 19 August 2018 (UTC)
 * in the mean time just open up new pages by clicking on the Random page link and then clicking the citation link off to the left side. AManWithNoPlan (talk) 22:19, 19 August 2018 (UTC)
 * Nah, in the meantime I'll just use the (broken) category API and let the bot run on a category with lots of articles, since it will continue to run on articles until it finds an article where an edit will be made, as with running it on individual pages using Special:Random, there is a great chance of no edits being made. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:29, 19 August 2018 (UTC)

The bot historically logged each page that it visited to a database, and could be run on the page that had been longest without a visit. The database didn't make the migration to ToolForge, but some of the code still exists. Something like what you suggest would be a good step towards the bot running unsupervised again (which had to be discontinued because I didn't have AManWithNoPlan to keep up with bug reports!) Martin  (Smith609 – Talk)  13:56, 21 August 2018 (UTC)
 * you still want this? I mean there's so much to work on that a random article seems wasted. Headbomb {t · c · p · b} 22:21, 26 August 2018 (UTC)
 * It would be a nice feautre, but it isn't so that I'm demanding it. Just thought I should ask to see... <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 07:13, 27 August 2018 (UTC)

wontfix too many other things to do AManWithNoPlan (talk) 14:49, 8 October 2018 (UTC)

remove even more google books
Regression of User talk:Citation bot/Archive 10 <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 23:06, 29 September 2018 (UTC)
 * NOT a regression. Just more ways that google books likes to describe itself.  https://github.com/ms609/citation-bot/pull/896  AManWithNoPlan (talk) 02:43, 7 October 2018 (UTC)
 * fixed AManWithNoPlan (talk) 14:32, 8 October 2018 (UTC)

cite arxiv --> cite journal misfires
https://github.com/ms609/citation-bot/pull/902 AManWithNoPlan (talk) 15:05, 8 October 2018 (UTC)
 * fixed AManWithNoPlan (talk) 17:02, 8 October 2018 (UTC)

Springer support
https://github.com/ms609/citation-bot/pull/886 AManWithNoPlan (talk) 04:36, 5 October 2018 (UTC)

https://github.com/ms609/citation-bot/pull/885 AManWithNoPlan (talk) 04:36, 5 October 2018 (UTC)

fixed AManWithNoPlan (talk) 14:31, 8 October 2018 (UTC)

More date errors
fixed AManWithNoPlan (talk) 14:30, 8 October 2018 (UTC)

Stop converting ... to …
This is very annoying. Headbomb {t · c · p · b} 06:13, 5 October 2018 (UTC)
 * I did not realize the bot did that. Just curious why annoying. AManWithNoPlan (talk) 12:42, 5 October 2018 (UTC)
 * MOS:ELLIPSIS. --Izno (talk) 13:52, 5 October 2018 (UTC)


 * I noticed it started doing this around this time. Or maybe it was this time. I can't say if it's related or not though. Headbomb {t · c · p · b} 13:57, 5 October 2018 (UTC)
 * It actually requested recently. https://github.com/ms609/citation-bot/pull/889 AManWithNoPlan (talk) 03:56, 6 October 2018 (UTC)

--AManWithNoPlan (talk) 02:08, 7 October 2018 (UTC)


 * fixed AManWithNoPlan (talk) 14:30, 8 October 2018 (UTC)

Follow redirects
that website uses invalid ssl certs and so the bounces get stopped by https libraries. i really do not want to turn that off. AManWithNoPlan (talk) 04:21, 8 October 2018 (UTC)
 * wontfix sadly. AManWithNoPlan (talk) 14:27, 8 October 2018 (UTC)

more non-standard jstor URLS

 * Actually all the j.something are books/book chapters. Headbomb {t · c · p · b} 23:28, 2 September 2018 (UTC)
 * AManWithNoPlan (talk) 23:36, 9 September 2018 (UTC)

fixed AManWithNoPlan (talk) 15:15, 9 October 2018 (UTC)

hardcoded hdl url
we will probably add regex's to catch the more common ones. We do the same with pubmed. AManWithNoPlan (talk) 22:11, 30 September 2018 (UTC)
 * wrote regex code. should make it easy to add more over time. https://github.com/ms609/citation-bot/pull/903  AManWithNoPlan (talk) 23:49, 8 October 2018 (UTC)
 * fixed AManWithNoPlan (talk) 15:16, 9 October 2018 (UTC)

Missing spaces
Problem with euppublishing.com? <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:02, 7 October 2018 (UTC)
 * Complain to the publisher about giving bad meta data to crossref (and you will be ignored most likely). We do not get data from the web information you quote AManWithNoPlan (talk) 04:15, 8 October 2018 (UTC)
 * wontfix sadly. AManWithNoPlan (talk) 14:27, 8 October 2018 (UTC)
 * Can't we make a manual "fix" with this publisher? If, ensure there is a space before any parantesis. If not, consider adding one. Or something like that? Or, if htere is no space, make the bot try and scrape the landing page and see if the HTML there has a space? <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t)  Josve05a  (c) 17:43, 8 October 2018 (UTC)
 * No way, no how that we a going to make a website correct the crossref data. Also, no space before a ( is correct in many contexts.  AManWithNoPlan (talk) 20:19, 8 October 2018 (UTC)