User talk:Citation bot/Archive 11

Convert hard spaces (various kinds) to regular spaces
This covers

It creates diffs like this. Headbomb {t · c · p · b} 12:30, 3 September 2018 (UTC)


 * Is it always the case that a user who enters one of these hard spaces truly wished to include a regular space? I think this is an instance where we would do well to respect manual input.  I'm not aware of the bot introducing hard spaces from any of its own data sources. Martin  (Smith609 – Talk)  12:03, 28 September 2018 (UTC)
 * See MOS/Text formatting: The only invisible characters in the editable text should be spaces and tabs. However, other invisible characters are often inserted inadvertently by pasting from a word processor. – Jonesey95 (talk) 14:22, 28 September 2018 (UTC)
 * Any hard-coded non-regular spaces should be changed and normalized to regular spaces. If someone writes &amp;nbsp; explicitly, sure, respect that, but a hard-coded ones should be converted to normal spaces. Headbomb {t · c · p · b} 14:26, 28 September 2018 (UTC)


 * https://github.com/ms609/citation-bot/pull/894 AManWithNoPlan (talk) 02:52, 11 October 2018 (UTC)

fixed

Expand based upon chapter-url

 * , to be clear, the request here is to do all the cleanup you do on url to chapter-url as well. Not just Research Gate chapter-urls. Headbomb {t · c · p · b} 23:59, 20 September 2018 (UTC)
 * If the citation has both, probably do chapter first since more specific? AManWithNoPlan (talk) 00:29, 21 September 2018 (UTC)
 * Doable, but more complicated than it looks. Will need to create do_the_url($url,$param) function that is called, where $param=FALSE for new ones. AManWithNoPlan (talk) 13:34, 21 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/882 AManWithNoPlan (talk) 02:56, 5 October 2018 (UTC)

fixed

Edit that only deletes empty parameter

 * I don't have a problem with some user-activate limited cosmetic edits. The real problem with this edit is that url is arguably a parameter we likely want to be used down the road, so removing it doesn't prevent bad usage/encourage standard usage. Unlike, say removing an empty page when pages is set. Headbomb {t · c · p · b} 11:18, 29 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/912 AManWithNoPlan (talk) 04:27, 10 October 2018 (UTC)

Some (specific) bibcodes still not expanded

 * There is also a warning about arxiv text parsing in the bot output from AM Herculis. Not sure if its a bug or not.  Lithopsian (talk) 20:54, 29 September 2018 (UTC)
 * Some more info: the problem is not specific to the bibcodes, but something related to the internal workings of the bot. See  for an example where bibcode 1995A&AS..114..269D was expanded without a problem.  The bot wrote

> Checking AdsAbs database > AdsAbs search 255/50000: bibcode:"1995A       AS..114..269D"
 * For the AM Herculis case, it wrote - oh dear, it wrote something else that I've now lost. A rerun gives:

> Expanding from BibCodes via AdsAbs API > AdsAbs 'big-query' request 26/1000: > Found match for bibcode 1977ApJ...216L..45K > Found match for bibcode 1977ApJ...212L.125T > Found match for bibcode 1924AN....220..249H ! No match for bibcode identifier: 2000A&A...361..952H; 1995A&AS..114..269D; 1977S&T....53..351L > Checking that DOI 10.1002/asna.19232201505 is operational... DOI ok.


 * Must be the & symbol AManWithNoPlan (talk) 18:23, 2 October 2018 (UTC)


 * The bigquery API accepts CSV-style form data in a POST request. However the bot is urlencoding it and I don't think this is correct.  If so, it would explain the ampersand issue.  Lithopsian (talk) 18:16, 3 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/883 AManWithNoPlan (talk) 03:16, 5 October 2018 (UTC)

Incorrectly converting to Journal
BBC News is reasonably/actually a work. --Izno (talk) 06:21, 30 September 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/915 AManWithNoPlan (talk) 00:35, 11 October 2018 (UTC)

JSTOR URL redirects
https://github.com/ms609/citation-bot/pull/888 AManWithNoPlan (talk) 03:17, 6 October 2018 (UTC)

Associated Press
Unless ap.org is the domain in url, then  or   should be agency and not publisher. (t) Josve05a  (c) 21:52, 3 October 2018 (UTC)
 * Also, remove  as author. (t)  Josve05a  (c) 21:52, 3 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/884 AManWithNoPlan (talk) 03:33, 5 October 2018 (UTC)
 * not fixed. must be some other path in the code. AManWithNoPlan (talk) 03:25, 9 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/913 AManWithNoPlan (talk) 15:32, 10 October 2018 (UTC)

Clean up meta data put in title
Is there anyway to detect this and have to bot "recreate" the title and remove all that metadata from the title when expanding? (t) Josve05a  (c) 19:40, 6 October 2018 (UTC)


 * Nope, because a lot of the times Errata and Comments look like that too. Headbomb {t · c · p · b} 22:49, 6 October 2018 (UTC)


 * indeed this would be an awesome feature, but very hard to do right without false positives. Also not too common. AManWithNoPlan (talk) 01:17, 7 October 2018 (UTC)


 * Which is why it would be awesome as a user-activated thing... **cough cough**. Headbomb {t · c · p · b} 01:20, 7 October 2018 (UTC)
 * sorry but i cannot hear your request over all the coughing AManWithNoPlan (talk) 02:07, 7 October 2018 (UTC)

wontfix Bring it up again if it continues to be a problem, and there are no other bugs. AManWithNoPlan (talk) 14:33, 11 October 2018 (UTC)

Bot uses " # # # citation_bot_placeholder_comment 15 # # # title " in edit summary

 * same as User_talk:Citation_bot/Archive_10. (t) Josve05a  (c) 12:17, 9 October 2018 (UTC)
 * That happens when there is a comment before the parameter name. parameters.php needs to put comments into the white space not the parameter name. Currently paramters with comments before the name are ignored completely.  AManWithNoPlan (talk) 13:26, 9 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/909 AManWithNoPlan (talk) 16:22, 9 October 2018 (UTC)

redundant hdl url added
https://github.com/ms609/citation-bot/pull/911 AManWithNoPlan (talk) 17:33, 9 October 2018 (UTC)

Bot does not recognize 'encyclopedia=' in encyclopedia tag

 * This also applies to cite newspaper (see this edit line 425). -- AquaDTRS (talk) 21:41, 10 October 2018 (UTC)
 * you are correct although that refences is messed up and should use series, volume, etc. AManWithNoPlan (talk) 21:45, 10 October 2018 (UTC)
 * I don't entirely understand why your first diff is using cite encyclopedia. Is that work actually an encyclopedia? --Izno (talk) 21:45, 10 October 2018 (UTC)
 * Now that you've mentioned it, the source does seem like a book than an encyclopedia. -- AquaDTRS (talk) 21:52, 10 October 2018 (UTC)
 * It is the wrong template, but the bot still did the wrong thing. AManWithNoPlan (talk) 21:56, 10 October 2018 (UTC)

https://github.com/ms609/citation-bot/pull/914/ AManWithNoPlan (talk) 23:49, 10 October 2018 (UTC)

Cite newspaper
https://github.com/ms609/citation-bot/pull/876 AManWithNoPlan (talk) 03:23, 4 October 2018 (UTC)

Remove "subscription required" or replace with parameter

 * That is actually really hard.  Will have to think about. AManWithNoPlan (talk) 18:11, 7 October 2018 (UTC)
 * wontfix we are not setup to do that AManWithNoPlan (talk) 20:45, 16 October 2018 (UTC)

A lot of metadata from biodiversitylibrary.org is junk
Not really junk as the publisher is not specified. [s.n.] is for Sine nomine ie "without a name". Perhaps better to either omit - or include the text "Publisher not specified". - Aa77zz (talk) 18:48, 12 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/925 AManWithNoPlan (talk) 21:25, 12 October 2018 (UTC)
 * Also note there is a trailing comma at the end for some reason... (t) Josve05a  (c) 21:26, 12 October 2018 (UTC)

Bot seems to have problems with Korean

 * Just a note: using REFLINKS these problems do not happen so it seems to be a problem with the bot. Redalert2fan (talk) 20:07, 12 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/926 AManWithNoPlan (talk) 21:24, 12 October 2018 (UTC)

Adding dates that are not English
This will require 19 arrays. One for each month and day of the week padded with spaces. Each one will include a bunch of non-English words. Then using unicode aware case-insesitive regex search and replace would run padding punctuation and the string itself with spaces, then search and replace on arrays, then de-pad, lastly call our date handler and pray. AManWithNoPlan (talk) 13:00, 14 October 2018 (UTC)
 * in mean time we should look for year at end. That would catch 99% i guess. AManWithNoPlan (talk) 13:02, 14 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/931 AManWithNoPlan (talk) 21:28, 14 October 2018 (UTC)

Bot chokes on Bram van Leer
Headbomb {t · c · p · b} 22:55, 13 October 2018 (UTC)
 * Github code choked until I added debug printout. AManWithNoPlan (talk) 13:01, 14 October 2018 (UTC)
 * The bot is choking on essentially every page I try for a couple of days now. I don't see any obvious theme in the diagnostic output.  Here's a couple of examples:
 * Tests of general relativity
 * SU Andromedae
 * Lithopsian (talk) 18:46, 15 October 2018 (UTC)

fixed

If adding newspaper, remove the publisher if it is the same
https://github.com/ms609/citation-bot/pull/933 AManWithNoPlan (talk) 17:04, 14 October 2018 (UTC)

CrossRef meta data provides invalid XML: the bot should fix it

 * In the api it states it was Published online: 30 Oct 2017, but published in 2006. (t) Josve05a  (c) 14:19, 14 October 2018 (UTC)


 * they give us two dates. Time to parse them all and pick print.

2017 2006
 * AManWithNoPlan (talk) 16:33, 14 October 2018 (UTC)

Honour the df=mdy-all card

 * df makes whatever date is present display in the format in question--it is not a requirement on that date to be X or Y format. This is not an incorrect behavior. --Izno (talk) 19:52, 15 October 2018 (UTC)
 * Also, this parameter was created specifically for bots and other automated tools so that they would not have to worry about date formatting. IABot, last time I looked, even provides a blank df parameter for editors to use.
 * —Trappist the monk (talk) 19:54, 15 October 2018 (UTC)
 * The df parameter does not do that - it displays wrongly, as you can see. I've added a nobots card to suppress the Citation bot.  Hawkeye7   (discuss)  22:04, 15 October 2018 (UTC)
 * The df parameter does not do that - it displays wrongly, as you can see.
 * You are going to have to prove that. Here are the templates that you modified in  immediately subsequent to the ; here these templates are as the bot left them:
 * legend: ✅ – template has mdy-all; ❌ – template does not have df):
 * The three templates where you changed date to year are excluded here as not relevant to this discussion.
 * —Trappist the monk (talk) 22:43, 15 October 2018 (UTC)
 * Three of them were not archives, so there was no df parameter. (I've forgotten what the special meaning of df without a parameter is.) There was a Use mdy dates card which the Bot should have honoured. If the choice comes down to adding df to every citation, or a bots card to every article, then the latter wins hands down.   Hawkeye7   (discuss)  01:57, 16 October 2018 (UTC)
 * That [three] of them were not archives does not prove your claim that the df parameter ... displays wrongly.
 * df has nothing to do with archives per se, just dates. When df is included in a cs1|2 template without a value, it has the same meaning as when the parameter is omitted entirely.  Perhaps you are thinking of dead-url which empty means yes which is the default state when dead-url is omitted from the cs1|2 template.  Because people forget this stuff, there is documentation at the template page.  When you forget how a template parameter works, consult the documentation.
 * —Trappist the monk (talk) 12:05, 16 October 2018 (UTC)
 * And I've spent a great deal of time updating template documentation that was missing or incorrect. In this case, the documentation doesn't say what meaning is when the df parameter is omitted entirely. It should default to the value of the use dmy dates or use mdy dates card, if present. The documentation implies that it does this, because it says: Use same format as other publication dates in the citations. I'm not going to update the documentation without confirmation from you. Hawkeye7   (discuss)  22:36, 16 October 2018 (UTC)
 * The only cs1|2 parameter that has meaning when empty or omitted is dead-url as I described above. cs1|2 templates cannot see what is outside of their bounding   and  ; for them,  and  do not exist.  Use same format as other publication dates in the citations is a directive to the user, not an indication of what the template does.
 * Where does the term 'card' come from? You have used card in this discussion as a synonym for 'template' and 'parameter'.
 * —Trappist the monk (talk) 09:12, 17 October 2018 (UTC)
 * —Trappist the monk (talk) 22:43, 15 October 2018 (UTC)
 * Three of them were not archives, so there was no df parameter. (I've forgotten what the special meaning of df without a parameter is.) There was a Use mdy dates card which the Bot should have honoured. If the choice comes down to adding df to every citation, or a bots card to every article, then the latter wins hands down.   Hawkeye7   (discuss)  01:57, 16 October 2018 (UTC)
 * That [three] of them were not archives does not prove your claim that the df parameter ... displays wrongly.
 * df has nothing to do with archives per se, just dates. When df is included in a cs1|2 template without a value, it has the same meaning as when the parameter is omitted entirely.  Perhaps you are thinking of dead-url which empty means yes which is the default state when dead-url is omitted from the cs1|2 template.  Because people forget this stuff, there is documentation at the template page.  When you forget how a template parameter works, consult the documentation.
 * —Trappist the monk (talk) 12:05, 16 October 2018 (UTC)
 * And I've spent a great deal of time updating template documentation that was missing or incorrect. In this case, the documentation doesn't say what meaning is when the df parameter is omitted entirely. It should default to the value of the use dmy dates or use mdy dates card, if present. The documentation implies that it does this, because it says: Use same format as other publication dates in the citations. I'm not going to update the documentation without confirmation from you. Hawkeye7   (discuss)  22:36, 16 October 2018 (UTC)
 * The only cs1|2 parameter that has meaning when empty or omitted is dead-url as I described above. cs1|2 templates cannot see what is outside of their bounding   and  ; for them,  and  do not exist.  Use same format as other publication dates in the citations is a directive to the user, not an indication of what the template does.
 * Where does the term 'card' come from? You have used card in this discussion as a synonym for 'template' and 'parameter'.
 * —Trappist the monk (talk) 09:12, 17 October 2018 (UTC)
 * —Trappist the monk (talk) 09:12, 17 October 2018 (UTC)

Okay, I am withdrawing this. The title is wrong. Am raising two new bug reports. The bug report was wrong; I expected the Bot to use the correct date format and not rely on the df card, which is not normally present, or on its default behaviour when it is not, which is undocumented. But the CS template is correctly reformatting the date. Hawkeye7  (discuss)  22:58, 16 October 2018 (UTC)

notabug good discussion all around AManWithNoPlan (talk) 03:32, 17 October 2018 (UTC)

Changes of cite book to cite journal and book reviews
Re run of the bot on Glossary of bird terms initiated by https://en.wikipedia.org/w/index.php?title=Glossary_of_bird_terms&diff=861412349&oldid=861412133

Two references to books with google scans were changed to cite journal. The books were Klein 1795 and Stark (& Sclater) 1900. The bot confused the books themselves with reviews of the books published in the journal Nature. Aa77zz (talk) 08:18, 27 September 2018 (UTC)


 * https://github.com/ms609/citation-bot/pull/942 AManWithNoPlan (talk) 17:19, 16 October 2018 (UTC)


 * mostly fixed AManWithNoPlan (talk) 18:03, 17 October 2018 (UTC)

URL retained on expansion
Would have to input doi and follow urls and see if it kind of matched. AManWithNoPlan (talk) 01:37, 9 October 2018 (UTC)
 * Could also generate a list of publisher websites such as:

const PUBLISHER_WEBSITES = array('elsevier.com', 'springer.com', 'sciencedirect.com', 'tandfonline.com',                                'taylorandfrancis.com', 'wiley.com', 'sagepub.com', 'sagepublications.com',                                 'scielo.org', 'scielo.br', 'degruyter.com', 'hindawi.com', 'inderscience.com',                                 'cambridge.org', '.oup.com', 'nature.com', 'macmillan.com', 'ieeexplore.ieee.org',                                 'worldscientific.com', 'iospress.com', 'iospress.nl', 'pnas.org'); and delete such urls if a DOI is present. AManWithNoPlan (talk) 14:39, 11 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/935 AManWithNoPlan (talk) 21:16, 14 October 2018 (UTC)
 * fixed AManWithNoPlan (talk) 13:57, 17 October 2018 (UTC)

If removing url, also remove website
That is really hard to fix. That will take some thought. AManWithNoPlan (talk) 16:04, 16 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/940 and https://github.com/ms609/citation-bot/pull/939 AManWithNoPlan (talk) 16:18, 16 October 2018 (UTC)
 * fixed AManWithNoPlan (talk) 13:34, 17 October 2018 (UTC)

Use Tags (perhaps instead of edit summary)
In edits such as https://en.wikipedia.org/w/index.php?title=Arabs&diff=prev&oldid=857900782 we should start using Special:Tags (as ProveIt does in https://en.wikipedia.org/w/index.php?title=Radiation_Research&diff=prev&oldid=856780199) (t) Josve05a  (c) 19:40, 3 September 2018 (UTC)
 * Please give an example of what you feel would be better. AManWithNoPlan (talk) 13:26, 22 September 2018 (UTC)
 * The edit summary could be the same for now (unless other has better input), but implement a Special:Tag so these edits can be filtered etc. (t) Josve05a  (c) 14:15, 22 September 2018 (UTC)
 * What would be the purpose of filtering?  i.e. what problem is this proposal trying to solve?  We already have Special:Contributions/Citation_bot to view the bot's edits. Martin  (Smith609 – Talk)  12:01, 28 September 2018 (UTC)
 * When it is an edit assisted with Citation bot (i.e. using my account), those do not show up on the bots contrib-page. (t) Josve05a  (c) 13:12, 28 September 2018 (UTC)
 * Three steps. A tag would need created (no idea how)   The bot could then tag its own edits (useless and duplicates contributions).  To make it useful another bot would then find all edits with our assisted by cite bot text in the summary and then tag it (no idea who would do that, but I would think it would not be very hard.  AManWithNoPlan (talk) 03:22, 11 October 2018 (UTC)
 * In step 3, it should be the user when making the edit who ‘adds the tag’ automatically (see first link as how that tool automatically add the tag if their (JavaScript?)tool was used when editing. Not a bot who post-actively tags the edits, but the users themselves in real time. (t) Josve05a  (c) 15:08, 11 October 2018 (UTC)
 * Tags has the info--admins can add tags on their wiki. --Izno (talk) 15:26, 11 October 2018 (UTC)
 * Orthogonal to tags would be for the bot to use Special:OAuth so that the edits can be attributed directly to the user who used the bot. --Izno (talk) 15:26, 11 October 2018 (UTC)
 * not fixed but closing discussion and making github issue since we agree. AManWithNoPlan (talk) 13:33, 18 October 2018 (UTC)
 * Would you mind linking that here before archiving? --Izno (talk) 15:03, 18 October 2018 (UTC)
 * You can find it here: https://github.com/ms609/citation-bot/issues/949 Redalert2fan (talk) 15:21, 18 October 2018 (UTC)

Question
How does one find out which editor is actually running the bot for a particular edit? I cannot believe that in a transparent and collegiate editing environment such as we (presumably) have, it is possible to edit truly anonymously via Citationbot? I note that the page reminds editors that they are responsible for every edit they make with the bot, but I do not see how that can be enforced if no-one knows who it actually is! Hopefully, I'm missing something blindingly obvious. Any thoughts? —SerialNumber54129 paranoia / cheap sh*t room 14:48, 2 October 2018 (UTC)
 * In most edits this is noted at the end of edit summaries. (t) Josve05a  (c) 15:57, 2 October 2018 (UTC)
 * Indeed it is; and as you can probably imagine, it is those that do not that I am interested in :) viz, they that merely say (...You can use this bot yourself. Report bugs here., if yousee what I mean... —SerialNumber54129  paranoia / cheap sh*t room 16:10, 2 October 2018 (UTC)
 * Yeah, those are when a user runs the bot without  in the URL. One could make that a prerequisite, but you can type whatever username you want...so not stopping anybody pretending to be "user foo". (t)  Josve05a  (c) 16:25, 2 October 2018 (UTC)
 * While it is probably unlikely to happen, If I understand correctly from the explanation above it is also possible to put in a username of another user, which may be prone to abuse and could be seen as unintended behavior. Redalert2fan (talk) 13:35, 4 October 2018 (UTC)
 * Some links relevant I think. AManWithNoPlan (talk) 15:44, 11 October 2018 (UTC)

https://www.mediawiki.org/wiki/Special:OAuthListConsumers/view/369e0e1d1c504d1956b87af5942879c4 https://tools.wmflabs.org/oauth-hello-world/index.php?action=download https://www.mediawiki.org/wiki/OAuth/For_Developers#PHP_client_without_using_any_libraries

Not fixed at this time but closing discussion and making a github issues AManWithNoPlan (talk) 13:30, 18 October 2018 (UTC)
 * Would you mind linking that here before archiving? --Izno (talk) 15:03, 18 October 2018 (UTC)
 * Seems like this is it: https://github.com/ms609/citation-bot/issues/948 Redalert2fan (talk) 15:18, 18 October 2018 (UTC)

Changing case of title is suspicious
In this edit the bot capitalized the first letter of each non-trivial word in a journal title. However, some citation styles use sentence case capitalization for titles. Also, the case in a citation is independent of how the source chooses to write the title, so grabbing it from some database is invalid. The citation style for templates does not specify whether titles should be so-called title case or sentence case. So why is the bot making this change? Jc3s5h (talk) 14:53, 15 October 2018 (UTC)
 * This has been the standard for over a decade -- even back when the bot ran automatically without a human requesting it. Others can chime in on this topic -- and we known that they will.   AManWithNoPlan (talk) 15:53, 15 October 2018 (UTC)
 * See MOS:TITLECAPS. Headbomb {t · c · p · b} 19:15, 15 October 2018 (UTC)
 * In as much as WP:CITEVAR permits any consistent citation style, I believe that other parts of WP:MOS, including the page suggested by Headbomb, "Manual of Style/Titles", does not apply to citations. I note that citation bot only operates on citation templates. In "Help:Citation Style 1", the "CS1 compliance with Wikipedia's Manual of Style" section goes out of it's way to explain that the date guidelines from "Manual of Style/Dates and numbers" § Dates, months and years apply to dates in citation templates. The absence of other references to the "Manual of Style" also suggests that other aspects of citations with templates are not controlled by the "Manual of Style". Jc3s5h (talk) 21:22, 15 October 2018 (UTC)


 * They are. The reason why dates are singled out is because the CS1 templates will throw out errors when dates are badly presented, and the templates aren't smart enough to throw out errors when things aren't capitalized properly, so they're less of a need for explanations. MOS still applies though. Headbomb {t · c · p · b} 22:55, 15 October 2018 (UTC)

Looking at Help:CS1 more closely, I see this passage for title case: "Use title case unless the cited source covers a scientific, legal or other technical topic and sentence case is the predominant style in journals on that topic. Use either title case or sentence case consistently throughout the article."

Manual of Style/Titles explicitly recognizes that citations may use sentence case titles if called for by the citation style used in a particular article: "permits the use of pre-defined, off-Wikipedia citation styles within Wikipedia, and some of these expect sentence case for certain titles (usually article and chapter titles). Title case should not be imposed on such titles under such a citation style when that style is the one consistently used in an article."

So the bot should not be going around changing titles from sentence case to title case. Jc3s5h (talk) 00:21, 16 October 2018 (UTC)


 * That's chapter titles, not work titles. And yes the bot should change them, per longstanding consensus to do so and other bots that do similar things. Headbomb {t · c · p · b} 00:38, 16 October 2018 (UTC)
 * The type of work covered by the passage in Manual of Style/Titles is journals. Journals don't have chapters, they have articles. It is common for titles of journal articles to be rendered in sentence case; the titles of the journals are typically title case. And bots designed to edit citation templates should obey the documentation for those citation templates. Jc3s5h (talk) 01:01, 16 October 2018 (UTC)


 * Same thing, chapter = article for journals, and bots follow both template docs and the MOS. And on Wikipedia, journal titles are capitalized in title case. See WP:JCW/Target1 for typical usage. Leaving obvious typos out, I count about 50ish cases out of 507819 citations (or <0.01%). And most of those were added by external tools by mistake. Headbomb {t · c · p · b} 01:10, 16 October 2018 (UTC)


 * notabug thank you head bombster AManWithNoPlan (talk) 13:16, 18 October 2018 (UTC)

Problems with twitter post / external links in title

 * cite tweet... (t) Josve05a  (c) 15:16, 18 October 2018 (UTC)
 * No twitter for us: https://github.com/ms609/citation-bot/pull/951  AManWithNoPlan (talk) 15:57, 18 October 2018 (UTC)
 * No URLs in titles: https://github.com/ms609/citation-bot/pull/950 AManWithNoPlan (talk) 15:57, 18 October 2018 (UTC)

Added foreign date

 * this should help a lot https://github.com/ms609/citation-bot/pull/953 AManWithNoPlan (talk) 17:26, 18 October 2018 (UTC)

Dates that are newer than access dates
That does point to data that has changed since being accessed. An example? AManWithNoPlan (talk) 20:18, 3 October 2018 (UTC)
 * When something lst was updated should not matter in most cases if they were accessed prior to the last update. What matters is when it was published. If it was published after the accessdate, then something is wrong with either the date or access-date and the bot should disengage due to GiGo causing more GiGo. (t) Josve05a  (c) 20:37, 3 October 2018 (UTC)
 * Examples from Spanish flu:

<span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 20:37, 3 October 2018 (UTC)
 * In ref 1 above, it was even archived (archive-date)) on Wayback prior to the date. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 20:38, 3 October 2018 (UTC)
 * Yeah, date should never be more recent than archive-date. See https://en.wikipedia.org/w/index.php?title=Aviation_fuel&type=revision&diff=864377933&oldid=864377553 <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 20:33, 16 October 2018 (UTC)
 * Occasionally a work published near the end of a year will be assigned a publication date of the next calendar year. I've only seen this with printed books; traditionally access dates are not put in a citation for a book, and also, access dates are not used when there is no URL. But there could be other kinds of work where the publisher gives a publication date later than the date the work is actually available, and is of a type where an access date would be appropriate. Jc3s5h (talk) 22:55, 16 October 2018 (UTC)


 * https://github.com/ms609/citation-bot/pull/955   AManWithNoPlan (talk) 04:15, 19 October 2018 (UTC)

Honour the use card in citations
Since all dates are added in add_if_new it should not be too hard. I should note that when editing part of a page it will ignore the use template if it is not within the area being edited, but that is on user. AManWithNoPlan (talk) 03:35, 17 October 2018 (UTC)
 * Just by the way, the bot is not required to listen to this template; since it does not have to obeyed in references. https://en.wikipedia.org/wiki/Template:Use_mdy_dates  We will try to support it, but I think this is worth noting for the talk record.  AManWithNoPlan (talk) 16:32, 18 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/952 AManWithNoPlan (talk) 17:51, 18 October 2018 (UTC)

eLS (again)
https://github.com/ms609/citation-bot/pull/945 AManWithNoPlan (talk) 01:15, 18 October 2018 (UTC)

author oddities
https://github.com/ms609/citation-bot/pull/947 AManWithNoPlan (talk) 03:44, 18 October 2018 (UTC)
 * the cause was that the first author is set, but to an empty string. AManWithNoPlan (talk) 13:17, 18 October 2018 (UTC)

chapter-format
https://github.com/ms609/citation-bot/pull/956 AManWithNoPlan (talk) 19:16, 19 October 2018 (UTC)

bot has to be run twice (Submitted manuscript)
just the database not resolving one time. AManWithNoPlan (talk) 14:15, 21 October 2018 (UTC)

ISO dates with foreign character separators
Playing wac-a-mole with dates https://github.com/ms609/citation-bot/pull/957 AManWithNoPlan (talk) 19:26, 19 October 2018 (UTC)

Caps bioRxiv
Really should make use of cite biorxiv when possible. Headbomb {t · c · p · b} 00:48, 21 October 2018 (UTC)
 * Or at least convert url to biorxiv <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 00:52, 21 October 2018 (UTC)

https://github.com/ms609/citation-bot/pull/964 AManWithNoPlan (talk) 14:31, 21 October 2018 (UTC)

Archived copy
https://github.com/ms609/citation-bot/pull/966 AManWithNoPlan (talk) 20:47, 21 October 2018 (UTC)

weird dates
https://github.com/ms609/citation-bot/pull/957 AManWithNoPlan (talk) 20:39, 21 October 2018 (UTC)

bioone.org
https://github.com/ms609/citation-bot/pull/965 AManWithNoPlan (talk) 20:38, 21 October 2018 (UTC)
 * this happens because bioone claims to not be the primary server in the open access datbase.  AManWithNoPlan (talk) 02:32, 22 October 2018 (UTC)

502 Bad Gateway
interestingly that is the current title. once github is online i will add bad gateway to the magic list of bad title fragments. AManWithNoPlan (talk) 02:31, 22 October 2018 (UTC)

Recognize date=24/01/2014 16:01:06
thank you https://github.com/ms609/citation-bot/pull/957 AManWithNoPlan (talk) 20:39, 21 October 2018 (UTC)

Added dead link
we need to recognize that as a handle. As for link, it is sadly not dead, so detecting is hard. AManWithNoPlan (talk) 21:21, 20 October 2018 (UTC)
 * Is it possible to detect how big (in bytes) a page is or if there is visable content (or any content besides HTML tags) on it? <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 21:24, 20 October 2018 (UTC)
 * we already do that.  that website lies to us.  AManWithNoPlan (talk) 22:55, 20 October 2018 (UTC)
 * wontfix link works now too. AManWithNoPlan (talk) 19:54, 23 October 2018 (UTC)

dspace.library.uu.nl/handle
https://github.com/ms609/citation-bot/pull/962 AManWithNoPlan (talk) 15:42, 23 October 2018 (UTC)

More biodiversitylibrary junk
https://github.com/ms609/citation-bot/pull/973 AManWithNoPlan (talk) 15:41, 23 October 2018 (UTC)

added incorrect date
https://stackoverflow.com/questions/29917598/why-does-0000-00-00-000000-return-0001-11-30-000000 AManWithNoPlan (talk) 02:41, 24 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/blob/GlazerMann-patch-10/expandFns.php AManWithNoPlan (talk) 02:41, 24 October 2018 (UTC)

Bug: doi-broken-date moves around
notabug the bot updates the broken date, even if same AManWithNoPlan (talk) 03:44, 25 October 2018 (UTC)

OAbot
Has anyone looked at the source code for https://github.com/dissemin/oabot and seen if some of that code is possible of any use for this bot? Finding open access links etc. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 16:25, 21 October 2018 (UTC)
 * link added to issue https://github.com/ms609/citation-bot/issues/948 AManWithNoPlan (talk) 18:36, 23 October 2018 (UTC)
 * not fixed not flag to archive. Thank you.  AManWithNoPlan (talk) 03:19, 25 October 2018 (UTC)

Request: Strip unnecessary code in URL (which just redirects)
wontfix there are so few of this website, not worth risk of bugs. AManWithNoPlan (talk) 02:01, 25 October 2018 (UTC)

Request: Unsupported response for URL
notabug it has two dois. which one should we pick. AManWithNoPlan (talk) 01:56, 25 October 2018 (UTC)

Better author cleanup
When the bot comes into across a citation like

It should expand it to include the other authors However, if etal is set, it shouldn't expand the authors. And if n is set, then it should only expand up to lastn and firstn.

Likewise, if it comes accross

Then it should remove all the empty lastn/firstn (or authorn). This would also apply to editors. Headbomb {t · c · p · b} 17:30, 16 September 2018 (UTC)


 * way too much whining when we do that. The actually has special code to not do this.  AManWithNoPlan (talk) 13:14, 18 October 2018 (UTC)
 * Not if it's done the way I described above. Headbomb {t · c · p · b} 13:15, 18 October 2018 (UTC)
 * you would be surprised ..... AManWithNoPlan (talk) 21:23, 18 October 2018 (UTC)


 * Past complaints about the bot's behaviour with respect to authors was because it messed with style, and added authors when etal was specified, or beyond lastn/firstn/authorn when n was specified. I should know, I was one of those making those complaints. Headbomb {t · c · p · b} 21:29, 18 October 2018 (UTC)
 * okay. not high priority since notta bug. AManWithNoPlan (talk) 21:53, 18 October 2018 (UTC)

Is this helpful https://github.com/ms609/citation-bot/pull/954 AManWithNoPlan (talk) 00:07, 19 October 2018 (UTC)
 * I'd have to see it in action, because I have no idea what I'm looking at. Headbomb {t · c · p · b} 00:29, 19 October 2018 (UTC)

that reomves all blank author parameters if at least one is set. AManWithNoPlan (talk) 02:12, 19 October 2018 (UTC)

this is the current code that you do not like: // If we already have name parameters for author, don't add more if ($this->initial_author_params && in_array($param_name, FLATTENED_AUTHOR_PARAMETERS)) { return FALSE; } we have to write quite a bit of code to deal with all the crazy existing data possibilities. pages with last1,2, and 3 and authors 4-7 all in last4 with commas AManWithNoPlan (talk) 02:12, 19 October 2018 (UTC)

fixed the easy part. If any author type parameter is set, then all the blank ones are deleted AManWithNoPlan (talk) 13:17, 25 October 2018 (UTC)

Book conversion: convert journal to series
i have seen series with journals with books with chapters. AManWithNoPlan (talk) 02:15, 19 October 2018 (UTC)
 * wontfix since it is not always the case. AManWithNoPlan (talk) 13:15, 25 October 2018 (UTC)

Bug: Adding handle link when hdl already exists
https://github.com/ms609/citation-bot/pull/963 AManWithNoPlan (talk) 14:28, 21 October 2018 (UTC)

Caps AMC
https://github.com/ms609/citation-bot/pull/979 AManWithNoPlan (talk) 02:01, 25 October 2018 (UTC)

Request: title={title}
https://github.com/ms609/citation-bot/pull/981 AManWithNoPlan (talk) 03:15, 25 October 2018 (UTC)

Bug: Captcha in title
https://github.com/ms609/citation-bot/pull/980 AManWithNoPlan (talk) 02:02, 25 October 2018 (UTC)

Error in shifting publisher info to author name parameter
thats mostly because the wrong citation template was used: cite journal instead of news or book. The bot is a litte over trusting of humans at times. AManWithNoPlan (talk) 17:26, 26 October 2018 (UTC)
 * It will work on the parts that are not related to the wrong template being used. AManWithNoPlan (talk) 17:31, 26 October 2018 (UTC)

Request: google.com.au
https://github.com/ms609/citation-bot/pull/983 AManWithNoPlan (talk) 03:29, 25 October 2018 (UTC)

Request: please replace existing URL in title
https://github.com/ms609/citation-bot/pull/978 AManWithNoPlan (talk) 03:16, 25 October 2018 (UTC)

Bug: PMC Journal Matter
That's horrible of them. Other than checking for meta-data or scraping a webpage i cannot see any way to tell. Am i missing an obvious clue?. AManWithNoPlan (talk) 22:38, 22 October 2018 (UTC)
 * Not that I'm aware, I'm afraid... <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 22:41, 22 October 2018 (UTC)
 * crossing fingers that headers check will tell us. AManWithNoPlan (talk) 00:59, 23 October 2018 (UTC)
 * headers give 404 AManWithNoPlan (talk) 00:34, 24 October 2018 (UTC)
 * todo check for pdf. if so check https://stackoverflow.com/questions/408405/easy-way-to-test-a-url-for-404-in-php on simplified pmc link if good then drop url.  id bad theb keep. AManWithNoPlan (talk) 04:28, 27 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/994 AManWithNoPlan (talk) 20:05, 27 October 2018 (UTC)

Caps EFSA
https://github.com/ms609/citation-bot/pull/991 AManWithNoPlan (talk) 17:35, 26 October 2018 (UTC)

Bug: + in DOI
https://github.com/ms609/citation-bot/pull/993 AManWithNoPlan (talk) 04:06, 27 October 2018 (UTC)

Bug: Crashes
Environmental impact of hydraulic fracturing in the United States Inductive programming Headbomb {t · c · p · b}
 * https://github.com/ms609/citation-bot/pull/998 AManWithNoPlan (talk) 22:39, 29 October 2018 (UTC)
 * fixed AManWithNoPlan (talk) 12:33, 30 October 2018 (UTC)

Caps AAP
https://github.com/ms609/citation-bot/pull/991 AManWithNoPlan (talk) 16:10, 28 October 2018 (UTC)

Use ISBN as actually printed on the book
Older books with SBN's get listed with the equivalent ISBN. The agreement has always been strong when discussed. AManWithNoPlan (talk) 15:52, 16 October 2018 (UTC)
 * The agreement is to cite the actually used source per WP:SAYWHERE. There is no agreement to systematically change ISBN-10s to ISBN-13s unless the source provided an ISBN-13 as well, in particular not automatically. It wouldn't be a problem if ISBN-13s were a super-set of ISBN-10s, but the somewhat odd application of the checksum causes it to be different enough from the original number to no longer match searches - thereby making it more difficult for humans to look up and verify information. (You can't expect them to understand the inner semantics of an ISBN number or use an ISBN calculator.) This problem does not occur when SBNs are zero-expanded to ISBN-10s. --Matthiaspaul (talk) 16:04, 17 October 2018 (UTC)
 * During Featured Article nominations, we are always asked to change to ISBN 13, which indicates there is a consensus for that. If the bot can do this for us, it's a bonus. So this proposal needs to be made at MOS level, not here. Only if a new consensus is reached at MOS, the bot should be changed. FunkMonk (talk) 11:25, 19 October 2018 (UTC)
 * FAC often has separate (often-curious) requirements which are not guidelines across the board. --Izno (talk) 16:12, 19 October 2018 (UTC)
 * It's not necessarily a feature proposal, but more a request to refrain from doing something that's causing inconvenience to readers and editors (not to machines, because they can easily convert between the two schemes), and therefore is undesirable. WP:SAYWHERE is a guideline, and I can't find anything in the MOS which would override it.
 * AFAIR we also have a policy for bots not to carry out unnecessary edits, and while there are cases where switching out ISBNs is perfectly fine (within the parameters given above), systematically changing ISBN-10s into ISBN-13s (without even knowing if they can be actually found by humans printed on the book) is neither necessary nor an improvement. After all, the project is for humans, not machines.
 * It is not as if ISBN-10s would lack some vital information. So, if the bot cannot adhere to a ruleset similar to that suggested above, it should better just leave it alone and only add a known ISBN when a reference is lacking one (because that's an improvement). --Matthiaspaul (talk) 12:19, 20 October 2018 (UTC)
 * If the book is reprinted, it will have the isbn 13. Converting the isbn is like adding the area code to a phone number - sadly the last number might change or might not.  ISBN organization does want people to use the 13 everywhere.  AManWithNoPlan (talk) 13:24, 20 October 2018 (UTC)
 * Regarding "reprint edition", yes, they will very likely have ISBN-13s (however, I am also aware of a few examples, where this has not been the case). If so, and if the editor actually cites from the reprint edition, using the ISBN-13 is fine. I'm also fine with using the ISBN-13 from a reprint edition even if the editor cites from the original edition, for as long as the reprint is really a 1:1 reproduction of the original including all errata etc. - many reprints, however, have known errata corrected (sometimes even "silently"), so it is not identical and therefore the ISBN from the actually cited source should be used.
 * Regarding "ISBN organization", while they are not authorative for us, can you point to anything official from them saying so? Most probably, they just mean that new books should use the ISBN-13 (obvious). After all, they can't change the fact that books used shorter ISBNs for decades, and those books don't disappear or somehow magically change, so ISBN-10s will have to be supported ad infinitum. As an encyclopedia, we have the duty to not rewrite history either. --Matthiaspaul (talk) 21:53, 20 October 2018 (UTC)

notabug comment will block AManWithNoPlan (talk) 04:38, 31 October 2018 (UTC)

BBC listed as a newspaper - they are not a newspaper
Normally in this case we use cite news with BBC. Hawkeye7  (discuss)  23:01, 17 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/946 AManWithNoPlan (talk) 02:18, 18 October 2018 (UTC)

Handles are not journals
PS: I don't even know what is about. Why does it think that is a journal? Hawkeye7  (discuss)  01:57, 16 October 2018 (UTC)
 * almost everything with a hdl is a journal or journal like. AManWithNoPlan (talk) 02:20, 16 October 2018 (UTC)
 * Apparently hdls are not used for journals, but for ephemeral web sites. I'm not sure whether they should be used in cite web templates. In any case, the reviewers want access-dates so the sites can be retrieved from archive, and that requires URLs.  Hawkeye7   (discuss)  11:06, 16 October 2018 (UTC)
 * Another example of a handle. How would the citation bot handle this?  Hawkeye7   (discuss)  22:17, 16 October 2018 (UTC)
 * The last one is a journal.  Journal was an agreed upon compromise since web is wrong AManWithNoPlan (talk) 00:29, 17 October 2018 (UTC)


 * No, it isn't a journal. Look again.  Hawkeye7   (discuss)  05:04, 17 October 2018 (UTC)
 * Handles are used for all kinds of things, not just journals. --Matthiaspaul (talk) 15:31, 17 October 2018 (UTC)


 * Cite document is a good generic non-cite web alternative. Headbomb {t · c · p · b} 15:49, 17 October 2018 (UTC)
 * many of the options being discussed are actually the same template, just aliases. Since journal is not an alias, it was the choice made. AManWithNoPlan (talk) 19:53, 17 October 2018 (UTC)
 * I have no objections regarding the usage of cite document if it supports all provided parameters. However, cite journal is wrong, because "John Glenn Archives" is no journal. (I would probably use |work=John Glenn Archives |publisher=Ohio State University.) If the bot actually changed cite web to cite journal because of the existance of a handle, than that's wrong as well, because cite web might not have been the best possible choice, but it is not a wrong choice.
 * In cases where there is no 100% clear solution (or it is not known), the best solution for a bot is to just leave it alone because of the high risk of causing much damage in little time if it doesn't work properly. I mean, there certainly are clear-cut cases and it is a relief if a bot can fix them for us, however, it is counter-productive if we cannot trust in a near-perfect behaviour of a bot and have to monitor and clean up after it. I'm somewhat shocked by the large number of reported issues recently. --Matthiaspaul (talk) 21:42, 17 October 2018 (UTC)


 * https://github.com/ms609/citation-bot/pull/944 AManWithNoPlan (talk) 22:52, 17 October 2018 (UTC)
 * the number of issues is mostly feature requests and people using it a lot more. AManWithNoPlan (talk) 22:52, 17 October 2018 (UTC)

Submitted manuscript
Now that I think about it, what is even the point of adding Submitted manuscript? I've seen no WP:MOS describing that this should be done, and nobody but this bot has ever added such comments about URLs. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 00:12, 21 October 2018 (UTC)


 * It's pointless bloat. I remove those whenever I see them. Headbomb {t · c · p · b} 00:48, 21 October 2018 (UTC)

wontfix as my momma said, yah gotta take them good with dem bad. AManWithNoPlan (talk) 04:01, 31 October 2018 (UTC)

Request: Zenodo support
Not a publisher: https://github.com/ms609/citation-bot/pull/999 AManWithNoPlan (talk) 03:56, 30 October 2018 (UTC)
 * Normalize URLs: https://github.com/ms609/citation-bot/pull/1000 AManWithNoPlan (talk) 03:56, 30 October 2018 (UTC)
 * fixed AManWithNoPlan (talk) 21:49, 30 October 2018 (UTC)

Publishers being deleted & specific pages being changed to page ranges...
Issue: Citation bot seems to be mistaking a single page as an error with the dash/etc and then changes that single page to a page range. Issue: Citation bot is deleting the names of publishers in this article and in at least some of the cases I *know* - because I did the initial research - that the previous form of the publisher was not incorrect. For instance: Shearonink (talk) 21:43, 23 October 2018 (UTC)
 * journal= Slate, January 18, 2006 was changed to journal=Slate, January 18, 2006,
 * publisher=Omohundro Institute of Early American History and Culture -> deleted,
 * publisher=Presidential Studies Quarterly; Center for the Study of the Presidency and Congress -> deleted,
 * publisher=Archeological Society of Virginia -> deleted
 * The deletion of publishers for journals is a feature AManWithNoPlan (talk) 22:27, 23 October 2018 (UTC)
 * Your use of pages is less than ideal: you should use at if you do not want the page numbers for the entire article. AManWithNoPlan (talk) 22:29, 23 October 2018 (UTC)
 * if you feel that a journal is obscure enough that people need publisher information, then the correct solution is to create a wikipage for the journal and wikilink to it and fix the problem once and for all globally. AManWithNoPlan (talk) 22:31, 23 October 2018 (UTC)
 * No, that's the wrong attitude. While an article about a journal is always appreciated, this is not a solution to the problem. By default, the publisher information belongs into a reference as much as the journal info. The solution is that your bot should simply refrain from performing actions, which remove info from citations humans felt useful or necessary to add in the first place. Your bot is not entitled to perform any actions overruling humans, except for correcting obvious errors. --Matthiaspaul (talk) 10:46, 24 October 2018 (UTC)
 * This is suboptimal. If he has a specific page, that is not only sufficient but preferred. The bot should not be making a change here. --Izno (talk) 04:17, 24 October 2018 (UTC)
 * Please disable the removal of publishers/publication locations in journals. I've been silently watching and I think we're at the point where if that is the way the bot should operate, that consensus should be assessed by an RFC or similar. I'm willing to walk over to WP:BOTN to see the bot blocked over this issue given how many complaints have come up here. --Izno (talk) 04:15, 24 October 2018 (UTC)
 * I second this, it is a bug, not a "feature". The whole idea of removing parameters with valid contents is silly, and it becomes outright dangerous if it is performed by a bot. Seeing edit summaries in articles and the wall of complaints on this page, it seems as if this bot is causing more damage to the project than doing good stuff - it is in no time destroying the work of human editors, who spent a lot of time to research proper references. In the case of rare references or less frequently visited articles, it means that it is causing damage which is likely to remain permanent. This disruption is not acceptable.
 * Regarding publishers, some users feel that the publisher is redundant info if it is named almost identical to the name of the journal, but other users don't agree with it. The template parameters exist not only for display purposes, but also to populate meta data, and if there really would be consensus (which I don't think it is) that the publisher name should not show up in rendered citations when it is identical to the journal, it is the citation template that should suppress it in the rendered display output, not a bot to remove the information from the reference at all. It's trying to fix a (perceived) problem at the wrong level.
 * --Matthiaspaul (talk) 10:46, 24 October 2018 (UTC)


 * I don't understand why the deletion of the publisher parameter is a feature. And being "obscure" has nothing to do with it, I thought the whole point of cites was to give readers as much information about the source as possible, to make it easy for readers to verify asserted facts. Why does the bot delete the publisher? If that is a clearly-approved part of that particular citation template why does Citation bot over-write and remove editors' valid contributions? I don't understand the logic of the deletion.
 * The past content/edit was "page=" and the bot changed that parameter to "pages=". Template: Cite journal/Template:Cite journal states "page=" "The number of a single page in the source that supports the content" and doesn't say "at=" is preferred. Also, the "Templates" option in the editing window's toolbar only gives editors "page=", there is no "at=" included for "cite journal"... Maybe that's one reason why "at=" doesn't appear within these cites.
 * As an aside, I was posting here because I didn't understand why something was happening, I thought it might be a bug in the bot. I understand that you might get a lot of queries about issues that possibly seem self-evident to you but people ask questions or post about a possible problem because they don't understand, because they want to know and want to learn. None of us came to Wikipedia knowing everything there is to know about it, even the most experienced editor around here was a complete Wikibaby at some point and there is so much Wikicoding and so many areas to edit in, we all continue to be Wikibabies to some degree.  Shearonink (talk) 05:48, 24 October 2018 (UTC)
 * I appreciate your complaint. If i had a dollar for eveytime someone said that they had seen this bug for years and were only now reporting it......   AManWithNoPlan (talk) 13:13, 24 October 2018 (UTC)


 * What style guide out there requires/recommends putting the publisher for a journal citation? None. So that's why the bot does what it does concerning publishers in journal citations. For the other thing, that's due to parameter misuse. Put the date in date and the bot will behave. Headbomb {t · c · p · b} 13:21, 24 October 2018 (UTC)
 * What style guide out there requires/recommends putting the publisher for a journal citation? Irrelevant. If there is evidence of non-consensus regarding some action of the bot, WP:BOTPOL is clear. --Izno (talk) 13:37, 24 October 2018 (UTC)
 * I'd ask for consensus to include that information in the first place. No style guide out there recommends that. No mainstream professional publications includes them in citation. Not even our own Citing sources mentions including publishers (see also CS1 documentation). The only people who want to include it are people under the misguided impression that just because a parameter exist, it must be used, and that citations need maximal information. By that logic, we'd include author emails, author addresses, ... just because this too is information. But it's not pertinent information. No one goes to a library and ask "I need Tattoli et al (2012) 'Bacterial autophagy'... I don't know the journal, but at the time, it was published by Landes Bioscience, who was acquired by Taylor & Francis." Headbomb {t · c · p · b} 14:44, 24 October 2018 (UTC)
 * I'd ask for consensus to include that information in the first place No, that's not how BOTPOL works. Do I actually need to recommend a block on the bot at BOTN? --Izno (talk) 14:45, 24 October 2018 (UTC)
 * You're the one that wants to change longstanding behaviour, I'd argue the onus is on you to show that consensus changed. Headbomb {t · c · p · b} 14:53, 24 October 2018 (UTC)
 * As for blocking the bot, it does not make edits on its own. It is always user initiated.  It is authorized to run unattended, but we do not do that at this time.  AManWithNoPlan (talk) 15:22, 24 October 2018 (UTC)
 * That doesn't answer the question. Will you disable the specific functionality related to removal or will I need to go to BOTN? --Izno (talk) 16:00, 24 October 2018 (UTC)
 * Still irrelevant. "Longstanding behavior" is actually "Headbomb made this request solo within the past month or 3" and since that time people have objected to it. That means it clearly does not have consensus at this time. BOTPOL is clear on the point. --Izno (talk) 16:00, 24 October 2018 (UTC)
 * I did not make that request, and, as AManWithNoPlan said, the bot is user-activated. Whoever activates it is responsible for its edits. If they want to have a special snowflake article that violates every style manual out there, that's on them. Headbomb {t · c · p · b} 16:31, 24 October 2018 (UTC)
 * this feature has been around for almost a decade (possibly longer), not several months. AManWithNoPlan (talk) 16:52, 24 October 2018 (UTC)
 * It is only now that I have seen this bot removing publisher information and doing all kind of other questionable things, and I'm around for much longer than a decade. So, either its behaviour has changed or it is used much more than in the past, or it is now used by people, who do use it to get rid of publisher info because that's their preferred style. Either case, fact is that there are now several complaints regarding the removal of publisher information on this page, indicating that this behaviour is not wanted. Therefore, remove this behaviour. --Matthiaspaul (talk) 22:39, 24 October 2018 (UTC)
 * scholarly journals do NOT include the publisher of journals in their footnotes. Style manuals like  the CHICAGO MANUAL do not include recommend publishers for journals. One big problem is that publishers change very often and the current publisher had nothing to do with the article in question. Rjensen (talk) 00:13, 25 October 2018 (UTC)
 * Who cares about Chicago style? We are Wikipedia and have our own style(s), which allow such info to be included because it is useful to build the web (inside and outside of WP) and helps further research and reverse lookup. We are electronic, we are machine readable, space is no issue.
 * While it is true that publishers often change, even this is important information for historical research. There have been several cases already where knowing a publisher helped me to locate historical journal articles I would not have been able to identify without this information because of abbreviations and liberal spelling changes. And since we cannot predict the future, what might seem redundant info now might help future readers in a couple of decades to locate present sources. So, by default, publisher info is definitely useful and must not be removed.
 * Nobody can force you to add it if you just don't want to include it, but it is nothing but hybris to remove publisher info added by another editor because you don't find it useful. The other editor obviously did.
 * --Matthiaspaul (talk) 04:49, 25 October 2018 (UTC)
 * the publisher of a journal article is not useful info in any way for Wiki readers or editors and no one here has claimed it to be useful. When it comes to books the publisher is useful and important information because the publisher makes the decision on the publication and content of the book. In Scholarly journals, on the other hand, the publisher only handles subscriptions, printing, and mailing and online distribution of current issues.  They were in no way responsible for issues before they became publisher and it is seriously misleading to suggest that to readers.   Editorial decisions about the content are not made by the publisher but by an entirely separate organization called the editorial board of the Journal.  Rjensen (talk) 05:11, 25 October 2018 (UTC)
 * If a journal is obscure enough that you need publisher information to find it then please create a page for that journal and help the world. The wiki style guides state even ISBNs are of questionable usefulness, so there certainly presidence for not adding every citation parameter. AManWithNoPlan (talk) 13:29, 25 October 2018 (UTC)
 * It is shocking to see that someone operating a bot has this attitude to problem solving - you are thereby serving your bot, but not the project.
 * Not every editor citing from a journal source is prepared to create an article about the journal, and why should s/he, anyway? As much as I appreciate it when someone writes an article, it is not necessary. --Matthiaspaul (talk) 21:32, 25 October 2018 (UTC)


 * It happens that I am one of those editors who find them useful for research, including the research necessary to further improve Wikipedia. I even gave examples. You will simply have to accept that different people have different expectations and needs. If you remove (correct) publisher info added by other editors, this is disruptive.
 * There is one exception: If the publisher name is identical to the journal name, this looks a bit odd in a citation (although it is technically correct and not redundant). Only in this case the publisher info can be suppressed, but this is something that should happen in the code of the citation template, not by removing the parameter value itself (and thereby losing the information that they are identical). --Matthiaspaul (talk) 21:32, 25 October 2018 (UTC)

The real solution is wikilinking to page about the journal. The publisher is relevent to the journal itself, not the page it is referenced on. AManWithNoPlan (talk) 02:10, 26 October 2018 (UTC)
 * Didn't really want to get involved here, but I think it is quite clear there is disagreement over the cite bot's ability to alter publisher info and it should be suspended pending further discussion. I don't think wikilinking every journal name is the solution, especially when there are journals that have no apparent notability. As for the comments about the limited usefulness of ISBNs (as if to say one style guide comment represents full consensus on a matter), I'd like to point out that every time I've brought an article through FA or A-class review (at the MilHist project) I've always been asked to provide a number identifier for books and journals. I would also like to note that the citation bot is removing publication location info too, and in my experience my peers have also preferred it when I include this info. -Indy beetle (talk) 07:01, 26 October 2018 (UTC)


 * Solution: Don't use the bot if you want to have an article that violates every style guide out there. The bot only removes locations for journals, since that too is useless. It leaves them where style guides recommends them (e.g. books) Headbomb {t · c · p · b} 11:12, 26 October 2018 (UTC)
 * Solution failure! I'm not initiating the bot, some other editor is and I'm getting tired of reverting them.--Sturmvogel 66 (talk) 00:40, 28 October 2018 (UTC)


 * Then talk to that editor and gain consensus for having non-standard citations that violate style guides. Or use nobots or equivalent. Headbomb {t · c · p · b} 01:15, 28 October 2018 (UTC)
 * Yes, I just discovered that bit of code, but I shouldn't have to be using it.--Sturmvogel 66 (talk) 02:06, 28 October 2018 (UTC)
 * And linking the publisher makes no difference; it still gets deleted. --Sturmvogel 66 (talk) 12:28, 28 October 2018 (UTC)


 * Because it's still a cite journal, and that information is still useless for journals. What people said was to wikilink the journal (i.e. Warship International) to have readers find information about the publication if they want to know who the publisher is. Alternatively, you could nobots or, or use cite magazine to cite it as magazine. Headbomb {t · c · p · b} 12:46, 28 October 2018 (UTC)
 * My mistake.--Sturmvogel 66 (talk) 16:47, 28 October 2018 (UTC)
 * "It is shocking to see that someone operating a bot...." odd comment considering that the operator is not involved in this conversation.  AManWithNoPlan (talk) 01:31, 28 October 2018 (UTC)
 * "And linking the publisher makes no difference; it still gets deleted" that was never suggested by anyone that i saw. Linking thr journal was.  AManWithNoPlan (talk) 16:16, 28 October 2018 (UTC)

notabug standard blocks work.

Request: via=PubMed

 * If pmid exists and url points to the same page, then url along with PubMed should be deleted. Boghog (talk) 09:41, 28 October 2018 (UTC)
 * So, the point is: if there is no url then remove via. AManWithNoPlan (talk) 16:01, 28 October 2018 (UTC)
 * There are two points. In addition to removing via if url is empty, also remove via (and the url) if the url points to the same page as pmid. Those are not the same thing. Is citation bot already doing the later? Boghog (talk) 16:38, 28 October 2018 (UTC)
 * It has been doing part two for years. If you have examples of it mot doingthat the please let us know. AManWithNoPlan (talk) 16:49, 28 October 2018 (UTC)
 * The bot used to not remove via=, thus it looks like a lot of PubMed and PubMed Central Vias need removed. AManWithNoPlan (talk) 18:00, 28 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/995 AManWithNoPlan (talk) 18:22, 28 October 2018 (UTC)
 * Also, JSTOR please https://en.wikipedia.org/w/index.php?title=Ingria&diff=866458590&oldid=866458289 <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 13:24, 30 October 2018 (UTC)

Bug: Bot chokes/crashes
Running https://tools.wmflabs.org/citations/process_page.php?edit=toolbar&slow=1&user=USERNAME&page=Polyphenol causes the bot to choke/stop mid-way.Polyphenol <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 12:52, 30 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1002 AManWithNoPlan (talk) 16:24, 30 October 2018 (UTC)

fixed

NIAAA Publications
wontfix Not that common <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 15:34, 30 October 2018 (UTC)

Request: Drop non-USA Amazon ASIN that match ISBN

 * The second thing that's wrong in that diff is that the bot should use asin. --Izno (talk) 20:34, 20 October 2018 (UTC)
 * No, not always. asin always link to Amazon.com, not Amazon.co.uk. They may differ and sometimes do not carry the same titles, which might make a former .co.uk link to a asin become a dead link. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 20:37, 20 October 2018 (UTC)
 * If you must link to amazon and you must link to the uk amazon then you should use asin and set co.uk. But, in this:
 * we have 9780582382107 which links to Special:BookSources where there are links to all of the amazon tlds and which holds the first 9 digits of the value in so  can and should be deleted (we are not here to feed prospective customers to amazon or to any other book monger).
 * Also, Dr Andrew should be Andrew.
 * —Trappist the monk (talk) 00:20, 25 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1003 AManWithNoPlan (talk) 22:03, 30 October 2018 (UTC)
 * —Trappist the monk (talk) 00:20, 25 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1003 AManWithNoPlan (talk) 22:03, 30 October 2018 (UTC)

Request: Don't add DOI if broken
I keep removing this DOI (since a free PDF version is available and linked) and Citation Bot keeps putting it back. I first reported the broken DOI to the publisher in 2016; clearly, they're not going to fix it! MeegsC (talk) 13:35, 22 October 2018 (UTC)


 * See this. The fix is to put doi or similar. Headbomb {t · c · p · b} 13:40, 22 October 2018 (UTC)
 * And did you report the error from this page, or from the Tordoff article? Because you need to report the error from the publisher page. Headbomb {t · c · p · b} 13:41, 22 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/972 AManWithNoPlan (talk) 22:19, 22 October 2018 (UTC)
 * wontfix  99% of the time the doi soon activates.   Just use comment to block.  Also, despite being broken the doi is still usable using google. AManWithNoPlan (talk) 13:08, 31 October 2018 (UTC)

Decode HTML characters
wikify_external_text missed that. Odd. AManWithNoPlan (talk) 23:06, 30 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1004 apostrophe was never done.  only quotes.   AManWithNoPlan (talk) 04:17, 31 October 2018 (UTC)

Request: Don't add web.archive.org in url

 * Also, the dates should be added as archive-date and not as date. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 17:49, 8 October 2018 (UTC)
 * Archive.org is the most common, but there is also webarchive.org and archive.is and others -- see WP:WEBARCHIVES for domain name particulars. Also they should have yes -- Green  C  19:36, 8 October 2018 (UTC)
 * No need to add yes because it does nothing;  is the default state when dead-url is empty or omitted.
 * —Trappist the monk (talk) 02:53, 9 October 2018 (UTC)
 * Isn't there another bot that cleans this up for us? AManWithNoPlan (talk) 17:41, 14 October 2018 (UTC)
 * wontfix at this time. I think there are other bots that handle all these various issues right now.  AManWithNoPlan (talk)
 * No bot corrects 1997-01-16 to 1997-01-16, so we should at least not add date for such URLs. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 14:39, 1 November 2018 (UTC)
 * Agreed the date is wrong and non-fixable by bot. IABot and WaybackMedic will do the rest but no guarantees if or when they get to it, they don't seek them out, it's incidental. It is involved to get it right due to the many archive services and URL patterns to extract the source URL and identify an archive URL. Should have a standard library for web archives, I have one but it's in a language no one else on Wikipedia uses. Some day I should learn PHP to port it for wider use. -- Green  C  14:44, 1 November 2018 (UTC)
 * It was said  No bot corrects 1997-01-16 to 1997-01-16, so we should at least not add date for such URLs.  I am curious your rational.  In this case, the date and archive-date should be the same.  It is the date that the URL is from.  AManWithNoPlan (talk) 21:21, 1 November 2018 (UTC)
 * date is the date of publication of the source document, not the date it was archived at archive.org (not the same thing). Looking at the bottom of the page https://web.archive.org/web/19970116221538/http://www.bell-labs.com/project/dali/ one can see the correct date of publication if November 13, 1996. However determining this is beyond the scope of any bot. -- Green  C  21:26, 1 November 2018 (UTC)
 * Think "a book published on March 1912, but added on Google Books in 2017". In date we would add "March 1912" not "2017". Same with archvie dates. Just because archive.org archived it a specific date, that is not the date the document/page/nnewsarticle was published. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 21:35, 1 November 2018 (UTC)

https://github.com/ms609/citation-bot/pull/1007 AManWithNoPlan (talk) 00:27, 2 November 2018 (UTC)

Request: Encyclopedia
What to do: https://en.wikipedia.org/w/index.php?title=Shea_butter&diff=864947033&oldid=864946581 <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 17:10, 20 October 2018 (UTC)
 * probably should have dropped blank editorn stuff instead of fixing too. AManWithNoPlan (talk) 17:29, 20 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1010 AManWithNoPlan (talk) 14:12, 1 November 2018 (UTC)

bug: NCBI bookshelf
https://github.com/ms609/citation-bot/pull/1008 AManWithNoPlan (talk) 00:01, 1 November 2018 (UTC)

upgrade: complex by-lines confusion
AManWithNoPlan (talk) 03:44, 29 October 2018 (UTC)
 * Not sure how to fix that. Not sure if there is any way for a non-human to understand that AManWithNoPlan (talk) 03:46, 29 October 2018 (UTC)

Perhaps check for  or  <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t)  Josve05a  (c) 12:58, 30 October 2018 (UTC)
 * I should note that the above HTML is irrelevant since the code in question uses meta-data and that is sadly "byline":"Hunter Felt at Fenway Park".   AManWithNoPlan (talk) 16:25, 30 October 2018 (UTC)
 * Perhaps if four or more spaces, do not split into first and last AManWithNoPlan (talk) 00:55, 1 November 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1009 AManWithNoPlan (talk) 02:19, 1 November 2018 (UTC)

Bug: Do not add dates to Wikipedia links
Do not add date to Wikipeida links, since, as we know, Wikipeid may be updated dayily. What counts is the accessdate. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 23:17, 31 October 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1007 AManWithNoPlan (talk) 23:28, 31 October 2018 (UTC)
 * A, why are we citing Wikipedia, and B, why is the correct fix not to point to a permanent version of the page instead, if there is some specific reason to cite Wikipedia? Citation bot shouldn't make a specific change regarding Wikipedia. --Izno (talk) 23:38, 31 October 2018 (UTC)
 * A & B: Ask the writers of the articles with Wikipedia references (there are a lot). The bot doe snot touch most references, however these are formatted as a cite template without dates, and that is a common parameter whcih should always otherwise be added, however, in this case it will not work. <span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 23:40, 31 October 2018 (UTC)
 * The bot cannot fix that type of a problem. We do what we can do.    AManWithNoPlan (talk) 23:54, 31 October 2018 (UTC)

Bot adds dead links to gateway.isiknowledge.com

 * Note that there is a URL inside the URL...<span style="background: turquoise;font-family: 'Segoe Script', 'Comic Sans MS';">(t) Josve05a  (c) 23:09, 1 November 2018 (UTC)
 * What kind of idiot sets that for their paper!!!! AManWithNoPlan (talk) 23:51, 1 November 2018 (UTC)
 * https://github.com/ms609/citation-bot/pull/1015 AManWithNoPlan (talk) 03:06, 2 November 2018 (UTC)