User talk:Citation bot/Archive 33

Suggestion: Could citebot learn from its mistakes? (cont)
As a quick update to an earlier discussion here, I now see there's been under way for the last couple of years a WM project along these lines (of keeping a centralised citation db) called Shared Citations. — Guarapiranga ☎ 01:59, 20 July 2022 (UTC)

Regular expression failures
I'm seeing of lot of "Regular expression failures" but I don't see what's getting things flagged. Example:

!Regular expression failure in Jules de Cuverville when extracting Templates The following text might help you figure out where the error on the page is (Look for lone { and } characters)

Jules de Cuverville (28 July 1834 – 14 March 1912) was a French naval officer who rose to become Chief of Staff of the French Navy. He entered politics in later life, elected to the senate where between 1901 and 1912 he represented Finistère.
 * 1) # # CITATION_BOT_PLACEHOLDER_TEMPLATE 2 # # #

Jules Marie Armand de Cuverville was born at Allineuc, a small village a short distance inland from Saint-Brieuc in Brittany. His father was Louis-Paul de Cuverville who represented the locality on the monarchist benches in the National Assembly between 1849 and 1853. Louis-Paul's family was descended from the lords of the manor at Maucomble in Normandy, some of whom had been Squires to French kings. Other kinsmen included sailors and naval officers, such as his grandfather, the Rear Admiral Louis-Hyacinthe Cavelier de Cuverville.

Jules Marie attended school at Saint Sauveur de Redon and the lycée in Rennes before entering naval college in 1850. He emerged in 1852 and participated at the Siege of Sevastopol (1854–55), where he was badly wounded. There were further missions in Africa and in the Crimea. He served in Algeria as deputy to Vice-admiral de Gueydon between 1871 and 1873. He was briefly given command of the avisos "Kleber" and "Cuvier" before being seconded to the diplomatic service, serving as naval attache at the French embassy in London during the middle 1870s. He then returned to France, serving aboard the Infernet as a commander with the South Atlantic Naval Division between 1875 and 1879, and promoted to the rank of ship's captain in 1878, taking command of a succession of training ships. He was promoted to the rank of rear admiral in 1888 and appointed a member of the Admiralty Council. Between 1890 and 1892 he served as head of the North Atlantic Naval division and was involved in the Pacification of Dahomey.

He became a vice-admiral in 1893, and then Maritime Prefect for Cherbourg, a member of the Upper Admiralty Council, commander of the Reserve Mediterranean Squadron in 1897 and inspector-general of the marine in 1898. He was Chief of Staff of the French Navy between 1898 and 1899.

Jules de Cuverville was elected to the senate on 31 March 1901 in a bye-election caused by the death of the previous incumbent, General Arsène Lambert, who had died. He was re-elected in the general election of 4 January 1903. He lost his seat to Maurice Fenoux by a narrow margin on 7 January 1912, however.

Brittany, then as subsequently, was relatively conservative in religious terms, and during the summer of 1902 Jules de Cuverville was among those at Le Folgoët, vigorously opposing the closure of the school of the Daughters of the Holy Spirit, and the concomitant expulsion of the nuns there. The sisters had fallen foul of the anti-congragationist legislation which was part of the Paris-based government's determined pursuit of Laïcité (separation of secular and religious institutions).

A friend and political ally of Jacques Piou, he joined the Popular Liberal Action (political party), becoming one of its most passionate supporters. He was appointed a party vice-president in 1907. Deeply Catholic and steadfast in his commitment to the Third Republic, Admiral Count de Cuverville had two principal political priorities: defence of the church and support for the navy.

Several months after losing his seat in the senate, Jules de Cuverville was crushed by a truck while crossing the street in Paris. He survived long enough to be taken to his home at 15, rue Dugay-Trouin, but died a few hours later.
 * 1) # # CITATION_BOT_PLACEHOLDER_COMMENT 0 # # #

References
 * 1) # # CITATION_BOT_PLACEHOLDER_TEMPLATE 9 # # #


 * 1) # # CITATION_BOT_PLACEHOLDER_TEMPLATE 0 # # #


 * 1) # # CITATION_BOT_PLACEHOLDER_TEMPLATE 1 # # #

!Report this problem please about page Jules de Cuverville >No changes required.

# # #

Done with page. Abductive (reasoning) 15:18, 21 July 2022 (UTC)
 * I came here to echo that I have the same problem here's the error from "Los_Angeles_Unified_School_District"

> The following text might help you figure out where the error on the page is (Look for lone { and } characters) > # # # CITATION_BOT_PLACEHOLDER_TEMPLATE 17 # # # # # # CITATION_BOT_PLACEHOLDER_TEMPLATE 18 # # # # # # CITATION_BOT_PLACEHOLDER_TEMPLATE 19 # # # # # # CITATION_BOT_PLACEHOLDER_TEMPLATE 23 # # # Mason (talk)

fixed Sorry, AManWithNoPlan (talk) 15:24, 21 July 2022 (UTC)

Cleanup of pubmed urls
All have PMCs set (or free DOIs), and thus should be overriden by autolinking. &#32; Headbomb {t · c · p · b} 01:58, 20 July 2022 (UTC)

Unhandled write error
Unfixable. Although since it seems be more common, I might have the code sleep and try again. AManWithNoPlan (talk) 22:39, 23 July 2022 (UTC)

Addition of page publication dates
Citation Bot adds page publication dates that are stated in the page's HTML metadata but not on the page itself.

For instance, the publication dates for just about every page on LeatherLicensePlates.com:


 * http://leatherlicenseplates.com/old-delaware-license-plates-vintage-delaware-license-plates/ (published August 27, 2015 according to the metadata)
 * http://leatherlicenseplates.com/old-nebraska-license-plates-vintage-nebraska-license-plates/ (also published August 27, 2015 according to the metadata)
 * http://leatherlicenseplates.com/old-washington-license-plates-washington-state-vintage-license-plates/ (published August 28, 2015 according to the metadata)

Is it legit to add these dates when they're not actually stated on the page itself, even if they *are* correct?

Klondike53226 (talk) 00:54, 17 June 2022 (UTC)
 * Yes. &#32; Headbomb {t · c · p · b} 02:18, 17 June 2022 (UTC)
 * Uh... anyone going to back Headbomb up here?


 * All I want here is to be 110% sure that it *is* acceptable to add page publication dates when they're stated in the metadata but not on the page itself, being someone who is so used to adding such dates when they *are* stated on the page (as with news or magazine articles).


 * Nothing more, nothing less. And, as ever, no disrespect intended. :)


 * Thanks, Klondike53226 (talk) 22:48, 17 June 2022 (UTC)
 * No, it is not acceptable in cases like this where there is a range of dates between when the web site was first published and when it was last modified. The bot has no way of knowing which date is applicable to the version of the site used as a source. That's what access dates are for, but even if an access date is provided, and lands within that publication-modification range, it is impossible to infer the correct date of the cited version of the source, because it could have been modified multiple times within that range. The only reasonable thing to do with web sites with this sort of date issue is to flag them for human attention so that a human can check that the current version still sources the content and update the access-date. But even if you did that, you'd be very likely to run into "helpful" gnomes (bot-like humans) who run through lists of these flags, check only that the web page exists and not that it is still accurate, and set a bad access-date. —David Eppstein (talk) 21:13, 18 June 2022 (UTC)
 * Well, I was already far from 110% sure that it was acceptable to add publication dates in cases such as this one with LeatherLicensePlates.com, when the discussion two below this one popped up - and now I'm as far from 110% sure as one can get. That is, 0% sure.


 * I find myself agreeing with you that it's better to leave a source undated when the corresponding page/site does not display a visible publication date, even if the publication and modification dates stated in the metadata are correct.


 * I've already put the "deny Citation Bot" comment in the date parameter for some of the LeatherLicensePlates.com sources used, and I shall waste no time in tackling the rest of these sources. Klondike53226 (talk) 22:18, 18 June 2022 (UTC)


 * Have to agree that the date should be shown on the displayed page or there is no easy way to verify that you are looking at the source that it is claiming to be. You should not add just from the HTML code. Keith D (talk) 16:08, 19 June 2022 (UTC)

The bot's still doing this - compare this edit to this source. That date being in the HTML isn't ironclad proof that that's truly the proper publishing date. I'm seeing that I'm the fourth person here to question this - is there a consensus somewhere that pulling these dates from the HTML is proper? Hog Farm Talk 22:20, 23 June 2022 (UTC)


 * This is one reason why the bot never adds access-date and it never adds a date that is newer than the access-date. AManWithNoPlan (talk) 13:23, 24 June 2022 (UTC)

bot adds |chapter= to cite journal and to cite news
i watch that error category and fix them by hand. almost universaly, the bot points out that the cite was broken to begin with. i will add code to log these pages. AManWithNoPlan (talk) 11:40, 24 July 2022 (UTC)

Mostly fixed, many will still require human interaction to clean-up. But they are now being logged to a file when they are found. The error category does not include Draft/sandboxes/etc AManWithNoPlan (talk) 15:04, 25 July 2022 (UTC)
 * Umm, cs1|2 errors in draft namespace are categorized in all appropriate cs1|2 categories except when the page is a /sandbox, /testcases, /log, or /archive subpage; see line 12 et seq. If you have evidence to the contrary, I would like to see it.
 * —Trappist the monk (talk) 15:10, 25 July 2022 (UTC)
 * That is news to me. I am glad to hear that.  I will keep the logging so that I can correct these by hand when they occur. AManWithNoPlan (talk) 17:27, 25 July 2022 (UTC)

New Google Books

 * Side note, the old version of Google Books did not generate a publisher in this example, while the new version added Berghahn Books. Jonatan Svensson Glad (talk) 13:42, 1 August 2022 (UTC)

CS1 errors
This bot routinely causes CS1 errors - for example, [this edit] caused the article to be listed at https://en.wikipedia.org/wiki/Category:CS1_maint:_numeric_names:_authors_list. It happens often, making unnecessary busywork to those of us who do CS1 maintenance. The bot should be modified so that it doesn't make any CS1 errors and cause pages to be listed at either https://en.wikipedia.org/wiki/Category:CS1_maintenance or https://en.wikipedia.org/wiki/Category:CS1_errors

Ira Leviton (talk) 20:37, 28 July 2022 (UTC)


 * Bad metadata is an ongoing issue yes. There's no way to anticipate all cases. &#32; Headbomb {t · c · p · b} 21:37, 28 July 2022 (UTC)
 * As with any other script, users of Citation bot are responsible for verifying that edits made using that feature are correct. -- Red rose64 &#x1f339; (talk) 22:46, 28 July 2022 (UTC)
 * New York added to non-human list. AManWithNoPlan (talk) 17:33, 31 July 2022 (UTC)

World Bird List
The bot has repeatedly changed the "cite web" to "cite journal" in this reference though it is clearly a web page and not a journal (<> and {} brackets removed so it shows here): cite web ref name=IOC12.1 /ref. Craigthebirder (talk) 17:45, 27 July 2022 (UTC)
 * Not only in bird lists, in individual species' pages. Craigthebirder (talk) 18:52, 27 July 2022 (UTC)
 * Why are you citing version 12.1 in the title and version 11.2 in the doi (which doesn't actually link to version 11.2 of the list)? And for specific bird species why aren't you linking to the particular place in the list for that species – shouldn't you link to https://www.worldbirdnames.org/new/bow/buttonquail/ from Plover?
 * —Trappist the monk (talk) 19:01, 27 July 2022 (UTC)
 * First question - because I messed up; have corrected for future and will try to correct earlier entries. Second question - because that page is the link to the whole species list. But will link to family sections in the future. Thanks for pointing these out. Craigthebirder (talk) 20:34, 27 July 2022 (UTC)

I flagged 10.14344 as a non-journal DOI. AManWithNoPlan (talk) 20:01, 27 July 2022 (UTC)
 * Thank you. Craigthebirder (talk) 20:34, 27 July 2022 (UTC)

DOING MORE HARM THAN GOOD! Stop this bot!
'''Your bot MUST be removed! You are damaging articles with no real benefit to balance that.'''

At Hezekiah (governor), this edit added data of little importance (doi, jstor, s2cid) since the URL is already indicated, while removing the very much essential page number and replacing it with incorrect or insufficient ones: What really matters is the actual page the ref is citing. Some sources can't be fully accessed online, but the ref page can be. If one can indicate the start & finish pages, that can benefit some who do have access (subscribers, access to hard copies), but is not essential. The form "x-y [z]" is well accepted in academic publications and transparent enough for the user: "the article goes from x to y, but look up only z." I have also seen it as "x-y (z)". Now you have forced me to remove the start & end pages of the article, in order to preserve the relevant page number(s) I'm actually citing. What's far worse, elsewhere the intervention of the bot has most likely not been reverted and now the cited page is missing. Now you must go and manually fix all the damage already done.
 * "page= 122–126 [125]" is correct data, in both form and content. It means that the entire article covers pp. 122–126, while the relevant ref page is 125 only.
 * "pages= 8, 13" are the correct, relevant ref pages. The bot replaced them with who knows what ("109–118"), maybe the general pages of the article.

'''The bot harms the user. We're far better off without it.''' Arminden (talk) 06:17, 24 July 2022 (UTC)


 * 122–126 [125] should be kept when converting to the proper pages, yes. But 8, 13 is definitely incorrect. Those are the pages of the preprint PDF, not the pages of the published book, which is what the reader [thus bot] is expecting.
 * Also, you were not harmed by the bot. Don't be dramatic. &#32; Headbomb {t · c · p · b} 06:52, 24 July 2022 (UTC)


 * The template is problematic, as there's only one page of reference, 125, so definitely singular, page. But it can be useful to indicate the size and position of the article, 122–126. I'm not coding, those who are should find a solution.
 * As I wrote on your page: your manual edits are absolutely perfect, detailed, and most welcome; the bot however is not ready to be let go. Unless and until it is, it is totally wrong to activate it.
 * "Being dramatic" is a touch too personal, if you don't mind. Wiki IS better off without the unfixed bot, broken bots tend to damage lots of articles, and those in charge aren't always quick to react, so clear words can only help. Arminden (talk) 10:45, 24 July 2022 (UTC)
 * fixed- now detects square brackets or commas AManWithNoPlan (talk) 11:37, 24 July 2022 (UTC)
 * , great, thanks! Now that I know you can do it, let's see all aspects.
 * "page= X–Y [Z]" means that only Z is directly relevant, so "page" in the singular. Or not? Ideally the template should lead to smth like "pp. X-Y (see p. Z)", or if needed "pp. X-Y (see pp. V, W, Z)".
 * "page= X–Y (Z)" is also used by some. I also seem to remember that in some templates or contexts, straight brackets create a lot of chaos, so round ones are an important option. Arminden (talk) 11:56, 24 July 2022 (UTC)
 * PS: All the edits produced by the bot before it reaching the final, mature and correct form must be fixed retroactively. Does that happen automatically, or must they be identified and fixed by hand, one by one? If the latter, how and who? Arminden (talk) 12:00, 24 July 2022 (UTC)
 * round paras add too. The bot has been fully approved for well over a decade.  People are expected to report problems as they see them.    AManWithNoPlan (talk) 12:10, 24 July 2022 (UTC)
 * Why on Earth would the reader be expecting the pages of the published book and not the linked copy? On the various templates we provide links to preprints, partial google books previews, and chapter excerpts, in spite of the bulk of the citation indicating the full published work, entirely because we fully expect the bulk of the readers are not going to have journal subscriptions or a complete next-door library and thus, should they want to verify a statement, they should be given as direct a route as possible.
 * Changing page numbers in any form is absolutely uncalled for unless you, personally, verify they still point to the sourced information. SamuelRiv (talk) 12:17, 24 July 2022 (UTC)
 * ""Why on Earth would the reader be expecting the pages of the published book and not the linked copy?" because that is the published version. I'm the one that added the preprint link. It wasn't linked before. If I tell you look up something from, you're expected to look at the published version of that, not a preprint of it. &#32; Headbomb {t · c · p · b} 13:52, 24 July 2022 (UTC)
 * , hi. We are almost of the same opinion, but just almost. Indicating the pages of the actually linked PDF (a preprint) is what we had and what's most needed, the bot changed that, and then the editor who apparently activated it, Headbomb, set in both infos.
 * Why the full info can be of some use? Because links rot, people do have access to hard or digital copies, and I for instance often bump into JSTOR docs I can only read online and knowing if the needed page is not too far in (not, say, on the 20th page of the doc) helps me decide right away if to bother clicking through the doc. I do read some of the referenced sources, as they educate me about topics I'm interested in, and often bump into "failed verification" cases. For some important titles I do have & use hard copies. So it's not all that hypothetical. ANd if it does no harm by replacing potentially useful, valid data with bad, let it be, it's a bot, doesn't cause sweat.
 * What I didn't realise: apparently, the bot isn't automatically crawling through the entire enWiki and doing edits, but needs to be activated by a specific editor for each article. Or not? Very different concepts. Arminden (talk) 13:27, 24 July 2022 (UTC)
 * Both modes are authorized. Generally a user has a specific issue that needs fixe, such as all pages where the jstor url has #meta_blah_blah_blah included, or all pages with a certain invalid DOI, etc. AManWithNoPlan (talk) 13:42, 24 July 2022 (UTC)
 * In the case where both the page range and pinpoint page(s) are given, the pinpoint should be preferred as the page range is almost always easily determined with the rest of the citation information, and the pinpoint is far more important for verifiability. If the bot can't distinguish in a template whether the editor was referencing page range, pinpoint, both, or neither, the proper action is to leave it alone or, if the metadata are corrupted, convert to at. SamuelRiv (talk) 18:06, 27 July 2022 (UTC)

Parameter with only a non-breaking space
More generally, starting or ending with one. AManWithNoPlan (talk) 11:27, 2 August 2022 (UTC)

Edit Summary: "Upgrade ISBN10 to 13" but no ISBNs changed
There's not even an ISBN anywhere in the article! &#32; Headbomb {t · c · p · b} 23:14, 2 August 2022 (UTC)
 * The bot detects the number of "978" present in the article before and after. One PMID added had that.  The code now counts "978-", since that is what the bot adds.  The mistake will still happen if for example the title of a journal article added is "The 978- area code: demographics", but that is rare enough to live with.  AManWithNoPlan (talk) 12:49, 3 August 2022 (UTC)

Fails to run on Scintillator
I've looked everywhere for a stray bracket, and I just can't find what the issue is. &#32; Headbomb {t · c · p · b} 23:04, 2 August 2022 (UTC)
 * The fix was to make this template-looking-thing in a math block less template looking. &#32; Headbomb {t · c · p · b} 23:20, 2 August 2022 (UTC)
 * The inner brackets around the \tau are unnecessary: \tau_f and \tau_s would have worked. —David Eppstein (talk) 00:09, 3 August 2022 (UTC)
 * indeed {{math stuff} more math} does confuse the bot. About once a week I add spaces to markup on a page.  All such page failures are logged.  AManWithNoPlan (talk) 11:14, 3 August 2022 (UTC)

Mark 10.4249/... DOI as free access.
Like so.

The doi prefix 10.4249 is for Scholarpedia, and all those DOIs will be freely accessible. &#32; Headbomb {t · c · p · b} 11:11, 3 August 2022 (UTC)

once deployed. AManWithNoPlan (talk) 12:32, 3 August 2022 (UTC)

Adds pseudonymous collective author, already listed, as redundant editor
Please see publishers page (linked via DOI for 2nd edition) and the MR review for first edition. AManWithNoPlan (talk) 13:07, 4 August 2022 (UTC)
 * The book itself and a correct description of its content is more definitive than what the publisher lists it as in some database. Lothaire is a name for the group of people who wrote the content (its author). It is not a name of an editor. —David Eppstein (talk) 16:37, 4 August 2022 (UTC)
 * I see that now. I have flagged the page to avoid the using of the bad meta-data. AManWithNoPlan (talk) 16:44, 4 August 2022 (UTC)
 * Flag all pages with this DOI. AManWithNoPlan (talk) 16:56, 4 August 2022 (UTC)
 * Thanks. Since this appears to be bad publisher metadata rather than a bot bug, I think there is nothing more to do. —David Eppstein (talk) 19:48, 4 August 2022 (UTC)

Robert Koch-Institut is not the name of an author
Nor is the cited source a journal...

—Trappist the monk (talk) 22:05, 7 August 2022 (UTC)

Error processing the List of new religious movements
Fixed the wiki-problems. https://en.wikipedia.org/w/index.php?title=List_of_new_religious_movements&type=revision&diff=1103126764&oldid=1102985631 BUT a topic expert needs to look over and probably double check. AManWithNoPlan (talk) 12:13, 8 August 2022 (UTC)

The BBC is not a newspaper
Yeah, only a minor thing. I am mostly puzzled about how the bot came to do this, because it's so odd that it suggests the possibility of a wider glitch in how the bot decides that a URL is that of a newspaper. -- Brown HairedGirl  (talk) • (contribs) 05:36, 7 August 2022 (UTC)

It is Citoid: [ {    "key": "4HM5G9UK", "version": 0, "itemType": "newspaperArticle", "creators": [], "tags": [ {       "tag": "Война России с Украиной", "type": 1 },     {        "tag": "Украина", "type": 1 }   ],    "title": "Леонид Кучма: \"Путин хотел уничтожить Украину, а получит наше второе рождение\"", "publicationTitle": "BBC News Русская служба", "url": "https://www.bbc.com/russian/news-62419765", "abstractNote": "Экс-президент Украины Леонид Кучма встретил российское вторжение у себя дома в Киеве и уезжать не захотел. BBC News Украина удалось взять у него первое и пока единственное интервью за время войны.", "language": "ru", "libraryCatalog": "www.bbc.com", "accessDate": "2022-08-08", "shortTitle": "Леонид Кучма" } ]

Hi
Hi bot i give you a barnstar 😁 Einahr (talk) 11:23, 12 August 2022 (UTC)

Citation bot tries to fetch from known dead URLs

 * The citation that matches that URL has no url-status flag set. Neither does the the other citation to openvms.compaq.com. Do you get the same error if you set ? Sideswipe9th (talk) 01:01, 3 August 2022 (UTC)
 * First not flagging a url as live does not mean the url is dead. Second, even if it was flagged dead, websites often from back to life, so in thourough mode, it should definitely check the url if it can. &#32; Headbomb {t · c · p · b} 01:41, 3 August 2022 (UTC)
 * "First not flagging a url as live does not mean the url is dead" The documentation for those parameters is not of the highest quality.
 * Help:Citation Style 1 says:
 * url-status: To change the order with the title retaining the original link and the archive linked at the end, set live
 * When the original URL has been usurped for the purposes of spam, advertising, or is otherwise unsuitable, setting unfit or usurped suppresses display of the original URL (but url and archive-url are still required).
 * and Template:Cite web says:
 * By default, if "archive-url" is used, the parameter dead is assumed and the resulting main link is to the archived version:
 * and Template:Citation Style documentation/url says of url-status:
 * this optional parameter is ignored if archive-url is not set. If omitted, or with null value, the default value is dead.
 * (and also mentions a "deviated" value for url-status).
 * So either not flagging a url as live does mean it's dead, if there's an archive-url parameter, or the documentation needs to be fixed to indicate that omitting url-status isn't the same as saying url-status=dead. Guy Harris (talk) 02:05, 3 August 2022 (UTC)
 * "Second, even if it was flagged dead, websites often from back to life, so in thourough mode, it should definitely check the url if it can." ...but not complain if the check fails, because it was warned that it's (probably) dead). Guy Harris (talk) 02:09, 3 August 2022 (UTC)
 * "Second, even if it was flagged dead, websites often from back to life, so in thourough mode, it should definitely check the url if it can." ...but not complain if the check fails, because it was warned that it's (probably) dead). Guy Harris (talk) 02:09, 3 August 2022 (UTC)


 * So either not flagging a url as live does mean it's dead No it doesn't. It's an assumption the template makes as far as the presentation of links go. It does not mean the URL is dead. Plenty of people simply archive links preemptively, but don't bother as flagging the url as live, mostly because it's rather pointless to do so.
 * ...but not complain if the check fails The bot doesn't a complain about it, it reports that the check failed.
 * The message is in gray, because it's not a very important thing to report. Unlike messages that are in red/orange/other colors. ce&#32; Headbomb {t · c · p · b} 03:00, 3 August 2022 (UTC)
 * The message is in gray, because it's not a very important thing to report. Unlike messages that are in red/orange/other colors. ce&#32; Headbomb {t · c · p · b} 03:00, 3 August 2022 (UTC)

Wikipedia Signpost/2022-08-01/Tips and tricks
Just a heads up that my little handy dandy guide on Citation bot finally got in the Signpost, as intended years ago. Feel free to leave a comment (or make suggestions for other guides on different topics). &#32; Headbomb {t · c · p · b} 20:04, 1 August 2022 (UTC)
 * Thank you. AManWithNoPlan (talk) 00:05, 2 August 2022 (UTC)
 * Just read it, and it is good. AManWithNoPlan (talk) 00:14, 2 August 2022 (UTC)
 * Good work, @Headbomb. And it's probably just as well that this fine guide did not appear sooner, 'cos until Citation bot got a massive amount of extra capacity a few months ago, the result of the extra attempts to use CB would ave been a lot of frustrated editors as requests timed out.   Brown HairedGirl  (talk) • (contribs) 07:09, 2 August 2022 (UTC)
 * Thanks for the words, though Wikipedia talk:Wikipedia Signpost/2022-08-01/Tips and tricks is probably the place to centralize discussion about the article. &#32; Headbomb {t · c · p · b} 07:35, 2 August 2022 (UTC)

RfC: Should you use cite web, or cite magazine, or cite news?

 * Please don't hold RfCs in user talk space (the practice of holding RfCs to discuss user conduct ceased some years ago). This matter should be discussed at Help talk:Citation Style 1. -- Red rose64 &#x1f339; (talk) 21:29, 1 July 2022 (UTC)
 * I had asked previously where to hold this and there was an even split between this talk page, and the CS1 talk page being the appropriate venue. I'm not opposed to moving it, if it was to be relaunched, so to speak, with the same question, would copy and pasting what has already been discussed and !voted on be acceptable?
 * I'm also not opposed to using this opportunity to address the concerns raised by both, which may result in a different question being asked however. I'd just like to check what the options are. Sideswipe9th (talk) 21:41, 1 July 2022 (UTC)
 * I've moved the whole RfC unchanged, apart from a slight adjustment to the section heading and also omitting the two posts above. -- Red rose64 &#x1f339; (talk) 08:14, 2 July 2022 (UTC)

RFC closed
The RFC has been closed by @ScottishFinnishRadish: "There is a strong consensus for Citation bot to use cite news and cite magazine in cases where online content doesn't appear in a print edition of a publication."

It is helpful to have this community endorsement of Citation bot's good work. This outcome was clearly foreseeable, but it is a relief that the saga is finally over.

However, I remain very sad that so much time and effort was spent in resolving the complaints of the 12 Angry Marvelites who created the drama. Their refusal to listen to more experienced editors led to this vast time-sink, and their aggressive rudeness poisoned the atmosphere. I hope that the 12 Angry Marvelites will reflect on their conduct, and pursue andy further concerns without the groupthink and assumptions of bad faith which characterised their approach to this issue.

And thanks to @ for their hard work in preparing the RFC. I wish that the RFC had not been needed, but when the 12 Angry Marvelites refused to listen, it was the only way to end the drama. -- Brown HairedGirl  (talk) • (contribs) 07:35, 2 August 2022 (UTC)
 * Thank you all. AManWithNoPlan (talk) 12:52, 3 August 2022 (UTC)
 * Some of the people on other side to you seem more experienced with the websites involved with dispute which is more relevant, even though you seem to be more experienced with references. No need to be uncivil, we were acting in good-faith, don't believe we were rude, this is not the place to discuss user conduct. Indagate (talk) 13:34, 3 August 2022 (UTC)
 * There is nothing uncivil in pointing out the time wasted by this drama. The rudeness is well-documented, and even extended to falsely accusing the bot of vandalism, an outrageous allegation which was repeated even after the rude Marvelite was pointed to WP:NOTVAND.  Not one of the tag-teaming 12 Angry Marvelites objected to that smear, but instead I was attacked for asking for a retraction.  And Indagate may not have been actually rude, but it was not civil for Indagate to waste the time of other editors by a silly claim that 9 to 8 amounted to a consensus for their view, and it was not civil for Indagate to falsely claim experience which they actually lacked.
 * I still hope that the Angry Marvelites may learn from this saga, and raise any further concerns more colegially.  Brown HairedGirl  (talk) • (contribs) 20:04, 3 August 2022 (UTC)
 * I have no desire to get into this again, but I'd like to note a few things:
 * This RfC was not a "waste of time" to address our concerns. The lack of consensus in the prior discussion left us with no other option to resolve the dispute, unless you would have preferred that the discussion drag on forever and Citation bot's edits continue to be reverted, or the bot be blocked outright on a large number of pages. I too am deeply saddened that you continue to regard our concerns as petty and insignificant.
 * Just because a group of editors has more "experience" does not mean the opinion of editors with less experience do not matter. We did not refuse to listen to you, we provided very clear counterclaims (magazine websites have different content, etc.) and questioned why Citation bot should continue making these changes despite there being no consensus at the discussion. Do you recall what response we received? "Online magazines are magazines." (Not the point, we're talking about magazine websites.) "There is consensus because most edits are not reverted." (We're talking about consensus in the discussion, not in CB's edits.) And for the record, some of the editors who disagreed with you had comparable levels of experience, so your repeating that you have "more experience" is beginning to feel condescending.
 * Darkwarriorblake is not a "Marvelite". I've almost never seen them pop up on a Marvel-related article, nor are they part of our taskforce.
 * it was not civil for Indagate to waste the time of other editors by a silly claim that 9 to 8 amounted to a consensus for their view – 9 to 8? Really? Unless my counting is significantly off, the participants of the prior discussion were pretty evenly split. If it really were 9 to 8 I wouldn't have called for an RfC, because consensus would have been clear in that case.
 * You continue to admonish us for being rude and uncivil, yet you're once again attacking Indagate, who did nothing wrong. Yes, maybe they shouldn't have claimed experience based on their account age, but that's not the definition of "uncivil", which implies malice. Mind retracting that statement?
 * To repeat, I do not want another lengthy debate, so I won't reply further. Also, I'd like to thank Sideswipe9th for their efforts as well. But I'll close my comment with this: while it's totally appropriate to reflect on the results of the RfC, do you really think it's appropriate to dance on someone's grave with this patronizing subsection of yours? InfiniteNexus (talk) 05:56, 4 August 2022 (UTC)
 * @InfiniteNexus: there is no grave to dance on. Nobody has died or been banned or blocked.
 * Your choice to reply with absurdly dramatic hyperbole is another reminder of how unpleasant the whole saga was made by the 12 Angry Marvelites. And no, I will not alter my view that making false claims to try to sway a discussion is uncivil.   Brown HairedGirl  (talk) • (contribs) 07:14, 4 August 2022 (UTC)

Pages vs Issue vs at vs ...
The edit at Aberration_(astronomy) changed the pages= entry to the issue number (e.g., A91), removing the page range (e.g. 1-6) of the article. This happened several times. I recently corrected the pages= entries.
 * You're actually reintroducing errors the bot is fixing (full proper fix is here, there were some GIGO issues). A&A and similar journals use an 'article ID' instead of a page number. This is what goes in page and how people cite these journals. There are no issue numbers for those. &#32; Headbomb {t · c · p · b} 17:07, 31 July 2022 (UTC)
 * I'm puzzled by this usage. There are clearly page numbers in the pdf version of the article; where should they be provided if not in the page(s)= entry of the citation template.  The article ID seems analogous to an issue number. Could you point me to appropriate reference supporting this strange usage of the citation template. --SteveMcCluskey (talk) 18:08, 31 July 2022 (UTC)
 * OK, I see that you're using the page= value to place the ArticleID without parentheses, which are used for issue numbers. It's a bit of a kludge but it reproduces the style used in AandA. However, it leaves no place for the page numbers (which this historian of science finds standard for a complete citation). I'm not going to make a big deal of it. --SteveMcCluskey (talk) 18:31, 31 July 2022 (UTC)
 * So when I tag those citations with Page needed, what is the suggested remedy? SamuelRiv (talk) 18:51, 31 July 2022 (UTC)
 * Don't misuse parameters to hold something that they are not intended to hold. An article number is not an in-source location so page, pages, and at are not the right place for an article number.  cs1|2 does not have any support for article numbers.  I think that there has been some discussion at Help talk:Citation Style 1 but never to the point of action.  Quite often I see article numbers in number (an alias of issue) which, really, is also incorrect.  I suppose that if you  include an article number number is the best place for it pending some decision to add support for article numbers – but don't be surprised when someone like me comes along and removes it.
 * —Trappist the monk (talk) 19:07, 31 July 2022 (UTC)
 * Perhaps use the id parameter. -- Red rose64 &#x1f339; (talk) 22:41, 31 July 2022 (UTC)
 * id is for bibliographic identifiers like Zbl etc... (when there aren't nativaly supported), not the article number. page is perfectly fine for it, thought if you want to be an überpedant, at is probably what you're looking for. &#32; Headbomb {t · c · p · b} 22:46, 31 July 2022 (UTC)
 * Trappist has already stamped on at. I'm trying to find a parameter which would not be misused. -- Red rose64 &#x1f339; (talk) 23:33, 31 July 2022 (UTC)
 * I think Trappist is wrong. For journals that identify articles by article numbers rather than by page numbers, the article number must be included in the citation, in the position that the page numbers would go for page-number-using journals, and page= or at= are the only ways to put it there. id= is no good because it puts the key article-identifying information in the wrong part of the citation. —David Eppstein (talk) 23:50, 31 July 2022 (UTC)
 * Trappist is wrong indeed. &#32; Headbomb {t · c · p · b} 00:15, 1 August 2022 (UTC)
 * I've been using id for press release numbers but reading this I think that may have been wrong. Perhaps I should have been using number instead, but there is no documentation to support this. Hawkeye7   (discuss)  01:26, 1 August 2022 (UTC)
 * Yes. Article numbers go in the same slot as page numbers; the journals themselves even do it that way (Philosophical Transactions of the Royal Society springs to mind as an example that I've seen recently). XOR&#39;easter (talk) 16:09, 1 August 2022 (UTC)
 * I'm pleased to see that I wasn't the only editor disturbed by changes involving article IDs. Articles with article IDs also have page numbers within the article. There should be a separate place for both within the citation template. It's beyond my abilities but those involved in maintaining the citation templates should be asked to deal with this new citation style, which seems to be proliferating in the science journals. --SteveMcCluskey (talk) 21:06, 1 August 2022 (UTC)
 * Standard practice is to use brackets A9 [17] if you want to refer to a specific page/page range. &#32; Headbomb {t · c · p · b} 21:10, 1 August 2022 (UTC)
 * Another convention I've seen (working better when the article numbers are short) is to use artnum:firstpage–artnum:lastpage as the range of pages, so Headbomb's example would become A9:1–A9:17. That's how they're numbered by LIPIcs (an open-access publisher of many computer science conference proceedings), for instance. —David Eppstein (talk) 07:22, 4 August 2022 (UTC)

Suggestion: convert Google books URLs to GBurl templates
Instead of modifying Google books URLs, use the GBurl template where the search parameters are encompassed by the template. This change may need an RFC. Possibly the Citer tool (python) may provide more documentation on the parameters of the Query part of the URL, for which there doesn't seem to be easily findable documentation. In *Template.php:expand_by_google_books_inner* the current code doesn't seem to cope with the munged title now being part of the string, so that may be worth enhancing. if (preg_match("~^https?://www\.google\.(?:[^\./]+)/books/edition/_/(.+)$~", $url, $matches)) {

Here's some perl snippets since I'm not PHP literate. use URI; use URI::Split qw(uri_split uri_join);

LINE: while (<>) { chomp; next if /^$/; if (/^\s*#/) { print $_, "\n"; next; }   my($uri) = URI->new($_); my($sch, $aut, $pth, $qry, $frg) = uri_split($uri); if (defined($sch) && ($sch eq 'http' || $sch eq 'https')   && defined($aut) && defined($pth)) { my($template); substr($aut, -3, 3) = '' if substr($aut, -3) eq ':80'; ...	if ($aut =~ /^www\.google\.c(?:a|o(?:m|m\.au|\.uk|\.in))$/	&& $pth =~ m|^/books/edition/[^/]+/([-A-Za-z0-9_]{12})$|) { my($gb_arg) = $1; foreach my $gb_restrict (split('&', $qry)) { if ($gb_restrict =~ /^(pg|dq)=/) { if (substr($gb_restrict, 0, 5) eq 'pg=PA') { $gb_arg .= '|p='. substr($gb_restrict, 5); }		   else { $gb_arg .= '|'. $gb_restrict; }		}		elsif ($gb_restrict =~ /^bsq=(.+)/) { $gb_arg .= '|dq='. $1;		}		# Safe to skip: hl= (Normally en), gbpv= # printsec= - skip only if there's a pg= # Shouldn't convert if there's anything else? }	   $template = ""; } --- Example URLs https://www.google.com.au/books/edition/D%C3%BCsenj%C3%A4ger/r_cuAAAACAAJ?hl=en https://www.google.com.au/books/edition/%EC%8A%A4%ED%83%80%ED%83%80%EC%9D%B4%EB%93%9C_%EB%9D%BC%EC%9D%B4%EC%A7%95_2_%EC%96%91%EC%9E%A5%EB%B3%B8_Ha/Cn_9jgEACAAJ?hl=en https://www.google.com.au/books/edition/Machiavelli_The_Prince/05R7kYOKD0cC?hl=en&gbpv=1&dq=Cambridge%2520University&pg=PT12&printsec=frontcover https://www.google.com.au/books/edition/Machiavelli_The_Prince/05R7kYOKD0cC?hl=en&gbpv=1&pg=PR4&printsec=frontcover&dq=Cambridge https://www.google.com.au/books/edition/Machiavelli_The_Prince/05R7kYOKD0cC?hl=en&gbpv=1&pg=PT12&printsec=frontcover&dq=Cambridge%20University https://www.google.com.au/books/edition/The_Existential_Graphs_of_Charles_S_Peir/Q4K30wCAf-gC?hl=en&gbpv=0 https://www.google.com.au/books/edition/The_Existential_Graphs_of_Charles_S_Peir/Q4K30wCAf-gC?hl=en&gbpv=1&pg=PA113&printsec=frontcoverG https://www.google.com.au/books/edition/The_Golden_Enclaves/7kRYEAAAQBAJ?hl=en https://www.google.com.au/books/edition/When_Computers_Went_to_Sea/Mi8MhzheOokC?hl=en&gbpv=0 https://www.google.co.uk/books/edition/When_Computers_Went_to_Sea/Mi8MhzheOokC?hl=en&gbpv=0 https://www.google.co.in/books/edition/Conflict_and_Conquest_in_the_Islamic_Wor/jBBYD2J2oE4C?hl=en&gbpv=1 https://www.google.com/books/edition/Playwriting_for_Profit/jhwLAAAAMAAJ?hl=en&gbpv=1&bsq=%22show%20not%20tell%22 https://www.google.com/books/edition/_/A-c7AAAAIAAJ?hl=en&gbpv=1&pg=PA419&dq=%27Barnstead%20charter%27 Provincial https://www.google.com.au/books/edition/Provincial_and_State_Papers/A-c7AAAAIAAJ?hl=en&gbpv=1&dq='Barnstead%20charter'%20Provincial&pg=PA419&printsec=frontcover https://www.google.ca/books/edition/Principes_d_exp%C3%A9ditive_fran%C3%A7aise_pour/FgVJ55_weywC?&gbpv=0

No, this is a bad idea, especially for a bot. --Izno (talk) 01:53, 15 August 2022 (UTC)

Postscript again
This was marked as fixed in February 2020 – either that was in error or it's subsequently been broken again. – Arms & Hearts (talk) 16:45, 5 August 2022 (UTC)
 * postscript specifies the citation's terminal character. For cs1 templates, the default character is a dot; for cs2 templates, the default is no terminal punctuation.  When the value assigned to postscript has more than one character, Module:Citation/CS1 emits a CS1 maint: postscript message and adds the article to Category:CS1 maint: postscript.  At, both cs1 templates have &amp;nbsp;– which Module:Citation/CS1 sees as 7 characters.  The exception to this one-character limit is the keyword   which suppresses the normal cs1 terminal dot (not allowed in cs2 templates because redundant).  Consider writing:
 * for cs2 templates:
 * —Trappist the monk (talk) 19:01, 5 August 2022 (UTC)
 * Thanks Trappist the monk, now fixed in the article as per. I see that the documentation for cite journal and cite web both say, but that doesn't seem to have happened here. – Arms & Hearts (talk) 12:17, 6 August 2022 (UTC)
 * Are you sure? When I look at this citation in your revert of the bot edit, I see both maintenance messages.  Remember that maintenance messages are hidden by default.  Editors who wish to see the maint messages must enable them (see ).
 * You came to this discussion with the complaint that the bot [removed] a necessary non-breaking space yet in this edit you did the same thing?
 * —Trappist the monk (talk) 13:13, 6 August 2022 (UTC)
 * I didn't realise maintenance messages were opt-in – from the documentation I assumed it would be like the red "Text 'Foo' ignored" and similar messages that I see without having to enable anything. Re the space, it only needed to be a non-breaking space while in the postscript parameter, as a leading space is ignored otherwise. If it's not in the template a normal space is fine (but per MOS:DASH en dashes still need to be spaced). – Arms & Hearts (talk) 14:19, 6 August 2022 (UTC)
 * cs1|2 emits preview messaging in the at the top of a previewed page which advertises the existence of cs1|2 error and maintenance messages and links to the instructions that describe how to display or hide those messages.
 * The heading in MOS:DASH says (in part):
 * Ideally, use a non-breaking space before the en dash, which prevents the en dash from occurring at the beginning of a line (markup: or  or  )
 * There is no reason to believe that a citation in won't wrap at a place that would place the en dash at the beginning of a new line; where a wrap occurs depends on user screen size, window size, font size, zoom level, ...
 * —Trappist the monk (talk) 14:57, 6 August 2022 (UTC)
 * There are sometimes good reasons not to use non-breaking spaces, i.e. to minimise potentially confusing and easily breakable code for the benefit of newer editors using the source editor. I assume that's why that bit of the guideline's framed as it is ("ideally") and isn't that commonly adhered to. But no objection to using them in this case. – Arms & Hearts (talk) 15:42, 6 August 2022 (UTC)
 * Ideally, use a non-breaking space before the en dash, which prevents the en dash from occurring at the beginning of a line (markup: or  or  )
 * There is no reason to believe that a citation in won't wrap at a place that would place the en dash at the beginning of a new line; where a wrap occurs depends on user screen size, window size, font size, zoom level, ...
 * —Trappist the monk (talk) 14:57, 6 August 2022 (UTC)
 * There are sometimes good reasons not to use non-breaking spaces, i.e. to minimise potentially confusing and easily breakable code for the benefit of newer editors using the source editor. I assume that's why that bit of the guideline's framed as it is ("ideally") and isn't that commonly adhered to. But no objection to using them in this case. – Arms & Hearts (talk) 15:42, 6 August 2022 (UTC)

Adds broken DOI
This has been discussed in the past, and there was a very strong agreement that adding these was a good idea. The DOI can be found via google, etc. and that makes it a useful identifier, even if broken. Also, it encourages people to report and get them fixed. Note that the dx.doi.org reporting method does not necessarily work often, since the DOI might be owned by a different company than the journal owner. I have actually reached out to various publishers and gotten many of them fixed. AManWithNoPlan (talk) 13:08, 15 August 2022 (UTC)
 * We do remove/block some DOIs, such as the Angle Orthodontist DOIs, since the old ones will never be registered (I have verified that with the publisher via email). AManWithNoPlan (talk) 13:39, 16 August 2022 (UTC)

Cleanup of date
Will do, if not book, and if no existing date, and id year matches any existing year. Now to right the code, and unit tests. AManWithNoPlan (talk) 11:35, 13 June 2022 (UTC)
 * Please make this work only with the volume parameter, and not title.
 * A date in the title may be genuinely part of the title, e.g. "Hospital opening delayed to August 2023". -- Brown HairedGirl  (talk) • (contribs) 14:48, 13 June 2022 (UTC)
 * I am still not sure how to distinguish dates and Issue numbers/names that are dates. AManWithNoPlan (talk) 18:55, 22 June 2022 (UTC)
 * Issue is for issue number, not the issue date/issue name.
 * should be converted to
 * Same in cite journal, etc. &#32; Headbomb {t · c · p · b} 04:31, 23 June 2022 (UTC)
 * Please remove year if you add date in this case. Izno (talk) 05:15, 23 June 2022 (UTC)
 * It we take time to get these right, so I will program in tests before implementing it.AManWithNoPlan (talk) 13:17, 24 June 2022 (UTC)
 * How common is this problem? AManWithNoPlan (talk) 13:13, 12 August 2022 (UTC)
 * It we take time to get these right, so I will program in tests before implementing it.AManWithNoPlan (talk) 13:17, 24 June 2022 (UTC)
 * How common is this problem? AManWithNoPlan (talk) 13:13, 12 August 2022 (UTC)

Double slashes in DOIs
I will look into doing that. I have run across at least one DOI where the first character in the suffix is a slash. Absolutely nuts. AManWithNoPlan (talk) 13:47, 18 August 2022 (UTC)
 * I have fixed this and am now running the bot over those pages. Here is a better REGEX.  Note that the difference is that it first limits to pages with the phrase "doi" on them.  This shrinks the number of pages to REGEX so you get more hits.  https://en.wikipedia.org/w/index.php?title=Special:Search&limit=500&offset=0&ns0=1&ns118=1&search=insource%3Adoi++insource%3A%2F10%5C.%5B0-9%5D%5B0-9%5D%5B0-9%5D%5B0-9%5D%5B0-9%5D%3F%5C%2F%5C%2F%2F. AManWithNoPlan (talk) 14:18, 18 August 2022 (UTC)

403 unauthorized
The | site in question seems to have some kind of browser validation. Perhaps it doesn't like Citation bot and directs it to a different page than normal users. Jc3s5h (talk) 15:45, 19 August 2022 (UTC)
 * Thank you for reporting. Interesting that this is the "title" for the page, since the page sends that as the valid title instead of sending an actual 403 code.  AManWithNoPlan (talk) 17:18, 19 August 2022 (UTC)

Suggestions
Is it possible to add ghostarchive.org and archive.ph alongside the current website/(s) this bot uses? Jenkowelten (talk) 08:35, 19 August 2022 (UTC)


 * . Hopefully, no CAPTCHA or such will block us. AManWithNoPlan (talk) 13:27, 19 August 2022 (UTC)

Adding a blank "title" parameter
script-title was not noticed and so a title was added, but then it was removed for being basically the same. AManWithNoPlan (talk) 23:11, 19 August 2022 (UTC)

Shouldn't NYT render as newspaper, and be linked, instead of work (with no link)? (bug or improvement?)
Whether to link a work is a page-specific choice. As for work versus newspaper, work is (effectively) an alias of the other without possibly mis-stating what the value of the parameter actually is. In CS1, it is sufficient. (In CS2, it would require a bit more work than to use work in general.) --Izno (talk) 04:14, 1 August 2022 (UTC)
 * I see. What does that choice typically depend on? — Guarapiranga ☎ 07:50, 1 August 2022 (UTC)
 * Couldn't say since I feel no inclination to do so in my work, but either way it's not an appropriate bot action. :) Izno (talk) 15:54, 1 August 2022 (UTC)
 * Can you point me to any policy on this, ? I couldn't find it on WP:REF. — Guarapiranga ☎ 05:57, 2 August 2022 (UTC)
 * WP:CITEVAR is the policy. &#32; Headbomb {t · c · p · b} 05:59, 2 August 2022 (UTC)
 * "this"? That it is a per-page matter? WP:DUPLINK is a good start; Citations stand alone in their usage, so there is no problem with repeating the same link in many citations within an article; e.g. The Guardian. However, this is in the overall context of "don't make duplicate links", so editors should have local control over which works to link in citations, and whether to link them in multiple or not at all. Izno (talk) 06:01, 2 August 2022 (UTC)
 * There's also collected bibliographies (structured lists of works cited, usually only the first instance is linked) vs unstructured lists (where often all instances are linked). The bot can't distinguish between these. &#32; Headbomb {t · c · p · b} 06:05, 2 August 2022 (UTC)
 * Absolutely, but is there a downside to CITEBOT suggesting notable newspapers, magazines, news channels, etc, to be linked by default? — Guarapiranga ☎ 09:20, 2 August 2022 (UTC)
 * Violations of WP:CITEVAR and some of the worse infighting you'll see on Wikipedia, with bot blocks, and weeks/months of wasted time on RFCs. &#32; Headbomb {t · c · p · b} 12:00, 2 August 2022 (UTC)
 * Are you referring to this, ?I didn't suggest existing citations, just simply linking the source (in the RS sense of source, not the actual cited work), when generating a new one. Whether this should be included in a requested run, is another question. I'd say yes, if there are other improvements, similarly to what the bot does renaming parameters (from first and last to first1 and last1, for example). — Guarapiranga ☎ 03:41, 12 August 2022 (UTC)
 * Linking the source is a change in citations. People got blocked for edit warring over it. This is not a good task to be done by bots. &#32; Headbomb {t · c · p · b} 05:50, 12 August 2022 (UTC)
 * I'm talking about the citation tool in the editor. Am I asking at the wrong place? — Guarapiranga ☎ 22:48, 13 August 2022 (UTC)
 * Yes, you are. Izno (talk) 01:53, 15 August 2022 (UTC)
 * (Which doesn't make your suggestion doable or desirable in that context .) Izno (talk) 01:53, 15 August 2022 (UTC)

recognize Foobar and Barfoo / Foobar & Barfoo as equivalent in journal/publisher
Same for Foobar / The Foobar. &#32; Headbomb {t · c · p · b} 20:27, 19 August 2022 (UTC)

Bot removed archive-url
Yes, this is GIGO, and this sort of junk is tricky to handle. But I think that it might be possible to fix this particular piece garbage rather than making things worse. -- Brown HairedGirl  (talk) • (contribs) 10:59, 21 August 2022 (UTC)


 * Surprising that having an archive-date without an archive-url is not reported as a CS1 error. User-duck (talk) 12:54, 21 August 2022 (UTC)
 * The bot should not remove the URL. It is a separate malformed parameter. User-duck (talk) 12:54, 21 August 2022 (UTC)

Replacement characters are being put into citation titles.
About 200 new entries appeared in Category:CS1 errors: invisible characters. This is an example: International Young Design Entrepreneur of the Year

It apparently occurred with this edit: " curprev 23:03, August 19, 2022‎ Citation bot talk contribs‎ 22,440 bytes +778‎  Alter: title. | Use this bot. Report bugs. | Suggested by BrownHairedGirl | #UCB_webform 2196/3663 undo ". — Preceding unsigned comment added by User-duck (talk • contribs)


 * An actual diff would be more useful than copy-pasting the thing you see on your watchlist. &#32; Headbomb {t · c · p · b} 03:32, 21 August 2022 (UTC)


 * Is this what you want: https://en.wikipedia.org/w/index.php?title=International_Young_Design_Entrepreneur_of_the_Year&type=revision&diff=1105372455&oldid=1093163191 . "diff" means something different to a programmer. User-duck (talk) 12:44, 21 August 2022 (UTC)
 * PS: The title in the archive page is "Don’t trash this". The apostrophe is probably an extended version of a single quote. (My browser's "Find" no longer detects this difference.) User-duck (talk) 12:44, 21 August 2022 (UTC)


 * Ah dreaded replacement characters. Took me a while to figure out how to detect those. If you want my solution which is not pretty but is reliable let me know. PhP might have better support than my language does. -- Green  C  03:52, 21 August 2022 (UTC)


 * Even CrossRef gives us those sometimes. We accept a certain amout of them since often you know that f< >r is some languge's word for for.  I have a added a bunch of new code to detect character sets and such, so this might be fixed already.  Depending upon when an archive was made, web.archive.org formats their html very differently.  AManWithNoPlan (talk) 11:40, 21 August 2022 (UTC)
 * With archived pages, is the bot's new code smart enough to tell when the archive site itself and the page being archived use different character sets/encodings (and switch between charsets/encodings accordingly)? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 13:34, 21 August 2022 (UTC)

There are over 300 articles in the category now, I have been fixing CS1 errors for some time now. This is the first time I have seen so many new articles in this category so I knew it was the result of a bot run. Either the bot has changed or the input to the bot is garbage. User-duck (talk) 12:44, 21 August 2022 (UTC)


 * A bunch more fixes implemented. AManWithNoPlan (talk) 18:35, 21 August 2022 (UTC)
 * Special bot run that fixes these. https://en.wikipedia.org/w/index.php?title=Minamib%C5%8Ds%C5%8D&diff=prev&oldid=1105777971 AManWithNoPlan (talk) 19:19, 21 August 2022 (UTC)

gadget
Ah -- my browser runs with NoScript enabled. The bot would not work until I enabled Javascript from citations.toolforge.org to run in my browser. After I enabled that in NoScript, it worked. This was not obvious from the documentation of how to enable CitationBot. Gnuish (talk) 22:39, 24 August 2022 (UTC)

Hebrew sources fault
Hello. I've noticed that the bot adds titles like ??????? ???? ??? when fetching Hebrew source titles (e.g. this). Not sure if this is a fault specific to Hebrew, but seen it on a few articles. Cheers, Number   5  7  20:53, 22 August 2022 (UTC)


 * https://en.wikipedia.org/w/index.php?title=User%3AAManWithNoPlan%2Fsandbox5&diff=prev&oldid=1106163921 Fixed. That was obscure PHP failure.  AManWithNoPlan (talk) 12:38, 23 August 2022 (UTC)
 * If anyone finds an others, please report them. PHP is not 100% perfect on all character sets.  I believe most of these have now been fixed.  On rare occasions, the actual web archive has the the title of ???????? which usually means try to find/make a better archive. AManWithNoPlan (talk) 16:24, 23 August 2022 (UTC)

Replaces legit citations with Chinese links to porno sites

 * The bot did nothing that wasn't already there. The link still is there too. &#32; Headbomb {t · c · p · b} 08:01, 25 August 2022 (UTC)

cite web and twitter
Apparently Citation bot can get the author and the tweet so why didn't it use separate parameters Matthew Green and The pragmatist in me says ...? And where did  come from? That shortened url isn't part of the original or the archived tweets – the url doesn't point to anything of value (unless a poodle in a chair has value...). It is interesting to me that the author text in title has the non-English 'Twitterren' (Google translate thinks that it's Basque for 'on Twitter' which is what I normally see: Matthew Green on Twitter: "The pragmatist in me says...". This odd use of 'Twitterren' is what provoked me to investigate the history of this citation.

—Trappist the monk (talk) 15:30, 25 August 2022 (UTC)


 * This is from the web archive title. AManWithNoPlan (talk) 16:22, 25 August 2022 (UTC)

Remove url where identifier already exist for the same thing
The bot should treat https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC6077754/ the same ay as https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6077754/ I.e. remove it from a cite journal if 6077754 is already present. Jonatan Svensson Glad (talk) 04:59, 31 August 2022 (UTC)

AManWithNoPlan (talk) 15:36, 31 August 2022 (UTC)

Update arxiv bibcodes
Any bibcode that starts with ####arXiv... should be updated when possible. &#32; Headbomb {t · c · p · b} 13:05, 28 August 2022 (UTC)

Okina replaced by apostrophe
The original citation at incorrectly uses a left single quotation mark U+2018 where okina U+02BB was intended. This was corrected when the citation was added to the article. The bot should have left this alone. GA-RT-22 (talk) 21:44, 30 August 2022 (UTC)

Mislabeling Associated Press and Reuters as a "work" rather than an "agency"
Not a bug. See the template documentation.

—Trappist the monk (talk) 23:03, 25 June 2022 (UTC)
 * What TMK is referring to is the statement "Do not use for sources published on the agency's own website; e.g. apnews.com or reuters.com; instead, use work or publisher". Using the work parameter presents in italics while publisher does not. I wonder why the indifference. I suggest we use the publisher parameter (again, for consistency with "agency" styling). Dawnseeker2000  23:25, 25 June 2022 (UTC)

I agree with Trappist. In these cases, the news agency is acting as a publisher rather than as an agency. Neutral on whether to use work or publisher. -- Brown HairedGirl  (talk) • (contribs) 13:03, 26 June 2022 (UTC)
 * When the publisher and work are the same, publisher is not usually used. AManWithNoPlan (talk) 14:29, 26 June 2022 (UTC)
 * Associated Press and Reuters are not works though. They are agencies/publishers, and should not be converted to works. &#32; Headbomb {t · c · p · b} 15:09, 26 June 2022 (UTC)
 * Review the template documentation. Realize accordingly that your comment here is in direct contradiction to your comments above about Cite magazine. Yes, these are the same exact issue. Izno (talk) 16:40, 26 June 2022 (UTC)
 * If this was directed at me... which contradiction? Dawnseeker2000  00:03, 27 June 2022 (UTC)
 * AP and Reuters are both news agencies, and news works in their own right. When citing the AP or Reuters website directly they are works. When citing a story on another news organisation's website that says that the reporting is from AP or Reuters, then they are acting as an agency. Sideswipe9th (talk) 17:06, 26 June 2022 (UTC)
 * When used as the name of a work (for an article directly from the AP web site), it should have AP News or Associated Press News. That is the name they give to that part of their site, that is, the work. It is incorrect to list AP or Associated Press . That is the name of the organization, not the name of their web site, and should appear in publisher or agency instead. —David Eppstein (talk) 19:51, 26 June 2022 (UTC)
 * Yes, that's what I'm saying. News outlets (regardless of media type) are considered publishers Dawnseeker2000  01:18, 27 June 2022 (UTC)

Dead links
You need to look more closely at the edits. AManWithNoPlan (talk) 20:42, 6 September 2022 (UTC)

Job stalled, won't die
My category run (Category:Genes on human chromosome X) stalled at 11:54 this morning after this edit to item 218 of 542, Collagen, type IV, alpha 5. Around 15:10, I tried to kill the stalled job with https://citations.toolforge.org/kill_big_job.php, which responded with, but, as of now, nearly an hour and a half later, my attempts to start a new category run are still throwing up. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 21:39, 15 September 2022 (UTC)

both hung job and what killed it. AManWithNoPlan (talk) 16:53, 18 September 2022 (UTC)

Another stalled job
Another one of my category runs's stalled (Category:Corrosion inhibitors this time), having made this edit to Benzotriazole (item 26 of 32); the reason I know it's stalled rather than simply having finished running through that category is that, when I attempted to start a new category run, it threw up, which it continued to do even after I tried to kill the stalled job with https://citations.toolforge.org/kill_big_job.php (which obligingly came back with  ). Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 19:36, 21 September 2022 (UTC)
 * You should be unlocked now. No idea why.  I see no error messages.  AManWithNoPlan (talk) 01:15, 22 September 2022 (UTC)
 * And now I'm being locked out again by yet another stalled job (I'm not sure what it is this time!). Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 19:50, 22 September 2022 (UTC)
 * try yet again. AManWithNoPlan (talk) 21:02, 22 September 2022 (UTC)
 * Working now, thanx! Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 21:26, 22 September 2022 (UTC)
 * Actually, scratch that; now I'm being locked out even after the run in question's finished! (Category:Cultural depictions of Xaviera Hollander, in case you were wondering.) Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 21:54, 22 September 2022 (UTC)

Sorry to pester you, but I'm still being locked out of starting a new run most of a day later. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 12:12, 23 September 2022 (UTC)

AManWithNoPlan (talk) 13:43, 23 September 2022 (UTC)
 * Thanx! :-) Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 16:59, 23 September 2022 (UTC)

Change parameter for cite web but not cite news?
Why does it change  to but not  to or am I missing something? Jonatan Svensson Glad (talk) 22:07, 12 September 2022 (UTC)
 * Can you point me to some specific examples AManWithNoPlan (talk) 12:51, 21 September 2022 (UTC)
 * See diff of the above. Sidenote, it did not remove via when work had content. Jonatan Svensson Glad (talk) 17:48, 22 September 2022 (UTC)

fixed AManWithNoPlan (talk) 19:46, 24 September 2022 (UTC)

Error Adding Category:Articles lacking reliable references

 * Umm, that was you. You included  when you created the article.  That template adds  (see info at the top of the category page).
 * —Trappist the monk (talk) 01:07, 27 September 2022 (UTC)
 * Thank you, Trappist. I missed that connection, but it makes perfect sense. I wanted to keep these entries for a while that were in the previous version of the list to give other editors a chance to provide reliable references that I have so far been unable to find. I concur the bot’s behavior is correct and not a bug. I appreciate the quick response. Skeet Shooter (talk) 01:30, 27 September 2022 (UTC)

wrong doi!
The bot added  bibcode = 1960Natur.186Q.211. to the But the Nature article (the doi) is only the review of the real article (the url) --Stone (talk) 08:54, 29 September 2022 (UTC)


 * The exising S2CID points to nature https://en.wikipedia.org/w/index.php?title=Iridium&type=revision&diff=1112758294&oldid=1112673378 GIGO AManWithNoPlan (talk) 11:30, 29 September 2022 (UTC)
 * Also the URL is wrong. AManWithNoPlan (talk) 13:06, 30 September 2022 (UTC)

Reuters

 * I don't think this is a news article syndicated by agency Reuters and published by someone else, so the use of Reuters is incorrect. However, turning it into Reuters is also incorrect. The name of the work is not "Reuters". I think that the correct parameters for this citation would be Reuters Graphics and/or Reuters. —David Eppstein (talk) 00:20, 21 September 2022 (UTC)


 * I totally agree. The main issue for me is the italicization of 'Reuters', which is just wrong, at least by utilizing the 'agency' parameter, the name won't be italicized even though it seems not to be the correct option either. — Mannofthomas (talk) 00:43, 21 September 2022 (UTC)
 * work is always italicized, and should be. If Reuters (the company) published this content on a site (work) whose name was also "Reuters", then italicizing it as the name of a work would be correct. The only problem here is that the name of the site is not "Reuters" but instead "Reuters Graphics". —David Eppstein (talk) 00:59, 21 September 2022 (UTC)
 * What I've learned is that not all websites are considered works. This is illustrated by the numerous articles we have about websites that don't employ the template. Also, when an editor considers whether a particular source should be a publisher or work it's helpful to look at things with a focus on the publisher/work relationship. In old media, there was a pretty consistently identifiable relationship between newspapers and their associated publishing companies, but with new media, that is becoming a thing of the past. Going back to the numerous articles we have about websites that do not have their titles italicized: I think that that is correct because of this lack of a publishing company. Yes, websites can have a parent company, but those are not the same as a publishing company, so it can be confusing. But what I think the editors of these articles have realized, as have I, is that if there's no identifiable publishing company then the website itself is the publisher. In my work, I try to be consistent with how I decide whether a source should be italicized or not, and this idea that Reuters should be italicized only sometimes is way too perplexing. Whoever deemed that the right way to go needs to think again. Consider the numerous other news agencies that don't have that rule applied. Also consider that every other news agency or publisher is not italicized. Our own article on Reuters is not even italicized.  Dawnseeker2000  01:57, 21 September 2022 (UTC)
 * The rule is very straightforward. When a name is used as the name of a company or organization (for instance, as a publisher or agency), it is upright Roman. When a name is a name of a collective body of work, like a periodical or an edited volume or a web site, it is italic. Some names are used for both kinds of things and depending on how they are used can be formatted differently. Trying to choose a parameter based on how you would like to see it formatted is a mistake. Choose a parameter based on what kind of thing you are naming and let the formatting handle itself. —David Eppstein (talk) 03:59, 21 September 2022 (UTC)
 * The main things to understand is that editors are indeed noticing the discrepancy of Reuters being italicized and that there will be more who speak up in the future. Dawnseeker2000  04:14, 21 September 2022 (UTC)
 * There are many places in Wikipedia where details are imperfect and can be improved. But your focus on whether it is italicized or not is not a useful way to think about it. Focus on whether the name Reuters is used to mean the company Reuters or the web site maintained by the company. They are two different things and our citation formatting (when used with the correct parameters, based on meaning rather than intended appearance) helps readers understand which of those two meanings is intended by the name. —David Eppstein (talk) 04:22, 21 September 2022 (UTC)
 * The bot's edit was more-or-less correct. When Reuters (the corporate entity) publishes something on its eponymously named website (the work) then the 'Reuters' name goes in work.  In this specific case, at the bottom of the source is this:
 * Rocket illustrations by Wen Foo
 * Additional work by Ashlyn Still and Travis Hartman
 * Editing by Christine Chan and Thomas Brown
 * REUTERS GRAPHICS
 * so Reuters Graphics is more correct than Reuters. Use Reuters when Reuters licences another corporate entity to publish a Reuters article in the 'other corporate entity's' work (commonly a newspaper, or other news source).  Example: Reuters grants a licence to The New York Times Company to publish a Reuters article in The New York Times so: Reuters The New York Times.  agency should never be used in the absence of work (or an alias) – I wonder if cs1|2 should enforce that; it might reduce the number of occasions that this topic must be discussed...
 * In periodical cs1|2 templates publisher should be omitted most of the time.  Do not use Reuters to avoid italicizing 'Reuters'.
 * cs1|2 templates create metadata that is used by reference management software (Zotero, for example). The standard that cs1|2 uses does not have support for publisher in periodical metadata and no support for agency at all.  agency is only rendered in the visual.  Those who consume cs1|2 citations through the metadata have no idea where an article comes from if that information is not included in the correct cs1|2 parameter (work or an alias).  Do not deprive those consumers of this essential information.
 * —Trappist the monk (talk) 14:03, 21 September 2022 (UTC)
 * cs1|2 templates create metadata that is used by reference management software (Zotero, for example). The standard that cs1|2 uses does not have support for publisher in periodical metadata and no support for agency at all.  agency is only rendered in the visual.  Those who consume cs1|2 citations through the metadata have no idea where an article comes from if that information is not included in the correct cs1|2 parameter (work or an alias).  Do not deprive those consumers of this essential information.
 * —Trappist the monk (talk) 14:03, 21 September 2022 (UTC)

Gimme Some Lovin’
Thanks for the help.😁 Hamsterbird (talk) 01:01, 1 October 2022 (UTC)

Did not fix author
This would not generally be a good solution to the error noted. --Izno (talk) 01:14, 5 October 2022 (UTC)
 * Then what was a purpose of deleting empty "last" field at all? While doing proposed action it makes something really useful: despite it's not a best solution - it still a solution and not some action that nor fix anything nor do something useful as it now. 85.238.101.64 (talk) 01:19, 5 October 2022 (UTC)
 * It would be good for you to join here as expert. What do you thing about current automated proposition? Will it ease ediitng some way by resolving "no last name" automatically instead of placing it ot appropriate category or not? 85.238.101.64 (talk) 01:24, 5 October 2022 (UTC)
 * If any author parameters are set, then all empty author parameters are dropped. Same with editors.  Also all aliases: such as if issue is set then number is dropped.  It is all about removing empty extra parameters.  AManWithNoPlan (talk) 14:13, 6 October 2022 (UTC)

User-activated instance of bot removes disambiguators from publication year
That was me, not the bot. I didn't see harv in the notes or refs and forgot to check in the text. RDBrown (talk) 12:41, 12 October 2022 (UTC)
 * Ok, thanks for clarifying. I'll tag this "not a bug". —David Eppstein (talk) 14:37, 12 October 2022 (UTC)

Edit comment of "Formatted dashes" is not accurate
The bot should only report "Formatted dashes" if the dashes it is formatting were already in place. It shouldn't report this for dashes it added. - UtherSRG (talk) 18:35, 11 October 2022 (UTC)

Untitled_new_bug
Earlier this afternoon I ran the bot on several small categories, all now complete. However hours later when I try to run it on a new category I just get "Run blocked by your existing big run." with no explanation as to what that is when the run should have completed ages ago. Timrollpickering (talk) 18:48, 14 October 2022 (UTC)

Error in ref re: comma
that is really odd. the original character is a single quote - spanish style. AManWithNoPlan (talk) 11:28, 20 October 2022 (UTC)

Small error
Strange error, in this edit Zehnder's first paper was marked with year 2010, even though it is from 1975. Gumshoe2 (talk) 01:11, 23 October 2022 (UTC)


 * The journal put the wrong data in CrossRef

https://search.crossref.org/?from_ui=&q=10.1002%2Fcpa.3160280104 AManWithNoPlan (talk) 11:19, 23 October 2022 (UTC)

Removal of "format" parameter in cs1 citations
Incidentally, it also says it upgraded an ISBN10 to 13 but all it did was add dashes to an ISBN13. --Izno (talk) 19:30, 17 October 2022 (UTC)
 * Removing PDF is correct, as the cite template now knows to display the PDF sign regardless. - UtherSRG (talk) 21:01, 17 October 2022 (UTC)
 * The reason it should not be removed is because PDF displays to everyone, whereas the icon is not available to all users. Izno (talk) 21:07, 17 October 2022 (UTC)
 * (PDF) displays regardless of whether or not format is declared. It can be removed without any difference to the output. &#32; Headbomb {t · c · p · b} 23:03, 17 October 2022 (UTC)
 * How does it determine that it's a PDF? If it's from the file extension in the URL, then is it not possible that the automatic determination could miss PDFs that are reached via URLs that don't have that extension?
 * Regardless, it strikes me as odd to go to such lengths to remove a piece of human-verified metadata that is 1) potentially useful and 2) absolutely harmless when it isn't be useful, simply because some uses of it are redundant. XAM2175  (T) 00:11, 18 October 2022 (UTC)

Free journal
* should be marked as free. Example: 10.1210/jendso/bvab049. Jonatan Svensson Glad (talk) 19:26, 24 October 2022 (UTC)

url in |title=

 * Restricting urls to something-url parameters in the citation templates is, as usual, too mechanical, prescriptive, and procrustian to handle real-world citations. It is frequently necessary to put urls in other parameters including id, page, at, etc. The only parameters for which urls should be excluded are the ones for which it is possible to make a link using a different parameter. However, title is one of those otherwise-linkable parameters, so I agree that it should not have urls. —David Eppstein (talk) 18:28, 23 October 2022 (UTC)
 * Sorry about that. I am running a special version of the bot that allows me to override the normal GIGO prevention systems to deal with titles of "archived copy".  Each and every title has to be manually approved.  That was obviously a mistake on my part.  AManWithNoPlan (talk) 14:02, 24 October 2022 (UTC)

Bad journal title
Another instance Special:Diff/1114768642 "journal=Includes:reports from Commissioners, Inspectors and Others" jnestorius(talk) 06:26, 14 October 2022 (UTC)
 * Thank you for reporting. Investigating how to detect this bad meta-data.  AManWithNoPlan (talk) 12:41, 21 October 2022 (UTC)

Duplicates hdl from inside url parameter
On 23 September 2022, on Partridge the Citation Bot ignored the parameter url=, which had a fully functional URL in it, and added hdl=, with a url. The result was that the citation displayed both with the clickable icon for a url, and the full, redundant hdl= url. I had fixed this untidy semiduplication before, but apparently to no avail. Clearly the bot is ignoring the url parameter, even when it is populated. Can this be fixed? Regards, Acad Ronin (talk) 16:58, 21 October 2022 (UTC)
 * Looks like it was 8 Sep 2022: Special:Diff/1109275325 - UtherSRG (talk) 17:07, 21 October 2022 (UTC)
 * This is not a bug. The standard for references is to add identifier, etc.. More links the better.  AManWithNoPlan (talk) 15:22, 22 October 2022 (UTC)

Big run doesn't stop
I haven't been able to use the bot in a while, apparently there's a big run done for me, but I can't seem to kill it or find what edit it's wanting to do. &#32; Headbomb {t · c · p · b} 02:00, 25 October 2022 (UTC)
 * The server got rebooted and corrupted the files. I have deleted everyones sessions.  AManWithNoPlan (talk) 11:41, 25 October 2022 (UTC)

Template change error

 * And the p changes should have been to pages. --Izno (talk) 20:24, 27 October 2022 (UTC)
 * the page to pages issue. AManWithNoPlan (talk) 21:30, 27 October 2022 (UTC)

bot adds |chapter= to cite journal

 * This appears to be an entry in a book, A Dictionary of Political Biography. The metadata for the book, showing it to be a book and giving its title, is what one gets from the doi database. So the correct change here would be to add the book title and change the citation template to cite book or maybe cite encyclopedia. —David Eppstein (talk) 23:25, 26 October 2022 (UTC)
 * Those are actually logged by the bot and I go in and fix them. They are almost universally GIGO that needs some human TLC.  AManWithNoPlan (talk) 14:13, 27 October 2022 (UTC)