User talk:Citation bot/Archive 26

treat "dos" like "its" in trust always code
per beadbomb suggestion. AManWithNoPlan (talk) 11:26, 19 June 2021 (UTC)
 * Its should be capitalized though. &#32; Headbomb {t · c · p · b} 17:50, 19 June 2021 (UTC)


 * fixed AManWithNoPlan (talk) 13:13, 23 June 2021 (UTC)

Journal volume value "cleaned up" unnecessarily
Not a bug, volume is for volume numbers (e.g. II, 30B), not running text. 'Vol.' should not be in volume. The documentation mentions this (volume numbers should be entered just as a numeral (e.g. 37)). &#32; Headbomb {t · c · p · b} 10:33, 22 June 2021 (UTC) &#32; Headbomb {t · c · p · b} 10:31, 22 June 2021 (UTC)
 * Again, as mentioned, the same documentation says: "Any alphanumeric value of five or more characters will not appear in bold. In rare cases, publications carry both, an ongoing volume and a year-related value; if so, provide them both, for example |volume=IV / #10." How would you explain this?  Jay (Talk) 10:48, 22 June 2021 (UTC)
 * See IV / #10, and not Vol. IV / #10. &#32; Headbomb {t · c · p · b} 10:53, 22 June 2021 (UTC)
 * Oh ok, I didn't really understand the Vol. IV / #10 part. Apologize for quoting that. The line I am interested in is: "Any alphanumeric value of five or more characters will not appear in bold.". It gives us the flexibility to add alphanumeric text with the caveat that it will not be in Bold. Which is exactly what I want, and did, in the Iridium Communications page.  Jay (Talk) 11:02, 22 June 2021 (UTC)
 * https://en.wikipedia.org/wiki/Category:CS1_errors:_extra_text:_volume "The templates emit this error message when some form of the word "volume" ("volume", "vol.") is found in the value assigned to volume. To resolve this error, remove the extraneous text from the parameter value." Grimes2 (talk) 11:14, 22 June 2021 (UTC)
 * Thanks, so it is the choice of text I used ("Vol. ") that created the error. And any other text would have been fine. I'll try to clarify the documentation on the volume param on the citation page on this one.  Jay (Talk) 11:51, 22 June 2021 (UTC)

Bot dropping jobs
The bot seems to be halting on the regular. Abductive (reasoning) 04:24, 23 June 2021 (UTC)


 * Seems fixed now AManWithNoPlan (talk) 01:04, 26 June 2021 (UTC)

Pointless titlelink=doi
Stopped. AManWithNoPlan (talk) 18:41, 26 June 2021 (UTC)

bot uses abstract id from bibcode abstract as value for |pages=
In that same edit, bot added 2011AGUFMPP43D..07E; should it not have then removed preexisting https://ui.adsabs.harvard.edu/abs/2011AGUFMPP43D..07E/abstract as redundant?

—Trappist the monk (talk) 23:49, 27 June 2021 (UTC)
 * Remember the RFC about removing urls that duplicate identifiers? &#32; Headbomb {t · c · p · b} 00:10, 28 June 2021 (UTC)

Springer-Verlag
This is not a journal, there's already the publisher/series information. &#32; Headbomb {t · c · p · b} 00:09, 28 June 2021 (UTC)

Removing MDPI URLs, Bot Approval?
Where's the bot approval for this kind of edit? I don't see it in the user page list. Also, what is this edit supposed to do? The URL is not dead, it works just fine, and Resources IS an MDPI journal, so it's not a proxy link. Silver seren C 07:11, 19 June 2021 (UTC)
 * The link is still there. It's just automatically generated. Compare
 * to
 * &#32; Headbomb {t · c · p · b} 17:51, 19 June 2021 (UTC)
 * Doesn't the continued existence of the link make these edits fall under WP:COSMETICBOT? (I agree with their removal, though, assuming they are part of some non-cosmetic set of changes.) —David Eppstein (talk) 18:52, 19 June 2021 (UTC)
 * It does change printed output and interactions with certain scripts that operate on url-recognition. It's borderline, but it has several benefits. For instance it future-proofs the link and against linkrot if MDPI ever decides to sell the journal or to change their URL structure. &#32; Headbomb {t · c · p · b} 20:38, 19 June 2021 (UTC)
 * notabug
 * It does change printed output and interactions with certain scripts that operate on url-recognition. It's borderline, but it has several benefits. For instance it future-proofs the link and against linkrot if MDPI ever decides to sell the journal or to change their URL structure. &#32; Headbomb {t · c · p · b} 20:38, 19 June 2021 (UTC)
 * notabug

DOI for 'domain for sale' not marked as broken?
I have reported this journal problems to its publisher. Depending upon the age of the articles, the DOIs point to one of two for-sale websites. Investigating how to detect this problem : crossref has data, the doi does resolve to a valid url, it is just the final website if non-ideal. AManWithNoPlan (talk) 15:57, 29 June 2021 (UTC)
 * points to the other website. AManWithNoPlan (talk) 01:25, 30 June 2021 (UTC)

TNT n/a volumes/issues/pages, revisited
Seems to work most of the time, but it's missing a few places. &#32; Headbomb {t · c · p · b} 15:57, 30 June 2021 (UTC)

Weirdness
This is weird, and shouldn't happen. &#32; Headbomb {t · c · p · b} 21:46, 30 June 2021 (UTC)

Regular expression failure in List of common misconceptions when extracting Templates

 * Alright... Garbage it, nothing out. Jonatan Svensson Glad (talk)
 * Yeah, the bot is a reasonable test for bad wikitext. We reject it since sometimes terrible edits happen when the templates are not where they should be. AManWithNoPlan (talk) 19:47, 2 July 2021 (UTC)

Remove certain publishers
Per this revision, Citation bot removed certain publishers, but not others. I could not see anything that would suggest whether Citation bot is not meant to remove publishers for news sources/these specific sources, or rather that it is meant to and that it was unable to detect and drop the others. Either way it seems like there is possibility for optimisation here. RoanokeVirginia (talk) 21:16, 2 July 2021 (UTC)
 * The bot uses a Whitelist of publishers to remove that are almost identical to the actual work in question. AManWithNoPlan (talk) 22:24, 2 July 2021 (UTC)
 * fixed by adding more. AManWithNoPlan (talk) 15:43, 4 July 2021 (UTC)

Whitelist Category:CS1 maint: PMC embargo expired to be run from the link
Similar to what's being done for the link present in Category:CS1 maint: PMC format. &#32; Headbomb {t · c · p · b} 05:44, 4 July 2021 (UTC)
 * Same for Category:CS1 maint: ref=harv. &#32; Headbomb {t · c · p · b} 05:48, 4 July 2021 (UTC)


 * fixed

incorrectly changing "Quotation mark" of german language title?
That's what should be done according to MOS:CONFORM. &#32; Headbomb {t · c · p · b} 15:04, 5 July 2021 (UTC)

Bogus date added
The website claims that is the right date. I have added special code to reject that date, since it is date zero in some calendars. AManWithNoPlan (talk) 12:28, 6 July 2021 (UTC)
 * Thanks! -- Brown HairedGirl  (talk) • (contribs) 12:59, 6 July 2021 (UTC)

Another bogus date
oireachtas.ie flagged as having bad dates. AManWithNoPlan (talk) 13:27, 6 July 2021 (UTC)

Cleanup tag not removed after problem fixed
The anger that this edit would create is more than we want at this time. AManWithNoPlan (talk) 13:53, 6 July 2021 (UTC)
 * I don't see how removing a redundant cleanup tag would create any anger at all. It needs to be removed, so why not remove it in the same edit as the fixes? -- Brown HairedGirl  (talk) • (contribs) 14:33, 6 July 2021 (UTC)
 * In part because there maybe other URLs not in ref tags that the bot doesn't see. However, do feel free to make a WP:BOTREQ to have a bot remove that tag once it can be confirmed no bare url remains. &#32; Headbomb {t · c · p · b} 15:17, 6 July 2021 (UTC)
 * what are the other parts? -- Brown HairedGirl  (talk) • (contribs) 16:25, 6 July 2021 (UTC)
 * Every URL that isn't in cite templates or in ref tags. &#32; Headbomb {t · c · p · b} 17:55, 6 July 2021 (UTC)

Retraction watch?
Does CitationBot check journal citations against RetractionWatch.com, to verify that we aren't compounding errors. Apparently it is quite common: "Of the 13,000 retracted papers that were cited at least once, 84% had a post-retraction citation. ... Together, the 20,000 papers in the [Retraction Watch] archive were cited in 95,000 articles [on Semantic Scholar] after their retractions. In turn, these were cited in 1.65m further papers." – The Economist --John Maynard Friedman (talk) 13:33, 9 July 2021 (UTC)


 * No, but there is . It hasn't run in a while, so I'd contact about that. &#32; Headbomb {t · c · p · b} 13:56, 9 July 2021 (UTC)
 * wontfix, since we do not want to deal with problems that lead to being shutdown. AManWithNoPlan (talk) 21:22, 9 July 2021 (UTC)
 * I just didn't find time to fix the problems with RetractionBot and get it running again. Anyone would be welcome to jump into the repo and work on it. As I recall the two primary issues with the bot were that Crossref was reporting certain review articles as being 'retracted' when that wasn't accurate, and it was capable of edit warring. I think I fixed both problems but didn't want to get it running again until I'd implemented some tests. If I find the time I'll dig out my notes and write something up about what needed work. Sam Walton (talk) 21:32, 9 July 2021 (UTC)

Issues with WP:COSMETICBOT
This edit. Yes, the |ref= parameter is useless in this case, but in this case it was empty anyway and there was no visible output difference, so it looks like its running afoul of COSMETICBOT. And the article is up for deletion anyway, so I'm not seeing how this is a useful edit. Is there a way to tweak bot code to prevent it from making throwaway edits like that? It's not a big deal; I'm just struggling to see the rationale behind that edit or how it helps anything whatsoever. Aren't there better uses of bot run time? Hog Farm Talk 01:37, 14 July 2021 (UTC)
 * I will look into that. It is hard to catch all cosmetic only edits.  Blank ref tags are dropped since they are a temptation to add bad data and there is almost never a good reason to set it.  AManWithNoPlan (talk) 14:10, 14 July 2021 (UTC)
 * mostly fixed - improving minor edit detection. AManWithNoPlan (talk) 12:57, 26 July 2021 (UTC)
 * It seems to me that this issue probably has almost no effect no bot run time.
 * This is about what the bot does after it has processed the page and checked all the databases. That processing is time intensive, but in most cases saving the changes is relatively quick in comparison to all those lookups.
 * So the question is whether, after all that processing, the edit is worth saving. -- Brown HairedGirl  (talk) • (contribs) 13:19, 26 July 2021 (UTC)

Also not a big deal, but wondering the rationale for something
I was recently conducting a GA review and in response to a query I made about a weird citation quirk I asked about, I was pointed to a citation bot edit. Essentially, the bot added a doi (makes sense), but the doi is inactive and was at the time the bot added it, so the bot also marked it as an inactive doi. I'm not the best with dois, but I'm a bit confused as to why it was added if it didn't work. Is there a rationale for the bot to add broken doi links that I'm not aware of? Hog Farm Talk 01:44, 14 July 2021 (UTC)
 * The idea is that broken DOIs can be repaired. The /html part was bad, but I've reported, which is found here, as borked. In my experience, these get fixed within a week to a month after being reported. &#32; Headbomb {t · c · p · b} 02:57, 14 July 2021 (UTC)
 * It is also common to be be able to google the non-functioning DOIs. I have personally gotten over a 1000 DOIs fixed.  I watch for new pages in https://en.wikipedia.org/wiki/Category:CS1_maint:_DOI_inactive and https://en.wikipedia.org/wiki/Category:Pages_with_DOIs_broken_vcite and fix/report the new broken DOIs.  AManWithNoPlan (talk) 14:13, 14 July 2021 (UTC)

notabug

Duplicate citations
Which bot is it that deduplicates duplicate citations (e.g., changing  to  )? I thought it was Citation Bot, but I ran it on Hookworm infection and it didn't seem to deduplicate the two "Veterans Administration Technical Bulletin" citations, which appear to be identical to me. —2d37 (talk) 01:51, 15 July 2021 (UTC)
 * Refill can do this, BUT it can get confused by existing named refs and leave junk behind. AManWithNoPlan (talk) 02:00, 15 July 2021 (UTC)
 * AWB is pretty good at handling this, as long as one named ref already is in the article. &#32; Headbomb {t · c · p · b} 02:35, 15 July 2021 (UTC)
 * Ah, I imagine it was AWB that I saw doing this. Thanks. —2d37 (talk) 04:38, 15 July 2021 (UTC)
 * AWB will combine duplicate refs. It will give them a usually meaningful name. The :0 pattern particularly is from VisualEditor where someone reuses a reference. VE does not automatically remove duplication like VE. Izno (talk) 16:57, 15 July 2021 (UTC)

reFill 2 also deduplicates duplicate citations. It uses the names "auto", "auto1", "auto2", etc. -- Brown HairedGirl  (talk) • (contribs) 10:51, 19 July 2021 (UTC)
 * Always preview those edits since if you have an existing named reference that gets its name changed, the tool does not update existing pointers to that reference. AManWithNoPlan (talk) 13:44, 19 July 2021 (UTC)


 * notabug - it would be nice for the bot to do this, but that is a complexity that I do not personally feel comfortable taking on. AManWithNoPlan (talk) 13:31, 22 July 2021 (UTC)

Linked from pagename
When using the "Category" option, the bot helpfully links the category. See e.g. this edit, with edit summary ‎Alter: template type. Add: magazine. Removed parameters. Some additions/deletions were parameter name changes.

However, when using the "Linked pages" option, the page is named but not linked. See e.g. this edit, with edit summary Alter: url. URLs might have been anonymized. Add: title. Changed bare reference to CS1/2.

It would take only 4 extra characters to link the page as  ... and that would increase transparency by making it easier for other editors to view the list.

Even better, the bot could link the actual revision used to make the list. In the edit above from my list, that would be : User:BrownHairedGirl/Articles_with_bare_links. That's an 33 characters more than the bare pagename, but the bot rarely runs anywhere near the limit on length of edit summaries.

Also, tho less importantly, could the username be linked? Piping the username could add a lot of extra characters, but an unpiped link of the form  adds only 9 characters. -- Brown HairedGirl  (talk) • (contribs) 03:26, 20 July 2021 (UTC)


 * edit summaries are aleady too long.  pages and usernames linked were removed because too many editors are either clueless or rude.  AManWithNoPlan (talk) 11:46, 20 July 2021 (UTC)
 * Hmm. The edit summary lengths look fine to me. And it seems to me to be fundamentally wrong to impede scrutiny because of the incivility of some editors.  The bots edits are attributed to editors, so there should be a link, as with any other edit.
 * Do you have a pointer to the previous discussions? -- Brown HairedGirl  (talk) • (contribs) 13:32, 20 July 2021 (UTC)
 * The edit summary lengths sometimes get truncated, so adding even one unneeded character is bad. Anyone that is too lazy or clueless to use the edit summary without a link is probably not someone you want editing your talk page.  Trust me, you do not want those links.  I will not point you to previous discussions since pointing out how other editors are sacks of shit is not productive.  AManWithNoPlan (talk) 19:25, 20 July 2021 (UTC)
 * notabug, since the bot used to do this, but had to stop. AManWithNoPlan (talk) 13:02, 22 July 2021 (UTC)

Adds a bad DOI
Bad pubmed data. AManWithNoPlan (talk) 12:42, 23 July 2021 (UTC)

"title = Redirect Notice"
Just went and removed all of these by hand. AManWithNoPlan (talk) 17:15, 23 July 2021 (UTC)

Job dropped, cannot restart
My job of 1771 pages dropped at 14:50 UTC, after editing page 591/1771.

At 15:09 I edited my list to comment out the articles already done, and have tried three times in the last 2 hours to start a new job on the reduced list, but with no evident result: see the bots contribs for the last few hours

What's going on? -- Brown HairedGirl  (talk) • (contribs) 16:53, 23 July 2021 (UTC)
 * simple problem is too many people doing enormous jobs trying to flow through the same small pipe — Chris Capoccia 💬 17:01, 23 July 2021 (UTC)
 * So it seems. But dropping existing jobs doesn't help.
 * I would like to know if the dropping is intentional, or a malfunction? -- Brown HairedGirl  (talk) • (contribs) 17:07, 23 July 2021 (UTC)

And lo, my job restarted at 1707 (see bot contribs). Hallelujah! -- Brown HairedGirl  (talk) • (contribs) 17:10, 23 July 2021 (UTC)

Seems fixed AManWithNoPlan (talk) 20:20, 27 July 2021 (UTC)

Put references in numerical order?
Would it be possible to add a function that puts references not separated by text into numerical order? It's a tedious task to go through and re-order references after an editing session, and the next editor to add a reference might get them out of order again. —valereee (talk) 17:29, 23 July 2021 (UTC)
 * what do you mean? please post a link. AManWithNoPlan (talk) 17:33, 23 July 2021 (UTC)

I presume to fix the likes this:

Hello world.

Goodbye universe.

In that example, the order of the refs to "goodbye universe" should be reversed. -- Brown HairedGirl  (talk) • (contribs) 17:42, 23 July 2021 (UTC)


 * I don't like that request. The multiple references should be in the order of importance. Grimes2 (talk) 18:22, 23 July 2021 (UTC)
 * Simply adding another reference can re-order them. This is something that can easily shift around.  This is something that a dedicated Bot would have to do, if and only if it was decided to be a good idea. While almost no editors make an effort to put in priority order, that would seem like a better idea. I would assume that the most common ordering is just people adding them to the end or the front just because that is where the editor always adds new ones.  This is a notes/references global idea and not unique to the content of references.  I certainly do not want to be the bot runner that changed:

info to: info wontfix AManWithNoPlan (talk) 20:13, 23 July 2021 (UTC)
 * To explain a little more, this feature was in AWB. It was removed in the past few years due to loud complaining. This is something a personal script might be developed for, but I would expect similar complaints here otherwise. --Izno (talk) 12:11, 24 July 2021 (UTC)
 * That makes it notabug AManWithNoPlan (talk) 00:59, 26 July 2021 (UTC)

Mangled ref to India Today
In this edit, the bot did something pointless to an already-mangled ref to 'https://www.indiatoday.in/education-today/gk-current-affairs/story/stree-shakti-puraskar-and-nari-shakti-puraskar-presented-to-6-and-8-indian-women-respectively-243468-2015-03-09

I have cleaned up the ref and added a "deny" comment:

I dunno whether this is due to bad metadata, or to the bot being led astray down a GIGO path by the human error of the article's creator in adding a malformed ref, but I thought I would flag it up in case there is some systemic issue wrt to this website. -- Brown HairedGirl  (talk) • (contribs) 00:06, 24 July 2021 (UTC)


 * this is some bad GIGO. I added some phrases to the bad author blacklist. AManWithNoPlan (talk) 01:22, 24 July 2021 (UTC)
 * Thanks. -- Brown HairedGirl  (talk) • (contribs) 08:53, 24 July 2021 (UTC)
 * fixed - just verified it. AManWithNoPlan (talk) 12:58, 26 July 2021 (UTC)

Job dropped on write error
[11:28:52] Processing page 'Madhumila' — edit—history >Remedial work to prepare citations >Consult APIs to expand templates >Using Zotero translation server to retrieve details from URLs. >Retrieved info from https://tamil.indianexpress.com/entertainment/vijay-tv-office-serial-actress-madhumila/ +Adding title: 'ஆபிஸ்' மதுமிலாவா இது? அதிர்ச்சியில் ரசிகர்கள்! >Retrieved info from https://timesofindia.indiatimes.com/tv/news/tamil/Vijay-Television-awards-launched/articleshow/35617411.cms >Retrieved info from https://www.deccanchronicle.com/entertainment/movie-reviews/080417/senjittale-en-kaadhala-review-inconsistency-in-presentation-and-lacks-the-fizz.html +Adding title: Senjittale en Kaadhala review: Inconsistency in presentation and lacks the fizz +Adding date: 8 April 2017 >Retrieved info from https://timesofindia.indiatimes.com/tv/news/tamil/Vinnai-Thandi-Varuyava-new-serial-on-Vijay/articleshow/54604051.cms +Adding title: Vinnai Thandi Varuyava- new serial on Vijay - Times of India >Expand individual templates by API calls >Checking CrossRef database for doi. >Searching PubMed... nothing found. >Checking AdsAbs database no record retrieved. >Checking CrossRef database for doi. >Searching PubMed... nothing found. >Checking AdsAbs database no record retrieved. >Checking CrossRef database for doi. >Searching PubMed... nothing found. >Checking AdsAbs database no record retrieved. >Checking CrossRef database for doi. >Searching PubMed... nothing found. >Checking AdsAbs database no record retrieved. >Remedial work to clean up templates >Writing to Madhumila...   !Unhandled write error. Please copy this output and report a bug.. There is no need to report the database being locked unless it continues to be a problem.
 * Full text of report on the edit that failed:

It seems to me that a write error should not trigger the dropping of the whole job. -- Brown HairedGirl  (talk) • (contribs) 12:40, 26 July 2021 (UTC)

Restarted job was also dropped
I restarted the job above, after removing from the list the articles already done.

The resumed job dropped at after editing article 229/2132 at 17:28: see this list of bot contribs. -- Brown HairedGirl  (talk) • (contribs) 18:41, 26 July 2021 (UTC)

Bot inserts wrong title
The bot will now removed the ones already added also. I am running the bot on all those pages right now. AManWithNoPlan (talk) 21:59, 26 July 2021 (UTC)

Does the bot fill in ref calls in body?
I'm not sure, but I've encountered a couple of times this month where citations are getting filled with title, url etc.--in the body of an article when it's supposed to be only a references call to a citation, in the references subheading, which already has the title, url, dates, etc. I keep clearing the references in the body, yet someone keeps putting the citation parameters back. Carl Francis (talk) 20:41, 26 July 2021 (UTC)


 * Please provide links to examples. AManWithNoPlan (talk) 21:13, 26 July 2021 (UTC)


 * reFill 2 does this. Refs in the reflist section get moved into the body. -- Brown HairedGirl  (talk) • (contribs) 23:05, 26 July 2021 (UTC)
 * notabug, since not us. Seems to be https://en.wikipedia.org/wiki/Wikipedia:ReFill and other tools.  AManWithNoPlan (talk) 20:08, 27 July 2021 (UTC)

Umlaut symbols are not displayed correctly in classical references
This is really the fault of the publisher for uploading bad data to CrossRef. AManWithNoPlan (talk) 14:22, 27 July 2021 (UTC)


 * Such as https://search.crossref.org/?from_ui=&q=10.1007%2FBF01457933 AManWithNoPlan (talk) 20:09, 27 July 2021 (UTC)

American Museum Novitates uses issues, not volumes
Look at the code you have for other journals like this for inspiration, like ZooKeys, e.g.. &#32; Headbomb {t · c · p · b} 07:34, 27 July 2021 (UTC)
 * Added to the list AManWithNoPlan (talk) 14:21, 27 July 2021 (UTC)

Please translate languages
the bot does not create new cite errors, it just make the error clearer. AManWithNoPlan (talk) 14:18, 27 July 2021 (UTC)
 * wontfix - too rare and way too many languages. AManWithNoPlan (talk) 20:01, 27 July 2021 (UTC)

ZooKeys is open access
All DOIs beginning with  can be marked as free to read. &#32; Headbomb {t · c · p · b} 17:00, 28 July 2021 (UTC)
 * Before marking any doi in as , make sure that title does not have any wikilinks; cs1|2 cannot link to doi.org at the same time that it is supposed to be linking to some internal en.wiki article. It is ok to add free when title-link is assigned one of it's allowed keywords:  ,  ,  .  When title-link is assigned anything other than one of the allowed keywords, do not add free; cs1|2 cannot link to doi.org at the same time ...
 * —Trappist the monk (talk) 17:19, 28 July 2021 (UTC)
 * really the CS1/2 templates should be updated to not autolink when there's a conflict. The DOI doesn't see to be freely accessibly just because someone wikilinked a work in the title. &#32; Headbomb {t · c · p · b} 17:24, 28 July 2021 (UTC)
 * No. The thing that cs1|2 should do is  autolink.
 * —Trappist the monk (talk) 17:50, 28 July 2021 (UTC)
 * Consensus is against you, people want autolinking, and the current functionality should be updated to minimize conflicts. &#32; Headbomb {t · c · p · b} 19:42, 28 July 2021 (UTC)
 * Regardless not the place to argue about it. Ttm is making a point for the bot operator to be cognizant of. Izno (talk) 21:01, 28 July 2021 (UTC)

Regular expression failure in Emulator when extracting Templates
2021-07-30, 22:13:56 GMT [22:13:56] Processing page 'Emulator' — edit—history !Regular expression failure in Emulator when extracting Templates

The following text might help you figure out where the error on the page is (Look for lone { and } characters)

notabug. I have fixed the page and run the bot on it. I have added code to bold the text about it being a problem with the page and not the bot. AManWithNoPlan (talk) 13:42, 31 July 2021 (UTC)

Cosmetic edits: quotation mark or apostrophe

 * None of these are cosmetic by the definitions in WP:COSMETICBOT. --Izno (talk) 18:21, 1 August 2021 (UTC)
 * The relevant part of WP:COSMETICBOT seems to be the first bulleted item in the definition of " substantive": the output text or HTML in ways that make a difference to the audio or visual rendering of a page in web browsers, screen readers, when printed, in PDFs, or when accessed through other forms of assistive technology (e.g. removing a deleted category, updating a template parameter, changing whitespace in bulleted vertical lists);.
 * That doesn't seem to me to endorse changing the type of quote mark. What you think I have missed? -- Brown HairedGirl  (talk) • (contribs) 18:44, 1 August 2021 (UTC)
 * Different quotation marks are a different visual rendering. Izno (talk) 18:52, 1 August 2021 (UTC)
 * But it's such a tiny difference that it is barely noticeable, and it makes no difference at all to the meaning. -- Brown HairedGirl  (talk) • (contribs) 19:25, 1 August 2021 (UTC)
 * MOS:APOSTROPHE recommends straight. I notice them because they are a big warning sign someone has probably copy-pasted content from another site. For citations, they probably copy-pasted the title, or imported with a tool like VE. -- Green  C  19:32, 1 August 2021 (UTC)
 * For citations, I always copy-paste the title if I can. Why on earth retype it? More work, less accuracy. --  Brown HairedGirl  (talk) • (contribs) 19:36, 1 August 2021 (UTC)
 * I guess for titles it's not a copyvio signal, but for main text content it often is. That's why I notice them, sort of a habit now, they stick out like something to be fixed. And should be, at least according to the MOS recommendation. -- Green  C  20:12, 1 August 2021 (UTC)
 * A cited title should usually be an exact match. That's a wholly difft issue to copyvios. --  Brown HairedGirl  (talk) • (contribs) 23:38, 1 August 2021 (UTC)
 * "Barely noticeable" and "cosmetic" are not equivalent for the policy in question. (NB I notice the difference between the kinds of apostrophes" all the time, so notice-ability is in the eye of the noticer. ;) Izno (talk) 19:32, 1 August 2021 (UTC)

Jobs being dropped again
Any speculation as to why? Abductive (reasoning) 02:59, 3 July 2021 (UTC)
 * Weirdly, it stopped on Wladimir Klitschko (hist) twice and made edits to the article four times in a row (including the two times it halted), then when I used the Citations button, then by the bot, then by my using the Citations button, then the bot, followed by two in a row by me using the Citations button. This is ten edits in a row by the bot without any human changes to the article. What could possibly be wrong? Abductive  (reasoning) 08:03, 4 July 2021 (UTC)
 * jobs arent generally dropped. they just take longer than your webbrowser thinks they should and the web browser gives up.  The multiple edits come from the zotero instance finding new information each time it is run - that is run by wikipedia.  AManWithNoPlan (talk) 11:38, 4 July 2021 (UTC)
 * So, the bot gets to, say, article # 123 out of 2300 in the run, and doesn't make any more edits to remaining articles in the category. And the reason the bot decided to stop was because it asked my browser, which I had closed right after I started the run, how it was feeling? Abductive  (reasoning) 21:54, 4 July 2021 (UTC)
 * Thats a different issue. Wikipedia can time out and the bot eventually gives up.  It is possible, but unlikely that the bot might crash on a page.  AManWithNoPlan (talk) 00:06, 5 July 2021 (UTC)
 * I see. It might be having trouble with Wladimir Klitschko because that article has 522 citations. Abductive  (reasoning) 00:48, 5 July 2021 (UTC)
 * Now it keeps aborting on American Jews, another article with lots of citations. Abductive  (reasoning) 21:02, 8 July 2021 (UTC)
 *  !Curl error: Operation timed out after 20001 milliseconds with 0 bytes received  !Wikipedia responce was not decoded.  !Unhandled write error.  Please copy this output...  Abductive  (reasoning) 21:32, 8 July 2021 (UTC)

I have been experiencing issues similar to those reported by @.

For the last week, I have been feeding the bot with long lists of articles with bare URLs, which I prepare using WP:AWB in pre-parse mode. I paste the lists into User:BrownHairedGirl/Articles with bare links, and feed that to the bot.

On most of these lists, the bot jobs the drop, and I make a new request. When the bot restarts, it processes articles which it skipped on the first pass, and also makes further changes to articles which it had already processed.

This is all kinda weird, and a bit annoying ... because while the bot can do a great job, I can no discernible pattern to whether with a given article it will do any job at all, or a partial job, or a great job. I have never come across such a fuzzy bot, and while I do understand that it is doing a very complex job and relies on multiple external lookups, this fuzziness makes it frustrating to use. The unpredictability also seems to me to be inefficient, because it leads to the same articles being processed multiple times, which exacerbates the bot's problem of capacity well short of demand. -- Brown HairedGirl  (talk) • (contribs) 00:40, 15 July 2021 (UTC)

Is it relevant that I also made two individual page requests (using the toolbar) close together about 21:30? Does that kill an existing job? -- Brown HairedGirl  (talk) • (contribs) 22:19, 20 July 2021 (UTC) After 30 minutes of inactivity, I trimmed the list to remove articles already processed, and restarted the remaining 950 pages at 14:49 (bot contribs after restart). The bot is doing great work on this set of international relations articles, making cleanup edits to nearly all the pages it processes ... but it's very time-consuming to have to monitor it for job drops, which has happened to my jobs 6 times in the last 6 days (see the of my article list). It would be great if this could be fixed. -- Brown HairedGirl  (talk) • (contribs) 15:07, 29 July 2021 (UTC)
 * I have had the same big job dropped twice today. The second time was shortly after 21:30 UTC.  See this set of bot contribs; at 21:30, there is the last edit on the set of 2198 pages, No. 1195/2198 ... then a gap until page 40/2198 at 21:56.  I had restarted the job at about 21:45, and if I assume that if the previous job was not already dead, the bot would given me its message that I already had a big job running.
 * Yet another job dropped today. A run of 2191 articles, dropped after editing 1241/2191 at 14:07 UTC: see list of the bot's contribs.


 * there is no way that I know of to debug these, unless there is a specific page that is crashing. AManWithNoPlan (talk) 15:25, 30 July 2021 (UTC)
 * There are several code paths that kill the bot, but those are reproducible on each page. After a minute of trying to write a page 6 times and failing without reason, then bot will die also (this does not include normal write failures).   The bot can also die because of some PHP timeouts.  I have just added a bunch more calls to the "give me more time" function, so that should be eliminated unless the bot is truly hung. AManWithNoPlan (talk) 16:29, 30 July 2021 (UTC)
 * Many many thanks, . The bot is a hugely valuable tool (which is why I have been using it so heavily in the last few weeks), and your work to fix glitches is hugely appreciated. --  Brown HairedGirl  (talk) • (contribs) 16:53, 30 July 2021 (UTC)

I think that the dying jobs is now mostly fixed.AManWithNoPlan (talk) 13:35, 3 August 2021 (UTC)

Curly Quotes
MOS:CONFORM says otherwise, and that they should be changed. &#32; Headbomb {t · c · p · b} 17:44, 14 July 2021 (UTC)
 * MOS:CURLY clarifies that quotation marks "internal to quoted non-English text" should be preserved, so the bot should change them only if they surround the entire title. Drahtlos (talk) 18:13, 14 July 2021 (UTC)
 * We do not edit quotes, only titles. AManWithNoPlan (talk) 13:30, 22 July 2021 (UTC)
 * In the example, the bot replaced
 * with
 * , i.e. it replaced quotes internal to non-English text. Drahtlos (talk) 17:05, 22 July 2021 (UTC)
 * I do not believe that is a "quote" AManWithNoPlan (talk) 01:00, 26 July 2021 (UTC)
 * Are you sure MOS:CONFORM doesn't say that they shouldn't be changed? I'm looking at the third bullet point,, and the footnote that clarifies that quoted text, for MOS:CONFORM's purposes, includes titles of works. At least to me, titles of works cited seem to be titles of works. —2d37 (talk) 04:25, 3 August 2021 (UTC)
 * I see that the style guides have changed since the bot started doing this. MOS:CONFORMTITLE.  The problem with the MOS is that it is way to large to be easily decipherable and yet way too small to be a complete MOS.   AManWithNoPlan (talk) 13:42, 3 August 2021 (UTC)
 * I see that the style guides have changed since the bot started doing this. MOS:CONFORMTITLE.  The problem with the MOS is that it is way to large to be easily decipherable and yet way too small to be a complete MOS.   AManWithNoPlan (talk) 13:42, 3 August 2021 (UTC)

bot creates cite web with |chapter=
For Oxford Dictionary of National Biography (ODNB) the correct cs1 template would be. There is which only requires doi and the entry name in title

—Trappist the monk (talk) 14:28, 19 July 2021 (UTC)
 * The chapter was fixed quickly. Will get to ODNB soon. AManWithNoPlan (talk) 13:11, 22 July 2021 (UTC)

&lcub;&lcub;cite news&rcub;&rcub; or not &lcub;&lcub;cite news&rcub;&rcub;?
I noticed that Citation bot inconsistently changes citation templates for links to Rock Paper Shotgun articles from cite web to cite news. For example, in a test run yesterday, the bot performed these changes in this and this edit but did not do so here or here. Is there a reason for this discrepancy? Either way, wouldn't cite web be the more appropriate template for this website? Regards, IceWelder  &#91; &#9993; &#93; 06:59, 27 July 2021 (UTC)
 * Related: Wired is being classified as cite journal, although the "ARE_MAGAZINES" constant stipulates that it should be a cite magazine. IceWelder  &#91; &#9993; &#93; 14:20, 27 July 2021 (UTC)
 * I fixed the wired thing - it was the wikilink that caused the issue. AManWithNoPlan (talk) 14:25, 27 July 2021 (UTC)
 * The different depends upon the meta-data that the website presents. AManWithNoPlan (talk) 14:27, 27 July 2021 (UTC)
 * Which metadata field is that? For example, this ref was converted whereas this one was not. Apart from the article-specific stuff (title, author, date, keywords), the metadata appear almost identical. IceWelder  &#91; &#9993; &#93; 14:44, 27 July 2021 (UTC)
 * Probably the citoid server probably timed out on the second one and thus there was no data. AManWithNoPlan (talk) 15:58, 27 July 2021 (UTC)
 * notabug they will slowly change to news as bots/people update. AManWithNoPlan (talk) 21:23, 2 August 2021 (UTC)

"Removed proxy/dead URL that duplicated identifier."
I find it annoying that Citation Bot currently removes direct links to paginated PDF versions of articles, since in many cases exact page numbers for the quote is provided. Can it be rewritten to avoid that behaviour when exact page numbers for the quote exists? The Perennial Hugger (talk) 08:09, 31 July 2021 (UTC)
 * Do you have a diff? &#32; Headbomb {t · c · p · b} 23:38, 31 July 2021 (UTC)
 * Here you go: https://en.wikipedia.org/w/index.php?title=Miscanthus_x_giganteus&type=revision&diff=1036386445&oldid=1034640398 The Perennial Hugger (talk) 12:58, 1 August 2021 (UTC)
 * in this case, best use is to remove the url. parameter doi-access=free already creates a title url that sends you to the article using the DOI  — Chris Capoccia 💬 20:07, 2 August 2021 (UTC)
 * notabug - PDF links are brittle. AManWithNoPlan (talk) 19:16, 3 August 2021 (UTC)

More journal/series = Methods in ... cleanup
Covers both Methods in Enzymology, and Methods in Molecular Biology, and the interference of the ™ glyph. &#32; Headbomb {t · c · p · b} 04:02, 4 August 2021 (UTC)

Figure out URL from archive-URL
It's hard to parse all 20+ archive providers, the URL formats are at wp:List of web archives on Wikipedia which can get messy (eg. NLA Australia). But the big three archive.org, archive.today (7 domains) and webcitation.org represent over 99% of all archive links on Wikipedia. See https://tools-static.wmflabs.org/botwikiawk/dashboard.html in row 22, compare columns G -> I with cell E22 ie. the big three have about 8 million links, the remaining archive providers 47k combined -- Green  C  19:11, 4 August 2021 (UTC)
 * The loss of the web.archive.org in some runs was an error. AManWithNoPlan (talk) 20:19, 4 August 2021 (UTC)

Zotero and newspaper redirects
https://github.com/ms609/citation-bot/commit/e2b68b00e615a7d386cb7e6bd5e9f146b91c7c41 AManWithNoPlan (talk) 21:04, 7 August 2021 (UTC)

Chose wrong title for source
https://github.com/ms609/citation-bot/commit/9696d3b8a42c531d65ccb035a1ad16e45d173360 AManWithNoPlan (talk) 21:01, 7 August 2021 (UTC)