Wikipedia:Bots/Requests for approval/Monkbot 5


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Monkbot 5
Operator:

Time filed: 12:40, Wednesday April 2, 2014 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: Yes (source)

Function overview: Working in, replace deprecated CS1 parameters coauthor and coauthors with individual authorn parameters (n is a number 2–10). Task 5 operates on CS1 citations that have |coauthor= parameters that contain a comma-separated list of names where the names are in the general form: First Middle Last.

Links to relevant discussions (where appropriate):

Edit period(s): Occasional

Estimated number of pages affected: At the time of this writing, has 99,101 pages.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Full details are listed with the source.

Discussion
task appears routine, just watch each edit for the first bit to check for bugs please, thanks -- Tawker (talk) 07:12, 6 April 2014 (UTC)


 * Thank you. First 50 done.  For the most part, I didn't see anything untoward in those edits.  In, the bot did what it was supposed to given the original content:
 * Knights of Columbus, Catholic Truth Committee
 * Three words on either side of the comma, each could be first middle last. This, I think, is a case of a misused template rather than a failure of Monkbot task 5.  For example, location lists the New York Public Library which is not a publication place; neither Knights of Columbus, nor Catholic Truth Committee – note the cover's punctuation which differs from the citation – are listed on the book's title page (they are on what appears to be a Google-created cover image, though not listed in the bibliographic information on the about this book page).


 * The test edits are listed at Special:Contributions/Monkbot beginning at 11:20, 6 April 2014 (UTC) and ending at 11:30, 6 April 2014 (UTC). The test edits have this edit summary: Task 5: Fix CS1 deprecated coauthor parameter errors (bot trial).


 * —Trappist the monk (talk) 12:08, 6 April 2014 (UTC)
 * Comments. I checked all 50 edits, and as far as I can see, the bot did not introduce any citation errors or make any mistakes with the author names.


 * I saw at least four cases in which there were nine authors in the resulting citation, and the bot properly added 9.


 * The bot properly ignored Dunbar, G. Davidson, R. in this edit.


 * The bot dealt properly with "&amp;" and ", and " before the last author's name. Nicely done.


 * Possible minor bug: The bot added an extra period to "et al." in this edit.


 * There's some GIGO here, but the bot is not to blame.


 * Just a curiosity, and nothing that should block bot approval: Why did the bot ignore the coauthors containing "Pribanic" in this edit but operate on the coauthors containing "Nguyen" in this edit? I read the source code and the description of it, and I could not figure out the difference between how they should be treated. Would one of the other Monkbot tasks have operated on the "Pribanic" citation? – Jonesey95 (talk) 18:37, 6 April 2014 (UTC)


 * Fixed the et al. issue, I think.


 * For the two Pribanic cites in AKAP10, the first has ten coauthors, too many for task 5. In the second, et al. is the last author.  To task 5, the name et al. looks like first name et, followed by last name al. with a terminal period. Task 5 is intended to find First Middle Last style names. This can be confusing when you see it fixing Vancouver style names (Last FM) because to task 5 Last FM looks like first name 'Last' followed by last name 'FM'.


 * The task 5 documentation is a bit misleading. Task 5 will accept et al without terminal punctuation as all or part of the last coauthor name. I'll fix the documentation.


 * —Trappist the monk (talk) 19:25, 6 April 2014 (UTC)


 * (period outside the italic wikimarkup) . Task 4 already shares this same regex and I'll update tasks 2 and 3 to use it as well.


 * —Trappist the monk (talk) 22:28, 6 April 2014 (UTC)

Why remove the italics from et al.? DrKiernan (talk) 07:00, 7 April 2014 (UTC)


 * Primarily because the wikimarkup contaminates the citation's COinS metadata. Also because italicized et al. is different from the automatic et al. rendered by CS1 citations when they include displayauthors. And because et al. is properly not italicized (see Help:CS1; and et al. at MOS:ABBR).


 * —Trappist the monk (talk) 09:51, 7 April 2014 (UTC)

Where the primary author is given using 'first=/last=' style rathern than 'author=', is there any possibility that coauthors could be similarly rendered? If the programming effort is unreasonable don't worry; this is more of a question than a request, let alone a demand! Regards, Martin of Sheffield (talk) 14:36, 7 April 2014 (UTC)


 * I have thought about that, and it's in the back of my mind for a future project. When I first conceived of this project, I discovered that dissecting coauthor was a sizable enough challenge, given how clever our editors are when it comes to inventing ways to format names, that I thought it best to simply and reliably extract the names from coauthor into authorn.  That's why there are four separate tasks.


 * —Trappist the monk (talk) 14:59, 7 April 2014 (UTC)


 * Fair enough, for the prompt reply. Martin of Sheffield (talk) 15:07, 7 April 2014 (UTC)

I have checked a few dozen more of this bot's test edits, and I found no errors.

I'm curious why the bot ignored some of the coauthors parameters here. Examples of ignored parameters: "|last=Sacristán |first=Catarina |authorlink= |coauthors=Tussié-Luna María Isabel, Logan Sheila M, Roy Ananda L" and "|last=Guo |first=B |authorlink= |coauthors=Kato R M, Garcia-Lloret M, Wahl M I, Rawlings D J". Is it possible that it ignores hyphenated names? I couldn't see anything else that would cause the bot to pass by these citations (i.e. no "ref=harv", and they aren't "citation" templates).

For any bot approvers watching, this note is not an objection to the bot's behavior. On the contrary, the bot is being very conservative by avoiding making potentially-erroneous edits. That's good. – Jonesey95 (talk) 06:42, 12 April 2014 (UTC)


 * Yep, hyphenated 'first' names are ignored. Task 5 is looking for names in first middle last order.  In your first example it looks like the editor who added the citation has listed the names in last first middle order but without a comma to separate last from first. Monkbot can't tell the difference except in the cases where there are hyphens or apostrophes which are only allowed in last names.


 * —Trappist the monk (talk) 10:34, 12 April 2014 (UTC)

Following the edits described above, I ran task 5 once a day, 500 edits per day, and frequently inspected random edits. Until today, I found nothing to cause concern. All of these edits are listed at Special:Contributions/Monkbot with the edit summary Task 5: Fix CS1 deprecated coauthor parameter errors (bot trial) and in this edit summary search result.

Today, popped up. Here, task 5 doesn't see the difference between Jr, a last name and Jr, an abbreviation of junior. So, I reverted that edit and tweaked the script. When I submitted the article to Monkbbot a second time, it properly skipped it and continued on with other articles. The script now ignores citations where Jr, jr, II, ii, III, or iii precede the comma; this will prevent fixes to certain citations where an author's initials are II.

—Trappist the monk (talk) 19:30, 12 April 2014 (UTC)

—Trappist the monk (talk) 11:56, 27 April 2014 (UTC)  MBisanz  talk 05:07, 4 May 2014 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.