Wikipedia:Bots/Requests for approval/EarwigBot 21


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard. The result of the discussion was

EarwigBot 21
Operator:

Time filed: 03:37, Monday, May 31, 2021 (UTC)

Function overview: Correct mismatched synonym authorities in taxon articles created by Qbugbot

Automatic, Supervised, or Manual: Automatic, partial supervision

Programming language(s): Python

Source code available: synonym_authorities.py

Links to relevant discussions (where appropriate): User talk:Qbugbot (permalink), Wikipedia talk:WikiProject Insects

Edit period(s): One time run

Estimated number of pages affected: 400–500

Namespace(s): Mainspace

Exclusion compliant (Yes/No): Yes

Function details:

In 2018, Qbugbot created 18,000 taxa stubs on insects and other arthropods. 1234qwer1234qwer4 noticed a systemic error in the bot-generated lists of synonyms in the taxoboxes, which were based on data from ITIS. In many cases, the bot alphabetized the list of synonym names, but did not reorder the corresponding author names, leaving them mismatched. Here is an example of this being fixed. The correct authors can be checked against ITIS as well as other sources like GBIF which reference the original literature.

This bot task operates as follows:


 * 1) Examine each page created by Qbugbot
 * 2) Parse the taxobox for the list of synonyms and authors
 * 3) Pull the correct synonym-to-author mapping from ITIS (obtained via database download)
 * 4) If the authors in the article do not match ITIS, but match after reordering in the same manner Qbugbot used to order the synonyms, fix the order and save the page
 * 5) Any exceptional cases (if the article contains synonyms not listed in ITIS, or the author names do not match ITIS after reordering) are left for manual review

Example edits the bot would make, saved manually: 1, 2, 3

Edits the bot would make are listed here: User:EarwigBot/Task 21/Edits

Of the 18,000 pages created by Qbugbot, about 500 have this synonym issue, and EarwigBot can fix about 425 of them using the logic above. For the other 50–100 pages, editors have made a partial attempt to fix this issue, changed authors to diverge from ITIS, or the data in ITIS has changed; these cases will be reviewed manually.

Discussion
Should be straight-forward, but let's make sure the wheels don't fall off. Primefac (talk) 11:28, 31 May 2021 (UTC)


 * Edits. No issues noticed. —&#8239; The Earwig (talk) 02:45, 2 June 2021 (UTC)
 * Thank you for taking on this task! Your rules for what cases to leave for manual review seem very rational to me (though allowing the reordered author list to be a subset of the ITIS one in rule #4 and adding the synonyms missing in the article as part of the bot task would have been okay as well IMO). User:1234qwer1234qwer4 (talk)  14:28, 5 June 2021 (UTC)
 * Hey 1234qwer1234qwer4, my description may have been a bit imprecise. The bot will still fix pages if ITIS contains additional synonyms that are not in the article, as long as every synonym in the article is in ITIS (i.e. article ⊆ ITIS is OK but not ITIS ⊆ article). For this bot I only want to correct existing data, so I wasn't planning to add new synonyms from ITIS. —&#8239; The Earwig (talk) 04:08, 6 June 2021 (UTC)
 * Ok. Do you log these somehow? User:1234qwer1234qwer4 (talk)  07:53, 6 June 2021 (UTC)
 * Sure, at the end of the run I'll make a list of all the pages that need manual review, including those with extra synonyms in ITIS. —&#8239; The Earwig (talk) 18:27, 6 June 2021 (UTC)
 * Cool. That wasn't clear to me from point 6. User:1234qwer1234qwer4 (talk)  18:29, 6 June 2021 (UTC)

Primefac (talk) 11:33, 12 June 2021 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard.