Wikipedia:Bots/Requests for approval/William Avery Bot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

William Avery Bot 2
Operator:

Time filed: 12:11, Wednesday, February 12, 2020 (UTC)

Function overview: Template can take a parameter ('fish', 'insect', etc) to subcategorise redirects from scientific names of organisms to their common names. The parameter is often not present in cases where it could usefully be supplied. This bot will add the parameter, where the correct value can be determined from the taxobox of the article that is the target of the redirect.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python, pywikibot

Source code available: https://bitbucket.org/WilliamAvery/wikipythonics Entry point for this task is redirectClassifierBot.py

Links to relevant discussions (where appropriate): I am not aware of any.

Edit period(s): One-time run to remove the current backlog. I will divide it into tranches using gcmstartsortkeyprefix and gcmendsortkeyprefix generator parameters. It might be re-run periodically in the future.

Estimated number of pages affected: 6000–8000 (25–30% of 25000 category members)

Namespace(s): Mainspace

Exclusion compliant (Yes/No): Yes

Function details:

The task can be accomplished with a simple python script.


 * 1) Retrieve pages in Category:Redirects from scientific names
 * 2) Fetch the HTML for the target page of each redirect, use BeautifulSoup to get taxonomic data from the taxonbox. Because the automatic taxobox system may be involved, examining the output HTML of the relevant taxobox is the cleanest route I can see to the required taxonomic information.
 * 3) Run the algorithm to determine the parameter value from the taxonomy
 * 4) Use mwparserfromhell to add the parameter value to the wikitext. There is a slew of redirects to template.
 * 5) Update page

Preliminary examination of a couple of thousand pages indicates many cases where 'fish', or 'insect' needs to be added, a few requiring 'crustacean', 'spider' or 'fungus' and none for 'plant'. (Plant articles are mostly *at* their scientific names.)

Discussion

 * Let's see how this does and go from there., take all the time you need (no rush). -- The SandDoctor Talk 20:53, 7 March 2020 (UTC)
 * Thank you. I just this evening started looking at implementing this task with pywikibot, as the novelty of node.js is wearing off. William Avery (talk) 21:07, 7 March 2020 (UTC)
 * I have edited the BRFA information above to reflect my revised approach. William Avery (talk) 20:51, 10 March 2020 (UTC)
 * and results checked. 50 edits
 * N.B. During the development process, on 9 March, I logged in under the bot account to create a bot password to use with pywikibot. I then inadvertently moved a page and did a couple of edits whilst logged in under the bot account. I'm aware I shouldn't make such edits using a flagged account. William Avery (talk) 07:50, 15 April 2020 (UTC)


 * Primefac (talk) 18:15, 22 May 2020 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.