Wikipedia talk:Automated taxobox system

This talk page can be used to discuss issues with the automated taxobox system that are common to the entire system, not just one of its templates. Discussions of this nature prior to 2017 can be found at Template talk:Automatic taxobox

Those familiar with the system prior to mid-2016 are advised to read Notes for "old hands".

30 June 2024 use stats update
30 June update

Mammal subprojects with articles tagged for both mammals and subproject:

Method: For the most part I use Petscan to search for articles with a talk page banner for a particular Wikiproject and either Taxobox, or any of Automatic taxobox+Speciesbox+(Infraspeciesbox and/or Subspeciesbox (depending on whether botanical/zoological code is relevant)), and record the results. Example search for algae with automatic taxoboxes (search terms are in the Templates&Links tab in Petscan). For viruses, I search for Virusbox rather than the other automatic taxobox templates. For plants, I sum the results for the Plants, Banksia, Carnivorous plants and Hypericaceae projects. "Total" is derived from the Template Transclusion Count tool (https://templatecount.toolforge.org/index.php?lang=en&namespace=10&name=Speciesbox#bottom e.g. results for Speciesbox), and is not actually sum of the results for individual projects (some articles have talk page banners for multiple Wikiprojects, and would be counted twice if rows were summed). I started compiling these stats in April 2017, and have been updating roughly every six months since December 2017. I've kept my method consistent; perhaps I should have included all of the automatic taxobox templates (Hybridbox, Ichnobox, etc.), but I didn't do so at the beginning, and the other templates aren't used in very many articles.

Caveat: The remaining manual taxoboxes in projects with a high percentage of automatic taxoboxes mostly have some kind of "problem". I have periodically reviewed all the manual taxobox articles in projects with less than 207 manual taxoboxes, and chose not to convert them to automatic taxoboxes at that time (however, it has been awhile since my last review, so there probably a few recently included articles I haven't reviewed). "Problems" may include:
 * Fossil taxa; fossil classifications may be derived from multiple sources and present classification on Wikipedia may include mutually incompatible hypotheses. Fossil taxa are often not be linked from extant parent taxa.
 * Synonymy; there is some obvious synonymy issue; e.g., a species is in a genus which redirects (as a synonym) to another genus; maybe the species article needs to be moved or maybe the genus should be reinstated
 * Common names; articles with common name titles may not correspond to taxa, but still have manual taxoboxes. In some cases Paraphyletic group may be appropriate, in others the taxobox should be removed
 * Parasite and pathogens; article on parasites and pathogens may be tagged for the WikiProject of the organisms they infect. Higher level taxonomy templates for the parasites may not yet exist, and the classification presented in manual taxoboxes may not be up to date.

I've added WikiProject Extinction to the table this time. WikiProject Protista continues to have tags added to existing articles, with a net increase in the number of tagged articles with a manual taxobox. WikiProject Dinosaurs recently merged a bunch of largely redundant articles for nodes in a cladogram, resulting in a net decrease in the number of articles tagged for that project. Plantdrew (talk) 17:12, 30 June 2024 (UTC)


 * Thanks for doing these updates. Good to see progress.  Did you include WikiProject Cacti in with the Plants totals?  It doesn't look like that template automatically adds it to the parent WP like the other plants subprojects. awkwafaba (📥) 19:01, 1 July 2024 (UTC)
 * , I did not include WikiProject Cacti in the totals. However, for the past several years, I've been running the "Taxon pages not tagged in WP ToL clade projects" query on your user page to ensure all taxobox articles are tagged for a project immediately before I start compiling an update of these numbers (and in general I run your query every couple of weeks, but haven't made it a priority to tag redirects). I did pick up several cacti articles and added WikiProject Plants tags before I started this update. None of the plant subprojects get picked up in a Petscan search for WikiProject Plants, so I have always done a separate search for Banksia/Carnivorous plants/Hypericaceae and added those results to the results for Plants when presenting these numbers (the other 3 subprojects aside from Cacti do contribute to the numbers reported in the assessment table for WikiProject Plants). Plantdrew (talk) 20:07, 1 July 2024 (UTC)

Automatic child taxa?
I don't really know how else to title this. I'm one of the editors on a wiki which focuses on recording fictional species made for a large collaborative speculative evolution project, and at some point for much the same reason you all did, we came up with an automated taxonomy system to reduce the pain of updating taxonomy for hundreds, even thousands of species. However, ours works a bit different from Wikipedia's, storing all taxonomy data in a centralized place--a JSON file. As all the data is in one place, it also allowed us to also be able to easily reverse the direction and display, for instance, all descendant taxa as well.

Looking at how Wikipedia does taxonomy, I noticed that there are places where it would make sense to automatically generate a list of descendant taxa. Most notably, the subdivision section of the automatic taxobox, and perhaps various other lists of genera and species around the wiki. I can't imagine pages like the list of Asteraceae genera being anything short of a nightmare to update and maintain, assuming its reputation among botanists is earned, and I could see it being worse for decently large mid-level taxa that are in a state of flux due to several new studies being published.

I think that the current system Wikipedia is using might make generating lists of child taxa impractical, but on the other hand, I wonder if the changes needed to support it would actually be considered worthwhile to those involved in this wikiproject. I know that for the aforementioned wiki I'm part of, this also made it much easier to browse taxonomy in general because readers and editors alike could reliably access related and descendant taxa from anywhere. And while editing is moderated on our wiki so we haven't had need for this, I can't help but imagine it would make it a bit easier to spot and fix vandalism as well because it would be plainly visible from higher taxa (which one might be more likely to view in some cases). Any thoughts on the idea? Disgustedorite (talk) 21:24, 20 July 2024 (UTC)


 * , when automatic taxoboxes were first being developed (ca. 2011), there was an attempt to include automatic child taxa that was eventually abandoned. I don't know the details about why it didn't work.
 * There is a script (User:Jts1882/taxonomybrowser.js) that allows you to see the taxonomy in a tree view, with children.
 * However, not all articles are using automatic taxoboxes (~88% are using them, but that still leaves 50,000+ articles with manual taxoboxes). And Wikipedia doesn't have articles for every genus, let alone every species. Plantdrew (talk) 22:44, 20 July 2024 (UTC)
 * On our wiki we actually manually update some lower taxa on a species by species basis while the higher taxa are what is automated, since for a collaborative speculative evolution project with nearly 10 times as many species as there are dinosaurs and upwards of 200 more added every year, the frequency at which those are defined and updated can make dinosaur researchers jealous.
 * I just skimmed the source code of the taxonomy browser and...well, I suppose the processing impact doesn't matter that much when it's run on your own machine, lol. I will say using the search API and taking advantage of taxonomy being stored exclusively within the template namespace is pretty smart. Our strategy was to index child taxa and then search that index taxon by taxon, though having far fewer species than have been described in real life (and not actually having a page for every member of a genus of insects) gives us the advantage of not needing to actually maintain an index by hand (we have few enough taxa that it's economical to index it over again every time).
 * If I were to take a guess, I can see that Wikipedia has no extensions like Semantic MediaWiki (understandable given its current state), Cargo (I wouldn't use it either), or even DynamicPageList3 (performance hell), which leaves the search API and checking each and every result as basically the only option, which, even if it was possible in a module, I could imagine hitting memory limits fast. On our wiki, we're looking into making a sort of poor man's Semantic MediaWiki using an autonomous bot that records and indexes information about pages in various dedicated JSON files...but a bot-dependent system wouldn't fly here, right?
 * Although, Wikipedia does have CategoryTree, which I think is what our wiki attempted to use for taxonomy browsing...in 2007, when there were only a few hundred species. But in any case, using the various different parameters of its parser function, it might be possible to twist it into something like a poor man's DPL3 by having each taxonomy template page automatically be added to a category like "child taxon of Parentsnameidae" and then using several instances of the CategoryTree parser function (or just one that's been quite heavily altered by the lua script after it was generated, if that's possible) to display it. But that's also a lot of potential categories... Disgustedorite (talk) 00:33, 21 July 2024 (UTC)
 * While I agree it would be nice to have automatic child taxa, it really isn't practical for Wikipedia. Wikipedia has to be open for everyone to edit, which is why we have the template system over centralised JSON or Lua module methods, and NPOV means we have to be able to show alternative taxonomies over one agreed system.
 * The taxonomy browser was developed as a tool to manage the taxonomy templates. It picks up the parent-child relationship of templates, which most of the time is a taxonomic relationship but gets more confusing where there are alternative taxonomies. And any JS additions have to be opt-in. For a Wiki that could impose one taxonomy with a centralised JSON source, a JS addition to the taxobox would be possible. —  Jts1882  &#124; talk 10:37, 21 July 2024 (UTC)