User:Plantdrew/Wikipedia and taxonomic databases

PREMISE: Wikipedia should (at least) mention every known species of organism in the world.

The situation
Taxonomy is subjective to a degree.

Under one definition of species, species are delimited as populations that can reproduce with each other, but not with other populations. But nature doesn't operate in absolutes, and has gray areas. Consider populations of birds on two islands. Sometimes birds fly from one island to the other and reproduce successfully. If "sometimes" is once in 10,000 years, they may be considered different species. If "sometimes" is 10% of matings every year, they are unlikely to be considered distinct at level that should be recognized in taxonomic nomenclature. At some intermediate level of of reproduction, they might be considered different subspecies in a single species.

Taxonomists can disagree about precisely which species should be recognized. However, for many species (or putative species), there is a broad consensus among relevant experts as to whether they should be recognized (or not).

Because disagreement is possible, many species names have been published which are not recognized as distinct species by general consensus. The number of species names ever published is several times the number of species recognized as distinct. Some published names fail to meet all the requirements of the relevant nomenclatural code; these are easy to deal with. Most published names are not recognized as distinct species, according to a subjective consensus.

Additionally, a single species may have been described under different names by researchers working independently of each other. This was more common in the past.

There a rules which must be adhered to in order to establish a new species names. As long as those rules are followed, anybody can establish a new species name. It is entirely possible to establish a new species name for your pet cat within the framework of the rules (but that doesn't mean that anybody else has to seriously entertain your proposal that your cat is a different species from other housecats).

On taxonomic databases
There is no single authoritative taxonomic database (or any other resource) that lists every known species of organism in the world. Encyclopedia of Life, GBIF and Catalog of Life are attempts at this. These databases have been constructed by aggregating other databases that focus on smaller groups of organisms. Because of the sheer number of records, individual species records have not reliably been curated by humans. Source databases may disagree on subjective matters, with mutually incompatible taxonomic opinions all presented as correct in one of the aggregating databases. And objective errors may persist in an aggregating database after they have been corrected in a source database. The quality of the global all species databases is insufficient for Wikipedia, and for the most part, they are not used as references in Wikipedia.

There are high quality databases with a focus more narrow than all the species in the world. Some are focused on all species (or a particular group of species) in a particular part of the world, and for the most part, these are not used as references on Wikipedia (unless they data is rich enough to support a substantial amount of prose content). Some are focused on all species within a particular taxonomic group, across the world. These are very important as sources for Wikipedia.

There are several kinds of taxonomic databases. Some databases seek only to record all scientific names published for a particular group of organisms, with the bibliographic details of their publication; they may detail objective reasons that name was not published in compliance with the relevant nomenclatural code, but do not take a position on subjective taxonomic matters.

A few databases seek to record subjective taxonomic matters, and the sources where these positions were expressed, but the database itself doesn't endorse a particular subjective view.

Many taxonomic databases seek to present a single authoritative classification, endorsing particular subjective views. These are the databases that are valuable to Wikipedia.

Taxonomic databases accept corrections, and Wikipedia (and Wikispecies) editors have pointed out errors that have been corrected. Wikispecies editors have coauthored scientific papers establishing names for taxa that lacked a name published in compliance the relevant nomenclatural code (having noticed the problem in the course of their edits).

The reliable academic publication that SPECIESOUTCOMES refers to is, in this case, Steindachner's 1877 original description of the species, available on BHL. It's in German and runs not quite 2 pages. Few species articles currently cite the original description directly (but there's almost always an indirect citation with the authority in the taxobox). Original descriptions may not be available online; BHL has a lot of taxonomic literature that is out of copyright, but uploading a publication that describes a single species isn't going to be a high priority for them. Other original descriptions may not be out of copyright. Many are in a foreign language (particularly Latin in the 19th century). The original description of a species described a long time ago may not be sufficient to distinguish it from the larger number of related species that are known now (this is why H. johnii was redescribed in 2017).

WP:SPECIESOUTCOMES begins with Species that have a correct name (botany) or valid name (zoology) are generally kept. Their names and at least a brief description must have been published in a reliable academic publication to be recognized as correct or valid. Because of this, they generally survive AfD. So far in this thread, nobody has mentioned why species articles are sourced to taxonomic databases. The database is the source that shows that a name is correct/valid. Taxonomy is subjective to a degree; taxonomists can disagree about whether two populations represent different subspecies, or aren't worth recognizing as taxonomically distinct at all; going the other way they can disagree about whether there are two species or one species with two subspecies. If we are going to present a list of species regarded as valid/correct in a genus article, that list needs to come from a single source, and in 2022 that source is usually going to be a taxonomic database (we can't just piece together a list of species in a genus from the original descriptions; there are several times as many species that have been described as are currently recognized by a consensus of taxonomists). Original descriptions are primary sources; the act of describing a species means that the person doing so thinks it is valid. Taxonomic databases are secondary or tertiary sources that show that somebody other than the original describer agrees a species is valid.

example doesn't mention the most important fact; it is a valid species, according to FishBase. Probably because that's taken as a given; taxonomic databases list invalid species as well as valid ones, and absolutely nobody is churning out sub-stubs for species that aren't considered valid. There are two high quality taxonomic databases for fish; the other Catalog of Fishes. For the vast majority of fish species, both databases are in agreement as to their validity. WikiProject Fishes has chosen to follow FishBase (in the early years of Wikipedia, it wasn't possible to link directly to species records in Catalog of Fishes). Every article for fish species should cite FishBase; that's the baseline to establish that the species is valid. Fish articles ought to cite the original description (if it was described recently); or a detailed rediscription (if it was described long ago). If there is active disagreement among taxonomists about the validity of a species, that ought to be mentioned as well (e.g. in article for species X "Catalog of Fishes recognizes species X and Y as distinct, but FishBase considers Y a synonym of X). Plantdrew (talk) 21:22, 20 September 2022 (UTC)

Related

 * User:Rschen7754/FAQ
 * WikiProject Highways/Rockland County Scenario (cf. Apororhynchus)