User:Plantdrew/pinegoogletest

Pine googletests
Google result number for pine scientific/vernacular names. All searches were performed with -wikipedia as a search term. Reported results are the value given at the bottom of the last page of results (arrived at last page by entering "990" for Google's "start=" parameter). Bolded entries are vernacular names beating the scientific name, or the scientific name if unbeaten by vernacular names.


 * "Pinus maximartinezii" 302
 * "Big cone pinyon" 69
 * "Martinez pinyon" 59
 * "maxipiñon" 42


 * "Pinus cembroides" 421
 * "Mexican pinyon" 390
 * "Mexican nut pine" 104
 * "Mexican stone pine" 49


 * "Pinus pinaster" 259
 * "Pinus pinaster" -"bark extract" 238
 * "Maritime pine" 336 (lots of results for a medicinal extract from the bark)
 * "Maritime pine" -"bark extract" 282 (still has results with just bark or extract though)
 * "Cluster pine" 409 (includes addresses on streets named "Cluster Pine")
 * "Cluster pine" -"bark extract" 417


 * "Pinus durangensis" 362
 * "Durango pine" 482 (lots of irrelevant results though). Some English sources are using the ambiguous Spanish term "ocote"


 * "Pinus merkusii" 337
 * "Sumatran pine" 347
 * "Merkus pine" 300
 * "Mindoro pine" 192


 * "Pinus squamata" 320
 * "Qiaojia pine" 167


 * "Pinus sibirica" 404
 * "Siberian pine" 305
 * "Siberian stone pine" 232


 * "Pinus kesiya" 300
 * "Khasi pine" 358
 * "Benguet pine" 317
 * "Luzon pine" 125
 * "Khasia pine" 169


 * "Pinus krempfii" 237
 * ''"Krempf's pine" 154

Dab term tables
23 March 2015

Updated table 16 October 2018

A couple notes. These figures are quick and dirty. "Total" for a project is ALL parenthetically dabbed articles (including anatomical terms and biologists). For most projects, genera account for almost all of the total, but for plants, mammals and birds, there are a large number of breed/cultivars and individual domestic animals with a dab term. And the figures in common name column may include article that aren't on genera (e.g. Operculum (fish).

comment
Ambiguity of "(genus)" as a disambiguator should not be underestimated. The last time this came up for plants, I did some research. I looked at the first 60 (alphabetically) plant genera that had a parenthetical disambiguator of "(plant)" or "(genus)". 22 of 60 were already disambiguated against an animal genus on Wikipedia. Checking against GBIF and Worms, I found that 45 of the 60 plant genera had a corresponding animal genus published. Granted, that included some animal genera best treated as synonyms, but there were also recognized animal genera not yet on Wikipedia. At any rate, for the plants I looked at, "(genus)" is still too ambiguous in somewhere between 35%-75% of cases.

Are 35%+ of all plant genera ambiguous with animal genera? No. But many scientific names are constructed by slapping a Latinate suffix on non-Latin personal name (as in Gordonia, or by creating a Greco-Latinate compound that nobody would've recognized 2000 years ago (as in Actinopeltis). If these weird hybrid terms are ambiguous with anything it's almost always going to be another genus constructed using the same rules, and "(genus)" won't work to disambiguate. Then there are scientific names taken directly from a term that had meaning in classical Greek/Latin. Obviously, these are ambiguous from the start with the classical term, but "(genus)" might work to disambiguate these; at least as long as another biologist didn't also find the classical term evocative and appropriate it for another genus (as happened with Laelaps and Echidna (disambiguation)).

Going through WikiProject Arthropod tagged articles with &quality=&importance=&score=&pagenameWC=on&limit=250&offset=1&sorta=Article+title&sortb=Quality this search], I found 132 articles that had a parenthetical disambiguatory term. 106 of these articles were on genera (the remainder being anatomical terms, biologists or taxa at ranks other than genus). 61 used "(genus)" as the dab term, and 45 used a different dab term (indeed, a horrifying mess of terms; a crab genus might be dabbed with "(crab)", "(arthropod)", "(crustacean)" or "(decapod)"). Of the 45 not using "(genus)", 31 can NOT use genus as there is already an article or redirect on Wikipedia for a different genus with the same name. Best case, if you try to standardize "(genus)" as the dab term for arthropods, you get 75 of 106 articles where it is sufficient to disambiguate. That doesn't strike me as especially consistent. If you go with "(animal)" it works in 99 of 106 cases; (animal) doesn't work for 2 articles where a genus is ambiguous with an anatomical feature in an animal, and 5 articles where there's an article or redirect for a homonymous animal genus (in two of these, there doesn't appear to be a replacement name yet for the junior homonym). If you go with "(arthropod)", it works in 101 of 106 cases (doesn't work for 1 genus/anatomy term (scutellum} and 4 homonymous genera that are arthropods in both uses ([[Battus, Cyclopyge, Harpagomorpha and Zalmoxis).

No one dab term is going to work 100% of the time. If you want the fewest number of dab terms covering the largest number of cases, (animal) is going to work far better than (genus). But (animal) is currently hardly used at all on Wikipedia and implementing it would require many moves. The current status quo is that across many organismal WikiProjects, most genera that need to be dabbed use the broadest common name(s) associated with the WikiProject's scope (e.g. fish, gastropod, etc.). The table below shows the current status quo.

April 10 2017
Queried Petscan for WikiProject template on talk page and either "taxobox" (manual) or "automatic taxobox" and "speciesbox" (auto) on article page on 4/10/17. Total is via transclusion count of the template

21 April 2018
I seem to have gotten into the schedule of running statistics on automatic taxobox usage every 4 months, so here is the latest update, one year in from when I begin tracking this (stats compiled April 15):

Commentary
Numbers for the individual projects are derived from searches for project banner templates on the talk page and the various manual/auto taxobox templates in the article. This searches for Amphibian and Reptiles articles with manual taxoboxes; modify the values under the Templates&links tab to run other searches. The Wikipedia line is derived from transclusion counts for the various taxobox templates (e.g. Taxobox), and doesn't take WikiProject banners into account. Some taxon articles have multiple WikiProject banners (most frequently with Palaeontology and something else) and are included in the count above for each WikiProject. Some taxon articles don't have any WikiProject banners; the Wikipedia line does catch these (most frequently these are weird protists and palaeontological animals).

Some highlights. In December, 5 WikiProjects used automatic taxoboxes in more than 50% of articles. Now 13 WikiProjects (of 26 tracked) are using automatic taxoboxes in the majority of articles. The 3 WikiProjects with the largest number of articles (Lepidoptera, Plants and Insects) each saw several thousand articles converted to automatic taxoboxes. Overall, almost 1/3 of taxon articles are now using automatic taxoboxes, and if the pace over the last year continues, all taxon articles could have automatic taxoboxes in 4.5 years.

However, I'm not sure that universal use of automatic taxoboxes should be a goal. WikiProjects Gastropods (December 2016), WikiProject Fungi (November (2014) and WikiProject Arthropods (March 2012) have had proposals to switch to automatic taxoboxes that have attracted opposition. These three projects are largely still using manual taxoboxes.

1 January 2019
Mammal subprojects with articles tagged for both mammals and subproject:

26 June 2019
Mammal subprojects with articles tagged for both mammals and subproject:

1 January 2020
Mammal subprojects with articles tagged for both mammals and subproject:

1 July 2020
Mammal subprojects with articles tagged for both mammals and subproject:

6 January 2021 update
Mammal subprojects with articles tagged for both mammals and subproject:

30 June 2021
30 June 2021 update

Mammal subprojects with articles tagged for both mammals and subproject:

30 December 2021
30 December 2021 update

Mammal subprojects with articles tagged for both mammals and subproject:

30 June 2022
30 June 2022 update

Mammal subprojects with articles tagged for both mammals and subproject:

30 December 2022
30 December 2022 update

Mammal subprojects with articles tagged for both mammals and subproject:

30 June 2023
30 June 2023 update

Mammal subprojects with articles tagged for both mammals and subproject:

30 December 2023
30 December update

Mammal subprojects with articles tagged for both mammals and subproject:

30 June 2024
30 June update

Mammal subprojects with articles tagged for both mammals and subproject:

Blank usage table
update

Mammal subprojects with articles tagged for both mammals and subproject:

IOC exceptions
Talk:Fawn-coloured lark Talk:Common gull Talk:Australian wood duck Talk:American purple gallinule Talk:Common raven Talk:Gray-lined hawk Talk:Gray thrasher Talk:Slate-coloured grosbeak Talk:Tricoloured munia Talk:Dull-coloured grassquit Talk:Slate-coloured seedeater Talk:Mouse-coloured tapaculo Talk:Gray-throated warbling finch Talk:Multicoloured tanager Talk:Gray-barred wren Talk:Mouse-coloured penduline tit Talk:Grayish baywing Talk:Sand-coloured nighthawk Talk:Gray-headed kite Talk:Blue-gray tanager Talk:Gray kingbird Talk:American gray flycatcher Talk:Gray catbird Talk:Gray-crowned rosy finch Talk:Blue-gray gnatcatcher Talk:Black-throated gray warbler Talk:Gray vireo Talk:Gray hawk Talk:Osprey Talk:Kererū Talk:Northern New Zealand dotterel Talk:New Zealand dotterel

Other
DAB cleanup
 * Olive (disambiguation) (no Neelix involvement, olive snail entry seems legit, though the bird and primate entries are partial title matches)
 * Pearly (no Neelix involvement, a bunch of legit entries, but also a bunch of partial title matches for organism common names)
 * Western Wood (non organism entries, but the organisms are partial title matches)
 * Giant spiny (move over redirect to giant spiny stick insect or giant spiny walking stick?)
 * Hoary (not just organisms listed, but all entries are partial title matches or a dictionary definition. redirect to Wiktionary?)

Draft
Incorporate with User:Plantdrew/Essay Equating the avian sense of "cardinal" with Cardinalidae appears to be a Wikipedia innovation.

To head off WP:COMMONNAME immediately: "the name that is most commonly used (as determined by its prevalence in a significant majority of independent, reliable English-language sources)" for the subject of this article is Cardinalidae. The WP:OFFICIAL English name that isn't the scientific name is Cardinals, Grosbeaks and allies


 * There's never been an encyclopedia the size of Wikipedia, and I'm not sure that it can really be considered to be written overall for the "general public" at this point. While there are many general interest articles, there are even more articles that are of interest to some particular niche audience. Sports fans read articles on professional athletes; if an athlete article mentions a notable achievement, it's not going to give the full rules of the sport that provide full context to that achievement. If a article on an episode of a TV series mentions an event relevant to an ongoing plot point, it's not going to give a full discussion of how that plot point has developed over all previous episodes. While that doesn't mean we should embrace jargon, I do think we can assume that for most niche topics, the likely audience already has some background in the broader topic. The majority of our organism articles are only going to be of interest to a small subset of people, who will be familiar with the concept of scientific names, and not necessarily intimidated by them.

WP:COMMONNAME was written at a time before it was clear that Wikipedia was going to be the 5th most popular website in the world, and before search engines were quite as a good as they are today. In the early days, we couldn't be sure that people searching for "Bill Clinton" would necessarily find a Wikipedia article titled William Jefferson Clinton. In order to make sure people found that article we put it at the COMMONNAME ("Bill Clinton" is clearly the most likely search term). "Common name" happens to be a term of art in taxonomy; some taxonomically interested Wikipedians back in the early days interpretated COMMONNAME to mean "avoid scientific names" at all costs. This has lead to some long-standing titles where scientific names are avoided, but it was never considered whether the title used actually helped readers find the subject they were interested. People looking for a bird called a cardinal are almost certainly looking for northern cardinal, an iconic and common species in eastern North America, but cardinal (bird) is about a family of some 50 species, only 3 of which are known as "cardinals"; but at least the awful scientific name Cardinalidae was avoided. Pig was first created as an article about domestic pigs but early in it's history was changed to cover the genus Sus; scientific name avoided, but do people searching for "pig" really want to read about the genus (and if they're interested in a larger group of animals than domestic pigs, who are we to assume they aren't also interested in javelinas which are in the same suborder, but a different family)? Turkey (bird) is about the genus Meleagris. Every year it sees a massive spike in page views during the American Thanksgiving holiday and a smaller spike for Christmas.

(genus)

 * 1) Mandragora (genus): Mandragora is a dab page
 * 2) Cassia (genus): Cassia
 * 3) Musa (genus): Musa
 * 4) Lotus (genus): Lotus
 * 5) Asparagus (genus): Asparagus
 * 6) Aster (genus): Aster
 * 7) Gloxinia (genus): Gloxinia
 * 8) Sequoia (genus): Sequoia
 * 9) Stevia (genus): Stevia
 * 10) Vanilla (genus): Vanilla
 * 11) Citronella (genus): Citronella
 * 12) Opopanax (genus): Opopanax
 * 13) Cotyledon (genus): Cotyledon is an article about plant anatomy, and a clear primary topic
 * 14) Crypsis (genus): Crypsis is a biology topic and mentions plants
 * 15) Jarilla (genus): Jarilla is a dab page
 * 16) Phyla (genus): Phyla is a dab page mentioning another biology sense; the plural of phylum. Traditional botanical equivalent of phylum is division (botany), which redirects to phylum. Division (horticulture) another plant related meaning. "Phylum (plant)"/"Phyla (plant)" not really likely search terms for the taxonomic rank between kingdom and class.
 * 17) Pimenta (genus): Pimenta
 * 18) Pleomele (genus): Pleomele

draft reply

 * Keep, Oppose redirecting The CONTENTFORK is the result of the nominator copying content from species/genus articles into the tribe article shortly before they opened this AfD proposing to redirect the species/genus article they had copied from. Using AfD to arrive at a consensus for which CONTENTFORKed title should be preserved when the nominating editor DISCOVERS a content fork is appropriate. Using AfD to ratify a resolution to a content fork the nominating editor CREATES is not appropriate. There is a WP:TRAINWRECK; Knulliana had more significantly more information when the AfD was initiated than any other nominated article. I think it could stand alone as a stub (not a sub-stub) in it's state at the time of the AfD nomination. It has since been expanded, along with other articles nominated in this AfD. The genus article Knulliana gets |Bothriospilini more page views than the tribe article Bothriospilini (12/day long term vs. 1/day) Tribes are taxonomic minutiae; they are rarely mentioned in sources with SIGCOV of a species. Families are regularly mentioned. Genera are inherently mentioned as the first word in the binomial name of a species. Readers might reasonably expect to find Wikipedias most detailed coverage of a species in a genus article. As a test case, this AfD seems to seek to set a precedent that Wikipedias most detailed coverage of a species could be found whereever it "fits" (by article size). Species might be covered in detail in an article for genus/tribe/subfamily/family. This isn't an outcome that would b  help readers at all. I'm much less opposed to merging species sub-stubs to genus articles than I

Some thoughts in no particular order. The reason taxon articles don't get deleted isn't because of WP:SPECIESOUTCOMES. It is because there is always an alternative to deletion; an article can be redirected to an accepted name if it is a synonym, or redirected to a higher taxon if it is decided that a stand-alone article is unwarranted. However, in practice, deciding that a stand-alone article is unwarranted and redirecting to a higher taxon has not been an outcome of any recent AfD. The idea that Wikipedia should have stand-alone articles for taxa to the extent that it currently does (and should continue to add more articles until all (extant) taxa have stand-alone articles) has been challenged. But the challengers don't seem to be aware

What would make a taxon non-notable? I'm not sure. I know what makes a taxon article go unread (editors start as readers, and unread articles don't get expanded). Articles about taxa that are uncommon, small, difficult to identify and which don't occur in English-speaking countries don't get read.

What taxa shouldn't have articles? There's already agreement not to have stand-alone articles for fossil species and monotypic taxa. I would add in articles that present one of the two possible cladistic topologies between three clades (i.e., don't name every node on a cladogram). Antennopoda and Tactopoda (as well as Lobopodia if it isn't treated as paraphyletic) representing competing topologies/hypothesized relationships between three clades. I'm not aware of any other examples like this but there may be a few.


 * Botanists do include years in the types of publications I think you're referring to (revisions, monographs, etc.) and do order synonyms chronologically in those publications. Lists of synonyms in Wikipedia articles on plants don't typically include years (and are generally sourced from databases that arrange synonyms alphabetically, and which may not include years (or any publication details) in the view of the synonyms of a particular accepted species (the database may have year/publication details if you click through to the record for a particular synonym). Sorting lists of plant synonyms chronologically would entail significant effort in adding years to the lists before the list could even be re-sorted. For animals, the years would already be present as part of the standard zoological authority citation.


 * I searched for taxonomic revisions of a plant and animal genus (chosen more or less at random). Here's how a synonym is presented in a plant revision (it was in a chronologically ordered list):


 * "Casearia celtidifolia Kunth, Nov. Gen. Sp. [H.B.K.] (quarto ed.) 5: 363. 1821. TYPE: Venezuela. Terr. Fed. Amazonas: ‘‘prope Angostura et Carichana, ad ripam fluminis Orinoco,’’ Humboldt & Bonpland 1047 (holotype, P not seen; isotypes, B, F neg. 13663 of B)"


 * And here is a synonym from the animal revision:


 * "Andrena subsquamularis Noskiewicz 1960: 85–89 (female). Type location: Bulgaria ("Sandansky"). Type depository: Universität von Wrocław, Poland"


 * The animal has the year immediately following the authority, followed by a page number (without the title of the publication). The plant has the publication immediately following the authority, followed by a page number and then the year. The format for the plant is typical for botanical publications; when the ICNafp itself cites a publication the order is: publication title, page number, year. From a quick glance through the zoological code, I'm not seeing any publication titles at all aside from the two in Article 3.


 * I'm getting off topic, but the basic point is that since the codes differ in whether the year is generally included in the author citation, there are differences in how the year is presented in the (botanical) contexts where it is necessary to present it. And that makes it more difficult to copy-paste a list of botanical synonyms with years (and then arrange it chronologically if it wasn't already).


 * But do start a thread on WT:TOL. Plantdrew (talk)

category draft reply
Wikipedia's system of categories kind of sucks. It doesn't necessarily work very well for what editors want it to do. Intersectional categories (categories covering two or more characteristics) are particularly problematic. With intersectional categories, Wikipedia editors are making decisions about what characteristics to include in the categorization, which may not be the characteristics of interest to readers (or to other editors). E.g. there is Category:20th-century American male actors, Category:American male television actors and Category:American male film actors. Somebody might be interested only in 20th-century film actors, but they won't find a category for them. It would be better if we had something like a system of tag (metadata), where users can select the tags that define the group of interest to them (e.g. 20th-century+film actors+Americans+males). But we don't have that. However, Wikidata has properties that can be searched on and in that way function similarly to tags. Wikipedia articles/Wikidata items do not necessarily have all the categories/properties that are relevant present, and even if they did there are going to be some edge cases that fall through the gaps (I don't think Wikidata has a property for "20th-century people"; while it does have birth/death dates and occupations, most actors born in 1992 were not active in as actors in the 20th-century, but a few were).

Theoretically with Petscan, Wikipedia users can search for any intersection of categories they care to. That could be 20th-century film actors, or it could be Cercozoa genera. I saw theoretically, because again, this depends on articles having the relevant categories which they might not. Category:Cercozoa genera is unnecessary if Category:Rhizaria genera and Category:Cercozoa are populated with all relevant articles; Petscan will find the article from the intersection of Cercozoa+Rhizaria genera. But Category:Rhizaria genera isn't necessary either; we could just have Category:Genera (essentially dumping all intersectional categories for taxon ranks and treating categories more like metadata tags).

However, even with a well thought out system of categories, you can't be sure that other editors won't create new subcategories that might pollute your Petscan results with unexpected articles. You certainly can't guarantee that categories won't end up being created for groups with less than 2000 species. You can't guarantee that newly created articles will be placed in the appropriate categories (I have seen new article on fungus species that lack Category:Fungus species; I add the category when I notice it is missing, but I'm not sure I always notice it's absence, nor am I sure if anybody else (Esculenta?) is checking for fungus species articles that lack the category).


 * 1) "[Taxon] taxa" has subcategories for e.g. species, genera, families, orders. What else goes in "[Taxon] taxa"? OK, unranked clades. Also infraorders, superfamilies, subgenera? Or do you also create subcategories for each minor rank? Category:Plant unranked clades exists; some of the clades are within families, some are above order rank, at least one is above phylum/division rank.


 * 1) "Creation of "[Taxon] species" categories". Wikipedia hasn't really had

WikiProject Plants/Categorization

Categories_for_discussion/Log/2020_February_6

Reply to NSPECIES proposal
Some comments pertaining to the FAQ section:


 * SARS-CoV-2 is not a species. It is certainly the most prominent recent example of a biological entity that was notable before receiving any kind of formal name (but the ICTV doesn't regulate names below species rank, so it's not a formal name of the sort covered by this proposal). Areas where we might find species that are notable prior to be named are: newly emerging pathogens (Middle East respiratory syndrome–related coronavirus is a species), fossil hominids (Denisovan although not clear whether they should be considered a species or subspecies), truly exceptional non-hominid fossils (the Suncor nodosaur had an article before being named as a taxon, but maybe the notability there is really for the fossil itself and not it's status or non-status as a species), and maybe some fish and invertebrates that are commercially available in the aquarium trade (the color morph Caridina cf. cantonensis var. blue tiger had an article two years before the species it is now known to belong to was named). The MERS virus might be the best example of a species that was notable prior to being named.


 * 2 million species. Is that intended to be the number that don't have articles yet? Stats on GBIF for the Catalogue of Life dataset estimate 2.3 million extant species recognized by taxonomists (I don't know of any estimates for the number of fossil species recognized by taxonomists). According to the GBIF stats, COL includes 2.581 million accepted names and 2.542 million synonyms; I believe these number includes names at higher ranks, not just species. I saw elsewhere that WAID had estimated 300k articles for species. For a more precise estimate, I'd go with 323,872 articles using Speciesbox plus 32,988 articles using Taxobox with binomial.


 * 25k prokaryotes includes synonyms. 20,413 is the number of prokaryote species excluding synonyms.