Wikipedia:Bots/Requests for approval/ShortDescBot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard. The result of the discussion was

ShortDescBot 2
Operator:

Time filed: 14:22, Friday, January 22, 2021 (UTC)

Function overview: (a) Add new short descriptions to organism articles. (b) Improve some existing moth short descriptions

Automatic, Supervised, or Manual: Automatic, after pre-review

Programming language(s): Pywikibot

Source code available: GitHub

Links to relevant discussions (where appropriate): WikiProject. Also noted on the WP short description page. Not a lot of interest, but there wasn't much for the moths task either, and that was entirely uncontentious.

Edit period(s): One time

Estimated number of pages affected: (a) 210,000 with relevant infobox; (b) 2000 moth articles

Namespace(s): Mainspace

Exclusion compliant: Yes

Already has a bot flag: Yes

Function details: ShortDescBot has successfully completed its addition of new short descriptions to all the moth articles. Next, I want to move on to categories of other organisms. This is a good bot task since non-technical short descriptions complying with WP:HOWTOSD can’t automatically be generated from the usual infoboxes, at least without expensive Lua calls.

Each bot run is based on a single category at some level in the tree that I can manually associate with a suitable common generic name. Sometimes that may be the same as the category name (Category:Butterflies --> "butterfly"), but often not (Category:Poaceae --> "grass" or Category:Onychophorans --> "velvet worm"). The bot then constructs and adds new short descriptions such as "Species of butterfly", "Genus of velvet worms", "Family of grasses" and so on. The text is deliberately simple so that a low error rate (<1%) can be maintained while minimising the number of non-standard articles that the bot has to skip as 'too difficult to parse'. For each category the procedure is:


 * 1) With the bot in trial mode, write the proposed descriptions to a local spreadsheet; review and repeat until the error rate is sufficiently low
 * 2) Manually remove from the list any obvious classes of article that the bot will not realistically be able to handle [not had to do this so far in testing]
 * 3) Re-run the bot in edit mode, making live changes only to the pages in the final corrected list.

The bot won't change existing short descriptions, with one small exception. A new feature this time is the inclusion of "Extinct ..." in the bot-created description of extinct organism articles, and also "Single-species .." in Monotypic genus articles (where that can be done without making the text too long). 2000 or so moth short descriptions of the form "Genus of moths" etc can be improved.

You can see a sample of suggested edits from a variety of categories at User:MichaelMaggs/ShortDesc.

Discussion
Primefac (talk) 16:10, 22 January 2021 (UTC)

The results look good, I think, although I did notice that in a few cases such as Phomatosphaeropsis and Pecoramyces the bot used "Genus of fungi" rather than "Single-species genus of fungi" which would have been better. So far I've not been adding "Single-species ..." to genus articles solely on the basis that the article is in a monotypic-specific category, as categorisations can very often be wrong. But in practice, monotypic categorisation seems to be done carefully, by specialists, and I suspect that using the name of the category will pick up a few more instances that can't be parsed from the lead: things like Wollemia, for example, where the fact that the genus is monotypic is well-hidden in the body of the article but can easily be seen from the category. I'll do that from now on. MichaelMaggs (talk) 15:35, 23 January 2021 (UTC)

BAG assistance needed Hi, hope it's OK for me to request follow-up, as I haven't heard anything for just over a week now. MichaelMaggs (talk) 17:48, 30 January 2021 (UTC)

BAG assistance needed Any chance of a conclusion please? I posted the results of the trial two weeks ago so there's been more than ample time for community comments. MichaelMaggs (talk) 09:35, 6 February 2021 (UTC)
 * Allow me to nitpick wording a bit, because this will affect many thousands of pages. Reviewing User:MichaelMaggs/ShortDesc, "Single-species extinct genus of monkeys" should perhaps be "Extinct single-species genus of monkeys"? This matches the order in the article ("extinct monotypic genus") and sounds more correct to my ear. Second: why "single-species" over "monotypic" in the first place? These articles typically use "monotypic" or "monospecific". — The Earwig   talk  00:14, 31 January 2021 (UTC)
 * , I've avoided "monotypic" based on the guidance at WP:HOWTOSD which says "avoid jargon, and use simple, readily comprehensible terms that do not require pre-existing detailed knowledge of the subject". On the question of word-order, I was in two minds about it, but on reconsideration I agree that "Extinct single-species genus .." is indeed probably best. Happy to make the change. MichaelMaggs (talk) 03:33, 31 January 2021 (UTC)
 * I suppose this edit is one instance of an editor disagreeing with the "single-species" wording, though I understand your point above about picking more layperson terminology. Apokryltaros, would you like to comment on this? —  The Earwig ⟨talk⟩ 09:06, 7 February 2021 (UTC)
 * I personally find it better to go with conciseness, wordflow and specificity over jargon-avoidance, if only because, in my experience, translating something out of jargon while trying to keep it concise makes things sound not-right.--Mr Fink (talk) 14:57, 7 February 2021 (UTC)
 * Thanks for the comment. As it seems this may need more discussion for a consensus, I've asked once more at Wikipedia_talk:WikiProject_Short_descriptions. MichaelMaggs (talk) 17:22, 7 February 2021 (UTC)
 * The overwhelmingly most common use of short descriptions is to help readers searching on the mobile interface to find the article that want among similarly-titles articles. If I search on mobile view for, I am presented with:  It seems to me that a reader trying to look up a word like "Geosiphon" might be better off with a jargon-free short description like this:  I don't see any advantage in using jargon that may not be understood by a large proportion of readers when a more comprehensible word is readily available. --RexxS (talk) 18:55, 7 February 2021 (UTC)
 * I, personally, disagree with the use of single-species on the basis of it not being an actual term, nor a particularly clear one. Monotypic is not a particularly obscure piece of jargon and it is not hard for someone to found out what it means. --SilverTiger12 (talk) 21:22, 7 February 2021 (UTC)
 * I disagree. "monotypic" is a particularly obscure piece of jargon, and "single-species" is a completely understandable term. We should not require 99% of readers just wanting to see if they have found the right article to have to look up what "monotypic" means when we can give them exactly the same information using simple terms. We are writing the encyclopedia for a lay audience, not for specialists in ivory towers and MOS:JARGON is a guideline with site-wide consensus: "Do not introduce new and specialized words simply to teach them to the reader when more common alternatives will do." --RexxS (talk) 22:15, 7 February 2021 (UTC)
 * I also disagree with the use of the term single-species over monotypic. Many words may stat out as jargon but this one has certainly been used extensively in many areas including in mainstream media for example in discussions of endangered species. I get the point on presuming to teach jargon to a general audience, and agree with the MOS:JARGON principal, but as words gain more usage they move from jargon to mainstream. This is not an uncommon unknown word only used in science. It is used in definitions of species and higher taxa across the board, I mean extending this if you add a short description to a Family that is monotypic, are you going to call it a single-genus family? Monotypic is a generally understood term and can be provide as a link in case. Cheers Scott Thomson  ( Faendalimas ) talk 04:21, 8 February 2021 (UTC)
 * How is somebody searching for an article going to follow a link in the short description? --RexxS (talk) 05:21, 8 February 2021 (UTC)
 * I assumed by looking at the 50 samples above, eg this one that contains three links this was possible. I meant in the article after you follow the link. Cheers Scott Thomson  ( Faendalimas ) talk 05:58, 8 February 2021 (UTC)
 * Please observe MOS:LISTGAP. Accessibilty is not optional.
 * I think you've missed the point of short descriptions. The idea is that a mobile user is searching for an article. They start typing into the search box on any Wikipedia page. They see a list of suggested article titles with their short description immediately below each one. After typing a few characters, for example, they will see suggested articles that have similar names like Geosiphon, Geosiris, Geositta. They will each have a short description (if it exists), and the idea is that the reader can use that to decide that they want the first one if they know that they are looking for some sort of fungus, rather than a flowering plant or a bird. If the short description contains words that they do not understand, they have no means at that point of clicking on any links (not that the links will show in the search list). By the time they reach the article to click on links, they have already decided which is the article they want, so they have no use for the short description at that point. That makes it especially important that short descriptions should use the most recognisable words available, because the user can't look them up while typing into a search box on a mobile phone! --RexxS (talk) 06:20, 8 February 2021 (UTC)
 * I'm requesting bot approval for "Single-species", as that's the only option that's widely comprehensible to non-specialists when seen without context, as it will be, embedded within a mobile phone search list of article titles. While I do understand that specialists may prefer their own specialist term, strongly arguing for that here is likely to be counterproductive. Lack of consensus will mean I can't do either, and tens of thousands of genus articles will have a less informative short description than I could easily provide. MichaelMaggs (talk) 09:44, 8 February 2021 (UTC)
 * I can't see that "monospecific" is less comprehensible to readers of an encyclopedia than "single-species". Those who don't understand the prefix "mono-" are unlikely to understand most of the species articles that I'm familiar with. This is an encyclopedia, remember. I have some sympathy with avoiding "monotypic", which is more jargon-y. More generally, the suggestions at User:MichaelMaggs/ShortDesc for the areas in which I usually edit (plants and spiders in particular) are certainly better than those often present in Wikidata – particularly "species of spider" rather than "species of arachnid" (I change Wikidata whenever I'm updating a spider item there). However, there are cases where the suggestions shorten a description, which I wouldn't. "Species of mygalomorph spider" is better than just "species of spider". A reader who doesn't understand "mygalomorph" can still see that this is a spider; those who do understand "mygalomorph" gain. The same applies when families are given and there's no common term for the family. I agree that "species of grass" or "species of orchid" are better than "species of plant in the family Poaceae" or "species of plant in the family Orchidaceae", but where there's no common term, "species of plant in family Xaceae" is more informative than just "species of plant" without taking anything away, and the same applies to other groups of organisms. Peter coxhead (talk) 09:52, 8 February 2021 (UTC)
 * Peter, the question of whether short descriptions should include family details was discussed at some length here before I requested bot approval. The conclusion was they should not, partly because in many fields species are constantly being moved between families and it would require long-term efforts to keep the descriptions up to date. Editors sometimes add them manually, which may in particular cases be OK, but something so complicated and contentious isn't really appropriate for a bot. MichaelMaggs (talk) 11:18, 8 February 2021 (UTC)

The Earwig Futher discussions here have persuaded me that it's best to avoid both "Single-species" and "Monotypic", and I've changed the bot so that it doesn't make use of either. Are you OK to approve the bot please? MichaelMaggs (talk) 09:48, 12 February 2021 (UTC)
 * Agree with abandoning this terminology. —  The  Earwig ⟨talk⟩ 05:21, 14 February 2021 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard.