Talk:Phylogenetic Assignment of Named Global Outbreak Lineages

Spelling and Grammar Mistakes
Three sentences in this article presently begin with 'pangolin' pretending not to know English sentence structure.


 * pangolin is a key component underpinning the Pango nomenclature system.
 * pangolin was created by Áine O'Toole and the Rambaut lab on 5 April 2020.
 * pangolin is available as a command-line-based tool, ...

Even 'van' from 'van Dijk' is capitalized at sentence initial position.


 * Oh noes! van prefix is not in capital at beginning of sentence — 2015

The \textcite command uses the standard capitalization; however the biblatex package provides \Textcite for use at the beginning of a sentence (section 3.7.2 of the manual).

Way off topic—what makes C > Fortran is that you access your data matrix D as D[i,j,k] instead of D[k,j,i].

C, Fortran, and biblatex all date back to the era when Stone Tablets of Gravitas ruled the Earth. Let's keep all four. &mdash; MaxEnt 17:27, 9 March 2021 (UTC)

PANGOLIN as an acronym
PANGOLIN is an abbreviation and should be written in all-caps in the article text, which I fixed throughout the article. Ain92 (talk) 13:11, 19 March 2021 (UTC) Here we provide clarification regarding the name of the dynamic nomenclature system for SARS-CoV-2 lineages published in our Article. Our nomenclature system should be referred to as the ‘Pango’ nomenclature system (from the first-person Latin verb meaning ‘I set’, ‘I fix’ or ‘I record’). It should not be referred to as the ‘Rambaut et al.’ or ‘Pangolin’ nomenclature system. Lineages defined by this nomenclature system should be referred to as ‘Pango lineages’.
 * I mean, an acronym. The fact that the name may be written in lower case in IT contexts should not lead us astray: since nobody would ever use, the name of the package manager itself is occasionally written "apt" on the Internet, but the Wikipedia page is still APT (software), not **apt. Ain92 (talk) 10:43, 20 March 2021 (UTC)
 * I am not necessarily convinced by this line of reasoning. There are quite reputable examples of the acronym not being written in all caps. Consider, for example, the original article that suggested and defined the Pango nomenclature system by Rambaut et al., as well as their January 2021 addendum. In the July 2020 article, the pangolin software is treated in all lowercase. In the addendum, it is treated as a proper noun (i.e. "Pangolin"). There is possibly no more reputable source then the original researchers. PeregrineFlight (talk) 02:37, 13 April 2021 (UTC)
 * Hm-m, that last addendum is worth quoting in full:

We emphasize that the Pango nomenclature is distinct from, and independent of, any algorithms that are used to implement the system. A growing set of computational tools for working with the Pango nomenclature are provided at https://cov-lineages.org/ and the corresponding GitHub repository, https://github.com/cov-lineages. One popular tool hosted at https://cov-lineages.org/ for classifying SARS-CoV-2 genome sequences is named Pangolin (an acronym for Phylogenetic Assignment of Named Global Outbreak LINeages). The term ‘Pangolin lineage’ in particular should be avoided because it may be misinterpreted as referring to viral host species, rather than to lineages defined by the Pango nomenclature system.
 * Given that, I can understand your point, but I still disagree with your last statement (cf. GIF, for example). It's very inconvenient to have an acronym spelled identically to a common noun, so it seems unlikely that this variant becomes prevalent. We may have orthographic preferences different from the software developers if it's supported by other reliable sources. Unfortunately I couldn't come up with a search term in Google Scholar that distinguishes PANGOLIN from pangolins (the latter are much more numerous), but it appears that many scientists spell PANGO in all caps, including WHO (all links from 2021, so at least half a year passed since the addendum). Ain92 (talk) 13 April 2021


 * You make a good argument. I will concede that it makes the most sense for PANGOLIN to be written in all-caps, given the potential confusion. I do find it interesting that the name of the nomenclature system is often treated in all-caps often as well. I wonder if that's a carry-over from the PANGOLIN software, which is truly an acronym. I might suggest, however, that writing Pango in capitals is in fact incorrect, given the Rambaut et al. addendum. (Also, don't forget to sign!) PeregrineFlight (talk) 03:25, 18 April 2021 (UTC)

Pangolin vs Pango Nomenclature system
At the moment, this page is the only Wikipedia article that discusses the Pango dynamic nomenclature system developed by the Rambaut lab at the University of Edinburgh Institute of Evolutionary Biology. In the absence of an official international nomenclature system, the Pango system has become the dominant non-geographic way to refer to the most prevalent SARS-CoV-2 variants of concern. However, the researchers themselves highlight that the Pango nomenclature system is distinct from the Pangolin software tool developed by the same researchers, and should be treated as such. . Given this, should this article on the software either be retooled to cover the nomenclature system, or should a new article be created to cover the Pango system itself? I would opt for the later, but I'm not the most familiar with Wikipedia's policies. PeregrineFlight (talk) 03:59, 13 April 2021 (UTC)

Number of patients per mutation?
Thanks very much to you for having created this article and to all the others who have contributed to it. Since 2021-01-22 it has averaged 151 pageviews per day (up to 2021-09-07). That looks to me like a great return on the work you've put into this article.

QUESTION: Is it fair to say that the mutation rate is best described as roughly a constant probability of a new mutation with each new patient? If yes, what's a reasonable estimate of the number of patients between mutation or it's reciprocal, the probability that each new patient will generate a mutation?

As of today, the Wikipedia article on COVID-19 says there have been 221,244,447 confirmed cases and 4,578,393 deaths, according to the Template:Cases in the COVID-19 pandemic. Is there a way I can easily get the cumulative number of variants in PANGOLIN up to today?

I read, "The PANGOLIN web application has assigned more that 512,000 unique SARS-CoV-2 sequences as of January 2021." However, I don't know how to update that number, and I don't know how to easily get the number of confirmed cases as of January 2021. Dividing 221 million by the 646 days between 2019-12-01, an approximate date for the first case in Wuhan, and today, 2021-09-06, gives me 342,000 new cases per day. Obviously, that rate has not been constant but mostly growing. And dividing 512,000 by the 427 days between 2019-12-01 and 2021-01-31 gives me 1,200 new variants per day. Then 342000 / 1200 = 285 patients per new variant. It's probably closer to one new variant for each 200 or 250 new cases.

If someone could provide a more current number of the unique SARS-CoV-2 sequences in PANGOLIN, that would be great. It would be even better if it were easier for me to see how to get that number. And it would be even better if I had a source that provided an estimate of the number of new cases between new variants in PANGOLIN -- or a credible explanation why that's not relevant.

Thanks, DavidMCEddy (talk) 15:34, 7 September 2021 (UTC)


 * Thanks for your acknowledgement. I'm having a bit of a wikiholiday at the moment while pursuing a local environmental cause which has seized my attention; nevertheless your question grabbed my attention and dragged me briefly back. Sadly, tho', I can't see where to find any more of the data needed to answer it; I can only suggest that we ask AineToole whether they can offer a suitable source. In an ideal world, such data could also appear in a potential new subsection at the article Variants of SARS-CoV-2. PS your Q- while only likely to give a very broad-brush result- seems fascinating, and expanding it in detail to sub-populations to see whether specific locations or demographic groups appear more likely to produce variants and still more so 'persisting' variants would be a potential extension... Yadsalohcin (talk) 08:05, 9 September 2021 (UTC)

Pangolin a host??
Scientist are busy looking at this new variant in south africa, im surprised at the name of this software the time lines, the fact that scientists were aware of the pangolin link as a host to human, and im from south africa gauteng not a scientist nor in microbiology but i can confirm people in the rural areas have eaten this mammal, the scales sold to Asia, 105.224.242.168 (talk) 00:05, 30 November 2021 (UTC)