Talk:Haplogroup R1a (Y-DNA)/Nomenclature

The most recent publications on R1a have greatly increased knowledge of the complexity of the R1a. This research demonstrates that the most commonly found type of R1a, now known as R1a1a, represents only one branch of a bigger "family tree". Two lower branches of the R1a tree have a corresponding set of known 'mutations', called SNPs. Evaluating these for cases typed as R1, R1a's parent clade, improves classification. The SRY1532.2 and M17 SNPs have been the most commonly tested markers for R1a. Evaluation of a new marker, M420, has altered our knowledge of the placement of SRY1532.2 and M17 within haplogroup R1. In addition, new clades, which group with M17, have been identified, and at least two have region specific distributions.

Roots of R1a
Haplogroup R1 is the parent haplogroup (or ancestor) of R1a and is defined by a SNP called M173. R1a evolved from a male-line of ancestors, or genetic lineage, who propagated the M173 genetic marker. During this process, four or more new SNPs evolved, including a SNP referred to as 'M420'. Subsequent to the evolution of M420, a split or radiation occurred, giving rise to R1a1 and R1a* (see R1a below, the R1a* clad is defined by and unknown number of variants). However only one branch, R1a1, encompasses the vast majority of R1a. R1a has at least one sister clade, R1b, that is defined by the M343 mutation and R1b is primarily Western Eurasian in its Old World distribution. Despite the geographic proximity of R1a to R1b bearing populations and assuming a Eurasian origin, there is no scientific consensus for where ancestral R1a evolved. However, the most divergent lineages cluster about Western Asia, including parts of Europe and South Asia. A recent study using pre-2009 R1 markers uncovered a large number of R1* mutants in India that may also be R1a variants.

Recent 'shift' in the R1a family tree
The structure of the family tree (or phylogenetic tree) commonly associated with 'R1a' differs between recent published sources. Although a new, more comprehensive, naming convention (nomenclature) has been created, this nomenclature has only been used in the most recent publications. In this new phylogenetic tree, the branch labeled 'R1a' defines a broader phylogenetic tree, which includes all R1 that are also M420 positive. This contrasts with the SRY1532.2 mutation that previously defined all cases of R1a. Consequently, this change has resulted in a shift in the positions of the labels "R1a", "R1a*", "R1a1", "R1a1*", and "R1a1a", and a renaming of M17 positive subclades "R1a1b" and "R1a1c".

R1a
In studies prior to mid-year 2009, R1a was defined by SRY1532.2. Four additional SNPs between ancestral R1 and the 'SRY1532.2' defined clade were recently described. Evaluating anyone of these new SNPs improves classification within haplogroup R1, but only one SNP, M420, is currently evaluated because M420 positives also have the the other defining mutations. When tested for both, SRY1532.2 positives are always M420 positive, however some M420 positives do not have SRY1532.2. These cases were previously misclassified as R1* in the old scheme, because R1* represented R1 cases that were neither SRY1532.2 positive or R1b (see cladogram on right). Underhill et al.(2009) found that this distinction was important as these cases were not isolated but clustered, but present at less than 1% of males in Oman, Iran, United Arab Emirates and Turkey. These cases may represent lineages with: no subclade defining mutations, with undetected mutations that define a new clade(s), and/or mutations that require the insertion of new nodes between the R1a and the SRY1532.2 defined clade.

Because of these newly detected R1 to SRY1532.2 lineage variants, the old R1a clade shifted 'outward' after a new node was inserted between the R1 and 'SRY1532.2' (see cladogram on right). This required the renaming of all higher branches. The inserted node and its branches drafted the name 'R1a' and the M420 positive/SRY1532.2 negative cases were assigned to R1a*. 'SRY1532.2' positives were then relabeled 'R1a1'.

R1a1
Prior to late 2009, R1a1 was defined as M17 or M198 positive. As SRY1532.2 positives assumed a new name, 'R1a1', the M17 positives were relabeled as 'R1a1a'. Under the previous nomenclature R1a* identified cases positive for SRY1532.2 but lacking M17 or M198, but with the shift these old R1a*, were relabeled R1a1*. Also, three new mutations were discovered that are always found with SRY1532.2. The R1a cases that have SRY1532.2 but not the M17 or M198 markers and are now referred to as R1a1*. This R1a1* category remains unrefined and may contain cases that: have no subclade defining mutations, subclade defining mutations and/or mutations that require the insertion of additional branch-points between R1a1 and M17 positives. Recent studies indicate that R1a1* is rate in most regions, at low levels, between 0.2 and 2%, in Scandinavia, Balkans, the Caucasus region and Kashmir Brahmans. However approximately 25% of Saharians from Northern Madhya Pradesh, India had R1a1*.

R1a1a
Because old R1a1 assumed the name R1a1a, old R1a1a was renamed R1a1a1. R1a1b and R1a1c assumed the new names R1a1a2 and R1a1a3, respectively. As part of the new studies, 5 new mutations that are always found with M17 or M198 were discovered. In addition, in late 2009 five new subclades were detected, R1a1a4 to R1a1a8, the identifying SNPs are depicting in the cladogram to the right. Two of these subclades, R1a1a6 (M434 positive) and R1a1a7 (M458 positive) are observed in multiple countries.

Roots of R1a
Current version:-
 * R1a evolved from male-line ancestor who was in haplogroup R1 (R-M173) but who also had SNP mutations M420, M449, M511, M513 etc. It is therefore the sister clade of R1b - another R1 lineage, but defined by the M343 mutation, and others. There is no simple consensus concerning the places in Eurasia where R1, R1a or R1b evolved, although Underhill et al. (2009) recently suggested that "the most distantly related R1a chromosomes, that is, both R1a* and R1a1* [...] have been detected at low frequency in Europe, Turkey, United Arab Emirates, Caucasus and Iran".

Newer version:
 * Haplogroup R1 is defined by a SNP called M173. R1a evolved from a male-line of ancestors, or genetic lineage, who propagated the M173 genetic marker. During this process, four or more new SNPs evolved, including a SNP referred to as 'M420'. Subsequent to the evolution of M420, the R1a clade radiated, but only one branch, R1a1, encompasses the vast majority of R1a. R1a has at least one sister clade, R1b, that is defined by the M343 mutation and primarily European in its Old World distribution. Despite the proximity of some R1a to R1b and assuming a Eurasian origin there is no scientific consensus for where ancestral R1a evolved. However, the most divergent lineages cluster about Western Asia.[2]

Comments.
 * First sentence introduces R1 without explaining how it is related to this article.
 * Removal of extra SNPs fine by me. It was PB666 who insisted on these. Not removal, but noting.
 * R1b is not primarily European in any simple sense. Most clades within R1b are rare in Europe and more common in Middle East. Some found in Central Africa. In Europe a recent branch got very successful. To explain all this would be a big diversion though, and inappropriate.
 * Unsigned comment by PB666 says: the word primarily you are interpreting as primal, in reference to Europe. Therefore the wordage is changed to Western Eurasian, and aware of the R1b in Africa notice Old World in its distribution, an indication of frequency not genetic diversity, however to satisfy Andrew the wordage European was changed to Western European. African instances are not material in primacy or abundance.
 * I have no problem believing that you can explain what you mean if given enough time. But the point is that this quick remark is not clear, and because it is not the core subject of the article this raises the question of whether it needs to be included. Please as per a previous discussion we have had, that "Western Eurasia" is a term with no clear definition, which we agreed to avoid. It makes it worse.--Andrew Lancaster (talk) 16:06, 19 November 2009 (UTC)


 * Odd sentence: Subsequent to the evolution of M420, the R1a clade radiated, but only one branch, R1a1, encompasses the vast majority of R1a. Comments:-
 * What is the source for saying that M420 "radiated" before its known sub-clades, which is the implication? It might have stayed in one village until a few hundred years ago, or it might have had multiple lineages successful all over Eurasia for thousands of years.


 * Radiate refers to the creation of at least two subclades, new R1a* and new R1a1, has nothing to do with villages, it has to do with genetic diversity.PB666 yap


 * The question is what, if anything, the quoted sentence will mean to a reasonably educated but non specialist reader of Wikipedia. I personally, do not understand what point it is trying to get accross. The two things which it clearly is saying seem to be said many times already:
 * (a) R-M420 came into existence, and then R1a, and then R1a1a, in a line of descent.
 * (b) Within the whole group R1a1a is dominant in modern populations.
 * Is it saying anything else?--Andrew Lancaster (talk) 16:06, 19 November 2009 (UTC)


 * Why use the word radiated, and is it intended to mean that carriers of this mutation spread out?
 * (Note: this question reinserted after PB666 deleted it in a subsequent edit.)--Andrew Lancaster (talk) 16:31, 19 November 2009 (UTC)


 * Radiation at the genetic level refers to diversification. Diversitification can be the result of population size increases, in the case of very old 'nodes' or it may refer to existing alleles in a population.PB666 yap


 * Again, what is the point you are trying to communicate which is not already in the article? I do not follow it easily myself, and I am trying. Again, I can see that your explanation, when parsed down, means some things, but they are such obvious things. For example, does it mean much more than just "R-M420 has descendants"?--Andrew Lancaster (talk) 16:06, 19 November 2009 (UTC)


 * The sentence will be altered for wordage. The logic of rewriting the section. As a reader and someone who is informed about the nature of molecular anthropology in terms of where PMRCA existed in this case the PMRCA of the lineage R1-R1a. There are two points of anchoring, from the base, for example L3-M lineage anchors in Africa, whereas M anchors in East Africa or Eurasia. This provides information about the context where the lineage evolved. In the case of R1a, the lineage evolution is not as clear as we might want it to be. First R1b's point of origin is not clear, Second R1a's point of radiation is not clear. The term radiates makes Andrew uncomfortable because it is not clear what is in the R1a* clade. However, given that there are 4 mututations from M420 to SRY1532.2 to M17 and 7 known mutations between R1a1 node and the R1a1a node, plus subclade defining mutations it is reasonable to assume that that a R1a2 or 3 will appear, even if that is not the case Redding appears to have uncovered length variation between R1a and the R1* clade. Consequently we can at least assume a bilateral split, if not a sequence radiation. So that the question an educated person would ask does the split indicate a reasonable point of origin: I would answer probably, from around the region of the fertile crescent; however because of Sharma's undefined R1* it is not clear that these should also be included, so that the conservative response is no.PB666 yap 16:18, 19 November 2009 (UTC)


 * This was supposed to be me assisting you toward improving that section of the page, not vice versa, you seem to be in a state of denial that there was anything wrong with numbered list format. We can go about this two ways, we can continue the process, which is productive in its own way, but its not you doing the improvements, and you constantly biting my back, which really will not help the other Y-DNA pages or you could take over the process, examine the style and structure and create your own version. I did not want to put this much work into this, I was simply trying to point the direction in which a GA format should be arranged. The issue of R-M173 versus R-M is an important point, when presenting said lingo, which may be useful, you must take the time to explain to the lay reader, and remember while reading about 4 or 5 new concepts can be stored in short term memory at a time, so they need a chance to commit important points to long term memory before preceding to other materials. In we do not need 4 lingos for the same thing, M17, R-M17, R1a1a, old R1a1 and rs123456789. That is a point of excessive behavior.PB666 yap 17:22, 19 November 2009 (UTC)


 * Cutting out all the excessive chatter which is irrelevant, you write: "The sentence will be altered for wordage"? So are we discussing your real proposals for better wording in the main article or are these not your real proposals yet? Do you need more time before we discuss them? I believe the article is good enough right now that it at least does not need the insertion of poor text with the excuse that the "wordage" can be fixed later?--Andrew Lancaster (talk) 16:37, 19 November 2009 (UTC)


 * It is not actually irrelevant. Recently I have been making improvements to the mitochondrial Eve page. A researcher of notability read the page and found issue with material that he, as a molecular and physical anthropologist, thought was overlooked, consequently, through back channels he sent me a copy of his most recent review. Consequently his thoughts and ideas on certain subjects were appended to the page. Note, I did not write a letter arguing with him adnauseum about the exact wording of his review or his critiques, trying to belittle his POV, etc. Indeed, if I have a concern, it means that other experts in the same field probably have the same concern, the concern is addressed, but as per WP:NPOV no opinion is given. Your defense of what can and cannot be discussed aligns with the comments below. Again, this should be your baby, and there are about 6 days left before GA occurs, if by that time we haven't gotten around the basic issues of style and working, then I might replace the sections. However I would hope that you will take the initiative at this point, looking at other GA articles and these edits go about making the repairs yourself. I will focus on the lede, henceforth.PB666 yap 17:57, 19 November 2009 (UTC)


 * Indicating the lack of ability to reflect on oneself. The page will not self-improve if you also do not self-improve. You have to get over your WP:OWN bias.PB666 yap 17:24, 19 November 2009 (UTC)


 * Sounds like a threat to edit war to me, and not your first. You are also continuing to make demands of other editors in an entirely inappropriate manner. I am not asking to be taught a lesson by you and I am not writing to a WP:DEADLINE. Please note: this talkpage is made by you and is your proposal for versions of text which you claim are superior to existing versions. My questions and concerns all over this talkpage and others, none of which have been answered in any respectable way, are all basically asking you to please make some sort of believable case that these wordings are better. You are simply refusing to do this. Above, you have (as you have done many times) now deferred to future editing as a possible way to solve the details of "wordage". It appears that this is an enormous waste of time, and you are apparently committed to an aggressive course of action since your first abusive posts when you first massive edit was rejected recently (the article split). If you start making unilateral mass edits again as per your threat this will not be a good thing.--Andrew Lancaster (talk) 18:57, 19 November 2009 (UTC)


 * "but only one branch..." This is a run-on sentence. What is the "but" for? What contrast is being made? Is it being implied that no other M420 lineages were ever similarly successful? How do we know that? The point being made is vague.


 * Despite the proximity of some R1a to R1b and assuming a Eurasian origin there is no scientific consensus for where ancestral R1a evolved. What a sentence. Doesn't it just mean There is no simple consensus concerning the places in Eurasia where R1, R1a or R1b evolved as in the version being proposed for scrapping by you?


 * ''When you are critique try to avoid the appearance of WP:OWN.PB666 yap


 * When you give advice like this, can you please explain what you are looking at as a basis of the criticism? How can I improve if you make vague accusations? Is this simply based on me saying that I find your wording worse than mine, or is there more to it? If this is the case you are making it seems to apply equally to you, or in fact far more to you given that you hopefully won't accuse me of not going to enormous efforts to try to explain every preference I have registered. Maybe it would be more constructive to actually try to convince other Wikipedians about the merits of your approach, rather than about speculations concerning psychology? WP:AGF--Andrew Lancaster (talk) 16:06, 19 November 2009 (UTC)

No, it does not mean where R1a, and b evolved, it concerns the places where the R1 to R1a lineage evolved, its one male lineage, in ranges of those male ancestors. ''PB666 yap


 * Can you explain in terms of article quality why the extra words are necessary? It sounds like you are saying that the main difference is that the older version said more, in less words?--Andrew Lancaster (talk) 16:06, 19 November 2009 (UTC)


 * the most divergent lineages cluster about Western Asia. You know very well that you are using stronger wording than the original authors and this is the type of thing that causes edit wars, and this is why the existing version uses a direct quote. What is wrong with the existing version?


 * I am trying to avoid direct quotation. Direct quotations should be used in instance were the statement is backed up with a powerful logic or strong historical importance (e.g. quoted on many occasions)


 * Is that just another personal preference that you are quoting as if it were a rule?--Andrew Lancaster (talk) 18:57, 19 November 2009 (UTC)


 * There has also been a repeated attempt by Geog1 to try to get discussion about the use of the term Western Asia.--Andrew Lancaster (talk) 10:21, 19 November 2009 (UTC)


 * I dislike the term, but we should try to conform to the Wikipedia guidelines.


 * Which guideline are we referring to please?--Andrew Lancaster (talk) 18:57, 19 November 2009 (UTC)

Keep in mind the point here is to improve, there has been no injection of opinion into the article, and therefore no need for WP:CRYSTAL what will happen.PB666 yap 17:24, 19 November 2009 (UTC)
 * Please explain your point?--Andrew Lancaster (talk) 18:57, 19 November 2009 (UTC)

Different meanings of "R1a" -> Recent 'shift' in the R1a family tree?
First question: why is the change of title being proposed? Mentioning different meanings of a term tells readers exactly which confusion to expect, even if not specialized. The shift metaphor is totally unclear and undefined even for someone who knows the field. PB666 should, as usual, at least try to make a case that might convince others.--Andrew Lancaster (talk) 10:34, 19 November 2009 (UTC)

Old Proposal:-
 * The phylogenetic ("family tree") naming system commonly used for this haplogroup remains inconsistent in different published sources. Although it has not yet used much in published surveys, a more comprehensive survey of the known mutations is listed by ISOGG, and an equivalent tree is given in Underhill et al. (2009).[4]
 * Prior to 2009 the mutation SRY1532.2 (or SRY10831.2) defined R1a, and this is also how the term R1a is most often used in publications before 2009.
 * However the term R1a is also now increasingly used to refer to a broader family including not only this "old" R1a, but also other related R1 (R-M173) lineages which have been found to share several unique mutations with R1a, including M420. In this newer system, the clade defined by SRY1532.2/SRY10831.2 moves from "R1a" to "R1a1".
 * The family tree of R1a as a whole can be divided into three levels of branching. The following summary is based upon the large survey of Underhill et al. (2009) as follows:

New proposal:-
 * The structure of the genetic tree (or phylogenetic tree) commonly associated with 'R1a' differs between recent published sources. Although a new, more comprehensive, naming convention (nomenclature) has been created, this nomenclature has only been used in the most recent publications. In this new phylogenetic tree, the branch labeled 'R1a' defines a broader phylogenetic tree, which includes all R1 that are also M420 positive. This contrasts with the SRY1532.2 mutation that previously defined all cases of R1a. Consequently, this change has resulted in a shift in the positions of the labels "R1a", "R1a*", "R1a1", "R1a1*", and "R1a1a", and a renaming of M17 positive subclades "R1a1b" and "R1a1c".

Discussion about the proposal.
 * Please explain how this first sentence is improved by inventing a non-term "genetic tree" in order to replace the non jargon word "family tree". What is the point here?--Andrew Lancaster (talk) 11:15, 19 November 2009 (UTC)
 * Removal of a key part of the second sentence, explaining what it means to say that there have been changes/updates, i.e. the subject of this section. Who can make such changes for example? Most readers will not easily follow. I note that the core of the matter is moved to footnotes. Footnotes should be for side issues.--Andrew Lancaster (talk) 11:21, 19 November 2009 (UTC)
 * I can understand why we should not mention the double name of SRY1532.2/SRY10831.2 all the time necessarily, but we do need to mention the equivalence somewhere because both terms are used in the literature. I do not see this as something for a footnote, because people will coming to this article looking for this type of information up front. By the way the footnote itself is garbled and needs to be fixed.--Andrew Lancaster (talk) 10:47, 19 November 2009 (UTC)
 * Prior to 2009 the mutation SRY1532.2 (or SRY10831.2) defined R1a, and this is also how the term R1a is most often used in publications before 2009. This sentence is removed, and a similar one has been moved down into the R1a section. I can not follow this logic because this sentence is about the changes in the tree, and the R1a section is about R1a as currently defined surely?--Andrew Lancaster (talk) 11:23, 19 November 2009 (UTC)
 * In both versions, the last part is the description in words of the changes in the tree. I think it is easiest to just say that I can not see what improvement the new draft is aiming at? Is this really more clear in any way? I do hope other editors will comment.--Andrew Lancaster (talk) 11:21, 19 November 2009 (UTC)

R1a (subsection)
Older version:
 * R1a* (new nomenclature). Defined as M420 positive but SRY1532.2 negative. (Articles published before the discovery of M420 will not have distinguished these from other R1-M173 lineages.) Underhill et al. (2009) believe this clade or clades to be rare, and have so far found 1/121 Omanis, 2/150 Iranians, 1/164 in the United Arab Emirates, 3/612 in Turkey. 7224 more tests in 73 other Eurasian population showed no sign of this category so far. Mutations understood to be equivalent to M420 include M449, M511, M513, L62, L63.[2][5]

New proposal:-
 * In studies prior to mid-year 2009, R1a was defined by SRY1532.2. Four additional SNPs between ancestral R1 and the 'SRY1532.2' defined clade were recently described. Evaluating anyone of these new SNPs improves classification within haplogroup R1, but only one SNP, M420, is currently evaluated because M420 positives also have the the other defining mutations. When tested for both, SRY1532.2 positives are always M420 positive, however some M420 positives do not have SRY1532.2. These cases were previously misclassified as R1* in the old scheme, because R1* represented R1 cases that were neither SRY1532.2 positive or R1b (see cladogram on right). Underhill et al.(2009) found that this distinction was important as these cases were not isolated but clustered, but present at less than 1% of males in Oman, Iran, United Arab Emirates and Turkey. These cases may represent lineages with: no subclade defining mutations, with undetected mutations that define a new clade(s), and/or mutations that require the insertion of new nodes between the R1a and the SRY1532.2 defined clade.
 * Because of these newly detected R1 to SRY1532.2 lineage variants, the old R1a clade shifted 'outward' after a new node was inserted between the R1 and 'SRY1532.2' (see cladogram on right). This required the renaming of all higher branches. The inserted node and its branches drafted the name 'R1a' and the M420 positive/SRY1532.2 negative cases were assigned to R1a*. 'SRY1532.2' positives were then relabeled 'R1a1'.

Comments:-
 * Most obvious remark: this is now massively expanded without adding anything except more complex ways of wording and yet more repetition of things said in other parts of the article? What is the aim of this?--Andrew Lancaster (talk) 11:31, 19 November 2009 (UTC)
 * Evaluating anyone of these new SNPs improves classification within haplogroup R1. What does this mean?
 * The footnote is wrong. All these SNPs were described earlier, at least the Underhill ones.--Andrew Lancaster (talk) 10:47, 19 November 2009 (UTC)
 * only one SNP, M420, is currently evaluated because M420 positives also have the the other defining mutations. There has only been one article which did this. The implication here that there is a standard procedure in a different surveys is wrong.--Andrew Lancaster (talk) 10:47, 19 November 2009 (UTC)
 * As discussed on talk pages, "misclassified" is a totally inappropriate word. These phylogenetic names are meant only as to show the current state of knowledge and ignorance. If this was a misclassification then you have to call the entire system of nomenclature wrong, which would be ridiculous.--Andrew Lancaster (talk) 10:47, 19 November 2009 (UTC)
 * not isolated but clustered, but present: double "but".--Andrew Lancaster (talk) 10:47, 19 November 2009 (UTC)
 * Where do Underhill et al. say that the M420 discovery is important BECAUSE of the geographical distribution?--Andrew Lancaster (talk) 10:47, 19 November 2009 (UTC)
 * Clustering is a poor choice of word in a genetics article, if you are referring to something geographical.--Andrew Lancaster (talk) 10:47, 19 November 2009 (UTC)
 * These cases may represent lineages with: no subclade defining mutations, with undetected mutations that define a new clade(s), and/or mutations that require the insertion of new nodes between the R1a and the SRY1532.2 defined clade. Remarks:-
 * Very poor sentence structure. Not acceptable.
 * The facts being explained are in a state of confusion, and wrong. Of course R-M420* cases DO have MANY other mutations downstream from M420, which have not yet been discovered. This is apparently coming from the same misunderstanding leading to you calling previous phylogenies wrong, when they were not wrong. Complete phylogenies are many years away because we would need full sequencing. We do not have that now, and so acting like this is the standard is misleading and wrong.--Andrew Lancaster (talk) 10:51, 19 November 2009 (UTC)


 * Because of these newly detected R1 to SRY1532.2 lineage variants, the old R1a clade shifted 'outward' after a new node was inserted between the R1 and 'SRY1532.2' (see cladogram on right). This required the renaming of all higher branches. The inserted node and its branches drafted the name 'R1a' and the M420 positive/SRY1532.2 negative cases were assigned to R1a*. 'SRY1532.2' positives were then relabeled 'R1a1'. Is anyone seriously going to make the case that this is a good clear bit of English? What was wrong with the current version? Does it not already explain this?--Andrew Lancaster (talk) 10:53, 19 November 2009 (UTC)

R1a1 sub section
Old version:-
 * R1a1* (old R1a*) SRY1532.2 positive, but M17 and/or M198 negative. Underhill et al. (2009) found 1/51 in Norway, 3/305 in Sweden, 1/57 Greek Macedonians, 1/150 Iranians, 2/734 Ethnic Armenians, 1/141 Kabardians. Sharma et al. (2009) also found 13/57 people tested from the Saharia tribe of Madhya Pradesh, and 2/51 amongst Kashmir Pandits. SNP mutations understood to be always occurring with SRY10831.2 include M448, M459, and M516.[2]

New version:-
 * Prior to late 2009, R1a1 was defined as M17 or M198 positive. As SRY1532.2 positives assumed a new name, 'R1a1', the M17 positives were relabeled as 'R1a1a'. Under the previous nomenclature R1a* identified cases positive for SRY1532.2 but lacking M17 or M198,[note 4] but with the shift these old R1a*, were relabeled R1a1*. Also, three new mutations were discovered that are always found with SRY1532.2.[note 5] The R1a cases that have SRY1532.2 but not the M17 or M198 markers and are now referred to as R1a1*. This R1a1* category remains unrefined and may contain cases that: have no subclade defining mutations, subclade defining mutations and/or mutations that require the insertion of additional branch-points between R1a1 and M17 positives. Recent studies indicate that R1a1* is rate in most regions, at low levels, between 0.2 and 2%, in Scandinavia, Balkans, the Caucasus region and Kashmir Brahmans. However approximately 25% of Saharians from Northern Madhya Pradesh, India had R1a1*.[note 6]


 * Once more expanded. Should be a reason.
 * and/or is correct
 * Not sure what is wrong with giving numbers of positive tests in small cases like this. Normal readers can understand and visualize this. Despite being more simple for everyone, using a format like 1/150 gives a good indication about how significant a number is, which a % does not. So it is simply better.
 * The compound term "SRY1532.2 positive" as in "the SRY1532.2 positives assumed..." is a very awkward piece of English. No need to use such odd constructions.
 * Once again, this section goes through the whole re-naming story. Avoiding such repetition was the reason for having a special introductory just about this subject. Apparently PB666 does not realize this.
 * This R1a1* category remains unrefined and may contain cases that: have no subclade defining mutations, subclade defining mutations and/or mutations that require the insertion of additional branch-points between R1a1 and M17 positives. This is fundamentally confused. See my remarks about R-M420*: Of course "R-M420* cases DO have MANY other mutations downstream from M420, which have not yet been discovered. This is apparently coming from the same misunderstanding leading to you calling previous phylogenies wrong, when they were not wrong. Complete phylogenies are many years away because we would need full sequencing. We do not have that now, and so acting like this is the standard is misleading and wrong."
 * rate should be rare

R1a1a sub section
Old version:-
 * R1a1a (old R1a1) is defined in various articles by M17 or M198 (two mutations which always appear together so far). Such lineages make up the dominant majority of all R1a, and most statistical or other analysis is by definition focused upon it. This clade also has some sub-clades of its own, although a large proportion of R-M17/R-M198 has however not yet been categorized into branches defined by mutations, and is therefore referred to as R1a1* (old nomenclature) or R1a1a* (new nomenclature). SNP mutations understood to be always occurring with M17 and M198 include M417, M512, M514, M515, and rs34297606.[2]

New version:-
 * ''Because old R1a1 assumed the name R1a1a, old R1a1a was renamed R1a1a1. R1a1b and R1a1c assumed the new names R1a1a2 and R1a1a3, respectively. As part of the new studies, 5 new mutations that are always found with M17 or M198 were discovered.[note 7] In addition, in late 2009 five new subclades were detected, R1a1a4 to R1a1a8, the identifying SNPs are depicting in the cladogram to the right. Two of these subclades, R1a1a6 (M434 positive) and R1a1a7 (M458 positive) are observed in multiple countries.

Comments:-
 * Again the repetition of the whole re-naming story
 * Again this focus upon counting SNPs
 * Strong assertion that no sub-clades have been found in more than one country based on what source? Why important to say this anyway?
 * Verb "were" implies all at the same time.
 * Removal of mention of the important fact that most of this clade is un-categorized.--Andrew Lancaster (talk) 11:54, 19 November 2009 (UTC)