Talk:Haplogroup R1a/Archive 3

Starting on condensing this article, bit by bit
===Central Asia===

There are big differences in R1a frequency between populations in Central Asia.

Exceptionally high frequencies of M17 are found among the Ishkashimi (68%), the Tajik population of Khojant (64%), and the Kyrgyz (63%), but are likely "due to drift, as these populations are less diverse, and are characterized by relatively small numbers of individuals living in isolated mountain valleys." (The frequency of the Tajik/Dushanbe population is, at 19%, far lower than the 64% frequency of the Tajik/Khojant population.)

Haplogroup R1a is also common among Mongolic- and Turkic-speaking populations of Northwestern China, such as the Bonan, Dongxiang, Salar, and Uyghur peoples.

note that Turkish and Azeri populations are atypical among Altaic speakers R1a1-M17 haplotypes. Rather, these two Turkic-speaking groups seem to be closer to populations from the Middle East and Caucasus, characterized by high frequencies of M96- and/or M89-related haplotypes. This finding is consistent with a model in which the Turkic languages, originating in the Altai-Sayan region of Central Asia and northwestern Mongolia (31), were imposed on the Caucasian and Anatolian peoples with relatively little genetic admixture—another possible example of elite dominance-driven linguistic replacement.

This section needs help.
 * 1) First off the quote from Wells is from 2001, which is rather dated considering Y -chromosomal studies. Is there any reason that we consider this conclusion to be current?
 * 2) HLA presents a number of markers in Central Asia that are of South Asia or Iranian origin. These markers are enriched in the Balochi/Baloch of the western Indus and S. Iranian region and appear to have arrived more recently from Africa, namely W. Africa. Therefore it is possible that these have been translocated from other parts of Asia recently, and the Y-invasion hypothesis may be applicable. In particular, the DR3-DQ2.5 haplotype is out of place in Central Asia, given gene flow between Central Asia and East Asia and the New World, its lack of presence in Japan, Parts of Siberia and the Americas, the fact that it is not linked to the A1-B8 or any other known haplotype in Europe, its complete absense in the middle east, and its association with A33-B58 and other haplotypes found in West Africa suggests that there was a very strong migration within the holocene period and that its expansion has continued until recently (i.e. Korea).
 * 3) M17 is sort of dropped into the piece, not well explained. I recommened 1 of 2 nomeclatures M17 (R1a1a, ISSOG) or R1a1 (M17)
 * 4) This section and the entire distribution section is sort of choppy in its organization. It reads something like a technical manual written in English by a non-English speaker.


 * I agree this section mentions old theories that can not be taken seriously anymore. (The Tajik high level are now known not to be isolated, and we can no longer say that Turks and Azeris are atypical when compared to the bigger data we now have for Turkic speakers going all the way to East Asia.) The most neutral approach seems to be to shorten it a lot for now.--Andrew Lancaster (talk) 14:37, 1 November 2009 (UTC)

Central Asia and Western Asian cleaned
NorthEast Asia was merged into Central Asia and Northeast Asia, the anecdotal comment about Costa Rica was removed. There were a substantial numbers of turkic people amoung the leadership of the moors, I know of hispanics who can trace their ancestry back to turkic peoples, so. . .PB666 yap 17:15, 2 November 2009 (UTC)

Audience for an encyclopedia piece
While I applaud efforts to elucidate this piece, one needs to keep in mind that its intended audience is the general public. Such sentences as "The phylogenetic analyses indicate a high degree of population admixture and a greater genetic proximity for the studied population groups when compared with other world populations" are beyond the reach of the average audience. This piece needs to be edited to conform with normal everyday English usage, and the thicket of verbiage needs to be pruned. Regards, MarmadukePercy (talk) 01:53, 3 November 2009 (UTC)


 * Hi Marmaduke. I do not think anyone will fight efforts to de-jargonize. I certainly agree with you. I think PB666 is working hard and fast to compress, and not to fix things like this.--Andrew Lancaster (talk) 20:10, 3 November 2009 (UTC)


 * Yes, I certainly agree with you that PB666 is doing a good job of compressing. At some point later we'll need to dejargonize and improve sentence structure. MarmadukePercy (talk) 20:40, 3 November 2009 (UTC)


 * Was that my sentence? lol. Feel free to dejargonize. I am finished with the distribution section.PB666 yap 23:41, 4 November 2009 (UTC)

Moving on to the origins section next, if anyone feels that my edits are too overwhelming please feel free to condense.PB666 yap 23:45, 4 November 2009 (UTC)

Another article!
Anyone have this? http://www.nature.com/ejhg/journal/vaop/ncurrent/abs/ejhg2009194a.html (I do not.)--Andrew Lancaster (talk) 18:05, 4 November 2009 (UTC)

Marker M434 has a low frequency and a late origin in West Asia bearing witness to recent gene flow over the Arabian Sea. Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region.

Kurgan Map?
Hi all,

Hate to be a stickler on the details and what not but does the Kurgan hypo. map need to be in here? Hate constatnly pointing things like this out but it clutters up the body of the article in a very unappealing fashion, isn't a genetic map but rather archaeological, and seems to add undue weight to the hypo. since no other maps for other hypo's are in the body of the text. Need feedback essentially as to why it needs to be here as it is presented already in the Kurgan Hypo. article where it has its most relevance and really belongs. Best.

Geog1 (talk) 15:46, 6 November 2009 (UTC)Geog1


 * I added the image and I have no vested interest in the theory, however the Kurgan Hypothesis was mentioned and this image helps to wikify the page, something that it lacking in. Andrew and I are in the process of redoing the Origins section with a greater focus on the more recent literature. If the kurgan hypothesis becomes a minor-hypothesis for R1a origin, then it is likely that the image and much of the text will be removed. If see a way to better condense and consolidate the Origins section, please be my guest, however consider at the same time that to improve the article it helps to present graphics, tables instead of having a long dirty laundry list of origin theories [sic].PB666 yap 16:36, 6 November 2009 (UTC)

Right yeah...I was thinking about making a map that shows the kurgan hypothesis in relationship to recent Ancient DNA findings but then I'd have to do a map for every section but since no other archaeological theory has been espoused for Out of India or anywhere else leading into Europe I'm afraid that such an endeavor is really pointless. Graphics are good but the best way to represent them and how they fit into the scope of the article is always most challenging.

Maybe a comprehensive theory map is most appropriate showing R1a being spread originally from multiple locations. I think this would resolve our cartographic origins dilemma.71.179.199.39 (talk) 17:07, 8 November 2009 (UTC)Geog1


 * I can not see any non-controversial way to do this. I do think a new frequency map might be handy. The ones used in this article until now are not good enough with respect to up to date Asian data. The new Underhill article has pretty good maps which could perhaps be emulated?--Andrew Lancaster (talk) 16:01, 9 November 2009 (UTC)

Like I said before I haven't read Underhill 2009 but hopefully will get around to it. I'm not really interested in making a chloropleth/frequency map as much as I'd like to make a multiple origins map. That said here's a cartographic to do list:

1) Multiple Origins Map

2) New Frequency Map

Anyone else have any other mapping suggestions? Geog1 (talk) 19:48, 9 November 2009 (UTC)Geog1

Proposal for deleting two sections
I propose we should delete the Haplotypes and Popular Science sections. Certainly the way they stand now is not acceptable, so if anyone wants these to stay please do something about them quickly. They have been in a sorry state, with tags, for a long time.--Andrew Lancaster (talk) 09:54, 7 November 2009 (UTC)
 * Clearly with this new R1a paper, and the division into new haplogroups R1a1a7* and R1a1a*, quite a bit of this entire piece is going to need revision. These old modals were configured before this paper, which looks to be darn important. As the Haplotype section is written now, it is irrelevant, it seems to me. MarmadukePercy (talk) 10:14, 7 November 2009 (UTC)


 * This morning I have tried to adapt most sections. The phylogeny section and Eastern European sections still need work of course as they are the most affected. The reason for mentioning these two separately is that I see no way to save them myself and they've been tagged waiting for help for a long time.--Andrew Lancaster (talk) 12:49, 7 November 2009 (UTC)

The Eastern European and Central Asian origin theories can essentially be reduced to 1 sentence under the major heading "Origins", with its references. That will leave the South and West Asian origin Theory and hypothesis, respectively. The whole section on the Kurgan hypothesis and image is obsolete, if it needs to be mentioned lets reduce it to a note.

I have added more space of the cladogram, I understand there is one more subclade that needs to be added on an existing peripheral clade, I will add it on monday or so.PB666 yap 23:04, 7 November 2009 (UTC)

I should remind everyone that the very popular anthropologist have gotten alot of traction out of these theories, however the goal of the encyclopedia is not to serve the career of molecular pundits, but produce an encyclopedic page based on the weighed information on hand. I cannot repeat this enough that even as R1 is one of the most covered Y chromosomal types, so many more types are missing discerning mutations, and yet pundits have a great variety of theories with these things. We are not here to propogate theories or promote points of view based on bad data.PB666 yap 23:04, 7 November 2009 (UTC)

This comment is specifically for Andrew. Read the paper carefully, they are claiming that the origin in in Western India approximately 17,000 years ago. That is close to the end of the LGM; There are seven clades branching off of a single type. This represents a major expansion in a relatively short period of time, otherwise we would see short branching clades. This may be due to some missing SNPs, however at 7 I find that unlikely that a large number of tiers can be created. Consequently we must assume that this is an expansion. The question they stipulate becomes when, and so the mode in which they calibrated the mutation rate is everything to the argument.

To keep this short, read the mitochondrial Eve page, particular the section on comparison to Y chromosome. There is a claim that Y is evolving at 55% of the commonly stated rate based on chimpanzee sperm evolution. Others are claiming the recent mutations are pruned from the population. An expansion date of 17,000 years ago could become 31,000 years ago, entry dates of the M458 at 7000 years ago could become entry dates that approximate the end of the Younger Dryas.

There is cultural evidence of the spread of culture from the East, from eastern black sea region into Central Europe during the epipaleolithic early mesolithic period, this was deemed to be a seasonal migratory culture. Some authors have argue that the Neolithic spread on preexisting mesolithic transmigration routes. IOW, we can present their data, but lets not lock the page into a new set of 'theorization' errors.PB666 yap 23:21, 7 November 2009 (UTC)


 * This team of authors constantly uses an age calculation method which roughly triples the age implied purely by what we know of the male line mutation rate, using the so called Zhivitovsky fudge factor. It is not uncontroversial, and therefore I have tended not to emphasis age estimations in Wikipedia articles. However, if you would calculate in a simpler way the dispersal of M458 has already been argued online to match the dispersal of Slavic languages. In other words, there are lots of different types of cultural evidence from different periods. Concerning where the authors think R1a originated, I think you are going further than they do. They write:


 * --Andrew Lancaster (talk) 08:36, 8 November 2009 (UTC)


 * Apparently the age stuff is a problem, and, as Andrew says, needs recalibrating. But the discovery of a new SNP that apparently tracks with Slavic migration is pretty big, it seems to me. Certainly from the list of authors, this is no gang of crackpots. MarmadukePercy (talk) 11:03, 8 November 2009 (UTC)


 * Ah, but I think they link it to much earlier events, not Slavic languages.--Andrew Lancaster (talk) 17:32, 8 November 2009 (UTC)


 * agree this is an earlier migration, somewhere during the neolithic or very early copper/bronze age.PB666 yap 14:57, 9 November 2009 (UTC)


 * The age issue is a problem, that is why we need to discuss the problem here, I have seen papers with large numbers of authors get it wrong so to speak. Putting that aside I think this is a good paper, but applying even a fudge factor has a broad range for a confidence interval. That is something I want to be thought out, that's all that I am saying. I have no problem with this migration from the holo-Indus region, HLA also suggests some backflow out of India, and also measures by haplotypes population movement events that had not been previously observed by archaeology. However the problems


 * 1) Out of India, but little R1a in south India.
 * 2) Within India, the Caste tend to have had dependency on Indo-Aryan source (That or the Indo-Aryan theory is not as robust). Consider that other studies such as HLA suggest admixture from regions west of the Indus; although there are other Y possibly involved.
 * 3) If India was a source that preserved diversity then we would expect more R1a*, and R1a1* to be spread more geographically. Did these authors cover increased sampling over a broader area of India looking for either of these two with their new sets of markers?
 * 4) If not then this R1a1a phenomena is a rapid entry and expansion from India. Timing is then everything, and that is what causes a problem.
 * 5) HLA connections with India are generally not in the region where M458, although at some point I can test the theory specifically, the HLA suggests a shared origin between some Indian and some Southern mediterranean (particularly western mediterranean) haplotypes. I have assummed this is a result of recent common ancestors in West Africa between the Western Indus valley region and Iberia.


 * Scenario 1. Last glacial maximum forces a migration of R1a1 into a local in India, where it evolves to R1a1a local and then radiates as it moves westward. Chronology is 25-18 kya. If the calibration is correct, this scenario is very marginal, suggesting more correction is in order. In terms of silent migration however this is probably the best explanation.
 * Scenario 2. The growth of agriculture moves the R1a1 into a local of India, where it evolves and then radiates as it moves westward. Chronology 12-8 kya. Where is the evidence however? This is recent enough we would see specific evidence on the archaeological horizons.
 * Scenario 3. Epipaleolithic, Optimal conditions on the western Indus deteriorated after the LGM, causing a migration into India, which then reversed, establishing the Harrapa culture as it spread back westward. 18-6 kya. Not sure there is any evidence of this having occurred. Harrapan culture has close ties to the very cultures where R1a1a derivatives are very low.

With each of these scenarios there is no real explanation why the sharp cline exists from west central India to south India although the second (short chronology) is most compatible. Why East-Central Iranian levels are so low, etc?

I think it is clear now that the M458 migrated in response to earlier events, for all intents and purposes it erases the most of the current theories. I have a problem with the very low information regarding M1a* and M1a1*. These categories of mutations also identify places of origin, if they are spread loosely about all over the place, it lowers the certainty of their conclusions.PB666 yap 14:57, 9 November 2009 (UTC)


 * I am having some trouble following all your thoughts here, but here are some I can answer:-
 * I do not see the problem you keep mentioning with the lack of M198/M17 testing. In fact if you look at the data article (which does not include the new data) it seems quite a lot of these regions have now been typed.
 * I think what the Underhill team are suggesting is that R1a* (M17/M198 negative) is spread in the old agricultural region from SE Europe to Northern India. Their maps, when compared to older ones, now show the oldest R1a sweeping through the Middle East, UNDER the Black, Caspian and Aral seas. So they presumably see everything to the north of this as something more recent perhaps spread by farming?


 * So that my question here is did we have a post migratory displacement (For example, migration into the Baloch regions western indus region), or are we talking about typing problems. My opinion based on years of analysis is that when a center of diversity is detected, one is then required to oversample those regions, making errors in peripheral branches in less logically problematic than making errors in the Basal branches particularly with regard to clocking. If indeed they believe that the Eastern Indus to Bombay region is the place of origin of R1a1a radiation, then they need to be sampling like crazy around these regions for R1a to confirm whether or not.


 * What do you think the problem for mtDNA has been in Africa, why their are arguments about the origin of M1 and U6, simply because in this site of diversity, sampling is not representing the level of diversity.


 * Just to let you know my own opinion, I do not think the Zhivitovsky fudge factor is justified. It triples age calculations, in a way which is very hard to justify. The unadjusted germline rate seems to usable especially if you come whole sub-clades to determine common ancestors.


 * How bad is the STR clock? "6.9 x 10-4 per 25 years, with a standard deviation across loci of 5.7 x 10-4". That in fact is an error of presentation, they should have given high and low variances separately, I would be willing to bet the 96% confidence interval is skewed, in fact I know it is skewed. 5.7 x 2 = 11.4
 * 6.9 - 11.4 = -5.0 x 10-4 to 18 x 10-4. You can almost bet that the range is from 10-5 to 3 x 10-3, that is one hell of a range. Cautious about talking about HLA? Have they asserted a pairwise distance in the clade that can be used for SNP clocking? If we had a SNP pairwise one could simply multiple the CHLCA derived date by 1.8, it is agrees with their STR then there is concordance, if it disagrees, then there are problems. PB666 yap 18:31, 9 November 2009 (UTC)


 * I think the STR mutation rates are reasonably well known. The problem is how to estimate what factors have caused the modern population (i.e. the sampling available today). The Zhivitovsky technique can be summarized as follows: calculate based on known mutation rate, and then multiple by about 3 in order to compensate for possible bottlenecks etc.--Andrew Lancaster (talk) 11:18, 10 November 2009 (UTC)


 * There obviously is a quite decent presence of R1a1 right into southern India and even Sri Lanka, though it is obviously patchy.


 * I have not seen anything yet demonstrating R1a* (xR1a1) or R1a1* (xR1a1a) from the broader south Asian region in their paper. India 728-Y 0-R1a* 0-R1a1* 115-R1a. IOW they have this R1a1a radiating from India, but not one single instance of its parent clade, that presents a large problem with their interpretation of STR diversity. This would indicate that Indian R1a1, despite its diversity is the result of a considerable displacement from elsewhere. One expects to see a greater spread of R1a basal branches at some level in India, if this is not observed in India, then India is unlikely the place where R1a1a* evolved or initially radiated. We could tolerate an arugment from ignorance if the clade was really old due to lineage loss and drift, but not something that is less than the LGM.


 * Turkey 2 of 3 R1a are R1a*
 * Iran  2 of 19 R1a are R1a*, more importantly 1 of 19 are R1a1*
 * Oman  1 of 11 are R1a*
 * Armenia 2 of 37 are R1a1*


 * Without more Indian data this places R1a1 origin in the vicinity of NW Iran and Armenia. It is very interesting that this is the same are believed to be the site of Hexaploid Triticum aestivum domestication at 8500 years ago, if they claim that R1a1 spread with agriculture, then one has a major problem, because R1a1a has to be younger than R1a1, which means their dating is really flawed. If this was indeed the case then full sequencing of R1a and R1a1 would discriminate the possibility of a relic type or whether the branching has to be much earlier.


 * Bearing on this interpretation, ethnographic and HLA suggests and one other study supports the migration of Africans into the Eastern Indus, the guess is in the early holocene prior to the Harrapan culture, this could be what drove R1a1a diversity to the East and away from Iran.


 * Yes the perhistory of India and Pakistan would encourage us to think the centre of gravity for the major population there may have shifted a bit of time.--Andrew Lancaster (talk) 11:18, 10 November 2009 (UTC)


 * I am cautious about using HLA data to guide us here.--Andrew Lancaster (talk) 15:59, 9 November 2009 (UTC)


 * HLA not as a guide but as a critique, are the conclusions consistent or inconsistent. Since mtDNA can have settlement bias and Y has invasion bias one needs to look for middle ground, particularly if multiple migrations (more than 2) have been implicated. Again we were waiting for more literature to change the page, however practically we need to make sure these authors are being critical of their own work, for example calibration problems or basal clade sampling problems. For example you should be asking the question, why haven't the few R1a* and R1a1* Y been sequenced in the NRY so that we can see their defining mutations, have these been adequately investigated around the Indus, etc. If one is using SNP for clock estimates (they are not but others might choose to) consider the worthlessness of doing pairwise test when one or the other branches have one or more undetected SNPs. These are potential errors that might change the conclusion with the next group that publishes.  The real question at this point based on this paper, should we actually have subdivided origin sections. IMO SW asian and S Asian origin should be combined under a single heading since the authors themselves cannot descriminate the two. No other origin sections are needed, unless one desires to create a page on R1a1a7.


 * I like the term invasion bias LOL! Anyway I know what you mean, but Y haplogroups are not necessarily spread by Genghis Khans. I write this mainly for the public who'll read this.--Andrew Lancaster (talk) 11:18, 10 November 2009 (UTC)

CWC & R1A
Question/discussion point:

How did the Haak study demonstrate that R1a spread during or with CWC? The strontium isotope analysis suggested the males were local inhabitants and I can't recall any genetic analysis that proved it spread during or with the horizon.Geog1 (talk) 13:41, 9 November 2009 (UTC)Geog1


 * By CWC you presumably mean corded ware horizon? I do not know if the Wikipedia article currently makes this claim? I think a claim like this is derived in the new Underhill et al. article though. It might be better to reference that as well because another problem with the current way of referencing is that I am not sure if Haak actually identified the Y haplogroup? I am writing all this without looking anything up.--Andrew Lancaster (talk) 15:47, 9 November 2009 (UTC)

CWC is still the way the Anglo archaeological world abbreviates Corded Ware Horizon (or "culture"). But don't get me started on the whole persistent archaeological horizons/persistent frontier debate in archaeology and its usefulness.

As for the study in question Haak did indeed identify R1a as the lineage extracted from the analysis among the males. So the Haak reference is fine and should be there but there is another reference next to the Haak ([22] I believe) that I don't think has any relevance to the debate. The real issue is whether or not Haak actually said R1a spread with CWC which I know from reading the article he did not.

The only article that I know of that claims R1a spread with CWC is Dupuy 2006 and this was based on modern population samples.

I think that statement in the article needs to change as its still up in the air whether or not R1a made it further west during or before the Neolithic.Geog1 (talk) 16:10, 9 November 2009 (UTC)Geog1


 * Have you seen the new Underhill paper?--Andrew Lancaster (talk) 16:36, 9 November 2009 (UTC)

The Underhill et al 2009 (U2009) suggests that R1a1a7 (M458) may have spread with LKB or expanded with CWC culture. They imply but have not directly tested the 3 male skeletal remains at Eulau, Germany are likely R1a1a7.

U2009 dates the expansion to 7.9 Ka. The LKB expansion begins in earnest as a prototypic culture at around 8.5 Ka, by 8.2 Ka IIRC it had a presence in the Iron gourges region, by 7.5 Ka it was present within the Loess belt. Some cattle culture had reached England by 7.1 Ka, and there is evidence of cultural admixing within the paris basin previous. In spite of evidence supporting in-situ cultural evolution, the HLA suggests some contributions to Germany and Chech regions specifically from populations to the East. Recent evidence for mtDNA suggests that there was cultural replacement as a consequence of LKB cultural expansion.

The problem is getting R1a1a7 into the Iron gorges region. The cattle for this culture appear to have been of Thessalonian stock, the wheat varieties appear to have been of Anatolian/Balkan stock, so there in lies one of the problems with that theory.

I have to disagree with these authors on one point, there appears to have been alot of genetic replacement within Poland, possibly due to the effects of the Hunnic invasion of the 4th to 5th century AD. I think that while they can establish R1a1a7 earlier types found in Scandinavia but not found to any degree in Ireland and at low levels in England are more than likely of more recent migrations. I would not attribute Polands Y chromosome to ancient migrations if other theories might explain these Y.

My reasoning is this the Irish and Scandinavians share many common HLA haplotypes, these are deemed the Ancestral Haplotypes of likely Canto-Iberian origin, In addition there is a cross gradient of Asian haplotypes that declines from about 25% in Swedes to about 5% in Ireland and has nothing to do with India or any known or historic migration, it tends to align with cultural evidence of East Asian like cultures (Tibet, Yakuts, Koreans, Japanese, Orochon, etc) hunting at the boundaries of glacial retreat around the pliestocene holocene boundary. While Ireland has had more recent influences from Eastern Atlantic cultures on the Western boundary, Scandinavia shows somewhat more recent influences from the SE. Poland in particular has a large number of haplotypes that are inconsistent with western, northern or central European origin and set them apart from any other European group, these haplotypes might be more consistent with Greek, Bulgar, Romanian, Anatolian, Caucasus, or other external origin, there is no specific origin within any typed people and these haplotypes could be of an untyped group within the Russian republic or within a subgroup within Eastern Europe. Has anyone bothered to type the Zoroastrian communities, yet?PB666 yap 17:30, 9 November 2009 (UTC)


 * You have it incorrectly about the Underhill paper implying the three Eulau skeletons are R1a1a7. In fact, they say the opposite. From the study: "Although haplogroup afﬁliation cannot be inferred with certainty from STR data alone, a composite 15-locus YSTR haplotype representing the ancient lineage suggests its potential R1a1a*(xM458) membership due to four alleles (DYS391 ¼11, DYS439¼10, DYS389B¼17 and DYS458¼15) shared with the median R1a1a*(xM458) haplotype (Supplementary Tables S4 and S7). Interestingly, from the list of regional median haplotypes, the ancient haplotype is most similar to the German R1a1a*(xM458) type." MarmadukePercy (talk) 18:14, 9 November 2009 (UTC)


 * Thanks I did not catch the "x"M458, why did they not mention the type it was most similar too?PB666 yap 18:37, 9 November 2009 (UTC)


 * I think the number of markers reported would not be helpful for that purpose.Recently discovered clades like this are getting to the point where you need a lot of STRs to predict them well.--Andrew Lancaster (talk) 19:09, 9 November 2009 (UTC)

Good converstion but I think we need more ancient DNA studies like Keyser et al. 2009 to shed more light on the subject here. I remain skeptical as to the validity that modern population studies make for macro-level demographic changes during a specific archaeological horizon.

Haak's 2005 study was particularly good for supporting the "Paleolithic Survial" theory in that most Europeans were descended from indigenous groups during the LBK though 25% of ancient specimens sampled were from mtDNA haplogroups that could be reasonably be regarded as Near Eastern. So some genetic contribution was given to Europe from the Near East that corresponded with the spread of agriculture. Again this becomes most clear from the methodology used. Regardless I'll check out Underhill to hear what the study says.Geog1 (talk) 18:52, 9 November 2009 (UTC)Geog1


 * Geog1, it does not really matter if you are sceptical. We are trying to summarize the state of play on this subject in the published literature. But anyway, please note that several Y tests from the Bronze Age sites have been R1a, not just one, so the chances of while this is not enough data to say HOW common R1a was, it does make a quite strong case that it was present. However I do not think this means anything for any survival theories. It just means it has probably become less common.--Andrew Lancaster (talk) 19:09, 9 November 2009 (UTC)

I was merely stating my opinion on the current state of research Andrew. And really the way the modern population sample studies are done is not enough to effectively prove anyone theory or stance. As a result of trying to cram every bit of modern population studies into articles like this withought any proper attention to Ancient DNA findings we constantly get endless revisions and an article that never has a well established body which is more of the point for an encyclopedic entry. Therefore a degree of skepticism or disgression is justly due rather than dumping in, and I must be brutally honest, the same old garbage. There was even a NY times article that interviewed members of the genetic community and it said that these modern population studies for genetics are no sure fire way to prove time, depth, or spread. The Haak study from 2005 (not the one from 2008) was the one I was referring to in regards to LBK and demographic change and continuance. I suggest you read it if you havn't already.Geog1 (talk) 19:24, 9 November 2009 (UTC)Geog1


 * Sadly, there are very few ancient DNA studies. In any case, for this article we are lucky that there are a few relevant ones. Thanks for the tip on the other Haak study.--Andrew Lancaster (talk) 20:23, 9 November 2009 (UTC)


 * Geog, we are trying to redo this article with a much more conservative (less speculative approach) . I am not criticizing Andrew, I and trying to predict the minefield of problems that might occur if we simply assume that one paper, any paper, the latest paper is the authority, since the latest paper eludes to the fact that previous research has been in error, we have to ask the question from a future tense, what problems will this paper show versus what we can be convinced of based on the strengths of the paper(s). While ancient DNA typing papers are interesting, we also have to have a broad basis of Mesolithic and Epipaleolithic DNA to discern what DNA has evolved in situ and what level of replacement has occurred. As the recent reports of LBK mtDNA show, the previous studies were in error because they did not type an adequate enough mesolithic mtDNAs. If you have any great opinions about what is too speculative or what should be cut, please present those issues, you will probably find some agreement.PB666 yap 00:05, 10 November 2009 (UTC)

I think what is too speculative is the notion that you can base what is a paleolithic vs. what is a mesolithic vs. what is a neolithic gene on modern population samples. So my problem for the past few years has been the generaly methodology used as a result to make such claims. Other problems that abound regard marrying some of these hypothetical migrations/spreads to archaeological horizons or even paleolithic "techno-complexes". Take for instance the whole Late Glacial stance for R1a. Its actually been fairly traditional among those who specialize in eastern European late paleolithic archaeology to view the situation as stadial without any firm archaeological evidence from any refugium where in long distance expansion/migration from west to east or even east to west was occuring (see Hoffecker 2002). Likewise there is no archaeological evidence for a migration Out of India and into Russia, Ukraine, and then Scandinavia during the various phases of the Paleolithic. Their is no ancient Y-chromosome data from Kostenki, Khavalynsk, Sredny Stog, or Yamna archaeological cultural blocs to confirm a particularly early date for R1a on the Pontic Caspian steppes and likewise none from India which is the only way I believe one could really effectively date these lineages (or at least begin to). Essentially hard evidence is missing from all this and again its really more of the methodology used over the years which I have a problem with. It should be noted that this problem is obviously not relegated simply to R1a but also R1b, J, and countless other haplogroups. At this point there isn't really much we can do except sit back and wait for more Ancient DNA studies to come out. Hope this makes more sense now where I am coming from. However I am not saying that we should censor the modern population studies...I just in all do honesty feel they are not that useful in pinpointing origins and establishing migration patterns for some areas though obviously Y-chromosome haplogroup Q's spatial distribution illuminates the crossing of Beringia and a significant peopling of America.Geog1 (talk) 02:44, 10 November 2009 (UTC)Geog1


 * All sounds reasonable to me. I would remark that Underhill et al's new approach seems to see R1a branching out of the Fertile Crescent, not going from India to Europe, although they do not say that in a clear way. I'd say this is certainly worth considering. I would also say that their approach to age estimates (Zhivitovsky method) is becoming increasingly controversial. If you take these two remarks and put them together my personal opinion (just mentioned in order to avoid misunderstandings here, not as a proposal for the article) is that most of the big clades in the Middle East, Europe, Central and South Asia, (R clades, J clades, E clades etc) are dominated by Holocene expansions out of the Middle East (Neolithic, Bronze Age etc). Even the distribution of N and I Y haplogroups seems to be affected by such movements, although their starting points may have been different.--Andrew Lancaster (talk) 08:31, 10 November 2009 (UTC)

Inetersting. Yeah it seems like every once in a while there's an approach used like "coalescence theory", microsatellite diversity and what not to try and pinpoint a temporal date but none of these methods seem concrete. I'm unaware of the Zhivitovsky method but will have to read more on that so thanks for the heads up on that one. Also in relation to the problems of archaeological horizons and marrying the modern population genetic data to them, Pdeitker brings up an interesting point regarding the spread of certain cattle associated with LBK in that such breeds strengthen a south to north "agro. wave of advance" and would invite greater possibility for near Easterns to have accompanied such movement or migration. Archaeologists usually speak of LBK as stemming from the "Danubian" sphere alluding to more southern farming groups and thus would have spread along the river. Connections are sometimes linked to Starcevo Cris horizon again strengthening the view that LBK, Lengyel, and Rossen archaeological groups would have in turned resulted from southern farming influence in Europe. Therefore, the spatial distribution of R1a would in essesnce not correspond with it spreading via the LBK horizon all that well. So I would agree with what Pdeitker is essentially alluding to. However who knows for sure if the prehistoric record always can clearly illuminate migration or directional influence or if it masks it in some cases. So many problems and not enough seemingly good data.Geog1 (talk) 14:14, 10 November 2009 (UTC)Geog1


 * Well I happen to think the evidence is very strong that there was indeed a large Middle Eastern population movement into SE Europe, and into the Danubian basin, which at least in terms of Y haplogroups may dominate Europe's whole population. Presumably the genetic effect gets less with each push these cultures made, with LBK areas having less than the Balkans, and the areas beyond the LBK less again. However I see no problem with saying R1a might come from the Middle East in this period and yet show no modern evidence of having been on the farming train into Europe. Because:
 * It might not have caught that particular train. Indeed R1a may have been from the East of the Middle East so to speak, indeed towards the areas where it is still most common today.
 * It might have entered Europe early (as per Eulau and Lichtenstein cave evidence) and then these old European R1a lineages might simply have not been winners in the luck stakes.
 * Indeed I think the chances are good that there were several waves of Middle Eastern immigration after the first one, perhaps right into the Bronze Age, possibly carrying R1b-M269, and these may have over-run R1a in frequency terms. (Note how there are many scattered I clades which survived, presumably native to Europe, but we would not see these as distinct old European clades if they were R1a, because R1a in Europe today is clearly dominated by an expansion from the East.)--Andrew Lancaster (talk) 14:28, 10 November 2009 (UTC)

I think that this has been a good conversation. Thanks everyone for sharing your views. Just to verify though in relation to what the title of this section was about: Haak simply discovered R1a among CWC males. Underwood said it may have expanded during the LBK or CWC using his own methodology. Dupuy is currently the only one that I know of who said the lineage spread with CWC based on modern population samples. Kasperaviciute (2004) is the only study that I know that discusses R1a arriving in the Baltic area during the time of the CWC as a possibility based on modern population samples.

So this statement: "The discovery demonstrated the spread of R1a with Corded Ware culture into Central Europe."...is it technically correct?Geog1 (talk) 15:12, 10 November 2009 (UTC)Geog1


 * I think I see your point. I guess it would be controversial to deny that it showed the "presence" of R1a in the CWC, but I agree that it seems controversial to make any claims about when it got into the area. As PB said, I reckon no one will object to you fixing the wording.--Andrew Lancaster (talk) 15:25, 10 November 2009 (UTC)

Ok I made a minor change then. Glad we could talk this out.Geog1 (talk) 15:37, 10 November 2009 (UTC)Geog1

Calibration issue, U2009
Sorry I present a bit of original research, but Andrew brought up the issue of calibration and their technique brings up issues all by itself, STR clocking is highly subject to lineage specific rate variances (evident in their stated confidence interval). To test which of these is likely correct I counted all the SNPs between the Y seqMRCA and R1a1a7 as 52 mutations. In agreement with White et al 2009, I set the CHLCA for humans and chimps, aged, at 8 Ma this reset Soares mtDNA TMRCA at 236,000 years ago. I set the Y TMRCA as the time of L0k-L0d/mt-else (Khoisan/East African) split at 169,000 years using the above long chronology (Soares et al. 138 kya for L0k/L0 k split * 8/6.5, liberally high estimate of Y chromosome MRCA, this will make all Y values high, but not as high as Soares or Gonders estimate of the mtDNA TMRCA, and allows for a lower male population size prior to expansion). For example with this chronology L3 expands 100 ka with an effectively larger population leaving a constrict period of 135 ka, with Ne Y 1/2 of mtDNA that would be 65 Ka + 100 Ka or 165 Ka, consistent with gender estimates. These are the dates for various mutations that I came up with:


 * R < 72,000    26,000 - Tatiana M et al 2008. (Genome Research)
 * R1 < 45,000
 * R1a < 34,000
 * R1a1 < 23,000
 * R1a1a < 6300
 * R1a1a7 < 3500

Again these timings come from someone very critical of Y-chromosomal molecular clock and who strongly favors a long chronology, and one should imply great variances. The mutation counts are very similar to those I obtained from the E1b1b1a1a1-seqMRCA distance. Even with that strong favoritism I cannot approach the aged timing of Underhill for Indian M1a1a diversity.

On face value I would argue that Andrew is correct, however does this hold up to scrutiny of various issues in the technique.

If we argue that CT is about the same time as first Y leaving Africa, then Y left Africa 134,000 years ago that would place the exodus about 30,000 years before the expansion of L3 mtDNA lineage using the same anchors. There are two possible explanations: Many missing mutations in the basal Y clades or the Y TMRCA is much more recent. If one packs the basal branches with mutations then 52 increases and the mutation rate increase and if one sets the Y TMRCA more recent then also the mutation rate is faster. Increased mutation rates means decreased times between branch points: t = 1 / f.

OTOH there could be a number of missing mutations in the R1a1a subclade which make the dates much older, however given the above. In this circumstance they would have to be hopeful that cryptic basal mutations or lower TMRCAs would be compensated for a much higher relative number of mutations found in the peripheral clades in order for their early dating to hold up. Generally I find this plausible, but specifically I would warn against an 'If and If argument' when both premises are unknown and when one relies on two specific premises to be true for ones argument to also be true, particular in this case when there has been much more study of peripheral clades than inter-comparisons between basal clades. Therefore with critique of potential errors it seems likely that Andrew is correct. We have to be very judicious with our wording on the main page:

I recommend a rewritten Origins section:

1. Using this paper to pair down the various regional origin hypotheses. Both the Central/East European and Central Asian origin can be eliminated (However one might mention the Northern Caucasus and Hindu-Kush as two possibilities) while mentioning these two in passing.

2. Since no R1a* or R1a1* has been found in India yet, but a center of diversity exists on the Iranian/Armenian border region for R1a basal clades and since the center of diversity in about the East Indus region supports at least an early entry of the R1a1 clade, I think we can effectively limit the points of origin from the Northern Caucasus to Anatolia to S.Iran and the E. Indus river in a single section. While explaining the limitations in the sampling (only 117 R1a typed in India for example). E.g. South or West Asian origin hypothesis. I think the problems can be described as being under a single umbrella, a single section.

3. If anyone has published critiqued of their clocking method then we can present their U2009 approach and also the critique and suggest other expansion chronologies, I had a paper about 2 years back that criticized STR clocking, however I am not sure where it is now. If we cannot find a current critique of the method then we have to be very careful about how chronology is discussed otherwise the article takes on non-NPOV.PB666 yap 22:38, 9 November 2009 (UTC)


 * I should state that I am personally under the influence of people like Ken Nordtvedt, who having once done maths for NASA, is now a genetic genealogist. His approach is to admit that STR variance on its own is a mess, but he is opposed to inventing fudge factors to try to fix it. What he encourages STR comparisons of whole clades, "interclade estimates" - clades being groups with known common ancestors. This gets rid of a lot of noise caused by population dynamics because you know you are dealing with two real father-to-son lines. The key then is to use as many markers as possible, and a lot of discussion has gone on about which mix of markers is optimal in terms of mutation "speed". Ken's analysis of this approach is that it should be accurate at within the Holocene, and possibly further. A lot of this is un-citeable, because it appears on internet forums, but I know that "real geneticists" have taken some note of it. What Ken has published (on JOGG and on his own webpages) has tended to be concerning the details of STR estimation maths. The basic method and logic is quite simple and not exactly new. Using SNPs is of course another approach which will probably come into its own in the near future, but I do not know if we have the right datasets to use it much yet. You need to compare big stretches of sequenced chromosome I think. Many known SNPs are discovered by variance coincidences, for example because they are next to an STR and make a test fail. So just counting them does not mean much.--Andrew Lancaster (talk) 08:47, 10 November 2009 (UTC)

Talk:Haplogroup_R1a1a_(Y-DNA)
Secretly Cardenas2008 created the main for this page. I have edited this new page, and moved all the discussions regarding this topic to the appropriate talk page. Also copied origins section from this main to that main, editing for R1a --> R1a1a changes where appropriate, I am sure I did not do it perfectly, Andrew and Marmaduke so please feel free to correct and balance.

In addition I completely rewrote the Origin section to reflect the 2009 understanding, since these previous studies were talking about R1a, that remains here and that page discusses R1a1a which is relatively new, all origin theories were eliminated and the discussion focuses on what is gleemed from Underhill et al. 2009.

Now we have two articles to bring up to B-class. I will redo the phylogenetics this evening to reflect and link to that new page.

CheersPB666 yap 00:04, 11 November 2009 (UTC)

I would like to hear what others think, but I am not really sure I can agree with either of the following:- I believe each article should make sense on its own, and as much as possible so should each talk page. There might come a day when R-SRY1532.2/SRY10831.2 and R-M17/R/M198 need to be discussed separately but I believe we have not reached it.--Andrew Lancaster (talk) 07:33, 11 November 2009 (UTC)
 * Having separate articles for R-SRY1532.2/SRY10831.2 and R-M17/R/M198. The two clades overlap almost entirely in all discussion so far. You'd either have to have almost identical articles, or else, if you avoid extensive redundancy, you'd need to cripple both articles.
 * Moving all the talk page material you moved. Some is quite relevant to this article still, no matter what happens.


 * I guess I'm not following on the reasoning for breaking this piece into two. I don't see the logic there. Can someone explain? MarmadukePercy (talk) 12:49, 11 November 2009 (UTC)


 * I've been trying to work out the edits that have been made and I must say the articles hardly make sense anymore, and a lot of hard work that had gone into developing something which made sense is no longer easy to recover. Chunks of information have simply been left hanging and incomplete, or else screwed up into material which is full of confusion and error. It is simply not good enough to make major changes like this if you have no time to do it properly. These major changes were a major mistake.--Andrew Lancaster (talk) 21:24, 11 November 2009 (UTC)


 * I think the R1a parts should all stay together in one piece. Otherwise, it becomes too diffuse and confusing. R1b, for instance, for all its subclades, remains one piece. MarmadukePercy (talk) 22:10, 11 November 2009

(UTC)


 * Since I did not create the new page, I simply worked to improve that which already existed, however as I told Andrew prior to realizing this was done. I had hoped to have two pages sandboxed for couple of weeks, particularly the R1a1a page.


 * Underhill and others have produced 3 more R1a-R1a1 (mutations M448, M459, M516) and 5 more R1a1-R1a1a mutations (M417, M512,  M514, M515, Page07). This is not to mention 3 more R1a1a haplotypes and one branching haplotype within R1a1a R1a1a7-M334. This means we are dealing with 4 tiers when considering R1a. The total number of SNPs from R1 to R1a1a7a exceeds now the number of mutations between R1 and R1b variants. The cladistics was becoming excessively difficult. In addition the page logic is following the exact same logic that the E1b1b page took, first the E1b1b page was split to form E1b1b1a page, last E1b1b page was split for form E1b1b1 page. In this situation R1a1 was not subsplit (since there are simply two few examples).
 * In the discussion of origins R1a origins were being confused with R1a1a origins, these are two clearly separate issues, and we can add to this issue the intermediate R1a1. IOW from a phylogenetics standpoint the origin was confusing three different branch points.
 * This problem is further exacerbated by the discussion of STR, which radiate from SNPs once they form, IOW as the site that Andrew pointed to points out STR clocking is highly dependent on the correct assignment of SNPs. As we can look at the STR mutations on the R1a1a page, as these have not been changed.
 * Andrew, After reading the Origins section of the R1a page, it was still a mess of facts and improved mess, we were still hanging onto alot of unneccesary chaff, rather than spending alot of time fixing that mess of facts, I thought if a split is inevitable (it is) then create the two pages now and fix the origins of both pages now.
 * This calls also for an expert opinion. My opinion is that we should not be confusing the origins of R1a, with the origins of R1a1 and/or R1a1a, this will only make future matters and page discussions worse. In contrast we should be setting up a framework to handle these discussions separately as not to confuse the readers. Are we there yet? No. As more markers and better studies come forth we are going to be in a situation of having to deal with all sorts of origin phenomena. Lets break these issue into more manageable bits now. R1a1a makes the perfect example, when discussing the origin are we talking about the origins or R1a1a7 that is in Europe, or all of R1a1a spread from India to Iceland. The intercomparsions may be difficult now, but since Underhill retested many of the previously tested markers I think the time is appropriate. Based on a SNP clock of a conservative 1.5 Ka per mutation, 6000 years separated the origin of R1a and R1a1 and 10,000 years separated the origin of R1a1 and R1a1a, once it appeared that two long chains of SNPs separated R1a and R1a1a I think there is an inevitability to splitting. Dealing with the origin of R1a and R1a1 is a task all into itself, IMHO, not withstanding dealing with the origin of R1a1a.
 * R1a was listed as needing immediate attention, so it did. I was suggested to pause until some new very big paper on R1a was published, it was published and it appears there are two major discussions in that page. 1 the spread and diversification of R1a1a (R1a1a7, R1a1a*, R1a1a6) and the origin of R1a1a. That singular topic identifies that this topic is worthy of its own page. While Cardenas2008 jumped the gun. Take a look at the references that can be devoted to the R1a1a page alone (either new R1a1a* or R1a1*) the list is longer than the reference list for most wikipedia pages.
 * I have not examined R1b yet, since I am tackling the most critical pages at the moment. Are you trying to draw my attention to an R1b page, does it need splitting?

If it is such an emotional obstacle to split now, then revert the page back to where it was, however Wikipedia does not allow for content forks or page duplication. What this means is that the R1a1a pages that was created must be blanked and redirected to R1a. And frankly I spent alot of time of the cladograms, and creating a single comprehensive clade for R1a now is not going to look encyclopedic nor attractive, and I would probably leave it split along the lines that these two currently exist. I really think Andrew should read through both pages, particularly the old version of the R1a page and consider heavily that the origins section was not divided very logically, there was not clear delineation of the discussion of R1a and R1a1 or R1a1a, these are three different and well separated nodes. I have waited for almost a month for the R1a page particularly the origin section to be improved, it has not moved along a direction toward becoming more encyclopedia. Just old sets of contrasting facts replaced with a combination of new and old hard to follow facts. I do appreciate the work Andrew has put into this, however its time we moved this along and get it up to stuff, there has been far too much discussion on this talk page for this article to remain a start class, there are many other pages in the project that need work.

In response to Andrews suggestion; Articles should make sense on their own, I would point out Andrew that the ISOGG 2009 created the problem, a situation that needs to be solved, Underhill adds to the problem within 9 new defining mutations. The problem existed is a natural situation that needs to be explained, simply because the complexity is difficult for you to deal with during this breif transitory period does not mean the page is more or less coherant it will be less coherant unless you begin organizing it along the branch structure from the lowest to the highest branch. I find mixing the discussion of R1a and R1a1 and R1a1a origin in a single page the most incoherant situation. In contrast to the R1b page that has a cohesive population structure (more or less unimodal) R1a has a multi-tier multi-modal structure.

The highest tier is R1a1a7 which is spread in a given direction across Europe, and R1a1a6 is spread across the Arabian Gulf. The next Highest tier is spread from Iceland to China to S. India and into Arabia. The R1a1*(xR1a1a) tier is spread from Greece to Scandinavia to Kashmir to North central India, to Oman and Back to Greece. There appear to be two modes also in this structure, one in NC India and Another between Iran and the Caucasus. The lowest tier (xR1a1) is spread about Southwest Asia. In discussing all of these relationships on one page one has to be careful not to confuse or mix up the discussion.PB666 yap 23:50, 11 November 2009 (UTC)


 * "I find mixing the discussion of R1a and R1a1 and R1a1a origin in a single page the most incoherant situation. In contrast to the R1b page that has a cohesive population structure (more or less unimodal) R1a has a multi-tier multi-modal structure." I take exactly the opposite position. The fact that R1a is, so far, something of a monolith argues for its being presented in one clear, concise encyclopedic entry. By breaking it apart, you're doing a disservice to readers. In a sense, it would make more sense to break apart R1b, which has been subdivided into various clades. By doing this to R1a instead, you are going to prevent readers from seeing the full picture. A big mistake in my book. MarmadukePercy (talk) 01:19, 12 November 2009 (UTC)


 * I did not split the page, someone else did, and you guys did not even notify the person who split the page that you combined it back together, however in all fairness he did not notify anyone he had split the page either, this is the way we are doing things around here for the last six months. R1b is reasonably geographically contained with a more or less sensible structure. I am not going to argue the point anymore, I have been waiting for a month or more now for one section to undergo a cleansing process and its still reads pretty much like a dirty laundry list, poorly organized, not encyclopedic (few figures (other than the clades I added), no explanations, no graphs, no maps, nothing but jibbering back and forth in a way that is not very digestible to the common reader. The way I am going to handle this is to remove entire sections that are unencyclopedic, drop it on this talk page and see then if you guys can fix it, instead of talking about what should be done. WP:BOLD, I was, lets see what you can do.PB666 yap 04:35, 12 November 2009 (UTC)


 * You're not bold, you're simply reckless. I had to point out to you that you confused one of the points of the Underhill paper. You have no business editing this piece, let alone taking radical surgery to it, until you slow down and do your homework. Perhaps once you've done that, then we can work together to make it fit. Until then, you have no business citing some wikipedia policy about being bold. Further, the division of this piece should never have happened without consultation between the editors involved. This entire process is being done in a way that violates a central tenet of wikipedia: editors conferring and reaching concensus. I, for one, am distressed at these changes and the way they were done without adequate consultation. The idea of breaking this thing into pieces is a disaster as far as I'm concerned. MarmadukePercy (talk) 06:24, 12 November 2009 (UTC)

just a coulpe of notes here I thought I would share.. frequencies of Underhill(2009) http://www.nature.com/ejhg/journal/vaop/ncurrent/extref/ejhg2009194x4.pdf

also sharma (2009) study with posted here also discusses R1a*, R1a1* and R1a1a (R1a1a1), R1a1b (R1a1a2), and R1a1c (R1a1a3):

"However, there is a scanty representation of Y-haplogroup R1a1 subgroups in the literature as well as in this study. The known subgroups (R1a1a, R1a1b and R1a1c), which are defined by binary markers M56, M157 or M87, respectively (Supplementary Figure 1), were not observed. In such a situation, it is likely that this haplogroup (R1a1*) is a polyphyletic (or paraphyletic) group of Y-lineages. It is,therefore, very important to discover novel Y chromosomal binary marker(s) for defining monophyletic subhaplogroup(s) belonging to Y-R1a1* with a higher resolution to confirm the present conclusion."

hope this helps you guys in anyway. thanks! HonestopL 05:14, 12 November 2009 (UTC)

PB666, there is no emotional issue, and this is not a content issue as such. Your edits created two articles which are both garbled. You seem to think it is normal to allow articles to become unusable while you play with them and try to find what you are looking for, but this is not true, and this was an extreme case. You mentioned yourself that you should have tried working on a draft somewhere. The problem with this particular proposal for splitting, which you proposed on this talkpage (you ignored the responses), is that as you have mentioned yourself, it is difficult to find discussions which make anything but side remarks concerning the clades within R1a. All discussion we can cite mixes discussion about the broader defined R1a with the more narrow but dominating M17/M198. If you have a way to split things up neatly you certainly have not shown it with your recent edits! Blaming ISOGG or Underhill et al for publishing new information, and confusing you, or blaming Cadenas2008 for making the original article you used to move material to, is a bit unrealistic.--Andrew Lancaster (talk) 07:02, 12 November 2009 (UTC)

A few responses on particular points made above:-
 * I do not see the R1b article as exemplary, and I am not sure why this is being discussed.
 * I do not see the case of the split we did of E1b1b1a from E1b1b1 as necessarily relevant here.
 * The reality is that there is no citable source for separate discussions about the origins of the different levels of clade here. This is not just due to mistakes by authors but also due to the paucity of data about the smaller clades. Everything we can say revolves around the dominant sub-clade, and so we have to work with that reality.
 * Some of the things directing your thoughts are clearly, by your own description, your own musings and doubts, for example about age estimates. No matter how interesting these thoughts are, these are confusing the issue. We are only supposed to be reporting what is already written out there in the "real world".


 * I am tolerant of their age estimates, however it needs also to be critiqued.


 * I have no problem with the assertion that various parts of the article needed re-writing in order to avoid being confusing. So we should fix that first, at least as much as possible, as we have been doing. (As Marmaduke pointed out above, some of the most confusing wording has been recently added by you. Please be careful to finish your edits and leave the articles in a readable form when you log off.) This editing process is not helped by splitting the discussion up into two separate articles.--Andrew Lancaster (talk) 08:23, 12 November 2009 (UTC)

Replying to Andrew and taking into consideration what HonestopL has stated. I want to be brief as to not dilute the point.

This process has gone on to long for the small amount of construction that has taken place. I have no vested interest in any origin theory, simply stated I reject placing speculation and belief as origin. R1b and R1a, as these are the only two R1 subclades, along with the placement of R1a basal branches has everything to do with R1as origin. The fact that Andrew still does not get this is problematic.

The flaws in the R1a have been obvious for quite some time and to many people. The problems have not gone corrected, IMHO. As long as there is destructive speculation marching around as theory this page will never achieve a B-class rating according to criteria (copy of which is also on my user page).

As HonestopL has stated, more descriminating mutations of R1a* and R1a1* are needed, the fact that these have not been discriminated has created bias within those involved in the current discussion. Since R1a1a* has at least three potent subclades and a cogent and encyclopedic discussion can be had, and because this process on this page has gone on for so long, I am setting a concise deadline, if by Saturday, the problems that I have stated about the origins section have not been cleaned up to at least a C-class standard, I will exercise my solution and revert the R1a1a page back to my last edit and work on an improved version in my sandbox. The last version is better than the one Cardenas2008 dropped their on the 23rd. There is an inevitability about the indistinction in the literature Andrew mentioned will be clarified. That is the way evolutionary biology works and for me and that is not and should not be an issue, the article mistated factual points.


 * 1) I believe that the R1a1a page can be easily made to a B-class standard, it is not a stub, it is well referenced and it has a concise and relatively clear story to tell.
 * 2) As long as major editors on this page are unclear about the phylogenetics and evolution process and obstruct major edits (i.e. deletion of sheer speculation) this page will never achieve a B-class rating. Therefore there is no longer any sense in trying to improve this page. The major edit warring on this page is now gone, it was a perfect opportunity to clean up the page and wikify it (graphically). However that opportunity has been wasted by stalling tactics and preservation of unneccesary speculation dressed as theory. If Some Fox News anchor person states there were 45,000 people at a rally when 10,000 were present, and the same person fakes photos of the rally that may be news, but it may not be worthy of inclusion into this encyclopedia. The same standard goes for beliefs of molecular pundits.
 * 3) under the auspices of WP:Bold edits are not to be reverted without a concensus, I made a Bold edit of the R1a1a page, the page existed because it reflect someone elses opinion. If the R1a1a page can be reclassed and the R1a page cannot, if a merger of these two is forced, then the edits on the R1a1a page will assume a priority status. That may solve the problem we are having here.

Saturday. PB666 yap 13:41, 12 November 2009 (UTC)


 * Actually it is Bold, Revert, Discuss -- WP:BRD. Which makes sense. If you make bold changes and no one objects, fine. But if they do, don't be surprised if you get reverted. You should then acknowledge that there is disagreement and work towards a consensus. Dougweller (talk) 20:25, 12 November 2009 (UTC)


 * You have no right to make such ultimatums, and you are exaggerating terribly. Two new articles came into consideration while we were already quite busy re-working this article. This should continue. There is no WP:deadline. No one is stopping you from taking part in this. There was only a lot of objection to your splitting the article, and making it (and this talkpage) incoherent. If you make a split of this article into two pieces which are incoherent again, you can be sure you will create a big controversy. You have not yet explained WHY the article needs to be split. The phylogeny of R1a, no matter how simple or complicated, needs to be explained within one article. If you split it, it will then simply be needed in both those articles.--Andrew Lancaster (talk) 15:53, 12 November 2009 (UTC)


 * For the R1a1a origins section only this segment from the R1a page needs to be adopted.

"*R1a1* (old R1a*) SRY1532.2/SRY10831.2 positive, but M17 and/or M198 negative. 1/51 in Norway, 3/305 in Sweden, 1/57 Greek Macedonians, 1/150 Iranians, 2/734 Ethnic Armenians, 1/141 Kabardians. also found 13/57 people tested from the Saharia tribe of Madhya Pradesh, and 2/51 amongst Kashmir Pandits."

however for the R1a, a whole slew of factual inaccuracies need to be dealt with, including the treatment of Central and West Asian origins as a Theory. Why argue with me about this, fix the page, I told you, I think the most constructive means of fixing the origin section is to split the page. If you have a better solution WP:SOFIXIT, and if you can't it means you don't have a solution in mind. I have not set a dealine, what I am doing is giving you guys an opportunity to correct long held problems with the page, you have all the info that you need, it should already be done. Both of you guys should read the A and B-class guidelines, follow the examples of B-class pages in the MCBR project and make some tough decisions about what is worthy of being kept and what needs to go. Enough discussion, spend your time improving the article.PB666 yap 16:01, 12 November 2009 (UTC)


 * I do not get your point. The longer origins sections are mainly about R-M17/M198. We have no sources to write much about the parent clade as a separate entity. If you think it is not clear which bit of the origin section are about which this is understandable given the new articles which we are trying to fit in. But this will not be fixed by splitting the article.--Andrew Lancaster (talk) 16:06, 12 November 2009 (UTC)


 * Then why are we retaining, in fact quoting in big quotations marks sources that makes claims of such that are completely unfounded based on the facts as we know it right now and have known for at least a year. I disagree with you, prior to Underhill there was substantially adequate information to define R1a points of diversities (places with higher density of R1a* and R1a1*) Central Asia only has R1a1a, West Asia has only R1a1a. Only N Europe, SE Europe, SW Asia and India have levels of expected diversity, among these sites only SE Europe and SW Asia have the level of R1a* diversity, by itself, to suggest places of origin. The data was already there, the only thing Underhill did was better parse R1a* and R1a1*. In addition to R1a one can also examine, as an outgroup R1b diversity, but the R1b PMRCA is based on a belief also and nothing more. I cannot accept a postulation of origin of R1a in Central and West Asia based on a belief and no data, can you? Is this something you want to project to casual readers. Likewise I cannot accept a conclusion of R1a or R1a1 origin India based on 2 data points, both R1a1*, and no STR or SNP data to back these up. I look at this purely from an expert perspective and based on a cumulative knowledge of 25 years of flawed thinking of aspects of molecular anthropology, speculation based on poor or missing data, 9 out of 10 times, is dead on arrival. The Sharma et al. paper did do STR analysis, but the only thing they looked at was STR within their two local groups, the Kasmir and Saharia. They did not compare STRs with any other R1a1* currently identified, and as stated they found no evidence of R1a* in India.


 * You have to learn to be far more critical of these results. Frankly I am glad that wikipedia disallows the demonstration of Cline maps. Sharma Figure 1 and Figure 4 are so troubled by ascertainment bias I am surprised it got published. As I stated you guys really don't want someone who is so critical of Y chromosomal studies (a 20 year disaster in the making that has only recently improved), from a global molecular anthropology story, the only story here is R1a1a and there are still relatively large facts. For R1a, the story of the page is very bad theories based on 'absense of evidence' and very confusing and inadequate discussion of the facts. Marmaduke criticized me for chopping up the origins, If it were not for underhill and the new R1a1a factoids which I laid out on its own page, the Origin section, on first inspection, needed to be deleted and completely started from scratch, it was a train wreck and still is.


 * "West Asia has only R1a1a"? Really? Have you actually been reading anything about this subject or have you now decided to work based on feeling? Underhill et al wrote that "The most distantly related R1a chromosomes, that is, both R1a* and R1a1* (inset, Figure 1), have been detected at low frequency in Europe, Turkey, United Arab Emirates, Caucasus and Iran". Your posting above is nonsense. Sharma was a massive survey, not just a small one of two ethnic groups. R1b origins have never been mentioned by me and they are not mentioned in the article etc etc. Slow down and take a deep breath.--Andrew Lancaster (talk) 19:31, 12 November 2009 (UTC)


 * You know what I was trying to get at. Europe (Europe, Turkey) SouthWest Asia(United Arab Emirates, Caucasus, Iran), you wanna call Southwest Asia, West Asia, finePB666 yap 23:24, 12 November 2009 (UTC)


 * I agree with mr.lancaster...the Sharma (2009) study was pretty big. here are some the frequencies published from that study: http://img208.imageshack.us/img208/6250/88127356.jpg hope you guys can solve the problems at hand. good luck! HonestopL 19:11, 12 November 2009 (UTC)


 * The inconsistencies are even bigger. R1a* = R1a1*, R1* (not R1b*)?, their results do not agree with Underhill even though they typed in many of the same regions. Old marker sets, failed PCR?

Concerns of one user, Pdeitiker
PB666 has now placed comments within the text, eccentric as usual, but I will try take this as a chance for discussion!
 * . Fact tagged placed at "On the other hand, until 2009 claims concerning which R1a populations show signs of being oldest varied greatly between different articles, with analyses focused on Asia proposing Asia to be the origin, and articles focused on Europe, arguing the opposite." My response. This sentence was brought in to replace a long rendition of all the old articles which had been there previously. As usual, when people are in a bad mood on Wikipedia, they force each other to bloat the article with over-explanation. This is however one habit Pdeitiker has usually been highly critical of. So, do we need to put in a full literature review or not?
 * Disputed tags placed on 6 sections, all referring to the talk page. All the sections are short dry recitations of theories to be found in the literature. Looking over this talkpage I find no claims being made that the literature does not contains these theories. So I presume these tags must be removed, or else justified more clearly.--Andrew Lancaster (talk) 20:04, 12 November 2009 (UTC)
 * Although I've been guilty of it before, tags like that are clearly meant to be followed up by the editor doing the tagging starting a discussion on the talk page. So, either they should be removed or the editor who tagged the sections needs to start discussions on each section. Dougweller (talk) 20:19, 12 November 2009 (UTC)


 * Concerns are mentioned above and also highlighted within the raw text. These tags are not just about the claims but the clarity of the claims, for example with Central Asia origins what exactly was type by those making the claims, what does that mean in the modern lingo, and say such and such studied the problem in 2006 and believed all kinds of wild stuff. If it is only belief (because of poor sampling, sampling bias, old markers, faulty techniques) it is not worthy of a theory section. Again see definition of a theory, it has to be upholdable under most situations, a hypothesis only needs a few seeds of evidence, we need to maintain a consistent and logically verifyable standard here, not every morsel of pundit wisdom belongs on a wikipedia main page, and definitely not in superquotes. Most of these sections are not Theories simply speculation based on old data and insufficient comparisons. Underhill was a coauthor on the Mirabal study, but as the lead author on this new studies does not argue that Central Asian origin is likely, quite the opposite, they take the argument that the origins of R1a1a is more likely South Asia and take all but no stance on R1a* or R1a1* even though they have the most complete data sets. That speaks volumes these incompatible theory sections must go. A couple of sentence for each one is all that is required.PB666 yap 23:46, 12 November 2009 (UTC)
 * If I am wrong about these theories then point me to the new-improved and correct definition of a theory. The way I look at this at current R1a*2009 SW asian hypothesis, SE european hypothesis. R1a1*2009 SE European hypothesis, SW asian hypothesis, South Asia hypothesis. R1a1a*2009 South/West Asian origin hypothesis, all kinds of other speculation. I am gathering this from the most recent papers themselves, anyone disagree, if not why do we have 4 theory sections describing god-only-knows what origins. Keep it real folks.PB666 yap 23:46, 12 November 2009 (UTC)

I have been trying to understand the concerns, and I see one thing that keeps repeating: Pdeitiker is concerned that all discussion (age estimates, migration theories etc) should explain which PART of the R1a clade was being discussed in any particular article. I do of course understand this concern, but with all due respect I think this is totally ignoring the challenge that reality confronts us with on this subject, which is by the way not all that difficult... It is my belief that we can avoid most misunderstanding simply by adding the odd extra word to remove any vagueness. I've been trying to do so this evening, but have not completed the task. Do others think this is a worthwhile effort?--Andrew Lancaster (talk) 20:50, 12 November 2009 (UTC)
 * R1a is a name which gets used both casually and in published journal articles to refer to different but related clades. People might be looking up ANY of these clades, and expecting to find an explanation. (Therefore by definition the article must distinguish and compare the DIFFERENT meanings.)
 * On the other hand, this gives less problems than you might expect because all discussion of these clades, (statistics, age estimates etc), is dominated by one sub-clade, the one defined by M17 and/or M198 (which is now officially R1a1a).

Self-indulgence about European male genetics
The meter on the blogosphere and Wikipedia always goes off the charts when there is a publication about European male history. No such excitement occurs when publications concerning female mitochondrial DNA lineages are published. Much worse for Non-European lineages. A case of WP:BIAS, I suppose. However, R1a is an important haplogroup, at least concerning prehistory, and who wouldn't be obsessed with their own genetic history. Nonetheless this obsession is what causes some of these disputes. Within a few days there were over 140 comments on Dienekes blog concerning the Underhill publication, which is unusually high, most of it is bloggers talking past each other, almost everyone was an expert. I don't have a stake in this haplogroup so I could be of some use here. Wapondaponda (talk) 21:28, 12 November 2009 (UTC)


 * Constructive comments welcome.--Andrew Lancaster (talk) 21:36, 12 November 2009 (UTC)
 * BTW I want to point out that whatever the faults of this discussion, your implied accusation that this is a "typical" POV biased argument as we often find on this type of article is not really fair. I do not detect any issue being made about particular regions or ethnicities or similar. PD seems to be frustrated at the messiness of the literature we must cite as much as anything. Both of us share a concern with trying to finish off a longer term project of tidying up an article that has certainly had its share of problems in the past. I think the size of the changes needed has been the source of the problem, with PD wanting to push faster than Marmaduke and I think is wise. I see no real content dispute at all worth speaking of. That does not mean your perspective might not help.--Andrew Lancaster (talk) 21:51, 12 November 2009 (UTC)


 * I have no stake either, its simply time this article was brought up to a standard. And if you look at the mitochondrial eve page, I have made a large number of edits based on the last 3 important publication. Which, to tell the truth, is what I would rather work on. The crappy state of Y chromosomal studies, IMHO, gets far too much coverage. How hard is it to sequence a Y chromosome in 2009? geeze. The fact we are retaining decade old opinions based on an incomplete marker set just goes to show how difficult its is to remove the chaff from the wheat (or the other way around from GFers).PB666 yap 23:11, 12 November 2009 (UTC)


 * Andrew you said wait until this new big paper is out, its out now, but I notice your latest edits are removing much of the new material added from that paper, backtracking only makes the future direction for the page all the more difficult.PB666 yap 23:26, 12 November 2009 (UTC)


 * Which material from the new paper have I removed?--Andrew Lancaster (talk) 07:30, 13 November 2009 (UTC)

Section (Central Asia origin)

 * 1) Literature in support is an interpretation of three relatively old papers that used old markers.


 * That you pointed out Underhill does not appear to be supporting this speculation.PB666 yap 15:18, 13 November 2009 (UTC)


 * 1) "This position is also considered likely by Mirabal et al. (2009) after their larger analysis of recent data.[vague]"
 * I have the paper right here.
 * Five groups I will label: 1- Arkangelansk 28, 2- Khanty 27, 3- Koml I 54, 4- Koml P 49, 5- Kurak 40, 6- Tver 38
 * R* (not M124, R2; not M173, R1) - 1-0, 2-0, 3-0, 4-0, 5-0, 6-0
 * R1* (not SYR1532.2, R1a1*2009; not M343, R1b) 1-0, 2-0, 3-0, 4-0, 5-0, 6-0
 * R1a* (SYR1532.2; not M198, R1a1a*) 1-0, 2-0, 3-0, 4-0, 5-0, 6-0
 * R1a1a* (M198; not M56, M157, or M64b) 1-5, 2-4, 3-16, 4-16, 5-21, 6-22
 * Haplotype variance R1a1a* 1- .271, 3 - .226, 5- .191, 6 - .280,
 * Compare South India - 0.505, South Pakistan - 0.475, West India - 0.426,
 * This is what I call a 'Daily Show' moment [Pause and smirk for a Stewart effect]. Based on what data do they affirm the quoted statement?


 * Ancestral haplotype (R1a1*2009) in Central Asia. None
 * SNP variants of R1a1a detected in Central Asia. None
 * Rank of central Asia types in STR diversity for R1a1a types. Intermediate.

Therefore the conclusion is that R1a1a evolved in Central Asia. Good thinking, completely wrong and out of step with molecular phylogenetics, but who needs that, we are talking about the Y.


 * I want you to show me what credible evidence suggest that Wikipedians should promote this point of view as a theory? That is a choice by editors on how they treat this, not the authors of the paper. And BTW we are allowed to critique this based on what other authors have said. Basically a POV has promoted this to a theory, and then failed to adequately critique the speculation, so . . . . .PB666 yap 15:23, 13 November 2009 (UTC)

3. Sheer speculation does not belong in Wikipedia, if god says it, it still does not belong in Wikipedia unless its part of a biography on god. PB666 yap 00:15, 13 November 2009 (UTC)

I will repeat the whole passage found in the Discussion (i.e. speculation) part of the paper. First lets go through the results that support the discussion: "It is readily observed that the diversity of Asian haplotypes is far greater than that found in the European population". Last month yes, today no. The Central European and proximal Eastern Europeans have the highest level of SNP diversity of any R1a1a bearing group, M458 diversity stop at the Ural Mountain. By SNP diverisity, Poland is more diverse than any other place for R1a1a and The Persian Gulf is the most diverse with regard to the broader clade. So it is not observed any more. In addition both R1a1a* and R1a1* is found scattered about Europe, in the North, in the Caucasus, in Greece and In Turkey. STR diversity is greater in Southern Asia, however here are the 95% confidence intervals for Northern India and Southern Pakistan: 2.1 to 58.7 Ka and 0.4 to 53 Ka. Not real confident.

"There are several clades exclusive to Asia groups; however, the same is not true for Europeans." False, M458 is specific to Europeans so is M334. The microsatellite distributions are especially interesting in Turkey (the only Anatolian group included), given the plethora of haplotypes present in the population." Variance = .298, for serbia .295. This is pretty much the results they draw on to make the following 'discussion'. Point making statements in argument from ignorance as many of the papers cited in the main page do, future proves generally wrong, exactly as the above.

"Alternatively, Sengupta et al [2006] and Wells et al [2001] have proposed that the haplogroup originated in Northwestern India and in the Central Asia steppes [both places, at once, a rare homoplastic Y mutation (actually 7 homoplastic mutations but who is counting)], respectively, given the wide variety of R1a1 Y-STR haplotypes throughout the areas.[..talk about age estimates we all agree are useless.....].These results along with time estimates for several other populations across Europe and Asia support the finding of Sengupta et al regarding the central[Sic] Asian origins of the mutation."

Note STR based dates for central Asia are ~10 kya compared to ~18 kya for India, Pakistan, Serbia and Turkey.

Now I am not so great in English, however isn't central supposed to be capitalize if the meaning is Central Asia, and lower case if it means central, for example core, or constitutive. Was this double talk (note the vaque template at the end of the sentence on the Main page, put there for a reason)?PB666 yap 02:34, 13 November 2009 (UTC)

Shall we move to the next section. . . . . . . . . . . One last thing. From Underhill et al. "Also noteworthy is the drop of R1a1a* diversity away from the Indus Valley toward central Asia (krygystan 5.6 KYA) and the Altai reigon (8.1 KYA)." Clearly, Underhill does not believe R1a1a originated in Central Asia.

Eastern Europe
First off, there is no R1a* (xSYR1532.2) so the question is should we be talking about R1a origins, generally, when basal diversity is outside of Europe? If R1b or R1* showed a increasing presence and diversity in Eastern Europe, there is the possible use of the Carpathian Ice Age refuge, however Andrew says don't bring R1b into the discussion. Without R1b diversity there is no basis to place basal R1 in Europe and without basal R1 in Europe, it is unlikely it R1a evolved in Europe. Lets see what we have got.

"Researchers using this estimation method therefore believe any Bronze Age or more recent dispersals affecting modern R1a diversity must be specific to certain sub-clades, such as R-M458."

Hmmmm, this is what Underhill stated. ",whereas the R1a1a* diversity declines toward Europe where its maximum diversity and coalescent times of 11.2 KYA are observed in Poland, Slovakia, and Crete." This does not discuss the contribution of R458 and R334 diversity. European Ages.
 * Epipaleolithic 16 KYA to 11,660 (10 KYA in parts of Europe).
 * Mesolithic 10 KYa to <7.5 KYA (depending on the people)
 * Neolithic 9 KYA to 3.4 KYA (Balkans to Ireland)
 * Copper/Bronze Age 5,300 to 2800 KYA (Europe)

Conservatively Underhill would argue for a Mesolithic origin, and they discuss Neolithic expansion of R1a1a* in central Asia. "Haplogroup R1a1a7-M458 diversity and frequency are highest in River Basins known to be associated with several late and early Neolithic cultures." That would be LBK culture. However as discussed here their moecular clocking technique is prone to excessive variance. So that we adhere to the implication that the Bronze age datings are the latest time R1a1a could have reached Europe. So between the Epipaleolithic and Early Bronze age R1a1a* reached Central Europe (11,200 to 4.6 Ka).

"Researchers using this estimation method therefore believe any Bronze Age or more recent dispersals affecting modern R1a diversity [R1a1a* diversity] must be specific to certain sub-clades, such as R-M458." Right but the introduction to that is that R1a1a* entered Europe between 11,200 to 4.6 Ka.

Bronze Age (Indo Europeans, Indo-Aryans, Kurgans and Horses) I thought we were going to get rid of the Kurgan hypothesis? And now we are discussing horse? What is the evidence for Horse use before the early bronze age? When was the IE expansion supposed to have taken place 4500 KA, R1a1a had already spread into Europe by that time. Spread from Europe to S. Siberia is evidence of Indo-Aryan gene flow?

Historic era (Slavic languages): Movements within Europe I have no big problem with this section, however the evidence is based upon expansions in Lithuania and Czech republic, however no R1a*2009 or R1a1*2009 has been uncovered in either place, so that any expansion of R1a that we are discussing is more specifically R1a1a*. Otherwise the discussion is Vague. How R1a1* got into Scandinavians is a matter for which we have no information. BTW the period you are discussion is called, properly, the Migration Age or Migration Period. I removed the 'factual Accuracy' tage and replaced with 'vague' sentence tag.

South Asian Origin
I have no problem with this section except that it is once again vaque in several places and is to be considered factually inaccurate. In my view South Asian and West Asian origin should be combined. Segregating these two sections is like talking apples and oranges.

Southwest Asian
A broken clock is correct at least 2wice a day.

User Pdeteiker
I don't give a damn how many posts you make pointing out various 'data' and 'subclades.' Because you don't know how to write a proper sentence in English. You also don't do your homework. You confused a central point in the Underhill paper concerning the Eulau remains. So your language skills are deficient; your science is lacking; and like many doctors and scientists, you hide behind a thicket of verbiage. For all your protestations about the value of ancient y-Dna, you confused the only mention in the Underhill report of a-Dna and got it backwards. My point is this: Your arrogance notwithstanding, you need help conveying your ideas. You also need to tone down your jibberish. Perhaps you're a scientist. That's nice. I know some scientists and I know some geneticists. Take a deep breath, learn to interact with other breathing humans and cooperate in this process. You are becoming the obstruction. MarmadukePercy (talk) 04:23, 13 November 2009 (UTC)


 * MamrudakePycre, I have never seen such a reduction and removal of current material such that has occurred in the this article, the defining mutations between R1a and R1a1a* are important, since they are a temporal indicator of the distance between the branch points. Andrew feels it is necessary to remove these to allow his sense of the article to remain. In addition he does not want to reduce inaccurate statements or at least condition these inaccurate opinions in the article. This verifies my point of view, that the article as it stands is incumbered by a structure of three phenomena that sometimes masquerades as one or two phenomena, this is a major flaw in the logical organization of the article. At best, this article could see two major divisions in the origin section that discuss papers solely devoted to the origin of R1a (and R1a1*) by the old criteria, and R1a1a* byt the new criteria. However, if this cannot be done I will split the page.
 * As for interacting one editor created a new page and I modified it. Andrew started hyperventilating and then destroyed that page, he made no concerted effort to merge my edits, and guess what, the Science of the page has fallen as a result. What he has done verifies my point of view, it is difficult to separate two complicated structures in the context of a single page without confusing the issues or confusing the readers. Maybe I am wrong, maybe by some miracle, after a month and a half of this page being sucky, all of a sudden you guys can fix the problem. And Marmaduke, where are your improving edits on the Main?PB666 yap 14:52, 13 November 2009 (UTC)


 * PD, you are not reading what people are writing. Marmaduke and I keep saying that we object to your illegible editing, (EDIT: which has never been thought through or checked), and not ANY points of content. Apparently you can not believe this? If you see something wrong with the "science" no one is stopping you from trying to communicate it. Please do attempt to communicate.--Andrew Lancaster (talk) 15:26, 13 November 2009 (UTC)


 * Apparently Andrew this is not the case, you did find in that 'illegible editing' exactly the points I was trying to make. There are two ways of looking at this my writing problem or the fact that there were so many tweeks that need to be made. Don't take me wrong, those tags I put on the article needed to be placed in October, however since there was the appearance of denying that there were problems, the tags and the 'unfashionable' critiques were necessary to bring your attention. Despite the terse discussion I think you realize the effort it takes to move a 'roadkill' article to a alpha'd class Status, despite what happens on the Talk page, what is important is that the Main page improves as a consequence. I defend my writing under WP:BOLD, it got things, finally, moving.
 * Thats why I am not complaining about MarmadukePercy's clearly ad-hominim attack that violates WP:talk page guidelines. I sent you the data from Mirabal et al. (2009) and even put it in a nice table, so if the form bothered you, go check your talk page history. You should have some apology for me having to repeat the information that I already gave you once.PB666 yap 18:40, 13 November 2009 (UTC)


 * I could argue a similar thing, and I think I would win the argument, but that is not important.--Andrew Lancaster (talk) 20:32, 13 November 2009 (UTC)

South West Asia vs. West Asia
Hey Pdeitker I know you prefer SW Asia as well but are you saying West Asia is a Wikipedia (WP) standard? I'm just not following the whole logic behind the stance. Usually in the field of Geography the Middle East is referred to as South-West Asia rather than West Asia. Even archaeologists who specialize in Middle Eastern archaeology will refer to their area of speciality as "South-West Asian archaeology". West Asia is still rather ambiguous/confusing. Geog1 (talk) 21:11, 13 November 2009 (UTC)Geog1


 * Well, your right its more intuitive, what does everyone else think, maybe we can alter the wiki-standard. When I think of West Asia I think of the Ural mountain region to the Caucasus. Andrew pointed out last night that that was a grossly horrific and terrible to have. lol. However if you type Southwest Asia you are redirected to this page. Some terribly important Wiki-Cabal has decided that this is the case as it should be.PB666 yap 00:04, 14 November 2009 (UTC)


 * Geographic terms have certain Wikipedia policies. We need to pick the most commonly used and easy to understand name, and yes preferably it should link to the most relevant article. Is Middle East perhaps the most clear?--Andrew Lancaster (talk) 12:46, 14 November 2009 (UTC)

Regarding GA nomination
I noticed that this article has been nominated -- I'm not quite ready to sign up to review it, but thought I would give my first reaction. It seems to me that the article is not accessible to a broad enough class of readers. It is easily possible that a reader interested in anthropology will come to this article, but such a reader won't be able to make sense of it, and doesn't get any guidance toward the necessary background. The main background needed is to understand the special features of inheritance of the Y chromosome, and what a haplogroup is. Most of the related Wikipedia articles are totally unreadable for non-specialists -- the genetic geneology article is the most helpful, so a pointer to it would be a start. The best thing, though, would be for this article itself to sketch the basic facts that a reader needs to know. Looie496 (talk) 18:50, 15 November 2009 (UTC)


 * Thank you for your input. This has been my feeling all along. The jargon needs to be removed, and the article needs to be made accessible to the average reader. (See my comments above.) Some editors have done a good job in inserting the science. Now the language needs improvement. MarmadukePercy (talk) 19:17, 15 November 2009 (UTC)


 * Thanks for the quick reply.PB666 yap 02:04, 16 November 2009 (UTC)

Next round of concerns
PB666 has posted notes directly into the text, after I adapted it in many places today, in an attempt to meet his concerns, so now I'll respond to those:-
 * . "Accuracy" tag added to section on Central Asian origins. Comment also made: "not the only problem with that speculation, this section title is a contradiction, which current authors calls it a theory, what aspects of diversity are missing from Central Asia.--Why does " Underhill et al. (2009) took to the data to be consistent with Western Asian origins"." I get the impression that this remark has something to do with some subtle methodological point about the definition of a "theory" as opposed to a "speculation" or something like that? And of course there is the point made previously, that PD wants reference to this theory cut out (even as a non-leading theory) because he feels it is not worthy. But why call this an "accuracy" issue then? Basically this remark is garbled and I need help to understand the point. This is a great example of the real problems now happening on this article. There is no excuse for this way of writing. It is simply showing no big interest in communication.--Andrew Lancaster (talk) 15:40, 13 November 2009 (UTC)


 * I'll try to pick up the thread of the discussion about this where it occurred above:


 * 1. Version when PD posted first extended comments above:


 * 2. Our "discussion" at that stage (re-factored with easier indents to follow)...


 * PB666 yap 00:15, 13 November 2009 (UTC):-
 * I have the paper right here.
 * Five groups I will label: 1- Arkangelansk 28, 2- Khanty 27, 3- Koml I 54, 4- Koml P 49, 5- Kurak 40, 6- Tver 38
 * R* (not M124, R2; not M173, R1) - 1-0, 2-0, 3-0, 4-0, 5-0, 6-0
 * R1* (not SYR1532.2, R1a1*2009; not M343, R1b) 1-0, 2-0, 3-0, 4-0, 5-0, 6-0
 * R1a* (SYR1532.2; not M198, R1a1a*) 1-0, 2-0, 3-0, 4-0, 5-0, 6-0
 * R1a1a* (M198; not M56, M157, or M64b) 1-5, 2-4, 3-16, 4-16, 5-21, 6-22
 * Haplotype variance R1a1a* 1- .271, 3 - .226, 5- .191, 6 - .280,
 * Compare South India - 0.505, South Pakistan - 0.475, West India - 0.426,
 * This is what I call a 'Daily Show' moment [Pause and smirk for a Stewart effect]. Based on what data do they affirm the quoted statement?
 * Ancestral haplotype (R1a1*2009) in Central Asia. None
 * SNP variants of R1a1a detected in Central Asia. None
 * Rank of central Asia types in STR diversity for R1a1a types. Intermediate.
 * Therefore the conclusion is that R1a1a evolved in Central Asia. Good thinking, completely wrong and out of step with molecular phylogenetics, but who needs that, we are talking about the Y.


 * Andrew Lancaster (talk) 08:23, 13 November 2009 (UTC):-
 * Here is what the authors wrote (emphasis added): "Network age estimations from this study suggest that two separate groups exist within R1a1 with similar ages for populations found at the western (Serbia 17.3±5.4) and eastern (South Pakistan 18.7±4.7) poles of the expansion. These results along with time estimates for several other populations across Europe and Asia support the findings by Sengupta et al18 regarding the central Asian origins of the mutation. NETWORK projections also support an Asian origin to this haplogroup, given the plethora of STR haplotypes present in these groups versus those found in European populations (Figure 4a)." FWIW my take on this is that it is a bit like your opinion mentioned below, that we can not distinguish South Asian and West Asian origins. These authors are also saying that we can not really distinguish Central Asian from South Asian either? In any case their words are quite clear, and the summary I proposed for the article is not "vague" but also quite short and clear. The article is also not a "fringe" article by any means. How can we ignore it? You remark about no basal haplotypes is not very convincing because both these authors and readers who know the subject would know that R-M173* haplotypes are fairly widespread, and not yet widely types for M420. Furthermore basal haplotypes are extremely rare in pretty much all places - just singletons and small local clusters - so it is not as if there is a strong case to be made for one part of Asia against another so far.


 * PB666 yap 15:23, 13 November 2009 (UTC):-
 * I want you to show me what credible evidence suggest that Wikipedians should promote this point of view as a theory? That is a choice by editors on how they treat this, not the authors of the paper. And BTW we are allowed to critique this based on what other authors have said. Basically a POV has promoted this to a theory, and then failed to adequately critique the speculation, so . . . ..


 * ...so indeed. The most obvious question is whether PD has really considered the changes made now to the passage...


 * 3. Version as it has been adapted today by me:


 * If this is what PD has really before writing the above, then I do not follow it so far. What the text now says is that one article considered central asia possible but only in a way that they did not distinguish the case distinctly from Asia generally. Is it just a simple case of PD asking a theory he does not like to be removed from view?--Andrew Lancaster (talk) 16:07, 13 November 2009 (UTC)


 * No, Andrew, you have made your point, it is substantial enough to be kept, however it is controversial enough to also be critiqued and hold a lower status. What is missing in central Asia that Europe has, defining (diversifying) SNPs. What is missing in Central Asia that India has, high STR diversity. How did underhill delineate South Pakistan and India from two populations in Central asia? By age estimates. One might not be faulted for pointing out the first, however considering the last two criteria, failure to mention these increases the POV of the section.PB666 yap 16:19, 13 November 2009 (UTC)


 * The new version gives it an extremely low status I think, and also it now explicitly casts doubt over whether there is any argument for Central Asia as such, distinct from Asia in general. What more do you want? I honestly do not know what the Mirabal article inserts the term Central Asia, but maybe the authors were thinking a bit like Cordaux was, simply looking at "geometry" like I mentioned above: if you have two poles of recent expansion, then you look in between those poles for any areas that also look pretty R1a rich. Also keep in mind that it is hard to be sure about anything subtle, like paraclades and STR variation, in central Asia given the potential for massive movements and founder effects, as well as no data for Afghanistan or indeed most of Iran. But I am just speculating. The point is that:
 * they did use these words,
 * these are not fringe authors,
 * this is not an old article.--Andrew Lancaster (talk) 16:30, 13 November 2009 (UTC)


 * . "Accuracy" tag added to Phylogeny of R1a section. Comment also made: "Removing newly defined SNPs which makes clades look closer together is a POV edit. We either can choose to place no mutations in the cladograms or place all mutations, cherry picking mutations is POV". I can reply to this very simply that this is simply normal procedure. EVERY Y haplogroup can be distinguished by large numbers of equivalent SNPS. As has been discussed before the exact number of these does not reflect an accurate measure of age gaps because there has not yet been enough standardized sequencing to know if we are comparing apples with apples. Many SNPs are just found by change while looking at something else.--Andrew Lancaster (talk) 15:40, 13 November 2009 (UTC)


 * If R1a1a page existed, in the Haplogroup box it is customary to add all the defining, mutations since you have decided to scrap the page, the onus is still on you not to loose or hide this information. If you choose to hide the information then it becomes a criteria for creating a new page, since you find it normal procedure to delete this information, it is thus normal procedure to create a new page to prevent information from being lost and allow a balanced consideration of the information. On which of the E1b1b or subpages have you suppressed defining mutations (doesn't seem to be normal procedure there)? Even on the E1b1b1a page you have mentioned all defining mutations. In addition, the removal of the R1a1a cladogram also removed one of the subbranches, which I see as a form of information suppression. The M334 mutation is as important as any other R1a1a mutation.PB666 yap 16:28, 13 November 2009 (UTC)


 * I think you simply do not realize how many mutations are known, and how randomly they are discovered. You are simply wrong about this. The E1b1b article does NOT show all the phylogentically equivalent SNPs, and neither does any article I can think of! For example where we mention M173 in this article should we mention "M173/P241, P225, P231, P233, P234, P236, P238, P242, P286, P294"?
 * It is not only normal procedure, it is also absolutely necessary. You would need to insert dozens of SNPs all over the place in all the Y haplogroup articles. I want to point out that already with all the M17/M198s I have been adding all day, you are forcing this article now in the exact same bloated direction which you have criticized so often when you've remarked on articles that have been a victim to edit warring. Nonconstructive editing tends to push other editors into putting in more words than necessary and not focusing on more important stuff. --Andrew Lancaster (talk) 16:37, 13 November 2009 (UTC)


 * I am very much aware that fact, this is something I keep reminding you of, but to say the least more defining mutations is better than fewer. Do you consider the article, at this moment, to be bloated, if so add comments to the comments section as to why. See section below:


 * Andrew, I would remind you that 'argument from ignorance' anti-logic as a precept for 'absense of evidence is not evidence of absense' logic generally does not improve by removing information. one doesn't need to place M17, M198, ..., ..., everywhere in the article. However, in the phylogenetics section you can state Currently, 7 mutations ([7 mutations]) distinguish R1a1 from R1a1a however only two mutations M17 and/or M198 are used in Y-DNA typing to define the M1a1a subclade (M1a1 in older literature). The same can also be done for R1a1, in this way you are not hiding information, but at the same time you are not bloating the article. As per working to re-add M17 and M198, that improves the accuracy, but in terms of readability I would recommend in parentheses adding (R1a1a2009), that is the nomenclature that is most picked up when _I_ read the article and so I am biased to wanting an accurate phylogenetic tag. If this does not upset you greatly I will add the tag myself for clarity.PB666 yap 17:44, 13 November 2009 (UTC)


 * Two things. Concerning finding solutions by putting notes about the notations and assumptions once only, this is the direction I thought we were going until you went massively unilateral. Concerning the superscript, I make no conclusion yet. To me it does not seem obvious for a normal reader at first sight.--Andrew Lancaster (talk) 19:55, 13 November 2009 (UTC)


 * A prescript might be better, for example 2009R1a1a. BTW there are not that many outstanding issues to get this article to a class-B status, if we put our elbow-grease into this it could be done by Monday. Higher class articles are easier to protect by Administrative action relative to low class articles. Something to think about.PB666 yap 20:24, 13 November 2009 (UTC)


 * What's wrong with using something that will make sense to any person with a reasonable education and reading skills like "R1a (M17/M198)" or "R1a (old)"? You sure are keen on jargon.--Andrew Lancaster (talk) 20:31, 13 November 2009 (UTC)


 * Old is relative, remember the point of still mutations missing, which happens when the next round of SNP hunts is done and there are only a month left in 2009!, (famous last words). I would desire a tag that is specific for a current version. Jargon is good; sometimes, it keeps us honest, if one places a specific tag and then spews speculation, that is a leverage point by which we can correct the persons POV. The problem I have with R1a M17/M198 is that it is two nomenclature revisions from current status. I would fell much better with R1a1a (M17/M198), again, this is my opinion watching HLA for the last 25 years, nomeclature only seldomly reverses itself. For example the B*0701 allele was not occupied when found to be an erroneous sequence of B*0702, they simply retired the number. In the immediate sense, if you want to use R1a1 (M17..) or R1a1a (M17 . . .) either is fine.PB666 yap 20:51, 13 November 2009 (UTC)


 * BTW there are not that many outstanding issues to get this article to a class-B status, if we put our elbow-grease into this it could be done by Monday. Higher class articles are easier to protect by Administrative action relative to low class articles, we would have to worry less about the bannable pests if we establish clear goals and meet those goals, reversion becomes much more justifyable. Something to think about. In terms of bloating, the article went from a size of 90 kbts to is current 53 kbts, actually smaller when we consider text. However, the impetus should be on increased wikification of the article with appropriate graphics to make it more attractive to a general audience, while this is not a requirement for class B articles given the dry nature of the topic, it is really helpful. Concerning the split, I think the jury is still out on the issue, but recent actions have deferred its necessity. I think you will see future actions to split the page, and if this is your dislike, the I would certainly make sure the split-reasoning is minimized to its lowest level without belt-busting the article.PB666 yap 20:51, 13 November 2009 (UTC)


 * Concerning the need for superscripts I see that your ideas are focusing on a hypothetical future, so lets watch for it. Personally I think mutational names will take over. Concerning "R1a1a (M17/M198)" the article is now full to bursting with such stuff. Anyway, if we have common sense, like we have generally had between us, we'll always find a way. Concerning classification systems I do not want to criticize you for doing something that needed to be done bypointing out that you are effectively citing yourself here, but on the other hand it is not what drives all good faith Wikipedians. B class would be good but there is no WP:deadline, particularly when you have other stuff to do also.--Andrew Lancaster (talk) 21:09, 13 November 2009 (UTC)


 * Andrew, I have a deep dark secret to tell you, so secret that just the words may burn your ears for a wwwwwhhhhhhhooollllleeee millisecond. Wikipedia:WikiProject Human Genetic History - particants - Andrew_Lancaster (Shhhhh quite, keep it to yourself, burn you CRT and throw it in a live volcano). Your such a super secret member you forgot to tell yourself. That does entitle you to make comments about HGH articles and judge the merits of the Article by the standards. More importantly, the-powers-that-be at Wikipedia (whom I can name but you can find a reference on my user page) would like it very much that you did comment on articles and did rate articles.PB666 yap 00:14, 14 November 2009 (UTC)


 * Please read what you reply to before you write rude replies. I said I have no problem with you doing ranking work per se, but that I find the discussion style odd (to say the least) when a person in the middle of a disagreement about an article goes out, makes rankings and then immediately starts posting remarks on talkpages where he is in disagreements, citing those rankings as if they were done by someone else. --Andrew Lancaster (talk) 13:47, 17 November 2009 (UTC)

Status change
Do to recent edits on the main page and in concert with the comments I made on 14 October 2009 and 22 October 2009 (listed here), most of the problems on the main page that prevented this article from being promoted have been satisfied (and I still have a few hairs left on may head to boot). I am promoting this article to C-class, since the two remaining issues are dealing with how to handle the most recent literature. If anyone has any comments that they think would demote the page back to a start class or changes need to promote the article please place the there. I do not want to be seen as 'raising the hoop' Ad libitum. Since these recent criteria have been passed I think the page should be promoted. The attention needed tag has also been removed. Congratulations.PB666 yap 17:14, 13 November 2009 (UTC)

Other tasks available. WP:WikiProject_Human_Genetic_History/to_do. Please take some time to scan the article for grammatical and spelling errors.

The article is mostly complete and without major issues, but requires some further work to reach good article standards.

The article meets the nine B-Class criteria when:
 * 1) The article is suitably referenced, with inline citations where necessary.
 * 2) It has reliable sources, and any important or controversial material which is likely to be challenged is cited.
 * 3) The use of citation templates such as cite web is not required, but the use of tags is encouraged. [Note: Thanks to the work of Andrew, I think we are going to move articles to the Harvard reference system with the Citation style; however we need to move away from {(Harvtxt|Doe et al.|Date)} to the official {(Harvtxt|last1|last2|last3|last4|year)} format]. I am working for producing a database of Harvard tags and citations in the WP:HGH page. You can store formal citation tags there for future use. Also Now that the Diberri tool is back on line, it would be nice to see those full citations again. Note the Diberri tool which has a Cite Web template can be easily converted to citation template by rearranging the authors names |last1 = |first1 = |last2 = |first2 =. This combines the ease of using Diberri with the improved (in text Harvard referencing) referencing of the citation template.
 * 4) The article reasonably covers the topic, and does not contain obvious omissions or inaccuracies.
 * 5) It contains a large proportion of the material necessary for an A-Class article, although some sections may need expansion, and some less important topics may be missing.
 * 6) The article has a defined structure. Content should be organized into groups of related material, including a lead section and all the sections that can reasonably be included in an article of its kind.
 * 7) The article is reasonably well-written. The prose contains no major grammatical errors and flows sensibly, but it certainly need not be "brilliant". The Manual of Style need not be followed rigorously.
 * 8) The article contains supporting materials where appropriate. Illustrations are encouraged, though not required. Diagrams and an infobox etc. should be included where they are relevant and useful to the content.
 * 9) The article presents its content in an appropriately accessible way. It is written with as broad an audience in mind as possible. Although Wikipedia is more than just a general encyclopedia, the article should not assume unnecessary technical background and technical terms should be explained or avoided where possible.


 * I should also note on the HGH page I removed R1a (Y-DNA) from articles needing immediate cleanup and attention. However it has been moved to Wikify, this is a reminder that graphics would be nice.PB666 yap 18:43, 13 November 2009 (UTC)

I do not wish to be negative about your good intentioned and positive efforts to start grading genetics articles, but let's all please be a bit more clear. You have recently been grading these articles yourself, so when you cite the grades, you are just citing yourself. Please make it clear when you are doing so, especially when it concerns articles where you have been editing yourself. Now you are giving rules also but what is the source of these criteria, for example the preference concerning citation method (which oddly seem like your personal preferences)? I note how you have even explicitly pointed to my name. Here is what I see on the appropriate Wikipedia page:- the six B-Class criteria: Notice the differences? Who is the source of the changes, and in particular the ones which say "Phil's referencing method is better than Andrew's"? I presume some sections of what you present above as Wikipedia guidelines are actually your personal notes about them? Looking to the Wikipedia guidelines as I have just quoted them, my own opinion is as follows...
 * 1) The article is suitably referenced, with inline citations where necessary. It has reliable sources, and any important or controversial material which is likely to be challenged is cited. The use of citation templates such as is not required, but the use of tags is encouraged.
 * 2) The article reasonably covers the topic, and does not contain obvious omissions or inaccuracies. It contains a large proportion of the material necessary for an A-Class article, although some sections may need expansion, and some less important topics may be missing.
 * 3) The article has a defined structure. Content should be organized into groups of related material, including a lead section and all the sections that can reasonably be included in an article of its kind.
 * 4) The article is reasonably well-written. The prose contains no major grammatical errors and flows sensibly, but it certainly need not be "brilliant". The Manual of Style need not be followed rigorously.
 * 5) The article contains supporting materials where appropriate. Illustrations are encouraged, though not required. Diagrams and an infobox etc. should be included where they are relevant and useful to the content.
 * 6) The article presents its content in an appropriately accessible way. It is written with as broad an audience in mind as possible. Although Wikipedia is more than just a general encyclopedia, the article should not assume unnecessary technical background and technical terms should be explained or avoided where possible.
 * 1) The article is suitably referenced: DONE.
 * 2) The article reasonably covers the topic, and does not contain obvious omissions or inaccuracies. DONE
 * 3) The article has a defined structure. DONE
 * 4) The article is reasonably well-written. DEBATABLY DONE: certainly not brilliant, but certainly reasonable.
 * 5) The article contains supporting materials where appropriate. Illustrations are encouraged, though not required. Diagrams and an infobox etc. should be included where they are relevant and useful to the content. COULD BE BETTER BUT DONE
 * 6) The article presents its content in an appropriately accessible way. It is written with as broad an audience in mind as possible. Although Wikipedia is more than just a general encyclopedia, the article should not assume unnecessary technical background and technical terms should be explained or avoided where possible. PROGRESS HAS BEEN MADE TO THE POINT WHERE IT IS DEBATABLY DONE

I do not demand others to agree with my opinion, but opinions are being called for I take it.--Andrew Lancaster (talk) 13:00, 14 November 2009 (UTC)

For ease of reference, here is the previous discussion PB666 and I (and others) had about how to reference this exact type of article:. Note that this discussion also touched upon the last criterium for quality mentioned above (accessibility), and also note the last sentence of the discussion where I wrote "A hybrid system where different parts of the article work different ways is a poor system". This point was a PRACTICAL point which was very similar to the concerns recently raised by the unilateral article split: making a major edit which PARTIALLY rebuilds the whole structure of an entire article and then demanding other editors to finish off the work, simply does not work in practice. You end up with a mixture of structures. See quality criteria above about structure. In other words, every proposal about an article's basic structure or referencing system should ALWAYS take into account that the structure should be easy to maintain for editors in the future.--Andrew Lancaster (talk) 13:11, 14 November 2009 (UTC)


 * Andrew, I agree with that, however again, the writing style is if you are going to drop the names of 5 different authors in two sentences such that the subject of the sentence changes into 'who the hell are all these authors' the reader looses. Two footnote/citation system is great, however, one critique, the replacement of Doe et al. for a list of coauthor names does not exactly comply with citation standards. If the referencing is distracting from the topic, and the footnoting method does not, use the footnote. Many articles have both ar reference section and a further reading section and so I don't see having to sections as a problem. This is wikipedia, not a paper journal, we like to wiki-link in the friendliest and easy way possible. Put your reader first is a general rule for article promotion.
 * I know the difficulty in using the Harvtxt/Citation template, that is why I think we should keep a database of frequently cited articles, for example Mirabal can be used on 5 different Y-DNA pages. Lets try, whenever possible, to comply with conventions. This is my only outstanding problem with the references other than the gargantuanly long note.
 * I am going to repeat myself because it appears not to be clear, the Harvtxt system, when used properly is completely acceptable, see Mitochondrial Eve, however the authors of the template admit there is a known bug that makes referencing difficult. Ergo one has to be extremely careful when creating the references. I have a critique that we are using, when it should be  . I had the hardest time trying to get your method to work until I scrapped it went to Citation, downloaded the citation form went to Harvtxt and selected the appropriate template. That is the source of the problem, its not very user friendly and what you have done is created alot of half-assed citations. I don't think most wikipedians would judge this to be a problem for class B rating. But I found a much-much easier way, a way if you tried it you would be pleased. I am trying to help you out here, not criticize you, I appreciate the Harvard referencing system as it has some advantages, but it has a major bug, so that we need to be clever about how we use it.


 * Most of the papers we deal with are cited on Pubmed and have Pubmed IDs. These can be placed into the following:

Bberri template filler (or as many in WPMed refer to it, a godsend). Where I simply cut the PMID off the Pubmed abstract or search-resluts page, past it into the text box next to [Submit] and click, You have a fully formed ((Cite journal)) reference. This takes about 6 seconds and replaces a couple of minutes of typing. Next I replace (( cite journal | .... in the reference with ation and now you have a citation,if you do nothing more it will work, however if you want it to work with Harvard Referencing templates you will need to modify the name stream._ {{ Citation and now you have a citation, if you do nothing more it will work, however if you want it to work with Harvard Referencing templates you will need to modify the name stream. Replace the list of authors | author = with ''|last1 = Doe |First1 = J |last2 = Smith |first2 = J |last3 = Jonese |first3 = B| last4 = Chan |first4= J|. . . . .'' This is a little work, however I just then cut and past the names into {{Harvtxt|last1|last2|last3|last4|2009}} and walla, beautiful, complete, accurate Harvard Citation capable citations. I assume you misunderstand what I was saying above and so I am not going to pester you about this anymore. (although I spent several hours trying to figure it out just to see things from your point of view, can this really improve articles). And that is the bottom line, too much of a good thing is too much of a good thing (Master Pi), highly dense use of harvard referencing is not encyclopedic it can be a distraction. I would argue one other thing, linking an article once per section is adequate, multiple linkings increase the size of the article, WP:MOS indicates to limit same Wikilinks to 1 or few per page. Within a section if a person has a Harvtxt link for a paper and one references the author a second time, non-hyperlinked Doe et al.(2015) is fine.PB666 yap 19:32, 14 November 2009 (UTC)

Anyway that's good enough for me, B-class it is. Lets try to keep it there.PB666 yap 19:58, 14 November 2009 (UTC)

B-class articles have some reward. Portal:Molecular_Anthropology-->Portal:Molecular_Anthropology/Selected_article (Also hint: This as a lead?) PB666 {{sup|yap}} 21:12, 14 November 2009 (UTC)


 * So your point is just about the "et al" instead of listing all authors? The Harvcoltxt template will work if you put in all the authors in the Citation template in the references section, so why not simply improve the citations in the reference section rather than arguing that there should be lots of footnotes instead of Harvard templates?--Andrew Lancaster


 * Too much caffiene I suspect is the problem here. No don't replace harvard references, unless they are junking up a particular passage, else leave them alone. No good reason, no replace. I should point out, in the mt Eve article I replaced footnotes with Harvard references, so.....PB666 {{sup|yap}} 00:22, 15 November 2009 (UTC)


 * I put in a lot of references which work properly by all standard norms, using both the Harvard citation template and the ref format, and these allow people to find the names of all authors anyway.--Andrew Lancaster


 * 1. It sets a bad precedence,
 * 2. Some references may not have a pubmed or Doi link, in those cases the full author list may help finding the article or parts of the article on the internet and library searches.
 * 3. Below you argue we should stick to the Geographic standards, OK we should stick to the reference standards
 * I don't think you have a cogent counter argument, simply griping cause I complained about the reference quality.PB666 {{sup|yap}} 00:22, 15 November 2009 (UTC)


 * I have never had the pleasure of seeing other editors, including you, helping to add in anything more than one or two wrongly formatted refs that are needed in any genetics article I have been working on.--Andrew Lancaster


 * Dog hand bite feeds. Where exactly did you get U2009, Mirabel, OK, so . . . . . .PB666 {{sup|yap}} 00:22, 15 November 2009 (UTC)

I spend an enormous amount of time putting them in already and see it as a job which might otherwise simply would not be done at all. So lets please be a bit realistic.--Andrew Lancaster (talk) 23:04, 14 November 2009 (UTC)


 * And I showed a relatively easy way to get that same product, better. So why are you griping, have you tried it. I am hearing this argument here 'I did alot of work, it worked for me, no-one else did squatilly, I'm perfect, don't criticize me' argument.

I will time it for you. From the beginning of the search until the full reference. 6:01 PM to 6:02:15 {{cite journal |author=Underhill PA, Myres NM, Rootsi S, Metspalu M, Zhivotovsky LA, King RJ, Lin AA, Chow CE, Semino O, Battaglia V, Kutuev I, Järve M, Chaubey G, Ayub Q, Mohyuddin A, Mehdi SQ, Sengupta S, Rogaev EI, Khusnutdinova EK, Pshenichnov A, Balanovsky O, Balanovska E, Jeran N, Augustin DH, Baldovic M, Herrera RJ, Thangaraj K, Singh V, Singh L, Majumder P, Rudan P, Primorac D, Villems R, Kivisild T |title=Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a |journal=Eur. J. Hum. Genet. |volume= |issue= |pages= |year=2009 |month=November |pmid=19888303 |doi=10.1038/ejhg.2009.194 |url=}} 6:03:22 to 6:06:10 reformat reference. {{citation| last1=Underhill | first1=PA, | last2=Myres | first2=NM | last3=Rootsi | first3=S | last4=Metspalu | first4=M | last5=Zhivotovsky | first5=LA | last6=King | first6=RJ | last7= Lin |first7= AA |last8= Chow |first8 = CE |title=Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a |journal=Eur. J. Hum. Genet. |volume= |issue= |pages= |year=2009 |month=November |pmid=19888303 |doi=10.1038/ejhg.2009.194 |url=}} 6:07:09 to 6:08:07 {{Harvtxt|Underhill|Myres|Rootsi|Metspalu|2009}} done. ~5 minutes, filled three different templates. Now, as you say, I have never added a reference to a 'damn article', how is it that it only takes me a few minutes to come up with a full harvard reference of the latest article and the footnote version? BTW, where did you get that article from. WikiProject_Human_Genetic_History/Mt-DNA. Here is an example of a featured article in which I have made major contributions Coeliac_disease, do you see a single reference that is Doe et al., Cause its going to seem like a damn shame that other Wikiprojects core articles are promoted to feature article status, but we are not able to promote articles beyond a certain level because our foolish pride gets too easily hurt. Stop whining - man up.PB666 {{sup|yap}} 00:22, 15 November 2009 (UTC)


 * Stop assigning pointless jobs to people. And stop trying to force the issue of assigning jobs by starting a job and leaving articles half changed. Please just accept that not everyone agrees with your opinions on every little matter of format.--Andrew Lancaster (talk) 08:34, 15 November 2009 (UTC)


 * you are not arguing with me, you are arguing with Wikipedia, and when it comes a time which you desire your favorite page to be promoted, don't be surprised if they ask, why do all you citations in the list begin with et al. Every single last one of them. I am pointing this out to you and I don't believe that at some level, it is useless. Be stubborn if you want. If you want to do a real test, since this is newly class-B article, lets see how it takes to get promotion to GA status. - Good_article_nominations- PB666 {{sup|yap}} 17:58, 15 November 2009 (UTC)


 * OK, you just nominated it. So what? I must say this is the weirdest new way of pushing a position I have seen. Can you please stop citing yourself in third person? It is confusing.--Andrew Lancaster (talk) 22:48, 17 November 2009 (UTC)

Review of structure
A while back we split the article into two sections, BOTH of which go through the full range of geographical regions - first just discussing the distributions and the second concerning origins and migrations theories relevant to those areas. Although this might sound odd, it cleaned up the flow of thought a lot because previously the migration theories had been mixed together in a confusion of raw data and original thinking. NOW, I note that there have been some small edits which are causing the Middle Eastern DISTRIBUTION section to become a bit disorganized. It also means the section is starting to cover migration theories. PB666 just made this remark also so he also noticed it. In the long run we might want to consider reviewing the structure, but only when someone is ready to change it fully and not leave the job half done.Andrew Lancaster (talk) 14:44, 14 November 2009 (UTC)


 * Split, Split, Split, Split, ra, ra, ra.PB666 yap 18:56, 14 November 2009 (UTC)

For the time being my advice is that people should always be careful about pasting in snippets into sentences bit by bit. It makes sentences longer and longer. It is better to look EVERY time at whether the sentences also need to be rearranged in order to avoid them becoming bloated. Always check to make sure how your edits FIT.--Andrew Lancaster (talk) 14:44, 14 November 2009 (UTC)

A related problem that I see has now developed is that PB666 has forced a whole new phylogeny section into the beginning of the origins section, and then apparently unaware of the irony he has posted a note in the original phylogeny section pointing out that it is redundant! I can see how it is tempting to start an origins section with a phylogeny discussion though, just as I can fully understand why editors want to make sure the interesting case of Iran gets full discussion, and so this problem also raises the question of whether the origins section should be re-merged into the other sections - with this new phylogeny discussion moved to the other, and each migration section merged back into the relevant geographical distribution sections.--Andrew Lancaster (talk) 15:57, 14 November 2009 (UTC)


 * The solution you provided is adequate, it separates the problem of R1a origins at the beginning of the page, and thus allows R1a1a origins not to be hindered by the confusion dilemma. Although structural I think it belongs in the Origins section, origins sections by HGH standards are typically at the front of the article, and it is now were it should be, however it is not named as such.


 * Although I did not change the cladogram, for comparisons sake and because in some papers, like Sharma et al. the do discuss R1* that is neither R1b and R1a indicating that the elimination style cladistics is throwing, potentially, M420 positive SRY1532.2, negative into mish mash clade, also it gives me a way of tiddying up that section of the page and adhere to the new suggestions for cladistic style. Now if you want to critique the approach, find out what Sharma et al. typed, because they found R1* in many places in India, but Underhill found no R420 positives in India. Its possible the R1* in Europe is 2009 R1a* and R1* in India is R1c, R1d, etc. IOW, a mish-mash clad R1a monophyletics and paraphyletics. The basic arguments in this group have largely stemmed from bad nomenclature, tis it not nice that we have mitochondrial DNA, sequence the entire genome and be done with it.PB666 yap 18:56, 14 November 2009 (UTC)


 * The above is a comment to a remark on your talkpage, not here. It is confusing that you post on my talk page AND here. Can I suggest anything about a particular article should be on the article talkpage, and responses should be near what they are responding to, or perhaps linking to them? Here is how I responded to a similar message on my talkpage: How can any reader possibly understand that this is the logic behind putting a mutation in the OLD phylogeny which was NOT KNOWN in "OLD phylogeny" times?????? If we can put in ghost clades for ones not yet discovered then maybe you should put a few in the new phylogeny also? If you are going to have a special diagram to show what people USED TO THINK, why criticize the diagram for not being up to date?? I want to also make a comment: You constantly write as if it is obvious that you are the only person who understands the literature and everyone else is a confused idiot. Are you sure about that?--Andrew Lancaster (talk) 22:53, 14 November 2009 (UTC)


 * Point one, if its article related - like please change this, it should have been posted here first. So that would be your fault (and BTW, pedantic). If you want the latest online copy of Sports Illustrated Swimsuit edition, it belongs on a talk page (and in case your wandering, I don't have it). Point two, we are talking about Y-chromosome, the dirty stray little dog of molecular anthropology, what I understand is irrelevant because the literature will change tomorrow. The better we can point to the faults of the methods used, the better we have informed the reader. Stop beachin.PB666 yap 23:36, 14 November 2009 (UTC)


 * No I am not being pedantic. Once again you are simply not believing it when people say that your way of communicating is confusing. As far as I can tell, it is simply not your priority to worry about this. Concerning point 2, this is not a response at all.--Andrew Lancaster (talk) 08:30, 15 November 2009 (UTC)
 * The M420 you put in the tree is not even in the right place!--Andrew Lancaster (talk) 08:40, 15 November 2009 (UTC)
 * It is, in fact, in the right place, it precisely depicts what you described in the text, however if what you wrote was wrong, and I don't think that it is, then we can change it, but don't go reverting images to a poor quality previous version without some discussion. And the version was restored and improved, Andrew you have been leaving inappropriate comments in the history section, you are very emotionally tied up in what you are doing and the last few edits you have made have not improved the article, I suggest you back off, work on another page or something else until your head cools a little.PB666 yap 17:51, 15 November 2009 (UTC)


 * That's a tremendously unconstructive response. I think it is obvious that:-
 * There was quite a big attempt to discuss, but you are not even trying to understand what is written to you.
 * M420 is in the wrong place after your new revert, for the obvious reason that it shows M420 as a sibling to SRY1532.2 whereas it is a parent!--Andrew Lancaster (talk) 13:20, 17 November 2009 (UTC)


 * Gee wilikers, you got it, that why its an obsolete cladogram, that's why we should be using the new nomenclature in priority over the old nomeclature. The cladogram is in error to show the error in the previous nomenclature. You ever seen one of those adds on TV with the old fat lady standing next to the picture of the buff/plastic surgery version. Its called a contrast. What I have done is contrasted the old (wrong) version with the new (right) version, because the text you have written is less fun than getting wisdom teeth pulled without an anesthetic. Of course we all know that given the sorry state of Y-cladistics, there will probably be shown problems with the new clade as some future point. But that is unimportant. Prior to revamping this page we just had to have inserted the latest, super-duper, R1a paper, well we got that paper. And based on that paper the old cladistics which was known to be flawed, but why it was flawed:
 * M420 mutation which was ancestral to R1a but descendant relative to R1 was being sorted as R1*, presumably a sister clade of R1b and R1a. This indicate critical deficiencies in the cladistic efforts regarding molecular systematics and clocking. With the new cladogrom M420 uncovered, it moves between R1 and R1a, where it should be.PB666 yap 20:13, 17 November 2009 (UTC)
 * After all these many discussions it is apparent that you want to treat the old cladistics as equivilant to the new cladistics. If so I ask one question: why did we need to wait for underhill, we could have just done the article with the wrong cladistics. Either its important or it isn't, if its important to keep the material how cladistics evolved, then I think its important to keep the cladogram showing how that undetected mutation was improperly sorted. Simple as that. Now, of course if you could clean up the text, remove the jargon, make it understandable to mad-dogs and englishmen, then I will think about whether we need to be showing the position of the misplaced mutation. Until such time, I think readers like myself would greatly prefer looking at the cladograms to tell the story. And BTW, that is the name of the game at this point, clean up the massively concentrated jargon within the associated text. I am working on the Lead, when it appears that you want to involve yourself in making the article more encyclopedic we can talk about how to better structure the very limited graphics on the page. PB666 yap 20:13, 17 November 2009 (UTC)


 * In your patronizing response above, as usual you do not deign to both explaining your own positions. What exactly is wrong with the graphic you reverted from?
 * If my wording is strong and clear, this is because discussion with you is such a one-sided effort, and clearly not because I am emotionally attached to any content, and it is tendentious of you to make such accusations. Please check the facts. No particular facts or theories are being debated between us, and concerning format changes I've been happily allowing massive changes except in specific cases where I explain my concern (and then you do not read what I write).
 * If people want to see examples of someone getting emotional about their sunk costs on this article, look on my Talkpage and PB666's loopy tirades.--Andrew Lancaster (talk) 13:20, 17 November 2009 (UTC)

Other people trying to follow the discussions about this article will surely find it even more difficult than me, because there are bits and pieces all over the place. Concerning the cladogram question, people trying to follow will need to see this prior discussion, dropped as usual, just before PB666 went into unilateral mode without trying to understand the point... http://en.wikipedia.org/w/index.php?title=User_talk:Pdeitiker&diff=325814200&oldid=325801439, http://en.wikipedia.org/wiki/User_talk:Andrew_Lancaster#Cladogram --Andrew Lancaster (talk) 19:32, 17 November 2009 (UTC)


 * Yes and why did you post that there instead of on this page, as I said you complained about the scattered discussion, but you actually triggered the scattering. Stop whining.PB666 yap 20:13, 17 November 2009 (UTC)


 * I am simply posting links to the discussion.--Andrew Lancaster (talk) 20:55, 17 November 2009 (UTC)