search this blog

Wednesday, July 29, 2015

The ancient DNA case against the Anatolian hypothesis

In the debate over the location of the Proto-Indo-European urheimat, Colin Renfrew's Anatolian hypothesis is usually mentioned as the most viable alternative to the steppe or Kurgan hypothesis. But probably not for very much longer.

Below is a Principal Component Analysis (PCA) featuring extant Indo-European and non-Indo-European groups from West Eurasia, a couple of typical early Neolithic farmers from Central Europe, a typical Western Hunter-Gatherer, also from Central Europe, and the Iceman from the Copper Age Tyrolean Alps, again typical of his time and place.*

It's just a taste of the ancient genomic data we have available from prehistoric Europe, but it has almost everything that is pertinent to the issue at hand.

You don't need to be familiar with PCA methodology to be able to read the plot. Basically, it shows that the present-day European population structure is the result of two main events:

- the arrival of early farmers from Anatolia during the Neolithic transition, which eventually caused the extinction of people like the Western Hunter-Gatherer, who is the most obvious outlier on the plot

- the expansion of Kurgan groups such as the Yamnaya, which led to the formation of the Corded Ware horizon across much of Europe and shifted the genetic structure of almost all Europeans to the east, away from the Neolithic and Copper Age samples.

These were massive population turnovers, and, as a rule, massive population turnovers are accompanied by language change. So it's highly unlikely that any Europeans today are speaking languages derived from those of the Western Hunter-Gatherers or early Neolithic farmers of Central Europe (ie. according to Renfrew the ancestors of Celts, Germanics and other Indo-Europeans). Moreover, consider this:

- most present-day Indo-European speaking Europeans form an elongated cluster between the Neolithic farmers and the Corded Ware sample, pointing to the steppe-derived Corded Ware Culture as the proximate agent of the Indo-European expansion in much of Europe

- the only present-day Europeans who closely resemble Neolithic farmers are some Sardinians (the small Romance cluster just above the two Neolithic samples), but Sardinians spoke Paleo-Sardinian or Nuragic languages until they adopted Indo-European speech, in the form of Latin, from the Romans (see page 118 here).

Also, this isn't shown on the plot, but the dominant Y-chromosome haplogroup of early Neolithic farmers is G2a, which is a low frequency marker in Europe today. The two most common Y-chromosome haplogroups among present-day Europeans are R-M198 and R-M269, which are also typical of Corded Ware and Yamnaya males, respectively, and probably originally from the steppe.

So is there any way to rework the Anatolian hypothesis so that it can be salvaged? I doubt it. Even making the steppe a homeland for all of the main Indo-European branches apart from Anatolian and Armenian probably won't help.

It is true that the Yamnaya nomads carried Near Eastern-related ancestry which may represent Proto-Indo-European admixture from outside of the steppe. But there's no evidence that it came from Anatolia.

In fact, if Neolithic Anatolians were basically identical to early Neolithic European farmers, which seems to be the case (see here and here), then it's unlikely that it did, because the latter carried a peculiar genome-wide signal that is missing in Yamnaya genomes (orange cluster in the ADMIXTURE bar graph below).** Heck, even the early Corded Ware genomes from Germany barely show any of it.

I won't go into the linguistics arguments here why the Anatolian hypothesis is implausible. But it might be worth checking out a new book on the topic by linguists Asya Pereltsvaig and Martin W. Lewis: The Indo-European Controversy: Facts and Fallacies in Historical Linguistics. I haven't read it yet, so I welcome the opinions here of those who have. I did, however, read a lot of the online articles on which the book is based. As far as I know most of them are still available here and here.

*Another version of the same PCA, with the samples labeled individually, is available here. All possible combinations of dimensions 1 to 4 are shown here. The samples are listed here. All of the samples are from Haak et al. and Allentoft et al. The PCA was run using ~56K high confidence SNPs listed here.

The Corded Ware sample is a composite of Corded Ware sequences from Germany, Scandinavia, Estonia and Poland. The Yamnaya sample is a composite of Yamnaya sequences from the Kalmykia and Samara regions of Russia.

I chose to use these composites instead of individual sequences because I didn't want to run any samples with genotype rates of less than 98%.

** For a more detailed ADMIXTURE analysis comparing early Neolithic farmers to Yamnaya refer to Haak et al. Supplementary Information 6. Note the minimal sharing of components at the higher K between the early Neolithic farmers and Yamnaya, especially at K=16, which has the lowest median cross-validation (CV) error. This is in agreement with the PCA above.

See also...

Population genomics of Early Bronze Age Europe in three simple graphs


Nirjhar007 said...

This Post of yours is highly childish....

Davidski said...

Which parts specifically?

Mike Thomas said...


One interpretation could be :

That Yamnaya is not ancestral to CWC, but in fact a Cousin (which you mention) and a rather distant one at that (separated sometime in the post-glacial period) .

The Teal appears to be a Caucasus- northern Iran component, whether it was solely due to women not.

This entered via the Caucasus, one to incipient Yamnaya groups in Russia, the other wave to (Belarus, Ukraine) proto-Corded Ware groups.

Simon_W said...

The teal component is bimodal, there is one peak in the south-central Asia-Hindukush region and another one in the Caucasus. Judging from David's analyses it appears to be a mix of ANE and West Asian, so it's natural that it's strongest wherever there is a lot of ANE in Western to South-central Asia. Neither ANE nor teal are particularly strong in Iran. (Northern Iran might be different, I don't know.)

Simon_W said...

Nirjhar, I don't find it childish; it's nothing new for sure, but necessary and useful, considering that there are still people believing PIE may have originated in central Europe with the WHG resurgence, or in southeastern Europe with the thriving Copper Age cultures or even be the legacy of R1b carrying HG at the Atlantic fringe. These views simply don't gel with the archaeogenetic data. (The regular mention of the Anatolian farmer theory as the main rival of the Steppe hypothesis is probably more a convention among geneticists, a habit that has become common because everyone does it.)

Davidski said...

All possible combinations of dimensions 1 to 4 for the above PCA...

postneo said...

Yamnaya itself may not be steppe derived. How can you refute anything if you don't have samples of relevant age from Anatolia, Armenia Iran etc. we already know ENF is linked to Anatolia. How can you prove that the late Anatolian migrants did not speak some form of IE. This would have primed European populations to switch to a steppe IE component even faster.

The British Were surprised by the rapid adoption of English in India. Something they never saw in other colonies.

Karl_K said...

"So it's highly unlikely that any Europeans today are speaking languages derived from those of the Western Hunter-Gatherers or early Neolithic farmers from Central Europe."

What about Basque? If it's not Indo-European, then it must have been left over from one of these. (Or be the original Bell Beaker language from the Iberian Steppe.)

Karl_K said...

"Yamnaya itself may not be steppe derived. How can you refute anything"

I must say, this seems like a pretty solid argument that postneo is making.

Illya P. Constant said...

That's a false dichotomy fallacy, the event from Anatolia that's relevant to the indoeuropean discussion is the spread of superior bronze age technology and civilizations from the region which is well established, not farming, when the genetic data from ancient Anatolians becomes available if we see that from the chalcolithic all the way to Palaics, Luwians, Lycians, Milyans, Carians, Sidetics, Pisidians, Lydians and Hittites there was no significant genetic influx then a bronze age Anatolian homeland for indoeuropean languages will be proven, Chalcolithic central and eastern Europe seem to support that idea as the people in those regions have been genetically mediterranized between that era and the factual appearence of indoeuropean languages in the region.

Romulus said...

"- the arrival of early farmers from Anatolia during the Neolithic transition, which eventually caused the extinction of people like the Western Hunter-Gatherer, who is the most obvious outlier on the plot"

Then how does Y DNA I make up a third of Euro Y Lines if they went extinct?

truth said...

I think he meant they don't exist in pure form, but they are indeed in large part of our ancestry.

Alberto said...

In the picture I still miss the "Teal" people. Now we know that they existed, that they represent about 50% of Yamnaya/Afanasievo, and that they went to all of Europe at that time. We also know they plot with Asian IE speakers (say, Tajiks, Pathans,...), and we cannot model Indo-Aryan speakers without the "teal" people (or the Georgians as a proxy, since we don't actually have samples from these people yet).

So these people were probably a lot, and they went everywhere: from India to Ireland. For what we know (few, really, just yet), there's no reason to think they were not R1 (clues point that they were, but still too little evidence to say).

The only other option would be EHGs (!?). They went as extinct as WHGs. That is, their genes survived, and maybe their lineages, mixed with farmer populations (in this case, the teal people) who really spread the culture (obviously. HG cultures went extinct). We really don't know if EHGs ever went south (long after mixing with the teal people, in any case). Though I wouldn't bet for it at this point.

So any model that does not include this teal people in it is basically ignoring the most basic and central piece of the puzzle. But then again, we don't (or hardly) have DNA from them directly, so I understand that the picture is still very incomplete.

The main question at this point is where exactly did these people come from. We know they were not original from Europe (including the steppe). We know they didn't come from western Anatolia. Or from North Africa. That's basically what we know right now. So still quite a few options, but all of them in Asia.

But in general I do agree with the basic point: it's unlikely that IE spread with farming from Anatolia around 7000 BC.

Taymas said...

I'm a complete novice, but if I understand correctly, PC1/PC3 is the next most-explanatory duo after PC1/PC2, and I suggest people take a look. You can see a clear arc of early-farmer-ancestry, plus two prominent clines, one toward Yamnaya. Note how Iranians pull slightly, and Eastern Iranians pull strongly towards Yamnaya.

Also, which population is the other cline leading to? I'd guess BedouinB, but that'd be very unexpected to my noob intuition, if in PC1/PC3 the Bedouin and early farmers are opposite poles. Is this other cline showing the Semitic expansion?

Thanks for such interesting visualizations Davidski! I'm getting greedy, but it'd be really interesting to get a few more populations colored and see where EHG/SHG/BAR100 fit in all the additional dimension pairs.

tew said...

A not so commonly mentioned but strong linguistic argument against the Anatolian Hypothesis (I haven't read the whole book, but I think Lewis mentions something to this effect only in passing) is that the terms for some of the most basic agricultural items and produce like grains are very different (i.e., often not cognate) between western (Europe/Caucasus) and eastern (Tarim basin/South Asia) IE languages - whereas important terms related to animal husbandry and metallurgy tend do be cognate. What is more, this mismatch is mostly geographical rather than dependent on linguistic genealogy, and languages from similar geographical areas tend to share agricultural terms no matter what their IE subfamily. That would be a really unusual outcome if different IE groups had spread and diversified along with agricultural expansions from a central location in or around Anatolia, and in fact seems to suggest multiple events in the adoption of large-scale settled agriculture among IE-speaking populations.

Krefter said...

"In the picture I still miss the "Teal" people. Now we know that they existed, that they represent about 50% of Yamnaya/Afanasievo, and that they went to all of Europe at that time. We also know they plot with Asian IE speakers (say, Tajiks, Pathans,...)"

Those IEs are Indo Iranian. Sintashta/Andronovo were likely the proto-Iranians, and were Teal+EHG+WHG+EEF. They were very similar to Corded Ware.

I don't know if you were making the argument IE started in S/C Asia, and Indo Iranian is the only one that stayed, but anyways since Indo Iranian is the only IE language there that wouldn't make sense.

If S/C Asian IRs are something like 40-70% Sintashta/Andronovo, there must have never been a Teal pop in S/C Asia. Instead a mixture of Sintashta/Andronovo and very unique and extinct populations recreated something similar to Teal.

Arch Hades said...

One wonders why the teal people are so phylogenetically distinct from the Orange Sardinianesque EEFs, especially since those farmers had most their ancestry coming from Eastern Anatolia. The Caucasus (where the teal component in the Yamnaya comes from) is not that far away from Eastern Anatolia. Two distinct West Asian populations who would have been pretty close to one another.

Alberto said...


"If S/C Asian IRs are something like 40-70% Sintashta/Andronovo, there must have never been a Teal pop in S/C Asia."

Well, first we have to see if S/C Asians are 40-70% Sintashta, or maybe 0-15% Sintashta. Both possibilities exist at this point. I don't know if you give much credit to the first one, but even if you do, you need the teal people to make that model work. So no matter if you go for the Sintashta model, or for the Afanasievo model, or for one that excludes both, you cannot exclude the teal people. Without them no model will work.

So yes, for what we know, there had to be a teal population in S/C Asia.

Colin Welling said...

Despite the critical comments I actually found this post very forthright and logically argued.

David gave his simple assumptions and explained why it leads to the given conclusion. It was a good argument!

He pretty didn't over extend or leave out decent counter arguments.

I dont understand the complaints.

Colin Welling said...


That Yamnaya is not ancestral to CWC, but in fact a Cousin (which you mention) and a rather distant one at that (separated sometime in the post-glacial period) .

The Teal appears to be a Caucasus- northern Iran component, whether it was solely due to women not.

This entered via the Caucasus, one to incipient Yamnaya groups in Russia, the other wave to (Belarus, Ukraine) proto-Corded Ware groups.

He didn't actually say that the teal component would only be reflected in female lineages. He even conceded that PIE might have come from these teal people (however likely or unlikely that is). The point he was making was that even in such a case, Anatolia is not suggested by the data.

He is sticked to the point at hand, which is that the anatolian hypothesis is incredibly weak.

Rokus said...

I was in California and saw Mexicans and Chinese all over, but it was the tiny Anglo-Saxon component that brought the English language. Isn't international admixture also the inevitable result of cultural success? Geneflow in Britain also caused a huge dislocation of the original components (AS, Iron-age) both in western and eastern direction. Kurganist geneticists love to mess things up with prefab mixtures like EEF, EHG. Having expanded so closely to Anatolia and Central Asia, wouldn't the Yamanaya-kind have met and mixed with loads of straight descendents of Neolithic farmers, being rather like Anatolians than EEF, and overrun an Amerind-like original population, apparently likewise extinct ever since? Why mixed Karelian EHG would have mixed more thoroughly with Yamnaya than more typical WHG, ANE etc specimen separately? I agree WHG had an ancient distribution as a component, though it also existed contemporarily as a single component in the west. So please quit the horsecrap and show me without cheating how the ANE and 'teal' autosomal increments west of Yamnaya/CW superate the CE Neolithic and WHG increments east of BB/CW?

Colin Welling said...

Yamnaya itself may not be steppe derived.

Yamnaya is on the steppe and is closely related the the preceding ydna and autosomal dna to previous people on the steppe.

Its insane to argue the yamnaya aren't steppe derived. If you mean entirely steppe "derived", well... sure... nobody here really disagrees with that, except perhaps David Anthony who has suggested the teal component itself might be steppe derived.

Mike Thomas said...


I'm certainly not advocating the Anatolian-Neolithic scenario. I very much agree with the copper age period (if not later !). Even before genetics, I found the Neolithic hypothesis linguistically absurd, for it assumes 5000 years of minimal language change- and is thus devoid of basic reason.

Moreover, is assumes that language change can only happen with epic phenomena (like farming). In this regard it's not too different to the Kurgan hypothesis (the ineffable "PIE warrior ideology" and alleged mounted combat- for which no real evidence exists.

Rather, I was merely commenting on the permutations possible for CWC -Yamnaya relationships.

Davidski said...

I've added a footnote about the ADMIXTURE analysis.

Btw, thanks for the comments. As usual, lots of useful stuff. However, I'm yet to see a single coherent critical point about what I've written.

Remember, if you're going to criticize, make sure you have something coherent to say. Thanks in advance.

Davidski said...


Claiming that Yamnaya aren't Europeans, simply because they're relative outliers on the intra-West Eurasian PCA above in dimension 2, is not a coherent argument, especially in light of the fact that on global plots Yamnaya cluster among Europeans.

Basically you're assuming that European-like populations lived across a great swath of Eurasia until recently, when, apparently, everyone else began to stir and move out of wherever they were hiding, presumably their hovels.

That's quite a stunning example of inadvertent Eurocentric chauvinism. Never seen anything like it.

Mike Thomas said...


3 more comments/ quesitons

1) why do you think Oetzi is so far "SW" on this plot ?
2) Im not sure I see such an 'extinction' of WHG, given the obvious continuity of WHG autosomal component and Y DNA
3) Im not sure why Karelia and Samara HG are excluded ? It cuts out half the steppe story, IMO.

Mike Thomas said...

That book looks excellent
Need to buy -E-Version..
Very good critique of lexicostatistics, etc.
Although One minor critique is that it perhaps too readily assumes that Neolithic farmers were static, and not mobile. A common perception, but they must have been fairly mobile to colonize Europe. Indeed, according to recent isotopic & C14 studies (C/f Price, Boric), they had reached the Carpathian basin not too long after Thessaly/ Macedonia

Davidski said...

1) Middle Neolithic and Copper Age Europeans behave like this on these sorts of plots (have a look at the PCA from Haak et al. in which they didn't use projection to position the ancient samples), which might be due to strong isolation and drift from the early Neolithic to the Copper Age across much of Europe.

2) It depends how you define extinct. Usually if something no longer exists except in mixed form then it's regarded as extinct.

3) EHG aren't relevant in this context, because there never were any expansions of pure EHG groups into Central Europe. Most of the EHG admixture in Europe today is from Yamnaya, Corded Ware and related cultures, except perhaps among some groups in Russia.

Mike Thomas said...

Dave, thanks for your reply.
In turn

1) the Isolation & drift might be relative only to western Europe, in light of possible '2nd Neolithic wave' into SEE.

2) Sure, there's no Palaeolithic foragers in modern Europe - genetically or socially (apart from those who are on the Palaeo diet :) ) But the Paelolithic component of ancestry - whether from CWC -types, or WHG 'revivals' appears dominant, albeit moderated via much later events.

3) Granted ,there was no pure EHG movement, apart from NEE. But EHG roots and contextualizes yamnaya and CWC. Quite clearly, there was some kind of admixture occurring prior to Yamnaya, akin to the Neolithicization of central Europe, albeit somewhat later, and by an apparently different population to that seen in EEFs.

Davidski said...

All of the Middle Neolithic and Copper Age samples - from Germany, Hungary, Italy, Sweden and Spain - are shifted west relative to the earliest Neolithic farmers. So it appears to have been a phenomenon not restricted to Western Europe.

And I don't think the minor (a lot less than 50% in most cases) survival of Western Hunter-Gatherer ancestry across much of Europe is relevant to the Anatolian hypothesis.

Neither is the mixing between the hunter-gatherers of Eastern Europe and Near Eastern-related groups, who in all likelihood didn't come from Anatolia, and certainly not from the part of Anatolia favored by Renfrew.

Mike Thomas said...

Again, I'm not making those points in context of a Neolithic linguistic hypothesis, which I certainly do not favour.

Nirjhar007 said...

Just take a look at this paper by Mallory, it clear gives importance to the role of Farming in IE people which can't be on the steppes!-

Davidski said...

I know, but my focus here was on the implausibility of the Anatolian hypothesis, and I didn't want to wonder too far from this brief because it's complex enough as it is.

I plan to do a similar write up about the Armenian Plateau hypothesis. But I'll have to wait for some more ancient DNA from the Near East, particularly from Armenia and Iran.

Davidski said...

Nirjhar, how did you work out that farming wasn't practiced on the steppe?

Ever heard of millet? Look it up.

Nirjhar007 said...

David,Just Read the Paper.

Mike Thomas said...


I don't think that farming and pastoralism are mutually exclusive. In fact, they're ends of a spectrum. True pastoral nomadism - the highly mobile form- didn;t exist until the Iron Age, or something.

And I beleive you could farm on the steppe - albeit limited to the big river valleys. Both forms of economy co-existed south and north of the Caucasus, and south and north of the Caspian - as early as the 4th Millenium.

Clealry, there was some complex interaction going on between Kura-Arax and Majkop. The latter provided the impetus for expansion to the Urals and Europe, the former expanded toward the Anatolia, Mesopotamia and Iran.

I suspect even with a lot of aDNA the complexity of these interactions will be difficult, but certainly possible, to unravel.

Nirjhar007 said...

This Very True in case of PIE history-
''“It is the consistency of the information that matters for a good story, not its completeness. Indeed, you will
often find that knowing little makes it easier to fit everything you know into a coherent pattern” (Kahneman
2011, 87).''

Davidski said...

My arguments here against the Anatolian hypothesis are very coherent and consistent.

The question remains though, can you put together a coherent rebuttal?

Nirjhar007 said...

Hi Mike,
// Both forms of economy co-existed south and north of the Caucasus, and south and north of the Caspian - as early as the 4th Millenium.

Clealry, there was some complex interaction going on between Kura-Arax and Majkop. The latter provided the impetus for expansion to the Urals and Europe, the former expanded toward the Anatolia, Mesopotamia and Iran. //
Yes there was a net of Archaeological complexes from Neolithic period (Even Mesolithic) from Northern Iran To SC Asia-Central Asia-Urals and To the areas of Maykop and the Near East, Farming and and Economical products were heavily imported and Exported, Catastrophic climate changes also created shifts in such patterns provoking migrations in large numbers and often those migrations took place on the known routes of economical interactions.
All of that can be pointed from Archaeology etc

Nirjhar007 said...

I'm not promoting Anatolian Hypothesis i just gave a scientific paper reflecting the limitations and shortcomings of AHT (Antatolian Homeland Theory) and also the SHT (Steppe Homeland Theory),
But we even don't have enough Genomes from Anatolia itself! specially the Eastern areas.

Davidski said...

Then what were you protesting about until this point?

Even if the Neolithic genomes from eastern Anatolia look like the ancestors of the Near Eastern half of the Yamnaya it won't salvage Renfrew's Anatolian hypothesis, because as per above, it's already extremely unlikely that early Neolithic European farmers spoke Indo-European languages and passed them on to the steppe hordes of the Bronze Age.

So the Anatolian hypothesis will have to be reworked into the eastern Anatolian hypothesis. Oh wait, we already have something like that, called the Armenian Plateau hypothesis.

Don't worry, I'll get around to the Armenian Plateau hypothesis in good time. But you probably won't like what I have to say, so...

Mike Thomas said...


"And I don't think the minor (a lot less than 50% in most cases) survival of Western Hunter-Gatherer ancestry across much of Europe "

To me it looks like it is the major compnent, in non-Mediterrameam europeans.

And no, I'm not suggesting PCT (ludicrous pseudo-science).

Davidski said...

Even some of the stuff in Northern Europe that can often pass for WHG is actually from Anatolia. And some of the rest is actually EHG, from Corded Ware/Yamnaya.

We'll have to revisit the issue of Mesolithic survival when early Neolithic genomes from western Anatolia become available, and then run the analysis also taking into account that EHG can look very WHG-like.

Grey said...


"So any model that does not include this teal people in it is basically ignoring the most basic and central piece of the puzzle. But then again, we don't (or hardly) have DNA from them directly, so I understand that the picture is still very incomplete.

The main question at this point is where exactly did these people come from."

Somewhere near Apple valley would be my guess (Almaty) - maybe those Swedes(?) researching in Kazakhstan will find another (and older) Gobekli buried under an apple forest with some skeletons.


"caused the extinction of people like the Western Hunter-Gatherer ... Basically, it shows that the present-day European population structure is the result of two main events:"

It seems to me there was a third event caused by a dramatic population expansion along the Atlantic coast but It doesn't effect the main steppe vs Anatolia argument.

Alberto said...


"Somewhere near Apple valley would be my guess (Almaty)"

Yes, I usually regard that area as the extreme end of the teal people's homeland. Basically I think that for a long time that area remained mostly pure ANE (and possibly R1a), and only quite late the ENF component arrived to mix and form the "teal". And soon after from there they mixed with steppe people to form Afanasievo types.

But from that area that you mention, to the South Caspian, there must have been a lot of teal (who expanded to the Caucasus, Iran, S/C Asia,... and to the steppe). That's my own guess, at least.

Davidski said...

Afanasievo wasn't native to the Altai. Really it wasn't.

Many years ago archeologists worked this out and even found rest stations along the way from the north Caspian to the Altai that the western steppe nomads used when they traveled east.

Mike Thomas said...

Not stating either way, but Dave you had earlier suggested it was a refugium for R1, ANE populations

Alberto said...

Yes, they could have migrated east too. What I find strange is that both Yamnaya and Afanasievo are contemporary, and in between them there are cultures like Botai that are clearly distinct. I guess some Khavynsk DNA could help to sort it out (or some Afanasievo Y-DNA).

If Afanasievo came from the west, then they might have mixed along the way with a small amounts of MA-1 type of people (kind of extreme EHGs, with more ANE and less WHG, which makes geographical sense).

Mike Thomas said...

^ by "it" I meant the Alrai region.

Alberto said...

An interesting paper debating the origin of pastoralism in the eastern steppe. It refers specifically to the possible relationship between Yamnaya and Afanasievo:

Read from "Early Pastoralism in the Eurasian Steppe", page 285.

It also mentions later (next chapter) that "wild sheep species from SE Kazakhstan are genetically unique from other Eurasian wild sheep, while domestic sheep from this region find their closest neighbors to the south in Tajikistan, rather than from European Breeds".

Davidski said...

Let's put this into some context.

This is where Afanasievo clusters on a global PCA. So if they were native to the Altai region it means that Europe stretched from the Atlantic to the Altai.

Possible, but unlikely. My prediction, based on what I've seen and heard, is that the pre-Afanasievo people of the Altai will not cluster with Europeans, but somewhere around MA-1.

Mike Thomas said...

Yes, I think Frachetti links the type of pastoralism in Afansievo to regions to its South rather than west, and argued that it's as early or earlier than Yamnaya.
Whatever the case, the actual movement of people (rather than the sheep) might be different , and we'll soon find out

Mike Thomas said...

"My prediction, based on what I've seen and heard, is that the pre-Afanasievo people of the Altai will not cluster with Europeans, but somewhere around MA-1.'

That certainly makes sense to me.

Alberto said...

Yes, it's difficult to say the exact place where WHG ancestry would start to sink when going to the east. In Samara it was still high, but whether it fell rapidly from there to the east we don't know. But yes, I agree that's more than possible.

I certainly don't argue strongly against Afanasievo being connected to Yamnaya. Their autosomal resemblance is obvious, and they're distinct from any other population. But I still leave some room for doubt, given some arguments about the chronology, distance separating them and some differences in their economy and culture. Plus the lack of continuity in the regions in between them. So let's say that a possible independent formation of Afanasievo is kind of a plan B.

Taymas said...

Davidski, any thoughts on what PC1/3 is showing us? Thanks!

Matt said...

The PC3 looks to be some shared factor which pushes Mediterranean-like populations further away from Bedouin, HG and Yamnaya than they would appear from PC1 and PC2, while putting Bedouin HG and Yamnaya slightly closer together than they would be on PC1 and PC2 (presuming Bedouin are where I think they are in PC1 and PC2). I can't think of how it can be interpreted as any population movement. This sort of thing is normal in PCA, where you have a lower dimension that places two populations closer together than they are relative to a third, and then another dimension later adds some nuance to that.

Davidski said...

In PC3 the grey outliers are the BedouinB.

It's hard to say what that plot means. It looks like a Copper Age Europe vs present-day Middle East and Yamnaya/WHG thing.

Grey said...


"Basically I think that for a long time that area remained mostly pure ANE (and possibly R1a), and only quite late the ENF component arrived to mix and form the "teal". And soon after from there they mixed with steppe people to form Afanasievo types."

What my fuzzy logic was looking for was a population that was adjacent to the steppe but not adapted to it with a naturally occurring reason for becoming sedentary and access to goats.

And then from that wondering in what directions might they expand assuming the later Silk Road routes were the paths of least resistance.

So for me less of a near-steppe population than a steppe one.

Grey said...

"So for me less of a near-steppe population than a steppe one."

*more* of a near-steppe population

Davidski said...

Afanasievo ins't an ANE/ENF mixture. It's EHG/ENF, same as Yamnaya.

So Afanasievo is from Europe. Again, you can see that on the global plots.

Kristiina said...

"So Afanasievo is from Europe."

Maybe from Eastern Europe, but Dave, in your own map, Afanasievo specimens (RISE509, RISE5011) do not cluster with Slavs such as Ukrainians or Czechs, but, instead, RISE509 cluster with Finns, Mordovians and Kargopol Russians and RISE5011 between the above and North Ossetians.

Nevertheless, according to the admixture chart K=20 of Allentoft et al paper (
– in addition to EHG (?) – Afanasievo specimens do not have the Sardinian component (=ENF?) but instead the Kalash/ Makrani/Pathan component (25-30%) + a small amount of Native American/Siberian stuff.

Davidski said...

Kristiina, what's above North Ossetia? Last time I looked it was the Pontic-Caspian Steppe.

And the reason Afanasievo don't show any of the "Sardinian component" is because they don't have any Early European/Anatolian Farmer ancestry.

So what can we make of these facts? Probably that Afanasievo came from the Pontic-Caspian Steppe, and didn't have any ancestry from Anatolia, which is another nail in the coffin for the Anatolia hypothesis. Wouldn't you agree?

Kristiina said...

Yes, I agree on that. So, we should say that basically Afanasievo are EHG + Central Asia/Teal.

It is interesting to see that the core Uralic speakers cluster with Afanasievo and Yamnaya and speak languages that have close lexical and structural parallels with IE stuff. The biggest difference is the Siberian/Native American portion that is significant in some northern groups such as Ob-Ugrics and Saami.

Mike Thomas said...

Yes. Maybe this all comes back to the substrate hypothesis, with the Central asian input acting on a uralic-type language to form PIE. But whatever the case, maybe a more neutral description like "Afanesievo were northwest Eurasians" would be better ?

Kristiina said...

or Uralic languages are based on PIE + Siberian/Native American type substrates.

Kristiina said...

Or perhaps more accurately Uralic languages are based on Indo-Uralic + Siberian/Native American type substrates and IE languages on Indo-Uralic + Caucasus/Teal stuff.

pequerobles said...


could you give a key to the colours in the chart.

including the minor admix stuff i see in some of the populations


Mike Thomas said...

Possible I guess. Although I've never heard of PIE being a substrate for Uralic. Usually the other way round. And how would you explain a Siberian expansion towards Europe? Unless one invokes one of those fanciful out of the Americas scenarios

Kristiina said...

Okunevo (c. 1800 BC) are c. 20% Native American and 30% Siberian + Eskimo and the rest is EHG and Teal. It is a pity that we do not have their yDNA. I am pretty sure that the oldest layer of Western Siberian Baraba Steppe stuff (4000 BC) is very EHG, Native American and Siberian + Eskimo and later periods bring Teal to the area. Similarly, people who lived in Kola peninsula 1500 BC, had with all probability a lot of Siberian with EHG.

My presumption is that the Siberian component differs from Native American stuff in that it has some Han ancestry in it. It has been shown that there is Southeast Asian mtDNA such as F1a in Siberia already 5000 BC (Baikal area).

I think that Native American type people have always existed in Siberia but the so called Siberian/Nganasan component seems to have expanded in Siberia from c. 5000-4000 onwards. Their language(s) certainly had similarities with modern Uralic languages that are spoken in the same area.

DMXX said...

Hi Kristiina,

"Yes, I agree on that. So, we should say that basically Afanasievo are EHG + Central Asia/Teal."

Is this not what Yamnaya itself was? Allentoft et al.'s ADMIXTURE plots show Afanasievo was almost identical to Yamnaya.

Davidski said...


See here...

Mike Thomas said...


Is this Siberian and Amerindian actually found in Uralic speakers, apart from the most eastern ones and Samoyeds?

Kristiina said...

According to my personal Geno2 test, I have 2% Native American and 5% Siberian. According to the admixture chart K=20 of Allentoft et al paper, Estonians have only trace amounts of Siberian, if any. Nenets have c. 70% Siberian and 30% European. The admixture chart of Anzick paper shows that in particular Ob-Ugrics have some Native American ancestry (maybe 10% range, cannot remember exactly).

This chart may give you more information on Siberian ancestry:

However, not all the Siberian ancestry in Finno-Ugrics is from the proto-language period as many groups have important Turkic admixtures, such as Maris. Finns should basically lack this late admixture.

Mike Thomas said...

Ok thanks !
So it doesn't appear to be particularly large west of the Urals, and dates from a much later period compared to the reconstructed age of proto-Uralic: mesolithic.

So I'm not sure that I see it being linked the spread of Uralic in any significant way.

Whatever the case, I'm (personally) going to try to withhold any genetics- based language hypothesis until I see data from Greece, Hittite core lands in Anatolia, and northern india. Otherwise we're hypothesising on assumptions on assumptions

Kristiina said...

Mike, if proto-Uralic developed in Volga Ural during the Bronze Age, Siberian component probably had already spread there at that time but it was probably a minor component. If proto-Uralic developed further west and closer to the Baltic area, Siberian component may have been absent. If proto-Uralic developed in Altai, it surely had Siberian component, but in that case their yDNA should rather be R1a or Q, as in the light of ancient yDNA from the Bronze Age Altai, the local yDNA was R1a or Q.

I understand your point, and it is not nice to be disapproved by later research. However, I am ready change my views according to the new evidence.

Nirjhar007 said...

Kristiina Hi!, What is the what is the word for Winter in Proto-Uralic?

Mike Thomas said...

I think I understand what youre saying now. So you think that uralic spread in Bronze age ? I thought it's reconstructed lexicon is mesolithic - at least according to Jaha Janhunen.

Kristiina said...

Nirjhar, reconstruction seems to be tälwä as in Finnish talvi, but Eastern Khanty word is teləγ. By way of curiosity, Nivkh word for winter is thulf, but of course, it may just accidentally resemble the Uralic construction.

Kristiina said...

Mike, Jaakko Häkkinen concludes in his remarkable essay on Proto-Uralic ( that differently from what has been claimed (for example Janhunen 2008), considering its quantity and quality, the proto-Uralic lexicon is not reflecting an early stage of development. Instead, the Aryan loanwords, name for mixed metal and agricultural lexicon show that in the beginning of the northern Bronze Age, around 2000 BC, proto-Uralic was still quite uniform language spoken in a restricted area.

Alberto said...

The other day I was reading a paper from some Anahit Khudaverdyan examining nonmetric craniological traits of Armenians from the early Bronze Age samples to modern ones, and the conclusion was basically continuity. Which is what ancient DNA says too. Today I found another rather fascinating paper from the author. It's from 2012, and maybe it went without much attention due to the method not being too trusted or whatever. But now we see it matches quite well what ancient DNA is showing, so I would recommend to revisit it:

It takes samples from all Eurasia and uses both craniological and dental nonmetric traits to compare them. Apart from predicting what Haak et al. showed about Yamnaya and Allentoft et al. about Afanasievo, it has a lot more samples from many other places. The way the results are showed is a bit impractical (built into trees without labels, only the numbers of the sample groups), so it takes a bit of effort to check them. But, as an example, I was lately interested in where did the early Kurgan type of burials from the Balkans come from, since they predate Yamnaya by some 500-1000 years. Here they have 2 groups, labeled:

159 Romania Total group (burials with ochre) c. 4500-3500 BC
160 East Romania Total group (burials with ochre) c. 4500-3500 BC

On the first tree (Figure 1), the first group (159) clusters with a group (64) labelled:

64 Central Asia Kapuztepe c. 4000-3000 BC

The second group does not have a twin branch, but the closest branches are:

66 Central Asia Parhai 3000 BC
79 Daghestan Ginchi c. 4000-3000 BC
178 Western Europe Total group (Globular Amphora Culture) c. 3400-2800 BC

On that first tree (Figure 1), the first branch to split (less related to others) is:

93 Ural Mellitamak c. 5000-4000 BC

Which makes sense.

From the dental analysis:

"The Armenian highlands sample (2) exhibit closest affinities to sample from Turkmenia (Gonur-Depe) (figure 5). The Balanovo culture sample from the Volga region (7) are identified as the steppe samples with closest affinities samples from Ukraine (26, Сucuteni-Trypillian culture) in particular. Intersample affinities among samples the Georgia (3) and Ural (16, Timber Grave culture) also show up. The Turkmenia sample (21, painted ceramics сulture) and the Bronze Age sample from Ukraine (25, Pit Grave culture) exhibit very close affinities to one another. Analysis has shown close affinities samples from Turkmenia (20, Altyn-Depe) and Altai (17, culture Andronovo)."

While this method is no substitute for ancient DNA, and some results make more sense than others, the overall picture seems quite solid. Worth a look. (If someone does check it and find interesting matches, please post them, it's hard to check all numbers one by one).

Nirjhar007 said...

Thanks Kristiina, so its not Similar to PIE which is *Wend.
Alberto, Very impressive! thanks a lot,

PF said...

"So it's highly unlikely that any Europeans today are speaking languages derived from those of the Western Hunter-Gatherers or early Neolithic farmers from Central Europe."

I think you meant "the vast majority" instead of "any." In fact this may be a good time to look at the surviving non-IE languages and see whether there are any clues to be gleaned from that info.

Besides Basque, there are the Kartvelian languages (Georgian, etc.) and some Northern Caucasian languages. I've for awhile been curious that these regions also show the highest concentrations of G2a anywhere, the dominant haplogroup of Neolithic Europe. The correlation is intriguing.

However, looking at G2a in more detail, one can see distinct clusters which separate the Caucuses from Sardinia/Oetzi/Europe. Perhaps, just before the advent and spread of agriculture in the area, some G2a men split off and settled in the Caucuses (e.g. G2a1), while others went on to figure out farming and colonize Europe (e.g. G2a2). To speculate further, this early Caucasian G2a might be related to the "Teal" we're looking for, either as a drifted form of an original G2a population, or via admixture between another unique/ancient group local to the region.

Taymas said...

Matt and Davidski,

To my untrained eyes, PC3 seems to say the W Asians are really a clump, closer to EEF than is clear in PC1/2, with certain groups pulled toward Bedouin or Yamnaya poles. To put it graphically, for W Asians PC1/2 is a bird's-eye view of a U shape, which thus just appears as a line, no? This seems to add to the BAR100 story: early Near Eastern farmers were closer than moderns to EEF. PC3 also shows that E Iranians, despite overlapping with other W Asians in PC1/2, actually differ by being much closer to Yamnaya. Let me know if I'm out of my mind.

Matt, thank you for your prior, thorough explanation of qpAdm. It was really informative, I only got to read it a few days ago (work/travel).

Matt said...

Taymas: PC3 also shows that E Iranians, despite overlapping with other W Asians in PC1/2, actually differ by being much closer to Yamnaya.

Ah, I didn't even notice that the Eastern Indo-Iranian samples had a different position in the PC3. Based on the PC1 and PC2, I'd expect those to be the 8 Tajik Pomiri samples, and the North Ossetians to still sit with the Caucasus samples? Re: U shape, yes, I think I see what you mean (assuming Jews / Cypriots covered under West Asians).

Looking at the West Asian samples there on PC3, it does look like there is a cline, with some samples who flex very much more towards the Sardinian-Oetzi direction than others.

Also Corded Ware is just underneath where the Slavic samples touch the Baltic samples in this dimension, and very much similar in this dimension, rather than just out beyond them towards Yamnaya. You have to zoom in a lot on the plot to see it.

Overall, I think taking the other PCs into account, it does show what direct measured genetic distances show, where relative to the other samples the Middle Eastern and BedouinB are a little closer to Yamnaya and a little further from Mediterranean and West Asians than is implied just by PC1 and PC2 (PC3), and that Mid East and Med are also a little more different from one another and everyone else than all the previous PCs show (PC4) - (slightly relabelled version of the PC2, 3 and 4 plotted against PC1).

Grey said...

"So Afanasievo is from Europe."

I don't disagree with that personally. I think a population X came onto the steppe somewhere near the Caucasus (from some direction or other) and either sparked PIE directly (if they were R1b themselves) or got into a conflict with the locals which they lost (if they weren't R1b themselves) thus sparking PIE indirectly with the catalyst population then retreating back into the Caucasus and PIE -> Yamnaya -> Afanasievo. My quibble is over where population X originated: Caucasus itself or somewhere among the mountain foothills that run all along the edge of the steppe further east.

Mike Thomas said...


Thank you. Was that paper by Jaska published in a peer-reviewed journal?

Davidski said...


I don't think Basque is a language derived from the languages of early Neolithic farmers. I'd suggest it arrived in Europe from Anatolia with a later wave of Neolithic farmers.

And even though Georgia is often included within Europe, I don't consider it a part of Europe in this context.

Kristiina said...

Mike, Jaska’s article has been published in Journal de la Société Finno-Ougrienne: Häkkinen, Jaakko: Kantauralin ajoitus ja paikannus: perustelut puntarissa. Journal de la Société Finno-Ougrienne, 2009, 92. vsk, s. 9–56.

I go back to your earlier comment ”And how would you explain a Siberian expansion towards Europe? Unless one invokes one of those fanciful out of the Americas scenarios”.

Take a look at Fig 15

At present, the genetic ancestry of the Mesolithic Oleni Ostrov man is available to us: yDNA R1a, mtDNA C1, EHG+NA. His culture was a mixture of western/Baltic Kunda culture and Siberian Butovo culture. I assume that Kunda is R1a, and mtDNA C1 comes from the Butovo culture. It looks like the Nganasan component had not yet arrived to Fennoscandia at that time and Butovo Siberians were old Native American like Siberians. I do not claim that they came from America. Probably they had survived somewhere in Ice Age Siberia.

The arrival of Nganasan component to Fennoscandia is less clear. With all probability, it was present in Bol'shoy Oleni Ostrov (Kola Peninsula) (1500 BC). Of this culture, Sarkissian et al notes that ”A later migration from the East was associated with the spread of the Imiyakhtakhskaya culture from Yakutia (East Siberia) through northwestern Siberia to the Kola Peninsula during the Early Metal Age (3,000–4,000 yBP)”. It would be highly interesting to get their yDNA. My bet is that it is among those numerous lineages that have gone extinct, but may be not...

However, my Siberian ancestry probably comes from there and is not linked with Volga Ural.

postneo said...

there was gene and technology flow across the urals both from siberia to europe and also in the other direction. particularly tin bronzes came from the east in the late bronze age(seima turbino). some ENF and WHG would have also moved east completely by passing the caucasian influenced yamnaya region. This can account for the ADNA match btw sintashta and CW where yamnaya is an outlier.

Mike Thomas said...

Thanks Kristiina
I'll have a detailed read of the article
Is that oleni ostrov man from that paper which was mentioned here a while ago from northern russia c. 3-3000 BC ?

Kristiina said...

Should be the same guy as there is no other ancient yDNA from North Russia.

Sarkissian paper reports three samples with mtDNA C1 from Yuzhnyy Oleni Ostrov dated 7,500 uncal. yBP. In Haak et al, the Karelian huntergatherer is dated 5500-5000 BCE, so your 3000 BC should not be correct.

Mike Thomas said...

Oh of course the Haak guy - most people just call him "Karelia HG" (sorry for misunderstanding you)
Kristiina , I think we also have aDNA (incl Y) from around that region (Smolensk) from c. 2 or 3000 BC. R1a and two N1, wasnt it ?

Kristiina said...

Yes, that is true but Yuzhnyy Oleni Ostrov site is on Lake Onega deep in Karelia, and the distance to Smolensk which is close to Belarus, is 1 130 km according to Google map. I did not locate Smolensk in North Russia.

Unfortunately, we do not know the autosomal composition of Smolensk yDNAs.

German Dziebel said...


"I do not claim that they came from America. Probably they had survived somewhere in Ice Age Siberia"

Why not? If the data supports the former and not the latter.

Fanty said...

"And even though Georgia is often included within Europe"

I wonder why. Possibly because its a christian country instead of a muslim one.
And maybe because it participates in the "European song contest" (so does Israel!)


1. Georgia is not inside the geographic borders officially assigned to the continent of "Europe".

2. On PCA Georgia usualy clusteres in the SWAsian-WAsian cline and not on the European one. It not even apears as a bridge between the 2 sides like Italy and Greece do.

Krefter said...

"I assume that Kunda is R1a, and mtDNA C1 comes from the Butovo culture."

Both his Y DNA and mtDNA are probably from Siberia. The C1 is most interesting to me. EHG shows sometype of relation to East Asians that WHG/ANE lack. His C1 might be connected.

Krefter said...

My bet is.....

Magdalenian and Gravettian genomes will come out very WHG, not proto-WHG or whatever. A very strong WHG-signal in Mesolithic Russia and Neolithic Turkey suggests it's very old.

EHG shared U5 with Mesolithic West Europeans, but one was U5b and one U5a(except a few exceptions). They had been split for many thousands of years, but had the same WHG-signal.

Martin Clifford Styan said...

Since the new ancient DNA data appeared, I have looked especially at the K12 admixture data, which distinguishes the “Caucasus” and “Gedrosia” components. I have not been able to find a complete set of data, but I have collected data for various prehistoric people from different web sites and combined them with the data for modern populations, already available for several years.

I found that the Yamnaya and Corded Ware people have high levels of the Gedrosia component and only a little Caucasus. (e.g. M020637 Yamnaya Sok River I0443: North European 61%, Gedrosia 27%, Atlantic Med 6%, Caucasus 3%, Siberia 3%). This surprised me because, on the basis of several things I have read, I expected Yamnaya to be connected more with the Balkans, Anatolia and the Caucasus, but if so they would have less Gedrosia, more Caucasus and probably also more Atlantic Med and some South-west Asian.

The new data provides evidence of a migration from the Steppes to Central and Western Europe, which can credibly be identified with the origin of the Germanic and Italo-Celtic branches of Indo-European. However, the Kurgan Hypothesis also proposes migrations from the Steppes to the Balkans, Anatolia, Iran and India. I think the evidence is not so good here.

The Gedrosia component occurs from the British Isles to South India, but reaches its highest levels around the western border of Pakistan. I think Gedrosia in Asia cannot come from Europe, because in Europeans, including the Yamnaya people, Gedrosia is always found mixed with a much higher percentage of North European. In Asia, Gedrosia occurs either with low levels of North European or with none at all. I think Gedrosians must have moved north or north-west across Iran and Central Asia and mixed with North Europeans to form the Yamnaya people.

It is also interesting that while modern Western Europeans tend to have around 10% Gedrosia, modern Eastern Europeans have little or no Gedrosia but significant levels of Caucasus. This also applies to the present population of much of the territories of the Yamnaya and Corded Ware cultures. It seems that the people with a significant amount of Gedrosia passed through Eastern Europe, then disappeared from there, but survived in the West. The modern Balto-Slav populations probably represent a later migration, presumably from the Balkans.

Thus, it appears that the present evidence opposes the Anatolian hypothesis, but only partly supports the Kurgan hypothesis. I am looking forward to further developments.

Davidski said...


The so called Gedrosia component is a cluster created by modern endogamy and drift in the Hindu Kush.

Forcing ancient samples into such clusters is like putting the cart before the horse.

The results from the Admixture analysis I put up above came from a run that at least included the relevant ancient samples, thus they had an important effect on the outcome of the analysis.

Also, the K12 is an old test that suffers from the calculator effect.

Davidski said...

And if you'd like to see evidence of population movements from the steppe into South Asia, here it is...

From this blog entry:

Mike Thomas said...

Will this promises upcoming aDNA from India help resolve which level of K is most realistic, and indeed if "Geodrosia" actually existed ?

Davidski said...

The so called Gedrosia does exist but it's not useful for tracking ancient population movements. It can't be because it wasn't based on any ancient populations. It can only be used to demonstrate that the Pakistani cohorts from the HGDP dataset are the result of extreme endogamy and genetic drift.

It looks like Martin launched into this area straight from Dienekes and GEDmatch, without paying much attention to the studies that published the ancient genomes he was testing.

Ancient DNA from India will be very useful, because it'll show us the structure of Gedrosia and other such clusters. In other words, the proportions of admixture that came from native South Asians, Neolithic farmers and steppe nomads.

Mike Thomas said...

Yes I see
I've had a hunch that some population existed southeast of the Caucasus that might have been "steppe-like". If so, this might go toward explaining the massive admixture rates we're seeing. If not, then those Copper age Eastern Europeans really were supermen :)

Davidski said...

Just take a look at the geographic spread of all of the R1a subclaces under the M198 mutation.

Bronze Age Eastern Europe had a massive impact on Eurasia.

Mike Thomas said...

Yes, from an R1a perspective, as current phylogeny holds, it's undeniable;

Kristiina said...

Krefter, you say that "both his Y DNA and mtDNA are probably from Siberia." Do you mean that we see in this Mesolithic Karelian the arrival of R1a1 from Russia to Europe? In any way, he is probably autosomally a mixture between Siberian Butovo and Baltic Kunda cultures as according to Eurogenes Steppe K10, he has 37% WHG, 48% Steppe, 13% Amerind and 1.5% Siberian. However, it is possible that Kunda were predominantly I2, as Swedish hunter-gatherers.

I checked my Geno2 results yesterday and they were 4% Northeast Asian and 2% Native American. I assume that that 2% comes from Butovo. If so, my Mesolithic Fennoscandian ancestry could be even in 25% range. It may not be far out as U+V account for 28% of mtDNA in Finland and even 34% in the Northeast.

Davidski said...

Karelia HG (and EHG in general) is certainly intermediate between WHG and ANE at some level, but based on qpGraph models this doesn't look to be the result of recent mixture between WHG and ANE. In other words, all indications at this point are that EHG were at least in large part unadmixed.

ADMIXTURE is simply forcing them into a specific model because there aren't enough pure EHG samples in the dataset to create an EHG cluster. That's one of the problems with using ADMIXTURE to infer ancient mixture events.

So what I think Krefter is saying is that Butovo may have formed after the migration of EHG groups from Siberia, bringing both R1a and certain mtDNA C subclades.

It's likely that Kunda was indeed primarily I2, and eventually mixed with the Butovo people, but there's no evidence at this stage that Karelia HG was partly of Kunda origin.

Martin Clifford Styan said...

I would like to comment on the replies to my comment.

I think Davidski’s assessment: “It looks like Martin launched into this area straight from Dienekes and GEDmatch“, is more or less accurate. I have followed Dienekes’ blog for many years. He is now less active than he was a few years ago, and I am disappointed that he has not added analyses of the new ancient autosomal data to his Dodecad blog. I know a lot about history, geography, archaeology and languages, and I have spent a lot of time working out how to identify Y-haplogroups from STRs, but I know rather less about analysis of complex data and the ways of presenting the results. I am mainly interested in the big historical questions, rather than my own individual DNA, which is still untested.

When looking at autosomal data, I find the admixture results giving percentages for a number of different components rather more convincing than the methods that turn the results from an individual or population into a single point on a two-dimensional chart. I prefer to take spreadsheets and look at the actual figures, dividing the individuals or populations into groups according to which component is largest, which come second and third etc.

I am not surprised that there is much to criticize in the data I have used, and I would like to ask if there are some good quality spreadsheets available on the internet giving admixture results combining the new ancient data with data from modern populations.

In addition:
If “the so called Gedrosia component is a cluster created by modern endogamy and drift in the Hindu Kush”, why is this component found very consistently at high levels in all modern Indo-Iranian and Dravidian populations, and at lower levels in most Western Europeans? At the same time, it is consistently absent from various other modern populations, including the Austro-Asiatic groups in India and most Eastern Europeans?

Kristiina said...

Thank you Dave! Fascinating. I never imagined that Butovo would be R1a1. I hope we will soon see which yDNA(s) brought Combed Ware to Europe. N, Q or a surprise yDNA.

Davidski said...


I'd say the reason Gedrosia shows up in some populations and not in others seems to mainly depend on two things: the level of shared ancestry with the Pakistani populations from the HGDP and demographic history.

So, for instance, Western and Eastern Europeans are more or less symmetrically related to the Pakistanis. However, Eastern Europeans have gone through some recent rapid expansions and founder effects of their own, which means they'll have to be differentiated at higher K by the algorithm in some way, and in the K12 this is simply done by pushing up their North Euro component and lowering their Gedrosia.

Different tests deal with this in different ways. It depends on the availability or your choice of test populations and your choice of K. For instance, in this test, which actually has many of the Yamnaya genomes, they don't show membership in the South Central Asian (or Gedrosia-like) cluster, but rather in the Transcaucasian cluster.

I suspect that we'll soon learn that this is indeed correct. In other words, we'll see that an ancient North Caucasian population is in part ancestral to Yamnaya.

Also, for a direct comparison of many of the recently released ancient genomes across several components or K, see here...

And here's an attempt to estimate ancient steppe ancestry in modern West Eurasians...

I'll be posting more Admixture runs in the near future after I become more familiar with the data, and also when more ancient genomes are released.

Have you ever tried using Admixture? If so, you might find this interesting, although if you're going to run the Allentoft et al. genomes it's best IMHO if you only use transversion SNPs.

Martin Clifford Styan said...

Thank you for your detailed reply. You seem to have given links to several things I would like to study in more detail when I have time.

a said...

@Martin Clifford Styan
Look at the plain facts. Many of the oldest Kurgan finds in both Haak et al, and Allentoft et al, share the same ydna and or/snps autosomal qualities with the most ancient samples that have been tested to date from various regions in Siberia and Steppe as shown in K6 basal run. This includes Ust-Ishim- 45k+/- Ko1 Kosteniki 37k+/- and Ma1 Malta boy 24k+/-. In the case of Malta1, his basal snp is even classified as basal R*,perhaps extinct line. Two ancient Kurgan sample/finds, Rise 548 and Yamnaya I0370 have yielded the same structure in both snp R1b Z2103+ phylogenic tree, and autosomal components; shared with above ancient samples, despite being separated by a distance of 1200+/- km and tested in two separate scientific papers.

Matt said...

When Dodecad's K12b values for West Eurasia are forced onto a 2D MDS (using a method that turns the results from populations into a single points on a two-dimensional chart), then it looks something like this:

Very similar to a lot of the conventional PCA of West Eurasia, with some slight shift in the West Asian and European clines from the Caucasus vs Gedrosia distinction.

Krefter said...

Map of Pre-Historic carriers of Red hair.

It looks like a lot, but it isn't. 1/5 of people in South Europe, 1/3 in North Europe, and 5/10 in the British isles have at least one of these mutations. Only 4 of these people probably actually had Red hair. Some of the others probably had Red beards though.

Motala Sweden, Sintashta/Andronovo, LN/BA Europe, and Yamnaya/Afanasievo have the highest frequency. Today the highest frequency is in the British isles and in Finno Urgic/Turkic Russians(Chuvash, Komi, Udmurt).

Alberto said...


I agree that calculators based on modern populations cannot be taken literally when looking at ancient samples. In the case of the K12 Gedrosia component, it's a bit misleading because if comes from the K7b "West Asian", which splits into Gedrosia and Caucasus, but Caucasus takes also Southern, which in the end becomes a strange mix.

So I would rather look at the "West Asian" component in K7, which is more straight forward. It actually is composed of ANE (Ancient North Eurasian) and Near Eastern, but there was a population that had this 2 components, so probably the "West Asian" does represent this population to some good degree. (By the way, this "West Asian" population is the same as the "Teal people" referred to above. We called them "teal" because on Haak et al. 2015 paper, they used this colour in the admixture graph to represent the component).

I personally agree with what you stated above about Asian "Gedrosia" not coming from Europe (it's rather the other way around, IMO), but that's the big debate at this point, since many people still think there was a mass migration from Europe to South/Central Asia. We need ancient DNA from Asia to really solve the question.

BTW, the Samara_HG belonged to the Elshanskaya culture, IIRC. This paper discusses the origins of it:

Davidski said...

Let's wait for those Maikop genomes.

Rokus said...

'Karelia HG (and EHG in general) is certainly intermediate between WHG and ANE at some level, but based on qpGraph models this doesn't look to be the result of recent mixture between WHG and ANE.'
Definitely ANE must have been old east of the Baltic, anyway much older than 'Siberian' as Kristiina elaborated. This should make us all wonder how dominant this component must have been just before the appearance of Yamnaya/Afanasievo etc. Most likely admixture of WHG started at an early stage, though there are reasons to assume there should have been a strong (cultural) Middle-Neolithic influence - as much as there are reasons to assume that Funnel Beaker was not a genetically homogenized culture fully represented by gok2. From here on ANE can't be considered an expansive component, it should be understood as the genetic backflow of an incorporated substrate. One significant cultural difference between PIE and Uralic cultures must have been the level of contact over a wider area, since Uralic people most of all were self-contained hunters, or not in large-distance contact where this was otherwise. As such, incidental substrates like ANE (not Siberian!) must have travelled well by PIE related gene flow, much better than each of the Uralic components by Uralic gene flow. The ANE that PIE people introduced in the west must have been a tiny bit in comparison with the ANE that used to be around in the east, and where the component apparently diminished significantly by immigration.
BTW, I gather the Neolithic genetic components in PIE must have been equally hard to avoid as ANE. Indeed, unless PIE was originally just a tiny subset of the Neolithic culture, I don't think the Neolithic component can't be considered anywhere anything but an incorporated (part of a) substrate.

Mike Thomas said...

There is growing concensus- at least amongst Georgian archaeologists- that the earliest kurgans in Georgia represent a pre-Kura-Araxes phase of Ubaid/ pre-Uruk expansion into the Caucasus, which then also (slightly later reached North caucasus (ie Majkop). In their view, this was nothing short of a significant colonization by Mesopotamian colonists. In this regard, then, "Teal" might indeed be what Dave described- the fusion of Near eastern and ANE groups.

"Teal's" presence South of the Caucasus could represent a reflux back from the steppe, unless ANE was already present in pre-Bronze age South Caspian, as Alberto suggests.

a said...

K6 is a nice concise test. Parsing the data from Samara H.G. IMO looks like Mamonov might have been on to something.
"One of the most controversial questions is the perio-
disation of the process of Neolithisation. Mamonov(2000.158) takes the 14C dates of bivalve shells found in the occupation debris of Chekalino IV, Ilyi-inskaya and Lebyazhinka IV sites from
c.8600 to 7940 BP to show that Elshanka culture was auto-
chthonous. He suggests that Elshanka pottery was
formed in the Povolzhye forest-steppe because “
there is no chronological possibility of a substratum or
cultural centre from which the ceramic tradition could be borrowed
” (Mamonov 2006a.274). The
supporters of the Balkan origins of Elshanka type
sites oppose such early dates. They point to the na-
tural occurrence of shells in the layers (Viskalin
2006), and consider the Balkan-Carpathian analo-
gies that date these sites to the 6t hand the begin"

In contrast "Researchers have thus suggested sources in Asia Mi-
nor for the Early Neolithic cultures in the steppes of
European Russia and Ukraine (Danilenko 1969)"
Samara H.G. K6=Samara_HG I0124 0.00001 0.00001 European-.807905
Oceanian-0.015214 0.00001

a said...

When dna/stats on Maykop are released, maybe a comparison can also be done using K6 by comparing to modern Georgians.
For example K6- Georgians are roughly 70+/-% Middle Easatern-25%+/-

Georgian mg49 0.731425 0.00001 0.241437 0.00001 0.00001 0.027108
Georgian mg43 0.699782 0.00001 0.271694 0.00001 0.009702 0.018802
Georgian mg47 0.71141 0.00001 0.261468 0.00001 0.009472 0.017629
Georgian mg22 0.73271 0.00001 0.250135 0.00001 0.001955 0.01518
Georgian mg27 0.730982 0.00001 0.236263 0.00136 0.017816 0.013568
Georgian mg23 0.738832 0.00001 0.245283 0.00001 0.00267 0.013195
Georgian mg62 0.727563 0.00001 0.259655 0.00001 0.00001 0.012752
Georgian mg31 0.709616 0.00001 0.273842 0.005767 0.00001 0.010755
Georgian mg40 0.73718 0.00001 0.241467 0.00001 0.010781 0.010552
Georgian mg34 0.694608 0.00001 0.296314 0.00001 0.009048 0.00001

PF said...

I wouldn't want to argue whether Georgia or the rest of the Caucuses are in Europe or not. Let's say they're on the border. ;-)

More pertinently, I was wondering out loud if it's possible that some of the old G2a seen in the Caucuses could be related to the Near Eastern input into "Teal," along with some of the non-IE language survival as well.

Davidski said...

Maybe, but all Caucasians show EEF-related ancestry, presumably from Anatolia, while Yamnaya lack it, and they also lack G2a.

So "teal" may have moved onto the steppe from the North Caucasus before both of these markers spread out across the Caucasus.

Grey said...


"Map of Pre-Historic carriers of Red hair."

Cool - pretty sure personally that the red hair gene will be a significant clue to something or other - not entirely sure what it will turn out to be but something.

Mike Thomas said...

Copper miners who boomeranged their lactase persistence ??

Unknown said...

I'm not an expert in genetics, but as far as I know the oldest subclades of Y-DNA haplogroups R1b and R1a are found between Iran, Turkemenistan, Afghanistan and Northern India. Therefore I would assume that Indo-European (or proto-Indo-European) languages were originated in this region.

The male populations carrying these genes may have spread to the Caucasus and Anatolia ia Iran on one hand and on the other to Eastern Europe and then Western Europe. This would explain why most people in the Caucasus speak Iranic languages as opposed to those living in Eastern and Central Europe.

It's also possible that Indo-European languages were introduced in Anatolia via Eastern Europe by people related to the ancient Kurgan culture.

Davidski said...

It makes no difference where the "oldest" subclades of R1a and R1b are found today if they weren't found there 3,000 years ago.

Eastern European hunter-gatherers carried R1a and R1b, so let's see if Central Asian hunter-gatherers did too. But I seriously doubt it.

Also, 99% of the R1a in the world today is R1a-M417, which certainly expanded from Eastern Europe during the Copper Age with the Corded Ware and derived cultures. Almost all of Asian R1a is a subset of this R1a.

Unknown said...

On what basis do you say those subclades weren't originated in that region (Iran, Turkmenistan, Afghanistan and Northern India)?

Is there archaeogenetic evidence to prove it?

I would guess it's also possible that R1a-M417 may have been originated somewhere between Eastern Europe and the region above mentioned.

Davidski said...

Have a close look at the phylogenetic structure of R1a-M417. All of the main branches are found in Europe; R1a-CTS4385, R1a-Z282 and R1a-Z93. The first one is restricted to NW Europe.

On the other hand, 99% of the R1a in Asia is R1a-Z93, but the most basal subclades are found in Poland and Russia.

R1b is more complex. But we have genomes of Eastern European Hunter-Gatherers carrying R1a and R1b that don't show any Near Eastern or South Asian admixture. They're purely European/Siberian.

So it looks like R1 fanned out from Siberia into Europe during the Mesolithic, and then expanded from the Eastern European steppe to Western Europe and Asia during the Bronze Age. Rare subclades of R1a and R1b survived in the Near East because the steppe has been a major migration highway, but they're not from the Near East.

The trail of R1a to India now matches linguistics and archaeology very well. Corded Ware R1a-M417 > Sintashta R1a-Z93 > Andronovo R1a-Z93 > Indo-Iranians R1a-Z93 > Indo-Aryans R1a-Z93.

The R1a-Z93 Sintashta and Andronovo genomes we have are northern European, with some Siberian admixture among the latter.

Grey said...

"Copper miners who boomeranged their lactase persistence ??"

Not so much LP except in the west but yes *if* copper working is somehow associated with R1b from the Urals then it might correlate with MC1R as well.

If correct it should show up in populations like the Newars in Nepal who are c. 10% R1b. They have a caste system so if correct then you might expect their copper worker caste (surname Tamrakar) to have a higher frequency of R1b and MC1R.

If they didn't then the idea would be wrong.