Friday, March 13, 2015
Yamnaya-related ancestry proportions in Europe and west Asia
Here's a quick and dirty attempt to flush out a Yamnaya-specific ancestral component with the ADMIXTURE software and a few Yamnaya genomes from the recent Haak et al. paper: K6 spreadsheet.
Obviously, we'll need many more ancient samples from the vast Yamnaya horizon to be able to estimate direct Yamnaya ancestry in modern populations with any great confidence. But I'd say this looks like a very reasonable attempt, with more or less comparable results to those published by Haak et al. (for instance, see Figure 3 from the study here).
Please note that this wasn't a supervised run. In other words, I didn't mark the Yamnaya genomes as reference samples with the aim of creating a cluster from them.
However, I initially excluded all individuals from northeastern Europe, the north Caucasus and South Asia from the analysis. The reason I did this was because samples from these regions have a peculiar habit of creating very robust clusters in ADMIXTURE, which is useful when looking at recent variation and wanting low cross validation errors, but not so great when trying to resurrect genetic components from the depths of prehistory.
Once I had a dataset that was forcing the algorithm to focus its attention on the ancient genomes and producing consistent results, I tested the problem samples in batches of 5-10, thus making sure they didn't skew the analysis.
Interestingly, the Yamnaya-specific component peaks in Udmurts, who live close to where the Yamnaya samples were collected. This can hardly be a coincidence.
In any case, I'm hoping to look at this issue in more detail soon with the help of qpAdm, a new program released recently with the updated ADMIXTOOLS package (see here). Based on f4 statistics, qpAdm is specifically designed for analyzing ancient admixture events.
Haak et al., Massive migration from the steppe was a source for Indo-European languages in Europe, Nature, Advance online publication, doi:10.1038/nature14317