Saturday, March 8, 2014
Ancient North Eurasian (ANE) admixture across Asia
Update 4/9/2014: Ancient North Eurasian (ANE) admixture across Europe & Asia
Studies of ancient genomes usually feature unsupervised analyses with the ADMIXTURE software. These are very informative, but only if interpreted in the right context and with caution, because they attempt to fit the ancient samples, often thousands of years old, into ancestral clusters mostly derived from present-day populations. That's like putting the cart before the horse.
So I thought I'd try a different approach, in the hope of achieving more straightforward results, and run ADMIXTURE in supervised mode, with the 24,000 year-old MA-1 or Mal'ta boy genome from South Siberia as one of the reference samples. After a lot of tweaking of the dataset, the experiment seems to have worked, because the cluster created from the ancient genome is basically identical to the MA-1-derived Ancient North Eurasian (ANE) component recently described in the Lazaridis et al. preprint.
Note also that the ANE in my analysis peaks among the Karitiana Indians at around 43%. This is very much in line with a TreeMix graph in Raghavan et al., which shows a Karitiana individual with 41.6% (plus or minus 3.4%) admixture from a clade ancestral to MA-1 (see image here).
Nevertheless, there are clearly some issues with this test. For instance, many South Asians show unexpectedly high levels of Sub-Saharan admixture (in particular, the Austroasiatic samples from India score around 6-7%, which has never been reported before). I'd say this is because they carry genetic variation indigenous to South Asia that doesn't fit well into any of the four ancestral components. The Eastern non-African (ENA) cluster, based on Han Chinese samples, captures most of this diversity, but some of it appears to be siphoned off into the other three clusters. I think the only way to really solve this problem is to include pre-Neolithic genomes from South Asia in the analysis.
By the way, I used 53K SNPs at read depth 2x or more, but varying the quality of SNPs from read depth 1x to 3x doesn't change the results very much.
Key: red = Ancient North Eurasian (ANE); green = Middle Eastern (ME); aqua = Eastern non-African (ENA); purple = Sub-Saharan African (SSA). ANE K=4 ADMIXTURE Test spreadsheet
Raghavan et al., Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans, Nature, (2013), Published online 20 November 2013, doi:10.1038/nature12736
Iosif Lazaridis, Nick Patterson, Alissa Mittnik, et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, bioRxiv, Posted December 23, 2013, doi: 10.1101/001552
First genome of an Upper Paleolithic human
Ancient human genomes suggest (more than) three ancestral populations for present-day Europeans