search this blog

Wednesday, August 16, 2017

A homeland, but not the homeland #2

Back in May, in a post titled A homeland but not the homeland, I said this:

It seems increasingly likely that ancient DNA has identified a massive expansion, or a series of expansions, from Mesopotamia and/or surrounds in basically all directions dating to the Chalcolithic (ChL) and Bronze Age (BA). This phenomenon is mainly characterized by the simultaneous spread of:

- Iran_ChL-related genome-wide ancestry

- Y-haplogroup J

- South Caspian-specific mitochondrial haplogroups such as R2 and U7

In the same post I also included a list of ancient populations that showed at least two of these characteristics. I can now add two more populations to this list: the Minoans and Mycenaeans.

- Anatolia_BA, Western Turkey, 2836-1800 calBCE (Lazaridis et al. 2017)

- Egyptian mummies, Middle Egypt, 776-2 calBCE (Schuenemann et al. 2017)

- Iran_ChL, Western Iran, 4839-3796 calBCE (Lazaridis et al. 2016)

- Levant_BA, Northwestern Jordan, 2489-1966 calBCE (Lazaridis et al. 2016)

- Minoans, Crete, Greece, 2900-1700 BCE (Lazaridis et al. 2017)

- Mycenaeans, Greece, 1700-1200 BCE (Lazaridis et al. 2017)

- Sidon_BA, Southern Lebanon, 1750-1600 BCE (Haber et al. 2017)

Out of all of these groups, only the Mycenaeans are generally accepted to have been speakers of an Indo-European language. However, they differ from the others in that they harbor minor but significant ancestry from a source, or multiple sources, closely related to Yamnaya, Sinatshta and other Bronze Age peoples of the Pontic-Caspian steppe (see here).

Possible question for the discussion in the comments: what does this say about where the Mycenaeans got their Indo-European language? Also, who wants to bet that Bronze Age samples from the Indus Valley Civilization will too make it onto my list?

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Monday, August 14, 2017

CHG or no CHG in Bronze Age western Iberia?

Here's what Martiniano et al. had to say recently in regards to the genetic shifts in what is now Portugal, western Iberia, during the Bronze Age that they saw in their ancient DNA data:

A recurring feature of ADMIXTURE analyses of ancient northern Europeans is the appearance and subsequent dissemination within the Bronze Age of a component (teal) that is earliest identified in our dataset in HGs from the Caucasus (CHG). Unlike contemporaries elsewhere (but similarly to earlier Hungarian BA), Portuguese BA individuals show no signal of this component, although a slight but discernible increase in European HG ancestry (red component) is apparent. D-Statistic tests would suggest this increase is associated not with Western HG ancestry, but instead reveal significant introgression from several steppe populations into the Portuguese BA relative to the preceding LNCA (S4 Text, S6 Table).


In the present analysis, fineSTRUCTURE has identified the 3 Portuguese Bronze Age individuals as a genetically distinct population (S23 Fig). When compared to Central or Northern European populations such as Ireland [11], the degree of discontinuity between the Neolithic and Bronze Age in Portugal is not pronounced. However, despite the small sample size we have evidence suggesting complete discontinuity at the level of Y-chromosome lineages with all 3 male Bronze Age samples presenting derived alleles at marker M269.

Although in ADMIXTURE analysis we were not able to observe the presence of the CHG-related cluster in the ancestry proportions of the Portuguese Bronze Age samples, with D(Mbuti, X; Portuguese MN/LNCA, Portuguese BA) we find support for CHG/Yamnaya related introgression and also an increase in EHG [Eastern European Hunter-Gatherer] ancestry.

Despite the authors' conclusion that steppe-related admixture was present in their Portuguese BA samples, the ambiguity created by their ADMIXTURE analysis encouraged some heated debates in the comments at this blog and elsewhere about whether their findings were legitimate, and also suggestions that the Portuguese BA R1b-M269 Y-chromosomes were not derived from the steppe.

To try and put this debate to bed, at least on this blog, let's run the same samples with the qpAdm mixture modeling algorithm. I don't want to get into the details here about the difference between ADMIXTURE and qpAdm, because I don't feel it's something that I can explain accurately. But, suffice to say that qpAdm is a more direct way of estimating ancestry proportions, so, in my experience, it's less likely to lose minor but significant admixture signals in a well thought out and put together analysis.

First up, I need to test whether these Portuguese BA (Portugal_BA) individuals can be modeled as a two-way mixture between EHG and Portuguese Late Neolithic farmers (Portugal_LN).

EHG 0.093±0.036
Portugal_LN 0.907±0.036
P-value 0.0102798873
chisq 20.015
Full output

Nope, they can't. But what happens if I add CHG to the model?

CHG 0.106±0.048
EHG 0.042±0.042
Portugal_LN 0.852±0.042
P-value 0.0367007784
chisq 14.946
Full output

The statistical fit improves, but it's still lousy, which perhaps suggests that I need a temporally more proximate CHG-related reference sample. How about Yamnaya?

Portugal_LN 0.849±0.045
Yamnaya_Samara 0.151±0.045
P-value 0.0725988319
chisq 14.371
Full output

That's not too bad. But let's try a more proximate Yamnaya-related population: Bell Beakers from Germany. Note that some of these Beakers belong to Y-haplogroup R1b-M269(P312+), which is the most common Y-chromosome lineage among present-day Iberians.

Bell_Beaker_Germany 0.328±0.089
Portugal_LN 0.672±0.089
P-value 0.109643502
chisq 13.065
Full output

Somewhat better, and we could probably keep going like this, improving the fits each time, with more relevant reference samples if they were available, like, say, late Beakers from what is now France. I suspect also that using more westerly Hunter-Gatherers than EHG, perhaps from what is now Ukraine, might significantly improve the second model. In any case, my qpAdm analysis provides strong evidence that, unlike Portugal_LN, Portugal_BA harbored CHG-related ancestry that was probably mediated via Yamnaya- and Beaker-related groups.


Martiniano R, Cassidy LM, Ó'Maoldúin R, McLaughlin R, Silva NM, Manco L, et al. (2017) The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods. PLoS Genet 13(7): e1006852.

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Steppe admixture in Mycenaeans, lots of Caucasus admixture already in Minoans (Lazaridis et al. 2017)

Saturday, August 12, 2017

The Iron Age Iranian (?)

After the recent publication of Bronze Age genomes from present-day Greece and Portugal, you'd have to be a desperate fool not to accept that the Pontic-Caspian steppe in Eastern Europe is the most likely homeland of all surviving branches of the Indo-European language family. I don't want to say I told you so, but, well, I told you so (see here).

Yes, we're still waiting for those ancient genomes from South Asia. But don't expect any surprises when they do arrive, probably in a couple of months. Indeed, if you've still got a thing for the Out-of-India Theory (OIT), then it might be time to start looking around for a different hobby than following ancient DNA results. My advice is try meditation.

Thus, pending the sequencing of Hittite and other bona fide early Indo-European genomes from Bronze Age Anatolia, which should be able to help pinpoint the Proto-Indo-European (as opposed to just the Late Proto-Indo-European) Urheimat to the satisfaction of most, I suggest that we shift focus in the comments here in a big way, and, instead of wasting time arguing whether the early Indo-European expansions from the steppe happened, we get stuck into the details of how they happened.

Worthy subjects of discussion in this context, I'd say, are a couple of intriguing ancient West Asian individuals whose genotypes are now available for download at the Reich Lab website: Kumtepe4 from Chalcolithic Anatolia and F38 from an Iron Age burial at Tepe Hasanlu in what is now Iran.

Let's start with F38, whose genome was originally published back in 2016 as part of Broushaki et al. (see here):

Furthermore, our male Iron Age genome (F38; 971-832 BCE; sequenced to 1.9x) from Tepe Hasanlu in NW-Iran shares greatest similarity with Kumtepe6 (fig. S21) even when compared to Neolithic Iranians (table S20). We inferred additional non-Iranian or non-Anatolian ancestry in F38 from sources such as European Neolithics and even post-Neolithic Steppe populations (table S20). Consistent with this, F38 carried a N1a sub-clade mtDNA, which is common in early European and NW-Anatolian farmers (3). In contrast, his Y-chromosome belongs to sub-haplogroup R1b1a2a2, also found in five Yamnaya individuals (17) and in two individuals from the Poltavka culture (3). These patterns indicate that post-Neolithic homogenization in SW-Asia involved substantial bidirectional gene flow between the East and West of the region, as well as possible gene flow from the Steppe.

In other words, it's almost certain that F38 had recent ancestry from elsewhere than the South Caspian region, and probably from the Pontic-Caspian steppe.

However, interestingly, when F38 was alive, Tepe Hasanlu was more likely to have been an ethnically Hurrian or Urartian site, rather than an Iranian one, and the Iron Age settlement there has a fascinating and tragic final story (see here).

Also, F38 shows a great deal of genetic similarity to three Early Bronze Age (EBA) samples from Kura-Araxes culture burials in what is now Armenia (labeled together as Armenia_EBA). Indeed, one of these Kura-Araxes individuals belongs to Y-haplogroup R1b, albeit to a different subclade than F38. Moreover, Kura-Araxes people are hypothesized to have been early speakers of Hurro-Urartian languages.

This is where Armenia_EBA and F38 cluster in my Principal Component Analysis (PCA) of ancient and present-day West Eurasian populations. Right click and open in a new tab to enlarge:

Like four peas in a pod, right? Not necessarily, because this outcome might be a simple coincidence. And, in fact, that's what my qpAdm analysis suggests. Using no less than 16 ancient outgroups, I found that the models below produced the best fits. Obviously, Anatolia_BA stands for Anatolia Bronze Age, CHG for Caucasus Hunter-Gatherer, Iran_ChL for Iran Chalcolithic, and Tepecik_Ciftlik_N for Tepecik Ciftkik Neolithic.

Iran_IA F38 (2-way)
Iran_ChL 0.815±0.066
Poltavka_outlier 0.185±0.066
P-value 0.72807065
chisq 10.457
Full output

Iran_IA F38 (3-way)
Anatolia_BA 0.122±0.107
Iran_ChL 0.717±0.098
Poltavka_outlier 0.161±0.070
P-value 0.773758066
chisq 8.989
Full output

Armenia_EBA (2-way)
CHG 0.582±0.042
Tepecik_Ciftlik_N 0.418±0.042
P-value 0.817374811
chisq 9.210
Full output

Admittedly, a more systematic and exhaustive search might be able to dig up even better fitting models and show that F38 does share recent ancestry with Armenia_EBA. But in any case, after running these tests, I'm now certain that F38 had significant admixture from the European steppe, probably via a population very similar to Poltavka_outlier.

On the other hand, I'd say that if Armenia_EBA had any steppe ancestry, then it's only a few per cent, and likely from a less northern-shifted source than Poltavka_outlier. This is what the 2-way models look like on the same PCA as above. Armenia_EBA and F38: so similar, yet potentially so different.

F38's probable steppe connection, of course, suggests that he was at least partly of Indo-European origin, and possibly a speaker of an Iranic language, because the Poltavka culture has been associated by some scholars with early Indo-Iranians.

Unfortunately, I don't have a decent enough diploid version of F38's genome to test his fine scale genetic affinities with a haplotype analysis. So I'd say that the most useful thing I can do, that wasn't already done in Broushaki et al., is to run an Identical-by-State (IBS) affinity test. This method is generally pretty good at picking up recent ethnic-specific genetic drift. These are F38's top 25 matches out of over 100 present-day populations:

Georgian 0.676468
Armenian 0.676024
Abkhasian 0.675791
Iranian_Jew 0.675418
Iraqi_Jew 0.675224
Lezgin 0.675124
Cypriot 0.674942
Greek 0.674824
Kurdish 0.674795
Uzbek_Jew 0.674770
Azeri_Jew 0.674701
Greek_Macedonia 0.674700
Italian_South 0.674556
Kosovar 0.674489
Chechen 0.674463
Sicilian_East 0.674334
Turkish 0.674315
Sicilian_West 0.674247
Sephardic_Jew 0.674198
North_Ossetian 0.674125
Kumyk 0.674045
Romanian 0.674017
Greek_Peloponnese 0.673945
Iranian 0.673911
Yemenite_Jew 0.673875

The top three hits are from the Caucasus, which I suspect is due to F38's high ratio of CHG-related ancestry. Iranian and Iraqi Jews are both in the top five, probably because they're relatively similar to Iran_ChL. Armenians are the highest scoring Indo-European speakers, but Kurds also make the top ten, and it's interesting to see several different Greek and Italian groups in the top 25. No idea what that might mean though? To wrap things up, I'll suggest a few questions for the ensuing discussion in the comments:

- Was F38 an Hurro-Urartian or Indo-European, or an Hurro-Urartian with some Indo-European ancestry? If Indo-European or partly Indo-European, then what type? Armenian, Cimmerian, Iranian, or...?

- Is F38's R1b1a2a2 lineage a reflection of his potential Poltavka ancestry from the steppe or Kura-Araxes ancestry from the Caucasus?

- What explains F38's strong affinity to many modern-day European groups?

- Does the southern, non-Eastern European Hunter-Gatherer (EHG), part of Yamnaya's ancestry perhaps derive from a Bronze Age South Caspian population closely related to F38 and rich in R1b1a2a2?

Nah, I'm just trolling with that last one. I thought I'd save some of you the trouble. Let's be honest, what are the chances that this will ever pan out? I'll give it a probability of 5%.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Steppe admixture in Mycenaeans, lots of Caucasus admixture already in Minoans (Lazaridis et al. 2017)

Yamnaya-related migrations into Iberia: infiltration rather than invasion (Martiniano et al. 2017)

Thursday, August 10, 2017

Basal-rich K7 & Global 10 updates (10/08/2017)

I've updated the Basal-rich K7 spreadsheet and the Global 10 datasheets with a plethora of ancient individuals and populations, including Anglo-Saxons, British Celts (labeled England_IA), Minoans, Mycenaeans, Bronze Age Iberians and many more.

Basal-rich K7 spreadsheet

Global 10 main datasheet

Global 10 ancient averages datasheet

Please keep in mind that the K7 can be somewhat conservative with minor ancestry proportions, especially Ancient North Eurasian (ANE) admixture, and low coverage samples can behave in odd ways in the Global 10. So when modeling ancestry with ancient samples it might be useful to stick to high coverage individuals that show consistent results. If you don't know what the Basal-rich K7 and Global 10 are, then these links will be useful.

The Basal-rich K7

Global 10: A fresh look at global genetic diversity

An nMonte and 4mix guide for the participants of the Basal-rich K7 and/or Global 10 tests

Tuesday, August 8, 2017

Pots were people in Bronze Age southern Central Asia too

New archaeological evidence of potentially significant Bronze Age migrations from the Eurasian steppe into present-day southern Turkmenistan is coming to light thanks to the Archaeological Map of the Murghab Delta (AMMD) project. The new findings are discussed in a paper in Quaternary International available here or here. From the paper:

Adding to the number of questions was the fact that the AMMD project also recorded hundreds of small campsites, particularly in the northern distal reaches of the fan, that bore ceramics [my note: called Incised Coarse Ware or steppe ware] unlike those of other Murghab communities, but with unmistakable affinities to the so-called Andronovo cultural group occupying regions to the north and east during this same period (Cattani, 2008; Cattani et al., 2008; Cerasetti, 2008, 2012). These campsites are interpreted as representing the influx of a new socio-cultural group of mobile pastoralists who began to occupy first more remote areas and gradually move toward more physical and subsistence integration with settled farming groups in the Murghab (Cerasetti et al., in press; see also; Rouse and Cerasetti, 2014). However, the question of whether such encounters upset a careful ecological balance struck by Murghab farming settlements for over a millennium, or whether they were merely coincidental with environmental changes, could not be sufficiently addressed with the coarseness of survey data; targeted research agendas were (and are) still needed to address such questions specifically. Nonetheless, up to this point, it is clear that at the end of the Bronze Age, major social, demographic, and environmental changes were coinciding.

Southern Turkmenistan is, of course, not too far away from South Asia, which was also potentially a target of large scale Bronze Age migrations from the Eurasian steppe that may have brought Indo-European languages to the region. Archaeological evidence of such population movements into South Asia is, for now, apparently minimal or, as some claim, even non-existent. However, ancient DNA evidence in favor of the so called Aryan Invasion or Migration Theory (AIT/AMT) is rapidly building up (see here). By the way, if you're wondering about the title of this post then this might help: "Kossinna's Smile" (Heyd, 2017).


Rousea and Cerasetti, Micro-dynamics and macro-patterns: Exploring new archaeological data for the late Holocene human-water relationship in the Murghab alluvial fan, Turkmenistan, Quaternary International, Volume 437, Part B, 5 May 2017, Pages 20-34,

See also...

Maybe first direct hints of Yamnaya-related gene flow into South Central Asia

Swat Valley "early Indo-Aryans" at the lab

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Wednesday, August 2, 2017

Steppe admixture in Mycenaeans, lots of Caucasus admixture already in Minoans (Lazaridis et al. 2017)

Over at Nature at this LINK. Why is the presence of steppe admixture in Mycenaeans important? And why does it matter if the Minoans already had a lot of ancestry from the Caucasus or surrounds? Because Mycenaeans were Indo-Europeans and Minoans weren't. I'm still reading the paper and will update this entry regularly over the next few days. Below is the abstract and, in my opinion, a key quote. Emphasis is mine.

The origins of the Bronze Age Minoan and Mycenaean cultures have puzzled archaeologists for more than a century. We have assembled genome-wide data from 19 ancient individuals, including Minoans from Crete, Mycenaeans from mainland Greece, and their eastern neighbours from southwestern Anatolia. Here we show that Minoans and Mycenaeans were genetically similar, having at least three-quarters of their ancestry from the first Neolithic farmers of western Anatolia and the Aegean [1, 2], and most of the remainder from ancient populations related to those of the Caucasus [3] and Iran [4, 5]. However, the Mycenaeans differed from Minoans in deriving additional ancestry from an ultimate source related to the hunter–gatherers of eastern Europe and Siberia [6, 7, 8], introduced via a proximal source related to the inhabitants of either the Eurasian steppe [1, 6, 9] or Armenia [4, 9]. Modern Greeks resemble the Mycenaeans, but with some additional dilution of the Early Neolithic ancestry. Our results support the idea of continuity but not isolation in the history of populations of the Aegean, before and after the time of its earliest civilizations.


The simulation framework also allows us to compare different models directly. Suppose that there are two models (Simulated1, Simulated2) and we wish to examine whether either of them is a better description of a population of interest (in this case, Mycenaeans). We test f4(Simulated1, Simulated2; Mycenaean, Chimp), which directly determines whether the observed Mycenaeans shares more alleles with one or the other of the two models. When we apply this intuition to the best models for the Mycenaeans (Extended Data Fig. 6), we observe that none of them clearly outperforms the others as there are no statistics with |Z|>3 (Table S2.28). However, we do notice that the model 79%Minoan_Lasithi+21%Europe_LNBA tends to share more drift with Mycenaeans (at the |Z|>2 level). Europe_LNBA is a diverse group of steppe-admixed Late Neolithic/Bronze Age individuals from mainland Europe, and we think that the further study of areas to the north of Greece might identify a surrogate for this admixture event – if, indeed, the Minoan_Lasithi+Europe_LNBA model represents the true history.

Lazaridis, Mittnik et al., Genetic origins of the Minoans and Mycenaeans, Nature, Published online 02 August 2017, doi:10.1038/nature23310

Update 03/08/2017: This is my own Principal Component Analysis (PCA) of the Minoan and Mycenaean samples, which are freely available at the Reich Lab website here. The Armenian angle for the eastern admixture in Mycenaeans looks forced. The trajectory of this admixture obviously runs from Northern or Eastern Europe to the Minoans. If it did arrive from Armenia, then realistically only via a heavily steppe-admixed population. Right click and open in a new tab to enlarge:

Update 05/08/2017: Much like Lazaridis et al., I ran a series to qpAdm analyses to find the best mixture model for the Mycenaeans. However, just to see what would happen, unlike Lazaridis et al., I didn't group any of the archaeological populations into larger clusters based on their genetic affinities. The three models below stood out from the rest in terms of their statistical fits.

Minoan_Lasithi 0.786±0.049
Sintashta 0.214±0.049
P-value 0.96574059
chisq 6.030
Full output

Corded_Ware_Germany 0.210±0.043
Minoan_Lasithi 0.790±0.043
P-value 0.961238695
chisq 6.198
Full output

Minoan_Lasithi 0.791±0.043
Srubnaya 0.209±0.043
P-value 0.950419642
chisq 6.558
Full output

So it's essentially the same outcome as the one obtained by Lazaridis et al., because Sintashta and Srubnaya are part of their Steppe_MLBA cluster, while Corded Ware is part of their Europe_LNBA cluster, and it's these clusters that, along with Minoan_Lasithi, provided their most successful mixture models for the Mycenaeans. But it's nice to see Sintashta at the top of my results, because it fits so well with the long postulated archaeological links between Sintashta and the Mycenaeans (for instance, see here).

By the way, here's what I said back in May when the Mathieson et al. 2017 preprint came out (see here). So things are falling into place rather nicely.

The same paper also includes the following individual from present-day Bulgaria dated to the start of the Late Bronze Age (LBA), which is roughly when the Mycenaeans appeared nearby in what is now Greece:

Bulgaria_MLBA I2163: Y-hg R1a1a1b2 mt-hg U5a2 1750-1625 calBCE

This guy is the most Yamnaya-like of all of the Balkan samples in Mathieson et al. 2017, and, as far as I can see based on his overall genome-wide results, probably indistinguishable from the contemporaneous Srubnaya people of the Pontic-Caspian steppe. He also belongs to Y-haplogroup R1a-Z93, which is a marker typical of Srubnaya and other closely related steppe groups such as Andronovo, Potapovka and Sintashta. So there's very little doubt that he's either a migrant or a recent descendant of migrants to the Balkans from the Pontic-Caspian steppe.

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Tuesday, August 1, 2017

A Bronze Age dominion from the Atlantic to the Altai

The BEAGLE analysis that I foreshadowed a couple of days ago (see here) is finally done. The output is available for download as a matrix of shared genomic tracts in centimorgans (cM) here.

I haven't yet had a chance to look at the results in detail, but I'd say that the outcomes for the three Early Bronze Age (EBA) Afanasievo and Yamnaya individuals make a lot of sense. The high affinity that these individuals show to the Irish EBA samples is not at all surprising, but striking nonetheless. The Afanasievo people, after all, lived in the Altai Mountains deep in Asia, more than 6,000kms from Ireland.

Update 02/08/2017: Interestingly, the graphs below, based on the cM values in my coancestry matrix, suggest that upper caste Indo-Aryan-speaking Brahmins from Northern India share relatively more ancestry with the Afanasievo genome than Iranic-speakers such as Pamir Tajiks, who generally share relatively more ancestry with the younger Andronovo and Sintashta samples. The relevant datasheet is available here.

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts