Ashkenazi Anatolia_ChL 7.9 Arab_Israel_1 15.65 Avar 0.6 Bashkir 0.05 Cossack 0 Italian_Tuscan 30.45 Polish 11.75 Samaritan 33.6 Uygur 0 distance%=0.2874 / distance=0.002874In any case, that looks like a fairly sensible outcome, considering that it only took me a few minutes to put together. I've seen much worse in scientific literature.
Monday, September 26, 2016
Estonian Biocentre Human Genome Diversity Panel (EGDP)
Published along with Pagani et al. 2016, the EGDP dataset is freely available at the Estonian Biocentre website as VCF and PLINK binary files here. It overlaps at ~550K SNPs with Broad MITs/Harvard's Human Origins, and at an impressive ~1.1 million SNPs with the ~1.2 million SNP ancient DNA chip used by the Reich Lab and others. To see what's what, I ran a Principal Component Analysis (PCA) of all of the samples except the Congo Pygmies. I then removed four Siberians that behaved as if they had very recent European ancestry, and reran the PCA. Below are a few screen grabs from the latter analysis. The datasheet is available here. here. Look for the individual IDs with the GS prefix. Using the K7 spreadsheet and nMonte, here's a model for Ashkenazi Jews with some of the new EGDP populations as references, including Avars from the North Caucasus and Arabs from Israel. The Arabs do help to improve the fit, but they're not as important as Samaritans and Tuscans.