Showing posts with label Harappa Project. Show all posts
Showing posts with label Harappa Project. Show all posts

August 20, 2012

Visualizing admixture differences with ACD tool

Vaêdhya has created a new ACD tool that allows one to visualize differences between sets of populations in terms of admixture components. He also posts two examples of the application of his tool on data generated by myself in the Dodecad Project, as well as by the Harappa Project.

 I have speculated about the origins of Indo-Iranians before, noting that the evidence links even the Kurds with a "South Asian" component; in subsequent higher-resolution analysis, such as the K12b, it appeared that this component was related to the Gedrosia component. In any case, the evidence is clear about the links of different Iranian and Indo-Aryan groups, so it is nice that this can be made evident with the ACD tool and data from the Harappa Project. Notice the excess of the Baloch (~Gedrosia) component in Kurds and Iranians in contradistinction to the Indo-European Armenians and Semitic Assyrians. It is fairly clear to me that the Iranian ancestral homeland is to be sought to the east, with the Bactria-Margiana Archaeological Complex (BMAC) being a good candidate for its location.

In a second plot, Vaêdhya uses Dodecad data to contrast patterns of differences in Northeastern Europe. Here, too, the patterns are clear, with Finns, and secondarily Russians showing an excess of Siberian ancestry relative to Poles. This is, no doubt, due to the Finnic element, which links Finns, and the Uralic substratum in Russians with Siberia. A second contrast is between Finns and Russians/Poles. The latter have more of the Caucasus component, a probable legacy of the Bronze Age Indo-European invasion of Europe. A final contrast is the higher Atlantic_Med element in Poles, which suggests an excess of early Neolithic farmer ancestry, or, admixture with West European populations such as Germans and others who possess more of this component than Slavs.

July 03, 2012

Ancient European DNA using DIYHarappaWorld and MDLP5

In the Bronze Indo-European invasion of Europe I compiled the assessment of different ancient autosomal DNA data from Europe using tools developed by the Dodecad Project. I thought it might be a good idea to do the same using a different calculator, the DIYHarappaWorld.

The results are shown on the left, and should be compared against the HarrapaWorld Admixture spreadsheet. I am not very familiar with the HW components, and you should not assume that similarly named Dodecad and HarappaWorld components reflect the same entities. This is probably due to probable differences in data (e.g. lots more West Eurasian vs. South Asian participants) and methodology between the two projects.

Nonetheless, I would say that these results largely confirm the "Mediterranean" character of the "farmers" Gok4 and Oetzi from Italy and Sweden and the "North European" character of the hunter-gatherers Ajv52, Ajv70, and Bra from Sweden and Iberia. I glanced through the the HW spreadsheet, and it seems that even small discrepancies, e.g., Oetzi's 12% NE_Euro from DIYHarappaWorld vs. the 0% North_European from Dodecad K12b are consistent: note that Sardinians, the most Oetzi-like population have a similar 12% NE_Euro in the HarappaWorld spreadsheet, and 0% in the Dodecad K12b one. Hence, this is probably a case of different anchoring of the similarly named components in the two projects, rather than a manifestation of different results of the ancient samples vis a vis modern populations.

DIYHarappaWorld, like K12b, is a high-resolution calculator, and hence does not show the clear absence of the West_Asian component in prehistoric Europeans; this component spans the West Asian highlands, and is replaced in both calculators by the twin Caucasus and Gedrosia/Baloch component that are much more localized in their respective areas. Even though both Caucasus and Gedrosia/Baloch are related to-, the are not  equivalent to the West_Asian component; for example, the K12b Caucasus component includes both the K7b West_Asian and Southern in the local blend of the Caucasus, and this explains the paradox that Oetzi has no West_Asian but 1/5 Caucasus.

Nonetheless, even using the DIYHarappaWorld and K12b, we can intuit that something did change in Europe in the last 5,000 years ago. This is most evident in the levels of the "Baloch" component (=Gedrosia of Dodecad Project) which is near zero in all individuals but reaches levels of ~10% in West European populations today. Certainly, the interesting fact that the rather uniform West_Asian Indo-European adstratum in Europe is transformed (at a higher resolution) into Gedrosia in the West (e.g., Irish) and Caucasus in the East (e.g. Russians) of the continent merits further investigation


I have also tried the MDLP5 calculator on the same data (on the right). I was not able to find population average data for this calculator, but it seems that the difference between farmers and hunter-gatherers is captured by  the "Paleo-mediterranean" vs. "West-Eurasian" dichotomy.

Overall, I think this was a worthy exercise which largely confirmed the picture drawn from the original papers and my application of the Dodecad calculators over their data. It certainly seems now that a Mesolithic, North-European-like substratum experienced gene flow from a Mediterranean, South-European-like population that conveyed the new agricultural way of life during the Neolithic. Hopefully, we will get new data from the last 4-5 thousand years before too long, so that my hypothesis about the Indo-European mediated final formation of the European population during that period can be tested.

February 06, 2011

Harappa Ancestry Project

Zack Ajmal has been doing some great work on the Harappa Ancestry Project, including posting some code and detailed instructions on how to process various source datasets. Zack contacted me only a couple of months ago for some tips about starting his own project, inspired by the Dodecad Project, and it's great to see that he is already outputting results.

This should be a good lesson to all people out there who possess (i) reasonable computer skills, (ii) reasonable understanding of genetics, (iii) reasonable computing power (my own platform most of the time is a virtual Linux box with 512MB of RAM), and (iv) time to spare on a hobby, that DIY genomics is a manageable endeavor. You will certainly learn a lot, both about genetics itself, and about the various genetics software. Most importantly, you will learn about human populations, and -if you have genetic data of your own- about yourself.