May 02, 2011

Six months of the Dodecad Project

The Dodecad Ancestry Project has recently reached its half-year milestone, and to celebrate, I present a small analysis of the Dodecad populations with at least 10 participants. With 17 such populations, and 254 individuals, we have now reached a point where the Dodecad dataset is equivalent to many published datasets in the literature.

The total sample now exceeds 600 members, and that includes both individuals from single populations that have not yet reached either the 5+ or 10+ person mark, as well as individuals of mixed heritage.

The first PCA dimension distinguishes between Indians and West Eurasians. The second one is clinal between Finns and North Africans.
In the following ones, I present the first dimension with the 3rd, 4th, etc.

The third dimension distinguishes between North Africans and the rest:
The fourth dimension separates the Ashkenazi Jews:
The fifth dimension distinguishes Finns:

Here are the results of the Clusters Galore analysis. I usually perform this over MDS data, but it works just as well on PCA data as well, as I've mentioned in the post where I introduced the idea.

Most Greeks align themselves with Italians, but some do so with the Balkans and West Asia. I would not draw many conclusions from this, as it might be a consequence of the composition of the Greek sample, but it's not a coincidence, I think, that these three places are Greece's geographical neighbors, as well as the only ones where Greek has continued to be spoken until quite recently.

Another interesting statistic is the average intra-population identity-by-state (IBS), which is a good measure of a population's homogeneity:

Here is the full IBS matrix:


I have recently introduced the idea of the population concordance ratio γ(A, B), the value of which becomes 1 if two individuals of population in a row A are always more similar to each other than any individual from column B:

As expected, most populations are perfectly concordant with respect to Indians and North Africans, the two clear outgroups in this set. Another interesting observation is that the most heterogeneous populations are the least concordant, as they include individuals of quite varying ancestry (e.g., a "white Mexican" may be closer to an Englishman than to a very Amerindian-admixed Mexican).

I decided to test this idea by calculating the correlation between a population's average concordance ratio and its average IBS similarity, which turned out to be -.36, which is not significant, but in the right direction. Perhaps average IBS similarity is less than ideal for the purpose of gauging homogeneity as it applies to the concordance ratio.

Notice also the asymmetry between γ(A, B) and γ(B, A). For example, the sample from the Balkans consists of an assortment of non-Greek people from the Balkans, so it's not particularly concordant with respect to many North European populations: people from the Balkans differ from each other substantially in their North European-ness. North European populations, however, tend to be concordant with respect to people from the Balkans.

Submission to the Project is currently closed, but I do encourage all individuals with 4 grandparents from the same European, North/East African, West/Central/South Asian group to contact me at dodecad@gmail.com for possible inclusion in the Project. Send e-mail first, not data, to check whether I can process your data. Or, follow the project blog for future submission opportunities.

14 comments:

George said...

And it's been an immensely productive half year. Excellent job. I look forward to more!

Diogenes said...

Interesting to see North Africans kind of lining up with Finns at dimension 3. And Indians still lining into another direction.

Cuah123 said...

"(e.g., a "white Mexican" may be closer to an Englishman than to a very Amerindian-admixed Mexican"

Thank you, I feel vindicated. My theory is that the majority of colonist "Spainards" were actually descendants of Northern Germans or Englishmen who were Normans/Crusaders. Perhaps entering through Lisbon Portugual, then to the center of Spain. 200 years later they leave for Mexico. Many families only married into known families (even regardless of poverty level) as long as they had a pedigree. This is really apparent in the Highlands of Jalisco.

Phoenix33 said...

I agree -- excellent work.

Average Joe said...

Well done! Nice to see how Dodecad is growing into a real powerhouse of genetic information. Keep up the excellent work!

Dienekes said...

Interesting to see North Africans kind of lining up with Finns at dimension 3. And Indians still lining into another direction.

Not sure what you mean by still as the first dimension is always plotted against the others in these plots.

As for North Africans "lining up with Finns", that is quite intriguing given the unexpected mtDNA link between berbers and saami

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1199377/

organistrum said...

Cuah123: "My theory is that the majority of colonist "Spainards" were actually descendants of Northern Germans or Englishmen who were Normans/Crusaders."
Did you notice how close Spaniards and British cluster in every dimension? No need for very personal hypothesis to explain why "a white Mexican may be closer to an Englishman than to a very Amerindian-admixed Mexican".

sykes.1 said...

I'm rather surprised that the Ashkenazi Jews separate out that strongly. I had thought that there was a large European component to their ancestry.

Dienekes said...

I'm rather surprised that the Ashkenazi Jews separate out that strongly. I had thought that there was a large European component to their ancestry.

The two propositions are not incompatible.

pconroy said...

Dienekes,

Yes, North Africans and Finns lining up together is absolutely fascinating, as it open the possibility that both populations represent the relics of Pre-Neolithic Europe better than any other Caucasian populations - as in all other Europeans are essentially intrusive, at least parentally.

It also brings up such questions, as did R1b enter Europe, via North Africa, as some such as Anatole Kloysev have suggested. He also suggests that R1b carrying men were originally what he calls "Turkic"

http://s155239215.onlinehome.us/turkic/60_Genetics/Klyosov2010DNK-GenealogyEn.htm

ssas said...

I wonder how some Balkanians cluster or are very close on the graphic with the Sacndinavians.
Goth ancetry was always ignored on the Balkans, while insisting on autochthonous, Slavic or even Asian Steppe origin.

Creative said...

Maybe I am wrong, but the overlapping of Assyrians and Armenians kind of actuates my opinion on Assyrians that they must be seen as a pre-Islamic Church than a "race". Kind makes me wonder how many Assyrians are just actually Armenian converts or represent ancient pre-Semitic layers in the region.

agit123 said...

@Creative
exactly my point. Most of the Neo Assyrians are actually Armenians. Otherwise they should have been genetically closer to Levant than Armenia.

Ar-Man said...

Creative & agit123...
They are simply semitic-speaking Armenians, and that goes back a long time ago, during the time of conversion of Armenia to Christianity. Christian Armenians were reading in Aramaic. But many Armenians actually could not understand the language at all and that was the time that Mesrop Mashtots created an alphabet in early Fifth century based on an older alphabet. So, yes basically the people that today are known under the name of Assyrians speak a neo-Aramaic language which is a completely different branch from the ancient Assyrian which was another branch of the more ancient Semitic Akkadian language. A good example of a distance of the two, can be a comparison of say Indo-European Irish and Sanskrit for instance. So the people that we know under the name of Assyrians are those who continued to speak in the Aramaic language and even today many have a great deal of recent Armenian genetic admixture.