The Dodecad Ancestry Project has recently reached its half-year milestone, and to celebrate, I present a small analysis of the Dodecad populations with at least 10 participants. With 17 such populations, and 254 individuals, we have now reached a point where the Dodecad dataset is equivalent to many published datasets in the literature.
The total sample now exceeds 600 members, and that includes both individuals from single populations that have not yet reached either the 5+ or 10+ person mark, as well as individuals of mixed heritage.
The first PCA dimension distinguishes between Indians and West Eurasians. The second one is clinal between Finns and North Africans.
In the following ones, I present the first dimension with the 3rd, 4th, etc.
The third dimension distinguishes between North Africans and the rest:
The fourth dimension separates the Ashkenazi Jews:
The fifth dimension distinguishes Finns:
Here are the results of the Clusters Galore analysis. I usually perform this over MDS data, but it works just as well on PCA data as well, as I've mentioned in the post where I introduced the idea.
Most Greeks align themselves with Italians, but some do so with the Balkans and West Asia. I would not draw many conclusions from this, as it might be a consequence of the composition of the Greek sample, but it's not a coincidence, I think, that these three places are Greece's geographical neighbors, as well as the only ones where Greek has continued to be spoken until quite recently.
Another interesting statistic is the average intra-population identity-by-state (IBS), which is a good measure of a population's homogeneity:
Here is the full IBS matrix:
I have recently introduced the idea of the population concordance ratio γ(A, B), the value of which becomes 1 if two individuals of population in a row A are always more similar to each other than any individual from column B:
As expected, most populations are perfectly concordant with respect to Indians and North Africans, the two clear outgroups in this set. Another interesting observation is that the most heterogeneous populations are the least concordant, as they include individuals of quite varying ancestry (e.g., a "white Mexican" may be closer to an Englishman than to a very Amerindian-admixed Mexican).
I decided to test this idea by calculating the correlation between a population's average concordance ratio and its average IBS similarity, which turned out to be -.36, which is not significant, but in the right direction. Perhaps average IBS similarity is less than ideal for the purpose of gauging homogeneity as it applies to the concordance ratio.
Notice also the asymmetry between γ(A, B) and γ(B, A). For example, the sample from the Balkans consists of an assortment of non-Greek people from the Balkans, so it's not particularly concordant with respect to many North European populations: people from the Balkans differ from each other substantially in their North European-ness. North European populations, however, tend to be concordant with respect to people from the Balkans.
Submission to the Project is currently closed, but I do encourage all individuals with 4 grandparents from the same European, North/East African, West/Central/South Asian group to contact me at email@example.com for possible inclusion in the Project. Send e-mail first, not data, to check whether I can process your data. Or, follow the project blog for future submission opportunities.