December 27, 2011

Lack of significant population structure in Spain

I took the Iberian Spanish (IBS) regional populations, and ran multidimensional scaling on them (left). Most of the populations form a tight cluster, with Basques and Canary Islanders having averages further removed from the main cluster.

It should be noted that the Canarias sample consists of only two individuals (big red dots): one removed from the main cluster, one in the midst of individuals from Castilla Y Leon (small red dots).

MCLUST analysis using 1 dimension, reveals 2 clusters: one consisting of Pais Vasco individuals (blue dots), the other of everyone else, with one Aragonese and one Cantabrian individual showing mixed probabilities between the two clusters.

The overall impression is that there may be additional population structure here (e.g., for Canarians or Galicians), but the sample sizes are not sufficient to make additional clusters unambiguously evident, except in the case of Basques vs. non-Basques.


Maju said...

A previous study (Gayan 2010), which excluded Basques and Canarians (most notably) but included large samples of Castilians, Asturians, Andalusians, Valencians and Catalans, and which used also the (rather limited) method of PC analysis, showed that:

1. Catalans and Andalusians diverged from the main Spanish cluster (Valencians and Asturians did not).

2. When compared with CEU and TSI the appearance of divergence decreased but did not totally disappear, specially not for Catalans. These showed a tendency towards both CEU and TSI, while the less marked divergence tendency of Andalusians was only towards Tuscany.

Published more or less at the same time, Athaniasadis 2010 studied Iberians in the context of the broader Mediterranean area. While all Iberians clustered on the European side of the first PC, there were some notable differences between them in PC2, with Asturians and Andalusians (and Occitans to some extent as well) tending towards the Eastern Mediterranean and instead Catalans, Basques and Pasiegos clustering together in the high PC1, low PC2 corner (discussed by me here).

I think that considering all these previous studies should be important when analyzing the possible structure (or lack of it) of Iberian peoples. In this case, it looks as id the dominance of the Basque and maybe North African (Canarian) influences are sentencing all the rest to appear undifferentiated but under different conditions, this is not necessarily the case, as proven by the mentioned papers.

That's why different viewpoints are most convenient in order to understand the real, multidimensional, structure of populations. When you look at things from just one angle, you miss most of the richness.

Now that I have begun dabbling with the Admixture program, I think I can do that different angle approach part, however I lack the IBS sample. Razib suggested me to ask you for it and so I do here - up to you of course. My email is in my Blogger profile (remove the anti-spam "DELETETHIS" insertion), feel free and thanks in advance.

truth said...

Nice to see this. I would like to see these populations in the context of a European structure (or west-eurasian) MDS plot, would that be possible.

anthrospain said...

Hey Dienekes, could you create a new category "Spanish_D" in the map with the dots of the spanish dodecad individuals, to see how they compare.

Fanty said...

I bet they indeed need other populations placed next to them.

I had a very close look on the Germans in Eurogenes "Gnuplot" experiments.

The Germans did not sort in a meaningfull pattern on their own. Only if at least one other country was placed next to them, the Germans seperated into 3 clusters: Northgermans, Southgermans+Austrians and Prussian Germans.

Maybe something like this is also true for Spanish.

truth said...

@ Maju

That study of Athanasiadis has nothing to do with autosomal genome-wide dna. It's a PCA of mtDNA

Maju said...

No, it is autosomal DNA, however (I had forgotten the details), it is about certain specific regions of the genome:

"Analyses were carried out on a diverse set of neutral and functional polymorphisms located in and around the coagulation factor VII and XII genomic regions (F7 and F12)".