June 05, 2009

Regional population structure in Iceland (Price et al. 2009)

In 2008 Ethnicity Struck Back, but in 2009, it already seems we're well on our way to finding the structure within ethnic groups themselves, as was possible in Sardinia and Estonia.

A new paper shows that such structure exists in Iceland, a population often used for association studies, because of its presumed homogeneity.

Individuals with most of their ancestry from one of several regions tend to cluster together in the PCA plot (top), but "random" individuals with ancestry from many regions tend to be all over the map. 

From the paper:
The ancestry predictions were correct for 47% of samples, correct to within a distance of one region for 74% of samples, and correct to within a distance of two regions for 93% of samples. The accuracy increased to 58% (87% to within one region, 97% to within two regions) when restricting to the 98 (of 250) samples with at least 16 of 32 ancestors from a single region.
Figure 3 shows the Icelandic populations in the context of Scotland and Norway. From the paper:
Based on the available data, the optimal linear combination yielded an estimate of 64% Norse and 36% Scottish ancestry, with a standard error of less than 2%.  [...] For each region, the estimate of Norse ancestry was between 62% and 65%, with a standard error of less than 2% (except region 1, for which we obtained 61% with a standard error of less than 3%).
Admixture took place so long ago, that it has spread evenly across Iceland, with no regions being particularly "Norse" or "Scottish" in ancestry. But, while the ancestral components are regionally the same, it is clear from the accuracy of region estimates that the various populations of Icelanders have not been panmictic, and thus some barriers to gene flow even in this very homogeneous population has allowed for the emergence of regional structure.

A consequence of the recent origin of the genetic differences between Icelandic subpopulations is that allele frequency differences follow the null distribution predicted by neutral drift. Thus, there is little risk of false positive associations due to population stratification in disease association studies, despite the fact that there are genuine differences between regions.

PLoS Genetics doi:10.1371/journal.pgen.1000505

The Impact of Divergence Time on the Nature of Population Structure: An Example from Iceland

Alkes Price et al.


The Icelandic population has been sampled in many disease association studies, providing a strong motivation to understand the structure of this population and its ramifications for disease gene mapping. Previous work using 40 microsatellites showed that the Icelandic population is relatively homogeneous, but exhibits subtle population structure that can bias disease association statistics. Here, we show that regional geographic ancestries of individuals from Iceland can be distinguished using 292,289 autosomal single-nucleotide polymorphisms (SNPs). We further show that subpopulation differences are due to genetic drift since the settlement of Iceland 1100 years ago, and not to varying contributions from different ancestral populations. A consequence of the recent origin of Icelandic population structure is that allele frequency differences follow a null distribution devoid of outliers, so that the risk of false positive associations due to stratification is minimal. Our results highlight an important distinction between population differences attributable to recent drift and those arising from more ancient divergence, which has implications both for association studies and for efforts to detect natural selection using population differentiation.


1 comment:

Paul_Johnsen said...

They should have used a Western Norwegian sample as the "Norwegian" component, and not one from Oslo.