There have been a bunch of studies on Hispanic Americans, Native Americans, African Americans, but very little work on European Americans (if we exclude the perennial fascination of the genetics community with Ashkenazi Jews and some studies which included European Americans of known European parentage).
This is one of the first studies I've seen where the objective was to look at a geographically definite population of European Americans and study its diverse origins in Europe itself. While there are many European Americans whose ancestry is no mystery at all (because their ancestors arrived within memory), there are also large numbers of them with much older ancestry, and these should sometime become the object of study, both for their own sake, but also because they may represent a separate evolutionary road of their ancestral European gene pool.
The STRUCTURE result, beautified by CLUMPP, is really fascinating. Unlike most studies where sub-population labels of clustered individuals are put on the chart, in this case individuals do not necessary report a single ancestry, so cannot be put on a single population label. Yet it is really evident that "European Americans from New Hampshire" can be broken down to several groups with a distinctive ancestry.
Bayesian clustering conducted using the structure software revealed distinct subpopulations, with the highest and most reliable probabilities between a K of 5 and 7. The bar plots are shown for K = 2 to K = 8 from the CLUMPP software (aligns multiple runs of structure) from 10 runs at each K (Figure 1a). As expected, individuals in the sample appear highly admixed; however distinct populations are discernible. The FST's increase consistently as K increases, with the average FST's for K = 4 to K = 7 around the level of “little genetic differentiation” as defined by Wright (approx. 0.05) (Figure 1c,d) . The admixture values increase for lower K's, but begin to drop at K = 6 to values between 0.6–0.7 (Table S2). In selecting the most correct K, parsimony is an important consideration, i.e. that the simpler answer tends to be correct. Though there may be some validity to further subdividing the groups, the most statistically consistent and the most parsimonious K based on the structure output is K = 6. Further analysis using the ancestral data is used to describe the groupings and lends support to our selection of K = 6.
These results suggest that genetic population structure is detectable in a highly admixed US population and that this structure correlates with self-reported ancestry. To our knowledge, this is the first time such an investigation has uncovered a strong link between structure and ancestry in what would otherwise be assumed to be a homogeneous US state where most individuals are of European ancestry. Our data indicate that that admixture has not eliminated the genetic structure found within Europe, and descendants of the Russian, Polish and Lithuanian immigrants remain genetically distinct from the rest of the population and are closely related to one another.
Exploratory analysis revealed that among the ancestries, those reported by at least five individuals were: American Indian (n = 32), Austria (n = 5), Belgium (n = 5), Canadian Indian (n = 14), Canada (n = 113), Czech Republic (n = 5), England (n = 355), Finland (n = 7), French-Canadian (n = 54), France (n = 173), Germanic (countries where Germanic languages spoken) (n = 5), Germany (n = 110), Greece (n = 9), Ireland (n = 218), Italy (n = 41), Jewish (n = 6), Lithuania (n = 12), Canadian Maritime Provinces (n = 6), Netherlands (n = 25), Poland (n = 44), Russia (n = 13), Scotland (n = 157), Sweden (n = 24), Switzerland (n = 7), UK (n = 11), US (n = 42), Wales (n = 24).
PLoS ONE doi:10.1371/journal.pone.0006928
Genetic Population Structure Analysis in New Hampshire Reveals Eastern European Ancestry
Chantel D. Sloan et al.
Genetic structure due to ancestry has been well documented among many divergent human populations. However, the ability to associate ancestry with genetic substructure without using supervised clustering has not been explored in more presumably homogeneous and admixed US populations. The goal of this study was to determine if genetic structure could be detected in a United States population from a single state where the individuals have mixed European ancestry. Using Bayesian clustering with a set of 960 single nucleotide polymorphisms (SNPs) we found evidence of population stratification in 864 individuals from New Hampshire that can be used to differentiate the population into six distinct genetic subgroups. We then correlated self-reported ancestry of the individuals with the Bayesian clustering results. Finnish and Russian/Polish/Lithuanian ancestries were most notably found to be associated with genetic substructure. The ancestral results were further explained and substantiated using New Hampshire census data from 1870 to 1930 when the largest waves of European immigrants came to the area. We also discerned distinct patterns of linkage disequilibrium (LD) between the genetic groups in the growth hormone receptor gene (GHR). To our knowledge, this is the first time such an investigation has uncovered a strong link between genetic structure and ancestry in what would otherwise be considered a homogenous US population.