This article is a review that presents a genetic map of the British Isles from an upcoming study by Leslie et al. (2014) that is listed in the references as being "in press" in Nature. This may very well be the big POBI study of the British Isles that has been talked about for years now.
Genetics February 1, 2015 vol. 199 no. 2 267-279
Genetic Characterization of Human Populations: From ABO to a Genetic Map of the British People
Walter Bodmer
From 1900, when Landsteiner first described the ABO blood groups, to the present, the methods used to characterize the genetics of human populations have undergone a remarkable development. Concomitantly, our understanding of the history and spread of human populations across the earth has become much more detailed. As has often been said, a better understanding of the genetic relationships among the peoples of the world is one of the best antidotes to racial prejudices. Such an understanding provides us with a fascinating, improved insight into our origins as well as with valuable information about population differences that are of medical relevance. The study of genetic polymorphisms has been essential to the analysis of the relationships between human populations. The evolution of methods used to study human polymorphisms and the resulting contributions to our understanding of human health and history is the subject of this Perspectives.
Link
Showing posts with label Geography. Show all posts
Showing posts with label Geography. Show all posts
February 11, 2015
June 15, 2014
Genetic structure of Mexico
This article is free to read with registration.
Science 13 June 2014:
Vol. 344 no. 6189 pp. 1280-1285
The genetics of Mexico recapitulates Native American substructure and affects biomedical traits
Andrés Moreno-Estrada
Mexico harbors great cultural and ethnic diversity, yet fine-scale patterns of human genome-wide variation from this region remain largely uncharacterized. We studied genomic variation within Mexico from over 1000 individuals representing 20 indigenous and 11 mestizo populations. We found striking genetic stratification among indigenous populations within Mexico at varying degrees of geographic isolation. Some groups were as differentiated as Europeans are from East Asians. Pre-Columbian genetic substructure is recapitulated in the indigenous ancestry of admixed mestizo individuals across the country. Furthermore, two independently phenotyped cohorts of Mexicans and Mexican Americans showed a significant association between subcontinental ancestry and lung function. Thus, accounting for fine-scale ancestry patterns is critical for medical and population genetic studies within Mexico, in Mexican-descent populations, and likely in many other populations worldwide.
Link
Science 13 June 2014:
Vol. 344 no. 6189 pp. 1280-1285
The genetics of Mexico recapitulates Native American substructure and affects biomedical traits
Andrés Moreno-Estrada
Mexico harbors great cultural and ethnic diversity, yet fine-scale patterns of human genome-wide variation from this region remain largely uncharacterized. We studied genomic variation within Mexico from over 1000 individuals representing 20 indigenous and 11 mestizo populations. We found striking genetic stratification among indigenous populations within Mexico at varying degrees of geographic isolation. Some groups were as differentiated as Europeans are from East Asians. Pre-Columbian genetic substructure is recapitulated in the indigenous ancestry of admixed mestizo individuals across the country. Furthermore, two independently phenotyped cohorts of Mexicans and Mexican Americans showed a significant association between subcontinental ancestry and lung function. Thus, accounting for fine-scale ancestry patterns is critical for medical and population genetic studies within Mexico, in Mexican-descent populations, and likely in many other populations worldwide.
Link
May 05, 2014
SPAMIX for spatial localization of admixed individuals
A new preprint on the bioRxiv suggests that it is possible to geographically localize the location of a person's four grandparents. This is often a problem for persons of mixed ancestry who often tend to plot in PCAs in some average location between their ancestors (so someone who is Swedish+Italian+Spanish+Russian might end up somewhere in central Europe even though none of his ancestors are central European).
This has appeared shortly after the GPS method of Elhaik et al. (2014) which presents evidence of being more accurate than SPA, so it will be interesting to see a comparison between SPAMIX and GPS. My experience in the Dodecad Project suggests that this is a useful feature (the Dodecad Oracle could sometimes be used for this purpose and e.g., could infer that a person that had one Ashkenazi Jewish grandparent and 3 English ones was a ~3/4 British+~1/4 Jewish mix, but it is limited to mixtures of two populations, so it could not cope with the case of 3-4 grandparents with different origins). There is an under-appreciated pool of adoptees who would love a tool like that, and there are also obvious forensic implications if something like this really works.
bioRxiv doi: 10.1101/004713
Spatial localization of recent ancestors for admixed individuals
Wen-Yun Yang et al.
Ancestry analysis from genetic data plays a critical role in studies of human disease and evolution. Recent work has introduced explicit models for the geographic distribution of genetic variation and has shown that such explicit models yield superior accuracy in ancestry inference over non-model-based methods. Here we extend such work to introduce a method that models admixture between ancestors from multiple sources across a geographic continuum. We devise efficient algorithms based on hidden Markov models to localize on a map the recent ancestors (e.g. grandparents) of admixed individuals, joint with assigning ancestry at each locus in the genome. We validate our methods using empirical data from individuals with mixed European ancestry from the POPRES study and show that our approach is able to localize their recent ancestors within an average of 470Km of the reported locations of their grandparents. Furthermore, simulations from real POPRES genotype data show that our method attains high accuracy in localizing recent ancestors of admixed individuals in Europe (an average of 550Km from their true location for localization of 2 ancestries in Europe, 4 generations ago). We explore the limits of ancestry localization under our approach and find that performance decreases as the number of distinct ancestries and generations since admixture increases. Finally, we build a map of expected localization accuracy across admixed individuals according to the location of origin within Europe of their ancestors.
Link
This has appeared shortly after the GPS method of Elhaik et al. (2014) which presents evidence of being more accurate than SPA, so it will be interesting to see a comparison between SPAMIX and GPS. My experience in the Dodecad Project suggests that this is a useful feature (the Dodecad Oracle could sometimes be used for this purpose and e.g., could infer that a person that had one Ashkenazi Jewish grandparent and 3 English ones was a ~3/4 British+~1/4 Jewish mix, but it is limited to mixtures of two populations, so it could not cope with the case of 3-4 grandparents with different origins). There is an under-appreciated pool of adoptees who would love a tool like that, and there are also obvious forensic implications if something like this really works.
bioRxiv doi: 10.1101/004713
Spatial localization of recent ancestors for admixed individuals
Wen-Yun Yang et al.
Ancestry analysis from genetic data plays a critical role in studies of human disease and evolution. Recent work has introduced explicit models for the geographic distribution of genetic variation and has shown that such explicit models yield superior accuracy in ancestry inference over non-model-based methods. Here we extend such work to introduce a method that models admixture between ancestors from multiple sources across a geographic continuum. We devise efficient algorithms based on hidden Markov models to localize on a map the recent ancestors (e.g. grandparents) of admixed individuals, joint with assigning ancestry at each locus in the genome. We validate our methods using empirical data from individuals with mixed European ancestry from the POPRES study and show that our approach is able to localize their recent ancestors within an average of 470Km of the reported locations of their grandparents. Furthermore, simulations from real POPRES genotype data show that our method attains high accuracy in localizing recent ancestors of admixed individuals in Europe (an average of 550Km from their true location for localization of 2 ancestries in Europe, 4 generations ago). We explore the limits of ancestry localization under our approach and find that performance decreases as the number of distinct ancestries and generations since admixture increases. Finally, we build a map of expected localization accuracy across admixed individuals according to the location of origin within Europe of their ancestors.
Link
May 01, 2014
The Geographic Position Structure (GPS) algorithm of Elhaik et al. (2014) is basically wrong
In the previous post I showed that the new paper by Elhaik et al. presents as its own (without citation) two of my ideas that were published on the web ~2 years before the paper was submitted.
I also took some time to evaluate the new aspect of their "geographical positioning system" (GPS) which is an algorithm to determine the geographic position of samples given their genetic distances to a group of reference populations. This is described under the heading "Calculating the biogeographical origin of a test sample" of their paper and I include a screenshot of it on the left to help you follow along.
From Equation (2) it is easy to see that the predicted position of the test sample is shifted away from the position of the best matching reference population (Positionbest) and towards the other reference populations (Position(m)) with the contribution of each reference population being weighted by wm which is the ratio of the distance of the closest population to the distance of the m-th reference population.
A little basic geometry (right) informs us that points on the circle have a constant ratio of distances (d2/d1) to points B and A.
That is, BC/CA = BD/DA = BP/PA
In terms of the Elhaik et al. (2014) algorithm this constant ratio is wm, so if A and B are two reference populations, and the test sample is e.g., either C or D, then the same constant weight applies (and this is true for all points on the circle).
In practical terms, the algorithm of Elhaik et al. (2014) will predict the same geographical locations for all points on the circle. This will be perfectly accurate for C and biased for every other point on the circle (with D being the absolute worst).
It is actually easy to test whether the test population is like C or like D; in the case of C it is CA+CB=AB. This is a simple test of collinearity that exploits the fact that not only the distance of the test population to reference ones, but also the distance of the two reference populations from each other. And, indeed, it's easy to see that for a test population P we can estimate genetic distances AP, PB, and AB and these uniquely define the circle on which the point must lie. Do this for all pairs of reference populations, find the distribution of the intersections of these circles, find a peak of this distribution (if such exists) et voila you have a sound mechanism for localizing individuals based on genetic distances. I expect to see something like this in Nature Communications circa 2016.
I also took some time to evaluate the new aspect of their "geographical positioning system" (GPS) which is an algorithm to determine the geographic position of samples given their genetic distances to a group of reference populations. This is described under the heading "Calculating the biogeographical origin of a test sample" of their paper and I include a screenshot of it on the left to help you follow along.
From Equation (2) it is easy to see that the predicted position of the test sample is shifted away from the position of the best matching reference population (Positionbest) and towards the other reference populations (Position(m)) with the contribution of each reference population being weighted by wm which is the ratio of the distance of the closest population to the distance of the m-th reference population.
A little basic geometry (right) informs us that points on the circle have a constant ratio of distances (d2/d1) to points B and A.
That is, BC/CA = BD/DA = BP/PA
In terms of the Elhaik et al. (2014) algorithm this constant ratio is wm, so if A and B are two reference populations, and the test sample is e.g., either C or D, then the same constant weight applies (and this is true for all points on the circle).
In practical terms, the algorithm of Elhaik et al. (2014) will predict the same geographical locations for all points on the circle. This will be perfectly accurate for C and biased for every other point on the circle (with D being the absolute worst).
It is actually easy to test whether the test population is like C or like D; in the case of C it is CA+CB=AB. This is a simple test of collinearity that exploits the fact that not only the distance of the test population to reference ones, but also the distance of the two reference populations from each other. And, indeed, it's easy to see that for a test population P we can estimate genetic distances AP, PB, and AB and these uniquely define the circle on which the point must lie. Do this for all pairs of reference populations, find the distribution of the intersections of these circles, find a peak of this distribution (if such exists) et voila you have a sound mechanism for localizing individuals based on genetic distances. I expect to see something like this in Nature Communications circa 2016.
April 30, 2014
Nature Communications, the Genographic Project, Elhaik et al. re-discover zombies, the Oracle, etc. 3 years after the fact...
... and (sadly) do not care to cite my lowly blog.
From the new paper's Methods:
To infer the putative ancestral populations, we applied ADMIXTURE46 in an unsupervised mode to the filtered data set. This analysis uses a maximum likelihood approach to determine the admixture proportions of the individuals in question assuming they emerged from K hypothetical populations. We speculated that our method will be the most accurate when populations have uniform admixture assignments. In choosing the value of K that seemed to best satisfy this condition, we experimented with different Ks ranging from 6 to 12. We identified a substructure at K=10 in which populations appeared homogeneous in their admixture composition. Higher values of K yielded noise that appeared as ancestry shared by very few individuals within the same populations. ADMIXTURE outputs the speculated allele frequencies of each SNP for each hypothetical population.
Using these data, we simulated 15 samples for each hypothetical population and plotted them in a PCA analysis with the Genographic populations. We observed that two hypothetical populations were markedly close to one another, suggesting they share the same ancestry and eliminated one of them to avoid redundancy. The remaining nine populations were considered the putative ancestral populations and were used in all further analyses.
Given nine admixture proportions for a sample of unknown geographic origin obtained using ADMIXTURE’s supervised approach with the nine putative ancestral populations, we calculated the Euclidean distance between its admixture proportions and the N reference populations (GEN). All reference populations were sorted in an ascending order according to their genetic distance from the sample.I'm sure my readers, and users of DIYDodecad know exactly why this is a carbon-copy of the tools I developed for the Dodecad Project. But, in any case...
The most exciting use of "zombies" is to convert unsupervised ADMIXTURE runs into supervised ones. In unsupervised mode, ADMIXTURE treats all individuals alike, and tries to infer their ancestral proportions. In supervised mode, some individuals are treated as "fixed" (belonging 100% in one of K ancestral components), and the ancestry of the rest is inferred.
The idea is fairly simple: run an unsupervised ADMIXTURE analysis once to generate allele frequencies for your K ancestral components; then generate zombie populations using these allele frequencies; whenever you want to estimate admixture proportions in new samples run supervised ADMIXTURE analysis using the zombie populations.... and the first post on the Oracle which shows how to find proximity to a population by calculating Euclidean distance in the space of admixture proportions between reference populations and a test individual (and also considers mixtures of populations).
I am flattered that the zombie approach has been copied and tested, but I doubt that all of the paper's 32 authors were unaware of the previous publication of the gist of their "new" method.
Nature Communications 5, Article number: 3513 doi:10.1038/ncomms4513
Geographic population structure analysis of worldwide human populations infers their biogeographical origins
Eran Elhaik et al.
The search for a method that utilizes biological information to predict humans’ place of origin has occupied scientists for millennia. Over the past four decades, scientists have employed genetic data in an effort to achieve this goal but with limited success. While biogeographical algorithms using next-generation sequencing data have achieved an accuracy of 700?km in Europe, they were inaccurate elsewhere. Here we describe the Geographic Population Structure (GPS) algorithm and demonstrate its accuracy with three data sets using 40,000–130,000 SNPs. GPS placed 83% of worldwide individuals in their country of origin. Applied to over 200 Sardinians villagers, GPS placed a quarter of them in their villages and most of the rest within 50?km of their villages. GPS’s accuracy and power to infer the biogeography of worldwide individuals down to their country or, in some cases, village, of origin, underscores the promise of admixture-based methods for biogeography and has ramifications for genetic ancestry testing.
Link
March 16, 2014
Back-migration of Yeniseian into Asia from Beringia
PLOS One DOI: 10.1371/journal.pone.0091722
Linguistic Phylogenies Support Back-Migration from Beringia to Asia
Mark A. Sicoli, Gary Holton
Recent arguments connecting Na-Dene languages of North America with Yeniseian languages of Siberia have been used to assert proof for the origin of Native Americans in central or western Asia. We apply phylogenetic methods to test support for this hypothesis against an alternative hypothesis that Yeniseian represents a back-migration to Asia from a Beringian ancestral population. We coded a linguistic dataset of typological features and used neighbor-joining network algorithms and Bayesian model comparison based on Bayes factors to test the fit between the data and the linguistic phylogenies modeling two dispersal hypotheses. Our results support that a Dene-Yeniseian connection more likely represents radiation out of Beringia with back-migration into central Asia than a migration from central or western Asia to North America.
Link
Linguistic Phylogenies Support Back-Migration from Beringia to Asia
Mark A. Sicoli, Gary Holton
Recent arguments connecting Na-Dene languages of North America with Yeniseian languages of Siberia have been used to assert proof for the origin of Native Americans in central or western Asia. We apply phylogenetic methods to test support for this hypothesis against an alternative hypothesis that Yeniseian represents a back-migration to Asia from a Beringian ancestral population. We coded a linguistic dataset of typological features and used neighbor-joining network algorithms and Bayesian model comparison based on Bayes factors to test the fit between the data and the linguistic phylogenies modeling two dispersal hypotheses. Our results support that a Dene-Yeniseian connection more likely represents radiation out of Beringia with back-migration into central Asia than a migration from central or western Asia to North America.
Link
March 04, 2014
Admixture in US populations
An interesting blog post from 23andMe:
In an update to that work, our researcher Kasia Bryc found that about about 4 percent of whites have at least 1 percent or more African ancestry.and:
Although it is a relatively small percentage, the percentage indicates that an individual with at least 1 percent African ancestry had an African ancestor within the last six generations, or in the last 200 years. This data also suggests that individuals with mixed parentage at some point were absorbed into the white population.
Looking a little more deeply into the data, Kasia also found that the percentage of whites with hidden African ancestry differed significantly from state-to-state. Southern states with the highest African American populations, tended to have the highest percentages of hidden African ancestry. In South Carolina at least 13 percent of self-identified whites have 1 percent or more African ancestry, while in Louisiana the number is a little more than 12 percent. In Georgia and Alabama the number is about 9 percent. The differences perhaps point to different social and cultural histories within the south.
Previous published studies estimate that on average African Americans had about 82 percent African ancestry and about 18 percent European ancestry. But in self-identified African Americans in 23andMe’s database, Kasia found the average amount of African ancestry was closer to 73 percent.I don't think that is necessarily the average percentage in the general African American population as the subset of African Americans who take 23andMe tests may not be representative (e.g., it may come more from cities where African Americans may have more opportunity to admix with European Americans).
and:
On average Latinos had about 70 percent European ancestry, 14 percent Native American ancestry and 6 percent African ancestry. The remainder ancestry is difficult to assign because the DNA is either shared by a number of different populations around the world, or because it’s from understudied populations, such as Native Americans. Obviously that large “unassigned” percentage means that those “averages” could be higher. As with African Americans, looking at the regional and state-to-state numbers for self-identified Latinos, the differences are striking.
...23andMe may have a couple of orders of magnitude more sampled individuals than anything that appears in most published studies and it's great to see this being put to good use.
For example, some Latinos have no discernible Native American ancestry, while in others have as much as 50 percent of the ancestry being Native American. Latinos in states in the Southwest, bordering Mexico — New Mexico, Texas, California and Arizona — have the greatest percentage of Native American ancestry. Latinos in states with the largest proportion of African Americans in their population — South Carolina, Louisiana and Alabama — have the highest percentage of African Ancestry.
It'd be great if someone at 23andMe did some more analyses over their huge database. I can only imagine what a flashPCA with half a million individuals from around the world would look like; even if it told us nothing new about human history it would be quite a cool picture to look at.
June 20, 2013
Genetic load accumulation during range expansions
arXiv:1306.1652 [q-bio.PE]
On the accumulation of deleterious mutations during range expansions
Stephan Peischl et al.
We investigate the effect of spatial range expansions on the evolution of fitness when beneficial and deleterious mutations co-segregate. We perform individual-based simulations of a uniform linear habitat and complement them with analytical approximations for the evolution of mean fitness at the edge of the expansion. We find that deleterious mutations accumulate steadily on the wave front during range expansions, thus creating an expansion load. Reduced fitness due to the expansion load is not restricted to the wave front but occurs over a large proportion of newly colonized habitats. The expansion load can persist and represent a major fraction of the total mutation load thousands of generations after the expansion. Our results extend qualitatively and quantitatively to two-dimensional expansions. The phenomenon of expansion load may explain growing evidence that populations that have recently expanded, including humans, show an excess of deleterious mutations. To test the predictions of our model, we analyze patterns of neutral and non-neutral genetic diversity in humans and find an excellent fit between theory and data.
Link
On the accumulation of deleterious mutations during range expansions
Stephan Peischl et al.
We investigate the effect of spatial range expansions on the evolution of fitness when beneficial and deleterious mutations co-segregate. We perform individual-based simulations of a uniform linear habitat and complement them with analytical approximations for the evolution of mean fitness at the edge of the expansion. We find that deleterious mutations accumulate steadily on the wave front during range expansions, thus creating an expansion load. Reduced fitness due to the expansion load is not restricted to the wave front but occurs over a large proportion of newly colonized habitats. The expansion load can persist and represent a major fraction of the total mutation load thousands of generations after the expansion. Our results extend qualitatively and quantitatively to two-dimensional expansions. The phenomenon of expansion load may explain growing evidence that populations that have recently expanded, including humans, show an excess of deleterious mutations. To test the predictions of our model, we analyze patterns of neutral and non-neutral genetic diversity in humans and find an excellent fit between theory and data.
Link
June 03, 2013
mtDNA from Nepal and Tibet (Gayden et al. 2013)
Am J Phys Anthropol. 2013 Jun;151(2):169-82. doi: 10.1002/ajpa.22240. Epub 2013 Apr 12.
The Himalayas: Barrier and conduit for gene flow.
Gayden T, Perez A, Persad PJ, Bukhari A, Chennakrishnaiah S, Simms T, Maloney T, Rodriguez K, Herrera RJ.
Abstract
The Himalayan mountain range is strategically located at the crossroads of the major cultural centers in Asia, the Middle East and Europe. Although previous Y-chromosome studies indicate that the Himalayas served as a natural barrier for gene flow from the south to the Tibetan plateau, this region is believed to have played an important role as a corridor for human migrations between East and West Eurasia along the ancient Silk Road. To evaluate the effects of the Himalayan mountain range in shaping the maternal lineages of populations residing on either side of the cordillera, we analyzed mitochondrial DNA variation in 344 samples from three Nepalese collections (Newar, Kathmandu and Tamang) and a general population of Tibet. Our results revealed a predominantly East Asian-specific component in Tibet and Tamang, whereas Newar and Kathmandu are both characterized by a combination of East and South Central Asian lineages. Interestingly, Newar and Kathmandu harbor several deep-rooted Indian lineages, including M2, R5, and U2, whose coalescent times from this study (U2, >40 kya) and previous reports (M2 and R5, >50 kya) suggest that Nepal was inhabited during the initial peopling of South Central Asia. Comparisons with our previous Y-chromosome data indicate sex-biased migrations in Tamang and a founder effect and/or genetic drift in Tamang and Newar. Altogether, our results confirm that while the Himalayas acted as a geographic barrier for human movement from the Indian subcontinent to the Tibetan highland, it also served as a conduit for gene flow between Central and East Asia.
Link
The Himalayas: Barrier and conduit for gene flow.
Gayden T, Perez A, Persad PJ, Bukhari A, Chennakrishnaiah S, Simms T, Maloney T, Rodriguez K, Herrera RJ.
Abstract
The Himalayan mountain range is strategically located at the crossroads of the major cultural centers in Asia, the Middle East and Europe. Although previous Y-chromosome studies indicate that the Himalayas served as a natural barrier for gene flow from the south to the Tibetan plateau, this region is believed to have played an important role as a corridor for human migrations between East and West Eurasia along the ancient Silk Road. To evaluate the effects of the Himalayan mountain range in shaping the maternal lineages of populations residing on either side of the cordillera, we analyzed mitochondrial DNA variation in 344 samples from three Nepalese collections (Newar, Kathmandu and Tamang) and a general population of Tibet. Our results revealed a predominantly East Asian-specific component in Tibet and Tamang, whereas Newar and Kathmandu are both characterized by a combination of East and South Central Asian lineages. Interestingly, Newar and Kathmandu harbor several deep-rooted Indian lineages, including M2, R5, and U2, whose coalescent times from this study (U2, >40 kya) and previous reports (M2 and R5, >50 kya) suggest that Nepal was inhabited during the initial peopling of South Central Asia. Comparisons with our previous Y-chromosome data indicate sex-biased migrations in Tamang and a founder effect and/or genetic drift in Tamang and Newar. Altogether, our results confirm that while the Himalayas acted as a geographic barrier for human movement from the Indian subcontinent to the Tibetan highland, it also served as a conduit for gene flow between Central and East Asia.
Link
May 31, 2013
LOCO-LD paper and software
Link to software.
AJHG doi: 10.1016/j.ajhg.2013.04.023
Enhanced Localization of Genetic Samples through Linkage-Disequilibrium Correction
Yael Baran et al.
Characterizing the spatial patterns of genetic diversity in human populations has a wide range of applications, from detecting genetic mutations associated with disease to inferring human history. Current approaches, including the widely used principal-component analysis, are not suited for the analysis of linked markers, and local and long-range linkage disequilibrium (LD) can dramatically reduce the accuracy of spatial localization when unaccounted for. To overcome this, we have introduced an approach that performs spatial localization of individuals on the basis of their genetic data and explicitly models LD among markers by using a multivariate normal distribution. By leveraging external reference panels, we derive closed-form solutions to the optimization procedure to achieve a computationally efficient method that can handle large data sets. We validate the method on empirical data from a large sample of European individuals from the POPRES data set, as well as on a large sample of individuals of Spanish ancestry. First, we show that by modeling LD, we achieve accuracy superior to that of existing methods. Importantly, whereas other methods show decreased performance when dense marker panels are used in the inference, our approach improves in accuracy as more markers become available. Second, we show that accurate localization of genetic data can be achieved with only a part of the genome, and this could potentially enable the spatial localization of admixed samples that have a fraction of their genome originating from a given continent. Finally, we demonstrate that our approach is resistant to distortions resulting from long-range LD regions; such distortions can dramatically bias the results when unaccounted for.
Link
AJHG doi: 10.1016/j.ajhg.2013.04.023
Enhanced Localization of Genetic Samples through Linkage-Disequilibrium Correction
Yael Baran et al.
Characterizing the spatial patterns of genetic diversity in human populations has a wide range of applications, from detecting genetic mutations associated with disease to inferring human history. Current approaches, including the widely used principal-component analysis, are not suited for the analysis of linked markers, and local and long-range linkage disequilibrium (LD) can dramatically reduce the accuracy of spatial localization when unaccounted for. To overcome this, we have introduced an approach that performs spatial localization of individuals on the basis of their genetic data and explicitly models LD among markers by using a multivariate normal distribution. By leveraging external reference panels, we derive closed-form solutions to the optimization procedure to achieve a computationally efficient method that can handle large data sets. We validate the method on empirical data from a large sample of European individuals from the POPRES data set, as well as on a large sample of individuals of Spanish ancestry. First, we show that by modeling LD, we achieve accuracy superior to that of existing methods. Importantly, whereas other methods show decreased performance when dense marker panels are used in the inference, our approach improves in accuracy as more markers become available. Second, we show that accurate localization of genetic data can be achieved with only a part of the genome, and this could potentially enable the spatial localization of admixed samples that have a fraction of their genome originating from a given continent. Finally, we demonstrate that our approach is resistant to distortions resulting from long-range LD regions; such distortions can dramatically bias the results when unaccounted for.
Link
April 17, 2013
Y-chromosomes of Native South Americans (Roewer et al. 2013)
It would be useful to sequence these South American C3* Y-chromosomes to see how they are related to the C3b-P39 found in some native North Americans as well as other unresolved C3* from Asia. It would also be worthwhile to look at autosomal data from these populations, to see if they are wholly descended from First Americans, or have evidence of more recent gene flow from East Asia.
PLoS Genet 9(4): e1003460. doi:10.1371/journal.pgen.1003460
Continent-Wide Decoupling of Y-Chromosomal Genetic Variation from Language and Geography in Native South Americans
Lutz Roewer et al.
Numerous studies of human populations in Europe and Asia have revealed a concordance between their extant genetic structure and the prevailing regional pattern of geography and language. For native South Americans, however, such evidence has been lacking so far. Therefore, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other, in the largest study of South American natives to date in terms of sampled individuals and populations. A total of 1,011 individuals, representing 50 tribal populations from 81 settlements, were genotyped for up to 17 short tandem repeat (STR) markers and 16 single nucleotide polymorphisms (Y-SNPs), the latter resolving phylogenetic lineages Q and C. Virtually no structure became apparent for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships. This continent-wide decoupling is consistent with a rapid peopling of the continent followed by long periods of isolation in small groups. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America. Such haplotypes are virtually absent from North and Central America, but occur at high frequency in Asia. Together with the locally confined Y-STR autocorrelation observed in our study as a whole, the available data therefore suggest a late introduction of C3* into South America no more than 6,000 years ago, perhaps via coastal or trans-Pacific routes. Extensive simulations revealed that the observed lack of haplogroup C3* among extant North and Central American natives is only compatible with low levels of migration between the ancestor populations of C3* carriers and non-carriers. In summary, our data highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions, most of which are likely not to have been met by the ancestors of native South Americans.
Link
PLoS Genet 9(4): e1003460. doi:10.1371/journal.pgen.1003460
Continent-Wide Decoupling of Y-Chromosomal Genetic Variation from Language and Geography in Native South Americans
Lutz Roewer et al.
Numerous studies of human populations in Europe and Asia have revealed a concordance between their extant genetic structure and the prevailing regional pattern of geography and language. For native South Americans, however, such evidence has been lacking so far. Therefore, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other, in the largest study of South American natives to date in terms of sampled individuals and populations. A total of 1,011 individuals, representing 50 tribal populations from 81 settlements, were genotyped for up to 17 short tandem repeat (STR) markers and 16 single nucleotide polymorphisms (Y-SNPs), the latter resolving phylogenetic lineages Q and C. Virtually no structure became apparent for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships. This continent-wide decoupling is consistent with a rapid peopling of the continent followed by long periods of isolation in small groups. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America. Such haplotypes are virtually absent from North and Central America, but occur at high frequency in Asia. Together with the locally confined Y-STR autocorrelation observed in our study as a whole, the available data therefore suggest a late introduction of C3* into South America no more than 6,000 years ago, perhaps via coastal or trans-Pacific routes. Extensive simulations revealed that the observed lack of haplogroup C3* among extant North and Central American natives is only compatible with low levels of migration between the ancestor populations of C3* carriers and non-carriers. In summary, our data highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions, most of which are likely not to have been met by the ancestors of native South Americans.
Link
January 25, 2013
The case for earlier Out-of-Africa (Boivin et al. 2013)
An informative review critical of the ~60kya coastal-Out-of-Africa hypothesis. On the left, the authors' estimate of the distribution of hominin groups during MIS5.
From the paper:
Human Dispersal Across Diverse Environments of Asia during the Upper Pleistocene
Nicole Boivin et al.
The initial out of Africa dispersal of H. sapiens, which saw anatomically modern humans reach the Levant in Marine Isotope Stage 5, is generally regarded as a ‘failed dispersal’. Fossil, archaeological and genetic findings are seen to converge around a consensus view that a single population of H. sapiens exited Africa sometime around 60 thousand years ago (ka), and rapidly reached Australia by following a coastal dispersal corridor. We challenge the notion that current evidence supports this straightforward model. We argue that the fossil and archaeological records are too incomplete, the coastal route too problematic, and recent genomic evidence too incompatible for researchers not to remain fully open to other hypotheses. We specifically explore the possibility of a sustained exit by anatomically modern humans, drawing in particular upon palaeoenvironmental data across southern Asia to demonstrate its feasibility. Current archaeological, genetic and fossil data are not incompatible with the model presented, and appear to increasingly favour a more complex out of Africa scenario involving multiple exits, varying terrestrial routes, a sub-divided African source population, slower progress to Australia, and a degree of interbreeding with archaic varieties of Homo.
Link
From the paper:
Another under-appreciated issue is the anomalous nature of the genetic evidence for a rapid spread of modern humans from Africa to Asia. Echoing the fossil date anomaly, the mtDNA branch lengths for sampled populations are longest for those which are farthest east, in Near Oceania, and shortest in the Asian areas that would have been encountered first (Merriwether et al., 2005; Oppenheimer, 2009). The real problem, however, is that the variation in branch lengths suggests that a single genotype engaged in the expansion actually existed for 30 ka, which does not support a rapid expansion. The anomaly can be explained by what we call an an ‘M buffer’ effect (see Supplementary material A) which implies that the branch ages we observe are considerable underestimates of the time of arrival of the genotype to these areas. Such anomalously long-lived genotypes have been directly observed through ancient DNA in species such as the Iberian lynx (Dalen et al., 2011).and:
We have focused here on the possibility that the modern human exit recorded by fossil evidence in the Levant in MIS 5 does not represent a failed dispersal, and that in fact our species was not only in the Levant but also the Arabian peninsula during this marine isotope stage, and spread to India before the Toba eruption at 74 ka (Petraglia et al., 2007). Another valid hypothesis we do not explore here is that H. sapiens was able to leave Africa in MIS 6 via a grassland corridor (Frumkin et al., 2011; see also Scally and Durbin, 2012). Yet another is that our species dispersed out of Africa shortly after its first appearance c. 195 ka, in MIS 7 (Dennell and Roebroeks, 2005: 1102). One other possibility is that there were several, separate dispersals of our species out of Africa (Dennell and Petraglia, 2012). At the same time, we acknowledge that major demographic changes occurred in MIS 4 and MIS 3, perhaps explaining the relatively young mtDNA coalescence age in living populations. The increasing evidence for complexity as well as the clear patterns of bias for all records, whether archaeological, fossil or genetic, suggests the need for an open mind to multiple scenarios for Out of Africa, as well as for more rather than less complex models of H. sapiens dispersal across Eurasia.Quaternary International doi:10.1016/j.quaint.2013.01.008
Human Dispersal Across Diverse Environments of Asia during the Upper Pleistocene
Nicole Boivin et al.
The initial out of Africa dispersal of H. sapiens, which saw anatomically modern humans reach the Levant in Marine Isotope Stage 5, is generally regarded as a ‘failed dispersal’. Fossil, archaeological and genetic findings are seen to converge around a consensus view that a single population of H. sapiens exited Africa sometime around 60 thousand years ago (ka), and rapidly reached Australia by following a coastal dispersal corridor. We challenge the notion that current evidence supports this straightforward model. We argue that the fossil and archaeological records are too incomplete, the coastal route too problematic, and recent genomic evidence too incompatible for researchers not to remain fully open to other hypotheses. We specifically explore the possibility of a sustained exit by anatomically modern humans, drawing in particular upon palaeoenvironmental data across southern Asia to demonstrate its feasibility. Current archaeological, genetic and fossil data are not incompatible with the model presented, and appear to increasingly favour a more complex out of Africa scenario involving multiple exits, varying terrestrial routes, a sub-divided African source population, slower progress to Australia, and a degree of interbreeding with archaic varieties of Homo.
Link
January 03, 2013
Body form variation of prehistoric Jomon (Fukase et al. 2012)
Am J Phys Anthropol DOI: 10.1002/ajpa.22112
Geographic variation in body form of prehistoric Jomon males in the Japanese archipelago: Its ecogeographic implications
Hitoshi Fukase et al.
Diversity of human body size and shape is often biogeographically interpreted in association with climatic conditions. According to Bergmann's and Allen's rules, populations in regions with a cold climate are expected to display an overall larger body and smaller/shorter extremities than those in warm/hot environments. In the present study, the skeletal limb size and proportions of prehistoric Jomon hunter-gatherers, who extensively inhabited subarctic to subtropical areas in the ancient Japanese archipelago, were examined to evaluate whether or not the inter-regional differences follow such ecogeographic patterns. Results showed that the Jomon intralimb proportions including relative distal limb lengths did not differ significantly among five regions from northern Hokkaido to the southern Okinawa Islands. This suggests a limited co-variability of the intralimb proportions with climate, particularly within genealogically close populations. In contrast, femoral head breadth (associated with body mass) and skeletal limb lengths were found to be significantly and positively correlated with latitude, suggesting a north-south geographical cline in the body size. This gradient therefore comprehensively conforms to Bergmann's rule, and may stem from multiple potential factors such as phylogenetic constraints, microevolutionary adaptation to climatic/geographic conditions during the Jomon period, and nutritional and physiological response during ontogeny. Specifically, the remarkably small-bodied Jomon in the Okinawa Islands can also be explained as an adjustment to subtropical and insular environments. Thus, the findings obtained in this study indicate that Jomon people, while maintaining fundamental intralimb proportions, displayed body size variation in concert with ambient surroundings.
Link
Geographic variation in body form of prehistoric Jomon males in the Japanese archipelago: Its ecogeographic implications
Hitoshi Fukase et al.
Diversity of human body size and shape is often biogeographically interpreted in association with climatic conditions. According to Bergmann's and Allen's rules, populations in regions with a cold climate are expected to display an overall larger body and smaller/shorter extremities than those in warm/hot environments. In the present study, the skeletal limb size and proportions of prehistoric Jomon hunter-gatherers, who extensively inhabited subarctic to subtropical areas in the ancient Japanese archipelago, were examined to evaluate whether or not the inter-regional differences follow such ecogeographic patterns. Results showed that the Jomon intralimb proportions including relative distal limb lengths did not differ significantly among five regions from northern Hokkaido to the southern Okinawa Islands. This suggests a limited co-variability of the intralimb proportions with climate, particularly within genealogically close populations. In contrast, femoral head breadth (associated with body mass) and skeletal limb lengths were found to be significantly and positively correlated with latitude, suggesting a north-south geographical cline in the body size. This gradient therefore comprehensively conforms to Bergmann's rule, and may stem from multiple potential factors such as phylogenetic constraints, microevolutionary adaptation to climatic/geographic conditions during the Jomon period, and nutritional and physiological response during ontogeny. Specifically, the remarkably small-bodied Jomon in the Okinawa Islands can also be explained as an adjustment to subtropical and insular environments. Thus, the findings obtained in this study indicate that Jomon people, while maintaining fundamental intralimb proportions, displayed body size variation in concert with ambient surroundings.
Link
December 27, 2012
Zoogeographic map of the world
I have written informally about the Sahara-Arabia belt in conjunction with my "two deserts" theory of modern human origins (=pre-100kya in North Africa, post-70kya from Arabia), so it's nice to see that it corresponds to some real zoogeographic entity derived from the distribution of thousands of species. So, perhaps an early evolution of modern humans in that area, followed by their dispersal and admixture with other hominins living in the Palearctic and Afrotropical regions might make sense.
Science DOI: 10.1126/science.1228282
An Update of Wallace's Zoogeographic Regions of the World
Ben G. Holt et al.
Modern attempts to produce biogeographic maps focus on the distribution of species and are typically drawn without phylogenetic considerations. Here, we generate a global map of zoogeographic regions by combining data on the distributions and phylogenetic relationships of 21,037 species of amphibians, birds, and mammals. We identify 20 distinct zoogeographic regions, which are grouped into 11 larger realms. We document the lack of support for several regions previously defined based on distributional data and show that spatial turnover in the phylogenetic composition of vertebrate assemblages is higher in the Southern than in the Northern Hemisphere. We further show that the integration of phylogenetic information provides valuable insight on historical relationships among regions, permitting the identification of evolutionarily unique regions of the world.
Link
Science DOI: 10.1126/science.1228282
An Update of Wallace's Zoogeographic Regions of the World
Ben G. Holt et al.
Modern attempts to produce biogeographic maps focus on the distribution of species and are typically drawn without phylogenetic considerations. Here, we generate a global map of zoogeographic regions by combining data on the distributions and phylogenetic relationships of 21,037 species of amphibians, birds, and mammals. We identify 20 distinct zoogeographic regions, which are grouped into 11 larger realms. We document the lack of support for several regions previously defined based on distributional data and show that spatial turnover in the phylogenetic composition of vertebrate assemblages is higher in the Southern than in the Northern Hemisphere. We further show that the integration of phylogenetic information provides valuable insight on historical relationships among regions, permitting the identification of evolutionarily unique regions of the world.
Link
December 08, 2012
Main orientations of human genetic differentiation (Jay et al. 2012)
Mol Biol Evol (2012) doi: 10.1093/molbev/mss259
Anisotropic isolation by distance: the main orientations of human genetic differentiation
Flora Jay et al.
Genetic differentiation among human populations is greatly influenced by geography due to the accumulation of local allele frequency differences. However, little is known about the possibly different increment of genetic differentiation along the different geographical axes (north-south, east-west, etc). Here we provide new methods to examine the asymmetrical patterns of genetic differentiation. We analyzed genome-wide polymorphism data from populations in Africa (n = 29), Asia (n = 26), America (n = 9) and Europe (n = 38), and we found that the major orientations of genetic differentiation are north-south in Europe and Africa, east-west in Asia, but no preferential orientation was found in the Americas. Additionally, we showed that the localization of the individual geographic origins based on SNP data was not equally precise along all orientations. Confirming our findings, we obtained that in each continent, the orientation along which the precision is maximal corresponds to the orientation of maximum differentiation. Our results have implications for interpreting human genetic variation in terms of isolation by distance and spatial range expansion processes. In Europe for instance, the precise NNW-SSE axis of main European differentiation can not be explained by a simple Neolithic demic diffusion model without admixture with the local populations because in that case the orientation of greatest differentiation should be perpendicular to the direction of expansion. In addition to humans, anisotropic analyses can guide the description of genetic differentiation for other organisms and provide information on expansions of invasive species or the processes of plant dispersal.
Link
Anisotropic isolation by distance: the main orientations of human genetic differentiation
Flora Jay et al.
Genetic differentiation among human populations is greatly influenced by geography due to the accumulation of local allele frequency differences. However, little is known about the possibly different increment of genetic differentiation along the different geographical axes (north-south, east-west, etc). Here we provide new methods to examine the asymmetrical patterns of genetic differentiation. We analyzed genome-wide polymorphism data from populations in Africa (n = 29), Asia (n = 26), America (n = 9) and Europe (n = 38), and we found that the major orientations of genetic differentiation are north-south in Europe and Africa, east-west in Asia, but no preferential orientation was found in the Americas. Additionally, we showed that the localization of the individual geographic origins based on SNP data was not equally precise along all orientations. Confirming our findings, we obtained that in each continent, the orientation along which the precision is maximal corresponds to the orientation of maximum differentiation. Our results have implications for interpreting human genetic variation in terms of isolation by distance and spatial range expansion processes. In Europe for instance, the precise NNW-SSE axis of main European differentiation can not be explained by a simple Neolithic demic diffusion model without admixture with the local populations because in that case the orientation of greatest differentiation should be perpendicular to the direction of expansion. In addition to humans, anisotropic analyses can guide the description of genetic differentiation for other organisms and provide information on expansions of invasive species or the processes of plant dispersal.
Link
November 14, 2012
High altitude adaptation in Ethiopia
The anthropometric characteristics on pp. 49-50 may also be of interest. It seems Amhara highlanders are shorter, thinner, and lighter than their co-ethnic lowlanders. Oromo highlanders, on the other hand, appear to be heavier and less thin. (for males).
arXiv:1211.3053 [q-bio.PE]
The genetic architecture of adaptations to high altitude in Ethiopia
Gorka Alkorta-Aranburu, Cynthia M. Beall, David B. Witonsky, Amha Gebremedhin, Jonathan K. Pritchard, Anna Di Rienzo
Although hypoxia is a major stress on physiological processes, several human populations have survived for millennia at high altitudes, suggesting that they have adapted to hypoxic conditions. This hypothesis was recently corroborated by studies of Tibetan highlanders, which showed that polymorphisms in candidate genes show signatures of natural selection as well as well-replicated association signals for variation in hemoglobin levels. We extended genomic analysis to two Ethiopian ethnic groups: Amhara and Oromo. For each ethnic group, we sampled low and high altitude residents, thus allowing genetic and phenotypic comparisons across altitudes and across ethnic groups. Genome-wide SNP genotype data were collected in these samples by using Illumina arrays. We find that variants associated with hemoglobin variation among Tibetans or other variants at the same loci do not influence the trait in Ethiopians. However, in the Amhara, SNP rs10803083 is associated with hemoglobin levels at genome-wide levels of significance. No significant genotype association was observed for oxygen saturation levels in either ethnic group. Approaches based on allele frequency divergence did not detect outliers in candidate hypoxia genes, but the most differentiated variants between high- and lowlanders have a clear role in pathogen defense. Interestingly, a significant excess of allele frequency divergence was consistently detected for genes involved in cell cycle control, DNA damage and repair, thus pointing to new pathways for high altitude adaptations. Finally, a comparison of CpG methylation levels between high- and lowlanders found several significant signals at individual genes in the Oromo.
Link
arXiv:1211.3053 [q-bio.PE]
The genetic architecture of adaptations to high altitude in Ethiopia
Gorka Alkorta-Aranburu, Cynthia M. Beall, David B. Witonsky, Amha Gebremedhin, Jonathan K. Pritchard, Anna Di Rienzo
Although hypoxia is a major stress on physiological processes, several human populations have survived for millennia at high altitudes, suggesting that they have adapted to hypoxic conditions. This hypothesis was recently corroborated by studies of Tibetan highlanders, which showed that polymorphisms in candidate genes show signatures of natural selection as well as well-replicated association signals for variation in hemoglobin levels. We extended genomic analysis to two Ethiopian ethnic groups: Amhara and Oromo. For each ethnic group, we sampled low and high altitude residents, thus allowing genetic and phenotypic comparisons across altitudes and across ethnic groups. Genome-wide SNP genotype data were collected in these samples by using Illumina arrays. We find that variants associated with hemoglobin variation among Tibetans or other variants at the same loci do not influence the trait in Ethiopians. However, in the Amhara, SNP rs10803083 is associated with hemoglobin levels at genome-wide levels of significance. No significant genotype association was observed for oxygen saturation levels in either ethnic group. Approaches based on allele frequency divergence did not detect outliers in candidate hypoxia genes, but the most differentiated variants between high- and lowlanders have a clear role in pathogen defense. Interestingly, a significant excess of allele frequency divergence was consistently detected for genes involved in cell cycle control, DNA damage and repair, thus pointing to new pathways for high altitude adaptations. Finally, a comparison of CpG methylation levels between high- and lowlanders found several significant signals at individual genes in the Oromo.
Link
October 03, 2012
rolloff analysis of South Indian Brahmins as Armenian+Chamar
The first analysis of this population showed that there were negative f3(Brahmin; X, Y) signals when X were a variety of West European, Balkan, and West Asian population, and Y either the Chamar or North Kannadi. In the first analysis I used Orcadians and North Kannadi. I have now carried out a new rolloff analysis on 470,559 SNPs, using Armenians_Y and Chamar_M as the reference populations.
The exponential fit can be seen below.
The admixture date is 142.814 +/- 15.010 generations, or 4,140 +/- 440 years, which seems to correspond quite well with commonly accepted dates for the formation of Indo-Iranian.
I have previously observed that:
The exponential fit can be seen below.
The admixture date is 142.814 +/- 15.010 generations, or 4,140 +/- 440 years, which seems to correspond quite well with commonly accepted dates for the formation of Indo-Iranian.
I have previously observed that:
These patterns can be well-explained, I believe, if we accept that Indo-Iranians are partially descended not only from the early Proto-Indo-Europeans of the Near East, but also from a second element that had conceivable "South Asian" affiliations. The most likely candidate for the "second element" is the population of the Bactria Margiana Archaeological Complex (BMAC). The rise and demise of the BMAC fits well with the relative shallowness of the Indo-Iranian language family and its 2nd millennium BC breakup, and has been assigned an Indo-Iranian identity on other grounds by its excavator. As climate change led to the decline and abandonment of BMAC sites, its population must have spread outward: to the Iranian plateau, the steppe, and into South Asia, reinforcing the linguistic differentiation that must have already began over the extensive territory of the complex.
Quite possibly, as the West Asian element began mixing with the Sardinian-like population in Greece, another branch of the Indo-Europeans made its appearance east of the Caspian, in the territory of the BMAC, admixing with South Asian-like populations. Thus, it might seem that the Graeco-Aryan clade of Indo-European broke down during the Bronze Age, with one branch heading off to the Balkans, and another to the east.
This scenario would also explain how the likely J2-bearing population associated with the earliest Proto-Indo-Europeans may have acquired the contrasting pattern I have previously described: the western (cis-Caspian) population would have admixed with R1b-bearers who occupy the "small arc" west and south of the Caspian, while the eastern (trans-Caspian) populations would have admixed with R1a-bearers who occupy the "large arc" in the flatlands north and east of the Caspian. It would also explain how the "western" branch (Graeco-Armenian) would have picked up Sardinian-like "Atlantic_Med" admixture, which is absent in the "eastern" Indo-Iranian branch.
At the same time, this scenario would explain the lack of "North European" admixture in the "western" branch (since this was shielded by the Caucasus and Black Sea from the northern Europeoids who may have lived north of these barriers), and explain it in the "eastern" branch (since the BMAC agriculturalists were in contact with presumably northern Europeoid groups inhabiting the steppelands, unhindered by any major physical barriers). (The relative absence of this admixture in the Graeco-Armenian branch may be advanced on the strength of its absence in Armenians, the evidence of a Sardinian-like Iron Age individual from Bulgaria, and the historical-era timing of admixture for the Greek population.)
It would be interesting to carry out similar experiments on Iranian groups, to see if they, too, present a similar pattern of admixture.
September 18, 2012
In favor of recent Out-of-Africa (Eriksson et al. 2012)
A new paper in PNAS argues for a recent (~60kya) expansion of modern humans Out-of-Africa. After reading the title, I was not sure what date the authors were arguing for, and I went straight for the movie in the supplemental material, which is a pretty cool depiction of the authors' scenario.
However, I disagree with the conclusions of this paper, for a variety of reasons. First, the climate history of Africa is consistent with older dispersal scenarios. The authors of the current paper follow other researchers into attributing the Skhul/Qafzeh hominins to an Out-of-Africa-that-failed, but that proposition is increasingly untenable.
The halving of the human autosomal mutation rate implies that Eurasians and Africans split before 100 thousand years ago, and African hunter-gatherers may have split as much as 300 thousand years ago. These dates are not set in stone, but can be downsized if one allows for substantial archaic African admixture. But, doing so weakens the case for a sub-Saharan origin of Homo sapiens.
Second, the Nubian Complex is a direct archaeological link between NE Africa and S Arabia pre-100ka. It becomes increasingly difficult to argue that the pre-100ka expansion fizzled when you have the triple evidence of Mt. Carmel, the Nubian Complex, and Jebel Faya, providing a combination of anthropological and archaeological evidence for African-Asian interaction prior to 100ka.
Some of the conclusions of this paper may be influenced by their choice of the dinucleotide mutation rate:
There is a good reason to favor a recent human expansion: if Out-of-Africa happened pre-100ka, as I have argued, then what did modern humans wait for to conquer the rest of the planet (a greater than 50ka hiatus until they begin appearing all over Eurasia). However, that problem can be solved if we acknowledge the fact that modern humans prior to 100ka may have been anatomically "like us", but behaviorally they were not much different from other hominins living on the planet at the time. These pre-100ka H. sapiens were just another set of hunter-gatherers: they may have had a chin, a smaller face, and a more globular braincase, but they did not appear to behave in any drastically different way than other humans who lacked these features.
There are three factors that drive migration: curiosity, need, and ability. One may wonder "what's on the land beyond the sea", but one needs the ability to build a lasting boat to find out. One may have the ability to build a boat, but has no need to do so, if there is game-a-plenty around camp and a beautiful woman with a few beautiful babies in the shelter.
I think that 2-3 reasons contributed to the hiatus:
I don't know to what extent changes in the modern human lineage made the mental hardware of early Homo sapiens something like a transistor-based computer that had to compete against the older triode-based models that filled the planet. As we sample more ancient hominins, we may eventually find out whether our wiring was really much improved.
But, one does not really need the best of wiring to conquer a planet. Few would argue today, I suspect, that English mental hardware was superior to e.g., Bavarian mental hardware, but the English brought half the planet under their domination, partly because of their fortunate geographical position which gave them (and other West Europeans) access to the lands beyond the sea. Similarly, few would argue that the Mongols had an innate ability to conquer half of Eurasia, but they happened to have a combination of drive, leadership, organization, and military hardware that allowed them to do so.
This is what I suspect was the real cause for the success of modern humans: they may have had some genetic advantage over others, but their success was partly unintended (the consequence of the drying up of the Sahara-Arabia region that forced them out c. 70kya), and partly the result of them having some vital technological "edge" over other Homo populations.
There may have been interplay between the "need" and "ability" causes of the great human diaspora: as modern humans were pushed out by the advancing desert, they had to adapt to dwindling resources, the challenge of new environments, and the challenge of contact and competition with archaic hominins in both Eurasia and Africa. Adversity does not always breed success: it most often results in failure. But, while many long-forgotten peoples may have faced formidable challenges during the long aeons of geological time, at least one of them had the combination of luck and the "right stuff" to rise to the occasion, and we are their descendants.
PNAS doi: 10.1073/pnas.1209494109
Late Pleistocene climate change and the global expansion of anatomically modern humans
Anders Eriksson et al.
The extent to which past climate change has dictated the pattern and timing of the out-of-Africa expansion by anatomically modern humans is currently unclear [Stewart JR, Stringer CB (2012) Science 335:1317–1321]. In particular, the incompleteness of the fossil record makes it difficult to quantify the effect of climate. Here, we take a different approach to this problem; rather than relying on the appearance of fossils or archaeological evidence to determine arrival times in different parts of the world, we use patterns of genetic variation in modern human populations to determine the plausibility of past demographic parameters. We develop a spatially explicit model of the expansion of anatomically modern humans and use climate reconstructions over the past 120 ky based on the Hadley Centre global climate model HadCM3 to quantify the possible effects of climate on human demography. The combinations of demographic parameters compatible with the current genetic makeup of worldwide populations indicate a clear effect of climate on past population densities. Our estimates of this effect, based on population genetics, capture the observed relationship between current climate and population density in modern hunter–gatherers worldwide, providing supporting evidence for the realism of our approach. Furthermore, although we did not use any archaeological and anthropological data to inform the model, the arrival times in different continents predicted by our model are also broadly consistent with the fossil and archaeological records. Our framework provides the most accurate spatiotemporal reconstruction of human demographic history available at present and will allow for a greater integration of genetic and archaeological evidence.
Link
However, I disagree with the conclusions of this paper, for a variety of reasons. First, the climate history of Africa is consistent with older dispersal scenarios. The authors of the current paper follow other researchers into attributing the Skhul/Qafzeh hominins to an Out-of-Africa-that-failed, but that proposition is increasingly untenable.
The halving of the human autosomal mutation rate implies that Eurasians and Africans split before 100 thousand years ago, and African hunter-gatherers may have split as much as 300 thousand years ago. These dates are not set in stone, but can be downsized if one allows for substantial archaic African admixture. But, doing so weakens the case for a sub-Saharan origin of Homo sapiens.
Second, the Nubian Complex is a direct archaeological link between NE Africa and S Arabia pre-100ka. It becomes increasingly difficult to argue that the pre-100ka expansion fizzled when you have the triple evidence of Mt. Carmel, the Nubian Complex, and Jebel Faya, providing a combination of anthropological and archaeological evidence for African-Asian interaction prior to 100ka.
Some of the conclusions of this paper may be influenced by their choice of the dinucleotide mutation rate:
The dinucleotide stepwise mutation model mutation rate for these markers was estimated in the work by Dib et al. (46) to μdi = 1.52 × 10−3 mutations per generation.But, Sun et al. seem to report a much slower dinucleotide mutation rate, as well as a deviation from the stepwise symmetrical model that would bias age estimates downwards if that model is used. So, I am not very confident in the age estimates provided in this paper.
There is a good reason to favor a recent human expansion: if Out-of-Africa happened pre-100ka, as I have argued, then what did modern humans wait for to conquer the rest of the planet (a greater than 50ka hiatus until they begin appearing all over Eurasia). However, that problem can be solved if we acknowledge the fact that modern humans prior to 100ka may have been anatomically "like us", but behaviorally they were not much different from other hominins living on the planet at the time. These pre-100ka H. sapiens were just another set of hunter-gatherers: they may have had a chin, a smaller face, and a more globular braincase, but they did not appear to behave in any drastically different way than other humans who lacked these features.
There are three factors that drive migration: curiosity, need, and ability. One may wonder "what's on the land beyond the sea", but one needs the ability to build a lasting boat to find out. One may have the ability to build a boat, but has no need to do so, if there is game-a-plenty around camp and a beautiful woman with a few beautiful babies in the shelter.
I think that 2-3 reasons contributed to the hiatus:
- The going was good in Arabia prior to the climate crisis of ~70kya
- The way north was blocked by Neandertals
- Whether or not modern humans had the genetic capacity to outcompete the Neandertals, they did not yet have the behavioral expression needed to achieve this
I don't know to what extent changes in the modern human lineage made the mental hardware of early Homo sapiens something like a transistor-based computer that had to compete against the older triode-based models that filled the planet. As we sample more ancient hominins, we may eventually find out whether our wiring was really much improved.
But, one does not really need the best of wiring to conquer a planet. Few would argue today, I suspect, that English mental hardware was superior to e.g., Bavarian mental hardware, but the English brought half the planet under their domination, partly because of their fortunate geographical position which gave them (and other West Europeans) access to the lands beyond the sea. Similarly, few would argue that the Mongols had an innate ability to conquer half of Eurasia, but they happened to have a combination of drive, leadership, organization, and military hardware that allowed them to do so.
This is what I suspect was the real cause for the success of modern humans: they may have had some genetic advantage over others, but their success was partly unintended (the consequence of the drying up of the Sahara-Arabia region that forced them out c. 70kya), and partly the result of them having some vital technological "edge" over other Homo populations.
There may have been interplay between the "need" and "ability" causes of the great human diaspora: as modern humans were pushed out by the advancing desert, they had to adapt to dwindling resources, the challenge of new environments, and the challenge of contact and competition with archaic hominins in both Eurasia and Africa. Adversity does not always breed success: it most often results in failure. But, while many long-forgotten peoples may have faced formidable challenges during the long aeons of geological time, at least one of them had the combination of luck and the "right stuff" to rise to the occasion, and we are their descendants.
PNAS doi: 10.1073/pnas.1209494109
Late Pleistocene climate change and the global expansion of anatomically modern humans
Anders Eriksson et al.
The extent to which past climate change has dictated the pattern and timing of the out-of-Africa expansion by anatomically modern humans is currently unclear [Stewart JR, Stringer CB (2012) Science 335:1317–1321]. In particular, the incompleteness of the fossil record makes it difficult to quantify the effect of climate. Here, we take a different approach to this problem; rather than relying on the appearance of fossils or archaeological evidence to determine arrival times in different parts of the world, we use patterns of genetic variation in modern human populations to determine the plausibility of past demographic parameters. We develop a spatially explicit model of the expansion of anatomically modern humans and use climate reconstructions over the past 120 ky based on the Hadley Centre global climate model HadCM3 to quantify the possible effects of climate on human demography. The combinations of demographic parameters compatible with the current genetic makeup of worldwide populations indicate a clear effect of climate on past population densities. Our estimates of this effect, based on population genetics, capture the observed relationship between current climate and population density in modern hunter–gatherers worldwide, providing supporting evidence for the realism of our approach. Furthermore, although we did not use any archaeological and anthropological data to inform the model, the arrival times in different continents predicted by our model are also broadly consistent with the fossil and archaeological records. Our framework provides the most accurate spatiotemporal reconstruction of human demographic history available at present and will allow for a greater integration of genetic and archaeological evidence.
Link
August 27, 2012
3-population test and east Eurasian-like admixture in Europe or The Isle of Refuge
The 3-population test (Reich et al. 2009) allows one to detect the presence of admixture in a population X from two other populations A and B. The value
is negative when X does not appear to form a simple tree with A and B but appears to be a mixture of A and B.
In a previous entry, I noted that continental European populations, and especially northern Europeans appear to have East Eurasian-like admixture on the basis of the 4-population test. The results of that test are more difficult to interpret, because the quantity f4(X, Y; A, B) can take significant negative or positive values depending on the relationships of populations X, Y with A, B. When A, B are East Eurasian and African populations respectively, and X, Y are West Eurasian ones, East Eurasian-like admixture in a northern European population will affect the f4 quantity similarly as African-like admixture in a southern Caucasoid one. This is not a problem with the f3 test, although caution is needed: a negative value indicates deviation from "treeness" and admixture, but a positive one does not reject admixture.
The f3 statistics were calculated with the threepop program of TreeMix with -k 500 over a set of 598,467 SNPs.
I have used 3 Asian/American reference populations (Karitiana from South America, CHB Chinese, and Papuans) and calculated the following:
As noted above, negative values of this indicate that West Eurasian 1 can be seen as an admixed population of West Eurasian 2 + Asian/American. The set of 14 West Eurasian populations used is:
Out of the 546 triples, 64 show an f3 score less than Z less or equal to -3, and are thus significant.
The following populations have such a score in at least one pairwise comparison, when they are set as West Eurasian 1, and thus appear to have east Eurasian-like admixture
The fact that Europeans appear admixed with an east Eurasian-like element when compared with Sardinians does not mean that Sardinians may not also be admixed with this element. I used the genome of the Tyrolean Iceman (Keller et al. 2012) to test whether Sardinians appear east Eurasian-like admixed relative to the Iceman.
f3(Sardinian;Karitiana,Oetzi) = 5.36496e-06 (Z=0.00940612)
This might indicate no admixture, but f3 can detect admixture but can't prove non-admixture. The f4 is suggestive:
f4(Sardinian,Oetzi;Karitiana,San) = -0.00221783 (Z=-3.06251)
You should probably not take my word for the above. It may appear that, contrary to expectation, Oetzi was more east Eurasian-like than modern Sardinians. Indeed, in my initial analysis of him with ADMIXTURE, I found that he was 2.8% East_Asian, which would point to an East Eurasian shift of Oetzi relative to Sardinians, and which might be consistent with the f4 result. On the other hand, the negative f4 score could be related to African-like gene flow. On balance I would say that Sardinians appear quite similar to Oetzi.
Gok4 and Ajv52
Furthermore, I carried out the same analysis on Neolithic samples from Sweden (Skoglund et al. 2012). The number of SNPs here is much smaller. Results are:
Gok4 (TRB farmer): f4(Sardinian,Gok4;Karitiana,San) = -0.00167365 (Z=-1.23616)
Ajv52 (PWC hunter-gatherer): f4(Sardinian,Ajv52;Karitiana,San) = -0.004676 (Z=-3.76048)
Isle of Refuge
The above set of experiments has revealed once more that "there's something about Sardinians." There is perhaps a reason for the fact that the arrival of population elements from continental Europe seems to have bypassed them to some degree, or, at least affected them least. However it was that continental Europeans got their east Eurasian-like shift, the great tank of European genetic variation does not seem to have achieved equilibrium with the little cup of Sardinia. Something stood in the way.
Sardinia is the west-most of the large Mediterranean islands. It is more distant from mainland Europe/Asia than the other big islands (Cyprus, Crete, Sicily, and Corsica).
And, unlike islands much smaller than itself, its size has probably been instrumental in helping it afford it a certain autonomy and continuity of population. Only Sicily is largest, but one can practically swim across the Strait of Messina to reach it from the Italian peninsula.
Hence, a combination of large size, western geographical location, and distance from the mainland have contributed to the continuity of its population. But, geography may not have been sufficient if other events had not taken place. Through a combination of favorable geography and historical contingency, the Sardinians made it to the present largely unscathed, and, among their other graces, can now help scientists figure out what happened to the rest of us.
f3(X; A, B)
is negative when X does not appear to form a simple tree with A and B but appears to be a mixture of A and B.
In a previous entry, I noted that continental European populations, and especially northern Europeans appear to have East Eurasian-like admixture on the basis of the 4-population test. The results of that test are more difficult to interpret, because the quantity f4(X, Y; A, B) can take significant negative or positive values depending on the relationships of populations X, Y with A, B. When A, B are East Eurasian and African populations respectively, and X, Y are West Eurasian ones, East Eurasian-like admixture in a northern European population will affect the f4 quantity similarly as African-like admixture in a southern Caucasoid one. This is not a problem with the f3 test, although caution is needed: a negative value indicates deviation from "treeness" and admixture, but a positive one does not reject admixture.
The f3 statistics were calculated with the threepop program of TreeMix with -k 500 over a set of 598,467 SNPs.
I have used 3 Asian/American reference populations (Karitiana from South America, CHB Chinese, and Papuans) and calculated the following:
f3(West Eurasian 1; West Eurasian 2, Asian/American)
As noted above, negative values of this indicate that West Eurasian 1 can be seen as an admixed population of West Eurasian 2 + Asian/American. The set of 14 West Eurasian populations used is:
CEU, TSI, Tuscan, Orcadian, French, French_Basque, North_Italian, Bedouin, Palestinian, Druze, Mozabite, Adygei, Russian, SardinianI thus report 2*(14 choose 2)*3 = 546 values of f3. Hence, I did not privilege Sardinians as a reference point, but instead tried all pairs of West Eurasian populations, and 3 different American/Asian references. There results can be found in the spreadsheet.
Out of the 546 triples, 64 show an f3 score less than Z less or equal to -3, and are thus significant.
The following populations have such a score in at least one pairwise comparison, when they are set as West Eurasian 1, and thus appear to have east Eurasian-like admixture
CEU, Russian, French, Adygei, TSI, Tuscan, Orcadian, North_Italian, Palestinian
Note that east Eurasian-like admixture cannot be rejected for the other populations, but it can be confirmed for the above. Moreover, the mean strength of the observed effect for the significant comparisons was Z=-5.5 for Papuan reference, Z=-10.2 for CHB, and Z=-10.9 for Karitiana, again suggesting a northern origin of the east Eurasian-like admixture, albeit without so major a difference between Karitiana and CHB as in the 4-population test.
But, it is worth reading the raw data. For example, note above that of the Middle Eastern and North African populations, only Palestinians show a negative f3 score in any pairwise comparison. And actually they only do so for f3(Palestinian; Sardinian, Papuan) with a Z-score of -4.1. So, it appears that Palestinians have undergone admixture of a different sort than Europeans.
Significant differences were observed for Sardinians as West Eurasian 2 in 21 cases, for French Basque in 11 cases, for North_Italian and TSI in 6 cases, for CEU, Orcadian, French, and Tuscan in 4 cases. So, it appears that other populations appear east Eurasian-liked admixed relative to Sardinians, and a couple of populations (Russian and Adygei) also appear so admixed relative to west Europeans.
Oetzi the Tyrolean Iceman
The fact that Europeans appear admixed with an east Eurasian-like element when compared with Sardinians does not mean that Sardinians may not also be admixed with this element. I used the genome of the Tyrolean Iceman (Keller et al. 2012) to test whether Sardinians appear east Eurasian-like admixed relative to the Iceman.
f3(Sardinian;Karitiana,Oetzi) = 5.36496e-06 (Z=0.00940612)
This might indicate no admixture, but f3 can detect admixture but can't prove non-admixture. The f4 is suggestive:
f4(Sardinian,Oetzi;Karitiana,San) = -0.00221783 (Z=-3.06251)
You should probably not take my word for the above. It may appear that, contrary to expectation, Oetzi was more east Eurasian-like than modern Sardinians. Indeed, in my initial analysis of him with ADMIXTURE, I found that he was 2.8% East_Asian, which would point to an East Eurasian shift of Oetzi relative to Sardinians, and which might be consistent with the f4 result. On the other hand, the negative f4 score could be related to African-like gene flow. On balance I would say that Sardinians appear quite similar to Oetzi.
Gok4 and Ajv52
Furthermore, I carried out the same analysis on Neolithic samples from Sweden (Skoglund et al. 2012). The number of SNPs here is much smaller. Results are:
Gok4 (TRB farmer): f4(Sardinian,Gok4;Karitiana,San) = -0.00167365 (Z=-1.23616)
Ajv52 (PWC hunter-gatherer): f4(Sardinian,Ajv52;Karitiana,San) = -0.004676 (Z=-3.76048)
While I would not bet the farm on these results (because of the small number of SNPs and the fact that they're based on a single individual), they do seem to suggest that these Neolithic Swedes were east Eurasian shifted relative to Sardinians. For example, for my Swedish_D sample, I get f4(Sardinian, Swedish_D; Karitiana, San) = -0.00372751 (Z=-22.8715). The Z-score is stronger (probably because of the much larger number of SNPs), but the f4 value of Ajv52 is lower (more east-Eurasian like). Modern Swedish_D appears intermediate between Gok4 and Ajv52, so this may suggest that Mesolithic Europeans may be, at least in part the source of this element.
(Comparison with Brana-1 Mesolithic Iberian indicates a negative non-significant f4 score, but with an even smaller number of SNPs).
In sum total, my experiments with ancient DNA samples from Europe suggest a little more east Eurasian-like shift relative to Sardinians (or conversely a little more African-like shift in Sardinians). Both Oetzi (who has the highest quality genome) appears to be so-shifted, but Ajv52 (a Neolithic northern hunter-gatherer) appears to be so as well. I am sure that if we get more high quality ancient DNA from Europe, some clear pattern may emerge, but I would not speculate further on the basis of these initial results.
Sardinia is the west-most of the large Mediterranean islands. It is more distant from mainland Europe/Asia than the other big islands (Cyprus, Crete, Sicily, and Corsica).
And, unlike islands much smaller than itself, its size has probably been instrumental in helping it afford it a certain autonomy and continuity of population. Only Sicily is largest, but one can practically swim across the Strait of Messina to reach it from the Italian peninsula.
Hence, a combination of large size, western geographical location, and distance from the mainland have contributed to the continuity of its population. But, geography may not have been sufficient if other events had not taken place. Through a combination of favorable geography and historical contingency, the Sardinians made it to the present largely unscathed, and, among their other graces, can now help scientists figure out what happened to the rest of us.
August 25, 2012
Genes and Geography (Wang et al. 2012)
Gene-geography correlations have been explored before at a regional level. More recently, they were also studied at the global level with the SPA method. A new open access paper shows gene-geography correlations across the world.
These correlations arise from the fact that humans tend to intermarry with their neighbors, so alleles have a decreasing probability of being transmitted from a person at location X to future generations, the further we go from X. But, the more interesting cases are those which show a violation of the overall pattern. These can usually arise because of genetic isolation or long-distance migration. An example is that of the African hunter-gatherer groups:
Observe that in Figure S3C, the Mbororo Fulani appear in the Balkans (!) relative to Sub-Saharan Africans. That is of course, due to their partial West Eurasian ancestry, but the magnitude of the difference is such that one suspects that it is not only due to this factor; if it were, then the Fulani would place somewhere between Europe and Central Africa.
The remaining figures (D-G) supply the explanation: the four hunter-gatherer groups appear well south of their actual locations; the Pygmy groups not in W/C Africa, but in S Africa; the Khoisan ones not in S Africa but in the Ocean well south of it.
Why does gene-geography correlation suffer such a violation in Africa? Figure S3 shows how different groups relate to W/C Africans. But, one could also use hunter-gatherers as an anchor point (i.e., place them where they actually live): in that case the W/C Africans would be the ones who would be pushed north towards the Mediterranean.
And, indeed, that is a good argument for the idea I've floated a few times, of substantial Eurasian back-migration into Africa: the genetic difference between African farmers and African hunter-gatherers dwarfs the geographic distance. This can easily be explained if we assume that back-migration from Eurasia affected the former much more than the latter. So, African farmers can be shown to be the outcome of mixture between two-divergent elements: one Eurasian-like, one African hunter-gatherer-like. The latter could include both groups like existing African H-Gs but might also include other groups who had the misfortune of being completely absorbed before the Eye of Science set its sights on the African continent.
PLoS Genet 8(8): e1002886. doi:10.1371/journal.pgen.1002886
A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations
Chaolong Wang et al.
The spatial pattern of human genetic variation provides a basis for investigating the history of human migrations. Statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been used to summarize spatial patterns of genetic variation, typically by placing individuals on a two-dimensional map in such a way that pairwise Euclidean distances between individuals on the map approximately reflect corresponding genetic relationships. Although similarity between these statistical maps of genetic variation and the geographic maps of sampling locations is often observed, it has not been assessed systematically across different parts of the world. In this study, we combine genome-wide SNP data from more than 100 populations worldwide to perform a formal comparison between genes and geography in different regions. By examining a worldwide sample and samples from Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, we find that significant similarity between genes and geography exists in general in different geographic regions and at different geographic levels. Surprisingly, the highest similarity is found in Asia, even though the geographic barrier of the Himalaya Mountains has created a discontinuity on the PCA map of genetic variation.
Link
These correlations arise from the fact that humans tend to intermarry with their neighbors, so alleles have a decreasing probability of being transmitted from a person at location X to future generations, the further we go from X. But, the more interesting cases are those which show a violation of the overall pattern. These can usually arise because of genetic isolation or long-distance migration. An example is that of the African hunter-gatherer groups:
When hunter-gatherer populations (!Kung, San, Biaka Pygmy, and Mbuti Pygmy) and Mbororo Fulani were included in the analysis, they appeared as isolated clusters on the PCA plots and greatly reduced the similarity between PCA maps and geographic maps (Figure S3, Table S7). The similarity score decreased from 0.790 to 0.548 after including all five of these populations in the analysis. This value, however, is still statistically significant, with a -value of ; further, if we disregard the hunter-gatherer populations and Mbororo Fulani in Figure S3B and only examine the relative locations of the original 23 populations, we can still find a clear resemblance between genetic and geographic coordinates. Compared to the other 23 populations, the four hunter-gatherer populations appear as isolated groups at the south, and Mbororo Fulani appears at the north. These observations are clearer in plots with only one among the five outlier populations included at a time (Figure S3C–S3G), each of which also produces significant similarity scores between genetic and geographic coordinates (Figure S4, Table S7).Figure S3 is very informative:
Observe that in Figure S3C, the Mbororo Fulani appear in the Balkans (!) relative to Sub-Saharan Africans. That is of course, due to their partial West Eurasian ancestry, but the magnitude of the difference is such that one suspects that it is not only due to this factor; if it were, then the Fulani would place somewhere between Europe and Central Africa.
The remaining figures (D-G) supply the explanation: the four hunter-gatherer groups appear well south of their actual locations; the Pygmy groups not in W/C Africa, but in S Africa; the Khoisan ones not in S Africa but in the Ocean well south of it.
Why does gene-geography correlation suffer such a violation in Africa? Figure S3 shows how different groups relate to W/C Africans. But, one could also use hunter-gatherers as an anchor point (i.e., place them where they actually live): in that case the W/C Africans would be the ones who would be pushed north towards the Mediterranean.
And, indeed, that is a good argument for the idea I've floated a few times, of substantial Eurasian back-migration into Africa: the genetic difference between African farmers and African hunter-gatherers dwarfs the geographic distance. This can easily be explained if we assume that back-migration from Eurasia affected the former much more than the latter. So, African farmers can be shown to be the outcome of mixture between two-divergent elements: one Eurasian-like, one African hunter-gatherer-like. The latter could include both groups like existing African H-Gs but might also include other groups who had the misfortune of being completely absorbed before the Eye of Science set its sights on the African continent.
PLoS Genet 8(8): e1002886. doi:10.1371/journal.pgen.1002886
A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations
Chaolong Wang et al.
The spatial pattern of human genetic variation provides a basis for investigating the history of human migrations. Statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been used to summarize spatial patterns of genetic variation, typically by placing individuals on a two-dimensional map in such a way that pairwise Euclidean distances between individuals on the map approximately reflect corresponding genetic relationships. Although similarity between these statistical maps of genetic variation and the geographic maps of sampling locations is often observed, it has not been assessed systematically across different parts of the world. In this study, we combine genome-wide SNP data from more than 100 populations worldwide to perform a formal comparison between genes and geography in different regions. By examining a worldwide sample and samples from Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, we find that significant similarity between genes and geography exists in general in different geographic regions and at different geographic levels. Surprisingly, the highest similarity is found in Asia, even though the geographic barrier of the Himalaya Mountains has created a discontinuity on the PCA map of genetic variation.
Link
Subscribe to:
Posts (Atom)










