Showing posts with label L3. Show all posts
Showing posts with label L3. Show all posts

August 13, 2015

Rethinking the dispersal of Homo sapiens out of Africa


An excellent review which -among its other graces- demolishes the view that mtDNA haplogroup L3 provides a terminus post quem of 70 thousand years for the Out-of-Africa expansion, a question I've discussed in this blog before.

I think the evidence is overwhelming at this point that there were modern humans outside Africa before 100,000 years ago. The argument that they were a  failed expansion is shoddy and is based, as far as I can tell on things like the age of L3, the assumption that Y-chromosome haplogroup E is native to Africa and not derived from back-to-Africa migrants, the assumption that Out-of-Africa coincided with the Upper Paleolithic cultural efflorescence (disproven by the earlier dating of Neandertal admixture), or the failed hypothesis of a coastal route Out of Africa 60 thousand years ago that seems to be repeated in inverse proportion to the evidence for it. The halving of the human autosomal mutation rate relative to what was inferred before has certainly not helped either.

Evolutionary Anthropology: Issues, News, and Reviews Volume 24, Issue 4, pages 149–164, July/August 2015

Rethinking the dispersal of Homo sapiens out of Africa

Huw S. Groucutt, Michael D. Petraglia, Geoff Bailey, Eleanor M. L. Scerri, Ash Parton, Laine Clark-Balzan, Richard P. Jennings, Laura Lewis, James Blinkhorn, Nick A. Drake, Paul S. Breeze, Robyn H. Inglis, Maud H. Devès, Matthew Meredith-Williams, Nicole Boivin, Mark G. Thomas andAylwyn Scally

Current fossil, genetic, and archeological data indicate that Homo sapiens originated in Africa in the late Middle Pleistocene. By the end of the Late Pleistocene, our species was distributed across every continent except Antarctica, setting the foundations for the subsequent demographic and cultural changes of the Holocene. The intervening processes remain intensely debated and a key theme in hominin evolutionary studies. We review archeological, fossil, environmental, and genetic data to evaluate the current state of knowledge on the dispersal of Homo sapiens out of Africa. The emerging picture of the dispersal process suggests dynamic behavioral variability, complex interactions between populations, and an intricate genetic and cultural legacy. This evolutionary and historical complexity challenges simple narratives and suggests that hybrid models and the testing of explicit hypotheses are required to understand the expansion of Homo sapiens into Eurasia.

Link and here

August 09, 2014

New estimates of human mtDNA node dates and substitution rates (Rieux et al. 2014)

This is a quite useful paper as it compares different methods of obtaining mutation rate estimates, either using "archaeological calibration" based on known migration events or ancient mtDNA genomes (with known archaeological dates). The authors write:
Our estimate of 143 Kya [112-180 95% HPD] for the TMRCA of all modern human mtDNA is slightly younger but highly consistent with the 157 Kya [120-197 95% HPD] value obtained by Fu et al. (2013b). We stimate the coalescence of the L3 haplogroup (the lineage from which all non-African mtDNA haplogroups descend), often used to date the “out-of-Africa” event, to 72 Kya [54-93 95%HPD], a value also onsistent with Fu et al. (2013b) estimation of 78 Kya [62-95 95%HPD]. This estimation rather places a conservative upper bound of 93 kya for the time of the last major gene exchange between non-African nd sub-Saharan African populations. As pointed out by Fu et al. (2013b), it is important to recognize that this divergence time may merely represent the most recent gene exchanges between the ancestors f non-Africans and the most closely related sub-Saharan Africans and thus may reflect only the most recent population split in a long, drawn-out process of population separation (Scally and Durbin 2012).
The 72kya date would agree quite well with my postulated Out-of-Arabia event circa 70 thousand years ago.

It should be fairly easy to pick out the common ancestor of Eurasian mtDNA (the common ancestor of M+N). I am reasonably sure that the two African red dots to the right of event "8" in the figure are African L3's, and this would place them within the Eurasian variation, and in particular as a relative of Eurasian M.

A similar observation could be found in Supplementary Figure 14 of the Lippold et al. (2014) preprint, with African L3 lineages clearly related to Eurasian M (and nested within the Eurasian phylogeny).

In any case, I don't see any evidence at all from this phylogeny that the date of L3 corresponds to an Out-of-Africa event. Unfortunately I couldn't see an estimate for the split of L3 from the rest of the phylogeny; my eyeball estimate from the figure is that it's about 20ky earlier. Hopefully, someone sooner or later will deal with the question of L3 phylogeny, because the "conventional wisdom" that Eurasian M, N are nested within African L3 variation does not appear to be quite right.

Mol Biol Evol (2014) doi: 10.1093/molbev/msu222

Improved calibration of the human mitochondrial clock using ancient genomes

Adrien Rieux et al.

Reliable estimates of the rate at which DNA accumulates mutations (the substitution rate) are crucial for our understanding of the evolution and past demography of virtually any species. In humans, there are considerable uncertainties around these rates, with substantial variation among recent published estimates. Substitution rates have traditionally been estimated by associating dated events to the root (e.g. the divergence between humans and chimpanzees) or to internal nodes in a phylogenetic tree (e.g. first entry into the Americas). The recent availability of ancient mtDNA sequences allows for a more direct calibration by assigning the age of the sequenced samples to the tips within the human phylogenetic tree. But studies also vary greatly in the methodology employed and in the sequence panels analysed, making it difficult to tease apart the causes for the differences between previous estimates. To clarify this issue, we compiled a comprehensive dataset of 350 ancient and modern human complete mtDNA genomes, among which 146 were generated for the purpose of this study, and estimated substitution rates using calibrations based both on dated nodes and tips. Our results demonstrate that, for the same dataset, estimates based on individual dated tips are far more consistent with each other than those based on nodes and should thus be considered as more reliable.

Link

September 06, 2013

ASHG 2013 abstracts

Feel free to point me to more interesting abstracts than the ones I noticed during my "first pass".

Morphometric and ancient DNA study of human skeletal remanants in Indian Subcontinent.
N. Rai et al.
Recovery and sequencing of mtDNA from ancient human remnants is a daunting task but provides valuable information about human migrations and evolution. Our present study is the first to recover, amplify and sequence (HVR and coding regions of mtDNA) inadequately preserved and highly degraded (1.5 Ky to ≤1.0 Ky ago) hominids mitochondrial DNA of three most intriguing and indigenous ancient population of South and South-East Asia (Myanmar=20 Buried individuals, Nicobar Islands=15 and Andaman Island=6). Following all parameters and to avoid the chance of contamination we independently extracted and sequenced the DNA in two different labs and measured the cranial variability in all hominid skulls using 128 cranial landmarks, compiled 3D morphometrics, genetic data of ancient DNA samples and analyzed the admixture and genetic affinities of above three populations. Results showed the predominant frequency of F1a1 and complete absence of 9bp deletion in ancient Nicobarese. Unlike in previous reports on modern Nicobarese, the high frequency of F1a1 haplogroup in ancient Nicobarese show the probable migration of Nicobarese from South East Asia and the complete absence of 9bp deletion suggests the different events of settlement. This study failed to detect genetic affinities of Burmese with Nicolbarese even though their phenotype and language appears to be same. We first time report any kind of population study on Burmese populations and with the genetic affinity of Burmese with East Asian, East Indian (Including Gadhwal region of Himalaya) and Bangladeshi populations, we found significant admixture with West Eurasians. Our study strongly supports the West Eurasian and East Asian route of migration and settlement of early Burmese population. The three populations in the present study are quite different in their genetic structure but 3D morphometric study using huge number of landmarks explains a close homology among these populations and this can be explained by the role of climatic signature on these populations.
 Y chromosomes of ancient Hunnu people and its implication on the phylogeny of East Asian linguistic families. 
LL. Kang et al.
The Hunnu (Xiongnu) people, also called Huns in Europe, were the largest ethnic group to the north of Han Chinese until the 5th century. The ethno-linguistic affiliation of the Hunnu is controversial among Yeniseian, Altaic, Uralic, and Indo-European. Ancient DNA analyses on the remains of the Hunnu people had shown some clues to this problem. Y chromosome haplogroups of Hunnu remains included Q-M242, N-Tat, C-M130, and R1a1. Recently, we analyzed three samples of Hunnu from Barköl, Xinjiang, China, and determined Q-M3 haplogroup. Therefore, most Y chromosomes of the Hunnu samples examined by multiple studies are belonging to the Q haplogroup. Q-M3 is mostly found in Yeniseian and American Indian peoples, suggesting that Hunnu should be in the Yeniseian family. The Y chromosome diversity is well associated with linguistic families in East Asia. According to the similarity in the Y chromosome profiles, there are four pairs of congenetic families, i.e., Austronesian and Tai-Kadai, Mon-Khmer and Hmong-Mien, Sino-Tibetan and Uralic, Yeniseian and Palaesiberian. Between 4,000-2,000 years before present, Tai-Kadai, Hmong-Mien, Sino-Tibetan, and Yeniseian languages transformed into toned analytic languages, becoming quite different from the rest four. Since Hunnu was in the Yeniseian family, all these four toned families were distributed in the inland of China during the transformations. There must be some social or biological factors induced the transformations at that time, which is worth doing more linguistic and genetic researches.
Genomic scans for haplotypes of Denisova and Neanderthal ancestry in modern human populations.
F. L. Mendez, M. F. Hammer University of Arizona, Tucson, AZ., USA.
Evidence of archaic introgression into modern humans has accumulated in recent years. While most efforts to characterize the introgression process have relied on genome averages, only a small number of introgressive haplotypes have been shown to have an archaic origin after rejection of the alternative hypothesis of incomplete lineage sorting. Accurate identification of introgressive haplotypes is crucial both to characterize potentially functional consequences of archaic admixture and to quantify more precisely the genomic impact of archaic introgression. We perform two independent genomic scans for haplotypes of Denisova and of Neanderthal origin in a geographically diverse sample of complete genome sequences. These scans are based on the local sharing of polymorphisms and linkage disequilibrium, respectively. The analysis of concordance between the methods is then used to estimate the power and to compare demographic inference when performed using either all the data or just the genomic regions with no evidence of introgression. Moreover, we evaluate the extent to which Denisova haplotypes are observed in non-Melanesian populations, and investigate whether the presence of such haplotypes is better explained by their persistence in the population since introgression or by more recent gene flow from Melanesians.
Admixture Estimation in a Founder Population. 
Y. Banda1 et al.
Admixture between previously diverged populations yields patterns of genetic variation that can aid in understanding migrations and natural selection. An understanding of individual admixture (IA) is also important when conducting association studies in admixed populations. However, genetic drift, in combination with shallow allele frequency differences between ancestral populations, can make admixture estimation by the usual methods challenging. We have, therefore, developed a simple but robust method for ancestry estimation using a linear model to estimate allele frequencies in the admixed individual or sample as a function of ancestral allele frequencies. The model works well because it allows for random fluctuation in the observed allele frequencies from the expected frequencies based on the admixture estimation. We present results involving 3,366 Ashkenazi Jews (AJ) who are part of the Kaiser Permanente Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort and genotyped at 674,000 SNPs, and compare them to the results of identical analyses for 2,768 GERA African Americans (AA). For the analysis of the AJ, we included surrogate Middle Eastern, Italian, French, Russian, and Caucasus subgroups to represent the ancestral populations. For the African Americans, we used surrogate Africans and Northern Europeans as ancestors. For the AJ, we estimated mean ancestral proportions of 0.380, 0.305, 0.113, 0.041 and 0.148 for Middle Eastern, Italian, French, Russian and Caucasus ancestry, respectively. For the African Americans, we obtained estimated means of 0.745 and 0.248 for African and European ancestry, respectively. We also noted considerably less variation in the individual admixture proportions for the AJ (s.d. = .02 to .05) compared to the AA (s.d.= .15), consistent with an older age of admixture for the former. From the linear model regression analysis on the entire population, we also obtain estimates of goodness of fit by r2. For the analysis of AJ, the r2 was 0.977; for the analysis of the AA, the r2 was 0.994, suggesting that genetic drift has played a more prominent role in determining the AJ allele frequencies. This was confirmed by examination of the distribution of differences for the observed versus predicted allele frequencies. As compared to the African Americans, the AJ differences were significantly larger, and presented some outliers which may have been the target of selection (e.g. in the HLA region on chromosome 6p).
Admixture in the Pre-Columbian Caribbean. 
J. C. Martinez-Cruzado et al.
The biological origin of the Caribbean aborigines that greeted Columbus is one of the most controversial issues regarding the population history of this region. Genome studies suggest an Equatorial-Tucanoan origin, consistent with the Arawakan language spoken by most natives of the region. However, the archaeological evidence suggests an early arrival from Mesoamerica, and their admixture with the more recent Arawak-speaking group stemming from the Amazon remains a possibility. The lineages comprehending most Puerto Rican samples belonging to haplogroups B1 and C1, which in turn encompass 44% of all Native American mtDNAs in the island, have an unambiguous South American origin. However, none of those belonging to haplogroup A2, encompassing 52% of all Native American mtDNAs, have been related to South America or any other continental region. To augment the scarce data from Mesoamerican countries other than Mexico, we present the complete mtDNA sequence of 6 Honduran samples belonging to distinct control region lineages in addition to 3 from the Dominican Republic and 3 from Puerto Rico. Interestingly, maximum likelihood phylogenetic reconstruction including 40 published haplogroup A2 sequence haplotypes from Mesoamerica, Central America and South America clusters 8 out of 10 Mesoamerican and Andean haplotypes in a deep rooted group, separate from, and excluding all Costa Rican, Panamian and Brasilian haplotypes, suggesting a relatively recent origin for Chibchan-Paezan and Amazonian groups. Furthermore, 4 of the 5 Greater Antillean A2 haplotypes are included in the deeply rooted Mesoamerican-Andean cluster. Moreover, the only Cuban haplotype in the literature and the remaining A2 haplotype from the Dominican Republic form even more deeply rooted private branches. Similarly, the only haplogroup C1d sample sequenced from the Dominican Republic forms a private branch with the deepest root in a maximum likelihood tree containing 19 additional C1d haplotypes from Mexico to Brasil plus the CRS. In conclusion, our preliminary results suggest that a substantial proportion of the Native American mtDNA lineages from the Greater Antilles do not share an Amazonian origin with the language their people spoke in 1492. Furthermore, the position of two Dominican lineages at the earliest split in both their respective trees suggests an early origin that could be explained by extensive lineage extinctions in Mesoamerica and the Andes or an origin in North America.
 The possible role of social selection in the distribution of the "Proto-Mongolian" haplotype in Kazakhs, Kyrgyz, Mongols and other Eurasian populations.
M. Zhabagin et al.
Social factors may be important contributors to reproductive success and determination of the selective survival of individuals. Therefore, social selection and other social factors are important for understanding population structure and its formation. The role of social selection on the distribution and formation of Y-chromosomal gene pool has been studied. There is a strong connection between social selection and birth rate of the descendants, whose fathers had achieved high social status during the expansion of the Mongol Empire and associated historical events. A total of 783 haplotypes, including 687 newly obtained and 96 retrieved from the literature were assigned to the haplogroup C3*-M217 (xM48) based on genotyping 17 Y-chromosomal STR markers. These haplotypes represent 11 populations of Eurasia: Kazakhs, Mongols, Kyrgyz, Telengits, Circassians, Balkar, Temirgoys, Karachai, Evenki, Kizhi and the Pashtuns. As the result, a major haplotype 13-16-25-15-16-18-14-10-22-11-10-11-13-10-21 (DYS389a-DYS389b-DYS390-DYS456-DYS19-DYS458-DYS437-DYS438-DYS448-GATA4-DYS391-DYS392-DYS393-DYS439-DYS635, N=94) was found to have 12.00% frequency within haplogroup C3*. This haplotype includes and extends the previously described “star-cluster” haplotype. Noteworthy, the frequency of this major haplotype within haplogroup C3* was 16.80% in Kazakhs, 10.13% in Mongols and 2.63% in Kirgiz who are not considered as direct descendants of Genghis Khan. 35.10% of the major haplotype was represented by Kazakh tribe Ashamayly-Kerey, 17.02% by the Khalkh Mongols and 7.44% by the Barguts. Therefore, we suppose this major ancestral haplotype to be the "proto-Mongolian haplotype", inherited by Genghis Khan and his descendants. It is important to mention that Temujin belongs to Kiyat-Borjigin tribe that in turn is a branch of the bigger Borjigin tribe, part of the Khalkh Mongols. Thus, Genghis Khan might be considered as a carrier rather than founder of the star-cluster haplotype. He and his descendants are the ones who contributed to a positive effect of social selection in the distribution of this haplotype. Other examples are the Barguts, who had Genghis Khan’s credit and were granted with a number of privileges, or the Kerey, based on the fact that Temujin had been brought up at the court of the Togrul Khan, belonging to the Kerey tribe.
Y-chromosomal variation in native South Americans: bright dots on a gray canvas.
M. Nothnagel et al.
While human populations in Europe and Asia have often been reported to reveal a concordance between their extant genetic structure and the prevailing regional pattern of geography and language, such evidence is lacking for native South Americans. In the largest study of South American natives to date, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other. We observed virtually no structure for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships, augmented by locally confined Y-STR autocorrelation. Analysis of repeatedly taken random subsamples from Europe adhering to the same sampling scheme excluded the possibility that this finding was due to our specific scheme. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America, which are virtually absent from North and Central America, but occur at high frequency in Asia. Our data suggest a late introduction of C3* into South America no more than 6,000 years ago and low levels of migration between the ancestor populations of C3* carrier and non-carriers. Our findings are consistent with a rapid peopling of the continent, followed by long periods of isolation in small groups, and highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions.
The timing and history of Neandertal gene flow into modern humans. 
S. Sankararaman et al.
   Previous analyses of modern human variation in conjunction with the Neandertal genome have revealed that Neandertals contributed 1-4% of the genes of non-Africans with the time of last gene flow dated to 37,000-86,000 years before present. Nevertheless, many aspects of the joint demographic history of modern humans and Neandertals are unclear. We present multiple analyses that reveal details of the early history of modern humans since their dispersal out of Africa.
   1.We analyze the difference between two allele frequency spectra in non-Africans: the spectrum conditioned on Neandertals carrying a derived allele while Denisovans carry the ancestral allele and the spectrum conditioned on Denisovans carrying a derived allele while Neandertals carry the ancestral allele. This difference spectrum allows us to study the drift since Neandertal gene flow under a simple model of neutral evolution in a panmictic population even when other details of the history before gene flow are unknown. Applying this procedure to the genotypes called in the 1000 Genomes Project data, we estimate the drift since admixture in Europeans of about 0.065 and about 0.105 in East Asians. These estimates are quite close to those in the European and East Asian populations since they diverged, implying that the Neandertal gene flow occurred close to the time of split of the ancestral populations. 
   2.Assuming only one Neandertal gene flow event in the common ancestry of Europeans and East Asians, we estimate the drift since gene flow in the common ancestral population. We show that an upper bound on this shared drift is 0.018. Because this is far less than the drift associated with the out-of-Africa bottleneck of all non-African populations, this shows that the Neandertal gene flow occurred after the out-of-Africa bottleneck. 
   3.We use the genetic drift shared between Europeans and East Asians, in conjunction with the observation of large regions deficient in Neandertal ancestry obtained from a map of Neandertal ancestry in Eurasians, to estimate the number of generations and effective population size in the period immediately after gene flow. These analyses suggest that only a few dozen Neandertals may have contributed to the majority of Neandertal ancestry in non-Africans today.
Genetic characterisation of two Greek population isolates. 
K. Hatzikotoulas et al.
   Genetic association studies of low-frequency and rare variants can be empowered by focusing on isolated populations. It is important to genetically characterize population isolates for substructure and recent admixture events as these may give rise to spurious associations. Under the auspices of the HELlenic Isolated Cohorts study (HELIC; www.helic.org) we have collected >3,000 samples from two isolated populations in Greece: the Pomak villages (HELIC Pomak), a set of religiously-isolated mountainous villages in the North of Greece; and Anogia and surrounding mountainous villages on Crete (HELIC MANOLIS). All samples have information on anthropometric, cardiometabolic, biochemical, haematological and diet-related traits. 1,500 individuals from each population isolate have been typed on the Illumina OmniExpress and Human Exome Beadchip platforms. Multidimensional scaling analysis with the 1000 Genomes Project data shows similarities of the two population isolates with Mediterranean populations such as the Tuscans from Italy and Iberians from Spain. We also observe evidence for structure within the isolates, with the Kentavros village in the Pomak strand demonstrating high levels of differentiation. To characterise the degree of isolatedness in these populations we estimated the proportion of individuals with at least one “surrogate parent” (using only the subset of samples with pairwise pi-hat<0 .2="" 707="" adolescents="" an="" and="" at="" attica="" compared="" comprises="" district.="" find="" for="" from="" genome="" greek="" in="" individuals="" is="" isolate="" least="" manolis="" of="" one="" outbred="" parent="" population="" proportion="" random="" regions="" study="" surrogate="" teenage="" that="" the="" this="" to="" unrelated="" we="" which="" with="">60% and in the Pomak isolate is >65% compared to ~1% in the outbred Greek population. Our results establish these populations as isolates and provide some insights into the genomic architecture of Greek populations, which have not been previously characterised.
Efficient and Accurate Whole-Genome Human Phasing.
T. Blauwkamp et al.
   High throughput DNA sequencing allows whole human genomes to be resequenced rapidly and inexpensively producing a comprehensive list of variants relative to the reference genome. However, short read sequencing technologies are limited in their ability to determine phasing information, thus resulting in heterozygous calls being represented as the average of the maternal and paternal chromosomes. Phasing information is of critical importance to personal medicine as it provides a better linkage between genotype and phenotype, permitting new advances in our understanding of compound heterozygote linked diseases, pharmacogenomics, HLA typing, and prenatal genome sequencing. Here, we describe a new sample prep method that enables whole human genome haplotyping at high accuracy using only 30Gb of sequence data. Genomic DNA was fragmented into ~10Kb fragments, end repaired, and ligated to adapters. Hundreds of aliquots with approximately 50MB of DNA in each were amplified, fragmented and converted into individual shotgun libraries. The pooled libraries were sequenced in a single lane of a HiSeq2500 at 2x100bp to generate ~30Gb of sequence. The resulting sequence information was analyzed to obtain a set of long blocks of ~10Kb, covering multiple heterozygous SNPs, allowing phasing of these SNPs relative to each other. An HMM-based phasing algorithm was used to compute the most likely phase and confidence intervals based on the observed coverage and sequencer quality scores. Phasing of those blocks relative to each other was done by another HMM-based algorithm which uses a panel of previously phased genomes. Comparing our results with phase information inferred by transmission from the parents, we found that over 98% of heterozygous SNPs were phased within long blocks (N50=500kb) at a switch error rate below 1 switch per megabase of phased sequence. We present results obtained from multiple cell lines and human samples. This new library prep method and data analysis pipeline enables whole human genome phasing with only 30Gb of raw sequence, which represents only ~30% more sequencing than current 30x baseline run for human sequencing. Compared to other published reports, this method is capable of phasing a greater fraction of SNPS with ~75% less sequencing. Coupling our higher percentage of SNPs phased with high accuracy and the lowest sequencing requirement, this new technology is the most affordable approach to generating completely phased whole human genomes.
 Inference of Natural Selection and Demographic History for African Pygmy Hunter-Gatherers.
P. H. Hsieh et al.
   African Pygmies are hunter-gatherers primarily inhabiting the Central African rainforests, where they are exposed to high temperatures, high humidity, and a pathogen and parasite-enriched woody habitat. These factors undoubtedly influenced their evolutionary history as they adapted to this environment. Many Pygmy populations have historically been in socio-economic contact with neighboring Niger-Kordofanian speaking farmer populations, particularly since the agriculture expansion in sub-Saharan Africa that began five thousand years ago (kya). To look for the true signatures of adaptation to the rainforest habitat of pygmies we must control for this complex demographic history. We sequenced and combined 40x whole genome sequence data from 3 Baka pygmies from Cameroon, 4 Biaka pygmies from the Central African Republic, and 9 Niger-Kordofanian speaking Yoruba farmers from Nigeria. We used ?a?i, a model-based demographic inference tool, to infer the history of these populations. Our best-fit model suggests that the ancestors of the farmer and pygmy populations diverged 150 kya and remained isolated from each other until 40 kya. This divergence is more ancient than estimated by previous studies that included fewer loci, but is consistent with a PSMC analysis, a separate inference tool that uses different aspects of the genomic data than ?a?i. Interestingly, our analysis shows that models with bi-directional asymmetric gene flow between farmers and pygmies are statistically better supported than previously suggested models with a single wave of uni-directional migration from farmers to pygmies. To identify possible targets of positive selection, we conducted a genomic scan using complementary methods, including the frequency-spectrum based G2D test, the population differentiation based XP-CLR test, and the haplotype based iHS test. We performed 10,000 simulations based on the above best-fit demographic model in order to assign statistical significance to each reported target of natural selection. Our results reveal that genes involved in cell adhesion, cellular signaling, olfactory perception, and immunity were likely targeted by natural selection in the pygmies or their recent ancestors. Our analysis also shows that genes involved in the function of lipid binding are enriched in highly differentiated non-synonymous mutations, suggesting that this function may have acted differently on the Pygmies and farmers after their divergence from their common ancestor.
Population demography and maternal history of Oceania.
A. T. Duggan et al.
   We present a large-scale study of mtDNA diversity across Near and Remote Oceania with whole-genome mtDNA sequencing and a sample collection of more than 1,300 individuals spanning from the Bismarck Archipelago in the west to the Cook Islands in the east. As the location of at least two major migration events (initial colonization over 40,000 years ago, followed by an expansion of Austronesian-speaking migrants around 3,500 years ago), Oceania provides a unique opportunity to study the effects of population admixture. Our results support the idea of sex-biased admixture between the resident populations and the migrants of the Austronesian expansion. We find that haplogroups of putative Asian origin which are thought to have spread with the Austronesian expansion are found at high frequency in all but two populations and, in general, we see little evidence of distinction between Papuan and Austronesian speaking populations. Santa Cruz, which is part of the Solomon Islands but geographically distinct from the main island chain and considered part of Remote Oceania, has long been considered a linguistic oddity and is now accepted to represent a very deep branch in the Oceanic language family. We find that it is also a genetic outlier, with potential direct connections to the Bismarck Archipelago not evident in the main Solomon Islands chain. In this expanded dataset, we find additional evidence of instability and increased heteroplasmy at the ‘Polynesian motif’ position 16247, further confirming previous findings restricted to the Solomon Islands. 

 Reconstructing Austronesian population history. 
M. Lipson et al.
   Present-day populations that speak Austronesian languages are spread across half the globe, from Easter Island in the Pacific Ocean to Madagascar in the Indian Ocean. Evidence from linguistics and archaeology suggests that the "Austronesian expansion," a vast cultural and linguistic dispersal that began 4--5 thousand years ago, had its origin in Taiwan. However, genetic studies of Austronesian ancestry have been inconclusive, with some finding affinities with aboriginal Taiwanese, others advancing an autochthonous origin within Island Southeast Asia, and others proposing a model involving multiple waves of migration from Asia. Here, we analyze genome-wide data from a diverse set of 31 Austronesian-speaking and 25 other groups typed at 18,412 overlapping single nucleotide polymorphisms (SNPs) to trace the genetic origins of Austronesians. We use a recently developed computational tool for building phylogenetic models of population relationships incorporating the possibility of admixture, which allows us to infer ancestry proportions and sources of genetic material for 26 admixed Austronesian-speaking populations. Our analysis provides strong confirmation of widespread ancestry of Taiwanese origin: at least a quarter of the genetic material in all Austronesian-speaking populations that we studied---including all of the Asian ancestry in populations from eastern Indonesia and Oceania---is more closely related to aboriginal Taiwanese than to any populations we sampled from the mainland. Surprisingly, we also show that western Austronesian-speaking populations have inherited substantial proportions of their Asian ancestry from a source that falls within the variation of present-day Austro-Asiatic populations in Southeast Asia. No Austro-Asiatic languages are spoken in Island Southeast Asia today, although there are some linguistic and archaeological suggestions of an early connection between mainland and island populations. The most plausible explanation for these findings, in light of the historical evidence, is that western Island Southeast Asia was settled by Austronesian groups who had previously mixed with Austro-Asiatic speakers on the mainland.
 No significant differences in the accumulation of deleterious mutations across diverse human populations. 
R. Do et al.
   Differences in demographic history across populations are expected to cause differences in the accumulation of deleterious mutations because natural selection works less efficiently when population sizes are small. Surprisingly, however, the relative burden of deleterious mutations has never been directly measured across human populations on a per-haploid genome basis, despite the fact that this is what matters biologically in the absence of dominance and epistasis. Here we empirically measure the relative accumulation of deleterious mutations in 13 diverse populations (Yoruba, Mandenka, San, Mbuti, Dinka, Australian, French, Sardinian, Han, Dai, Mixe, Karitiana and Papuan) along with one archaic population (Denisova). All the present-day populations have statistically indistinguishable accumulations of coding mutations. We highlight two examples. First, we find no evidence for a lower mutational load in West Africans than in Europeans despite the approximately 30% higher genetic diversity in West Africans: the accumulation of nonsynonymous mutations in West Africans is 1.01±0.02 times that in Europeans, and for “probably damaging” mutations, the ratio is 1.03±0.04. Second, we find no evidence for a lower mutational load in populations that have experienced agriculture-related expansions over the last 10,000 years and those that have not: the ratio in Chinese to Karitiana hunter gatherers from Brazil is 0.99±0.07. We determined that these null results are not an artifact of insensitivity of our method to differences in demographic history. As a positive control, we also analyzed archaic Denisovans who are known to have had a small population size for hundreds of thousands of years since separation from modern humans. We show that the Denisovan lineage has accumulated “probably damaging” mutations 1.33±0.06 times more rapidly than modern humans since they split. These analyses are important because of the new constraints they place on the distribution of selection coefficients in humans. Given the currently estimated demographic histories of West Africans and Europeans, combined with the fact that we do not detect a lower accumulation of deleterious mutations in West Africans than Europeans, we can conclude that only a small proportion of nonsynonymous mutations have selection coefficients in the range s=-0.01 to -0.001, which is the range of selection coefficients which would be expected to show a lower accumulation in West Africans than in Africans.
Deep coverage Bedouin genomes reveal Bedouin haplotypes shared among worldwide populations in the 1000 Genomes Project. 
J. L. Rodriguez-Flores et al.
   The 1000 Genomes Project (1000G) has sampled and sequenced over 2500 genomes that are representative of the genetic diversity in populations worldwide. The Arabian Peninsula has not been previously included in 1000G, hence the connections between genetic variation in the indigenous Bedouin people and worldwide populations is unknown. We have sampled genomes from Bedouin individuals in the nation of Qatar as a window into the genetic variation in this understudied region. Our goal was to use this sample to assess the hypothesis that there is detectable shared ancestry between Bedouin and Southern European populations resulting from the history of empires that spanned both the Mediterranean and Arabian regions and the hypothesis that there is shared ancestry between Bedouin and contemporary Latin American populations, since the majority of European settlers in Latin America from the past half millennia are primarily from Southern European countries. We selected 60 Qataris with over 95% Bedouin ancestry and at least 3 generations of ancestry in Qatar for deep coverage genome sequencing. Genomes were sequenced by the Illumina Genome Network using TruSeq DNA PCR-free sample preparation, generating over 120 gigabases of paired-end 100 base pair reads per genome on a HiSeq 2500, yielding over 30x depth and genotypes for >96% of the genome using both the ELAND/CASAVA and BWA/GATK pipelines. Using these genotypes, we inferred haplotypes using SHAPEIT for Bedouin Qataris and for 1000G populations on a set of sites polymorphic in both 1000G and Bedouins. We used admixture analysis to assess shared ancestry between our Bedouin sample and 1000G populations using the ancestry deconvolution method SUPPORTMIX. Given the lack of appropriate ancestral populations, we conducted a leave-one-out approach, where for each population (1000G + Bedouin = n), we removed the population and used the remaining n-1 populations as an ancestral reference panel. Using this approach, we observed up to 15% Bedouin ancestry in European, South Asian, and American populations. Likewise, we observed ancestry from Europe, South Asia, and America in the Bedouin. For individuals from the Americas, the analysis identified a considerable number of segments shared with Bedouins previously classified as European ancestry. 
Using a haplotype-based model to infer Native American colonization history.
C. Lewis et al.
   We apply a powerful haplotype-based model (described in Lawson et al. 2012) to infer the population history of 410 individuals from ~50 Native American groups, using data interrogated at >470,000 genome-wide autosomal Single-Nucleotide-Polymorphisms (SNPs). The model matches haplotype patterns among individuals' chromosomes to infer which individuals share recent common ancestry at each location of the genome, an approach that has previously been demonstrated to increase power substantially over widely-used alternative approaches that consider SNPs independently. We apply this methodology to 1861 samples described in Reich et al. (2012), incorporating 263 additional samples from 32 relevant world-wide regions collated from other publicly available resources and currently unavailable data. We utilize these methodology and data in two ways. First, we infer intermixing (i.e. "admixture") events among different Native American groups by identifying the groups that share the most haplotype segments. Using additional unpublished techniques, we determine the dates of these intermixing events, the proportions of DNA contributed, and the precise genetic make-up of the groups involved. These unique characteristics set this methodology apart from all presently available software, allowing us to place these mixing events into a clear historical context and thus identify the factors (e.g. the rise or fall of various Native American empires) that have contributed most to the genetic architecture of present-day Native American groups. Second, we match DNA patterns from each Native American group to a set of over 30 populations from Siberia and East Asia, describing each Native American group as a mixture of DNA from these regions. This enables us to shed light on the widely debated number of distinct migrations into the Americas during the initial colonization across the Bering Strait, comparing our results to previous inference from the literature. Our application demonstrates the power gained by using rich haplotype information relative to approaches that ignore this information.
Using Ancient Genomes to Detect Positive Selection on the Human Lineage. 
K. Prüfer et al.
   At least two distinct groups of archaic hominins inhabited Eurasia before the arrival of modern humans: Neandertals and Denisovans. The analysis of the genomes of these archaic humans revealed that they are more closely related to one another than they are to modern humans. However, since modern and archaic humans are so closely related, only about 10% of the archaic DNA sequences fall outside the present-day human variation whereas for 90% of the genome, Neandertal or Denisova DNA sequences are more closely related to some humans than to others. The fact that the archaic sequence often falls within the diversity of modern humans can be used to detect selective sweeps that affected all modern humans after their split from archaic humans since such sweeps will result in genomic regions where both the Neandertal and Denisova genomes fall outside the modern human variation. The genetic lengths of such external regions are proportional to the strength of selection, since stronger selection will lead to faster sweeps allowing less time for recombination to decrease their size. We have implemented a test for such external regions as a hidden Markov model. At each polymorphic position the model emits ancestral or derived based on whether the tested archaic genome carries the ancestral or derived variant of SNPs observed in present-day humans. The model was applied to 185 African genomes from the 1000 genomes phase 1 data. We identified thousands of external regions using the Neandertal and Denisova genomes, separately. Approximately one third of the regions are overlapping between the two genomes. These regions are significantly longer than regions only identified in only one of the archaic genomes. Based on this excess of overlap for long regions, we devise a measure to identify a set of regions that are candidates for selective sweeps on the human lineage since the split from Neandertal and Denisova.
Pulling out the 1%: Whole-Genome In-Solution (WISC) capture for the targeted enrichment of ancient DNA sequencing libraries. 
C. D. Bustamante et al.
   The very low levels of endogenous DNA remaining in most ancient specimens has precluded the shotgun sequencing of many interesting samples due to cost. For example, ancient DNA (aDNA) libraries derived from bones and teeth often contain <1 b="" by="" capacity="" dna.="" dna="" endogenous="" environmental="" is="" majority="" meaning="" of="" sequencing="" taken="" that="" the="" up=""> We will present a method for the targeted enrichment of the endogenous component of human aDNA sequencing libraries. Using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to significantly enrich for human-derived DNA fragments. This approach, which we call whole-genome in-solution capture (WISC), allows us to obtain genome-wide ancestral information from ancient samples with very low endogenous DNA contents. We demonstrate WISC on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased dramatically, with up to 59% of reads mapped to human and folds enrichment ranging from 5X to 139X. Furthermore, we maintained coverage of the majority of fragments present in the pre-capture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062-147,243) for the post-capture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217-73,266) for the pre-capture libraries, increasing resolution in population genetic analyses. We will also present the results of performing WISC on other aDNA libraries from both archaic human and non-human samples, including ancient domestic dog samples. Our capture approach is flexible and cost-effective, allowing researchers to access aDNA from many specimens that were previously unsuitable for sequencing. Furthermore, this method has applications in other contexts, such as the enrichment of target human DNA in forensic samples.
Insights into population history from a high coverage Neandertal genome. 
D. Reich1, for.the. Neandertal Genome Consortium2 
   We have sequenced to about 50-fold coverage a genome sequence from about 40 mg of a bone found in Denisova Cave in Southern Siberia. The genome of this female is much more closely related to the low-coverage Neandertal genomes from Croatia, Spain, Germany and the Caucasus than to the genome of archaic Denisovans, a sister group of Neandertals, and provides unambiguous evidence that both Neandertals and Denisovans inhabited the Altai Mountains in Siberia. The high-coverage Neandertal genome, combined with our earlier sequencing of a high quality Denisova genome, allows novel insights about the population history of archaic humans:
    •We document recent inbreeding in this Altai Neandertal. The inbreeding coefficient of about 1/8 corresponds to about the homozygosity that would be expected from a mating of half siblings. 
    •The Altai Neandertal genome shares almost seven percent more derived alleles with present-day Africans than does the Denisova genome. This means that the Denisovans derived a proportion of their ancestry from a very archaic human lineage, and the amount of this ancestry they inherit is larger than in Neandertals. 
    • The Denisovan genome is affected by major recent gene flow from an Altai-related Neandertal. 
    • To further characterize the variation among Neandertals we sequenced the genome of a Neandertal from the Caucasus to about 0.5-fold coverage. Comparisons to present-day genomes show that the Neandertals who contributed genes to present-day non-Africans were more closely related to this Caucasian Neandertal than to the Neandertals we sequenced from the Altai. 
    •We built a map of Neandertal ancestry in modern humans, using data from all non-Africans in the 1000 Genomes Project. We show that the average Neandertal ancestry on chromosome X of present-day non-Africans is about a fifth of the genome average. It is known that hybrid incompatibility loci concentrate on chromosome X. Thus, this observation is consistent with a model of hybrid incompatibility in which Neandertal variants that introgressed into modern humans were rapidly selected away due to epistatic interactions with the modern human genetic background.
Inferring complex demographies from PSMC coalescent rate estimates: African substructure and the Out-of-Africa event.
S. Gopalakrishnan et al.
   Human population history is an intriguing and complex story with many events like population growth, bottlenecks, time-dependent/non-homogeneous migration, population splits and mixtures. Estimating complete demographies with population sizes, rates of gene flow and population split times has proven to be a challenging endeavor. We propose a framework for jointly estimating the demography parameters, especially gene-flow rates and split times, for a large number of populations. We use coalescent rate estimates obtained from Pairwise Sequentially Markovian Coalescent (PSMC) as the starting point for our analysis. Since PSMC works on only two chromosomes at a time, we apply PSMC to all pairs of individuals to obtain the pairwise coalescent rates for lineages from every pair of sampled populations. Using a mathematical model for calculating coalescent probabilites given population parameters, we estimate demography using the parameters that best fit the observed coalesecent rates.
   In this study, we focus on two aspects of African population genetics, 1. the nature of population structure in Africa going back in time and 2. the timing of the Out-of-Africa event. To address these questions, we assembled a dataset with whole genome sequences from 162 individuals using both in-house sequencing and publicly available sources. These samples span 22 populations worldwide. These include eleven African populations which we use to dissect the population substructure in Africa. In addition, we also have 2 Middle Eastern, 5 European and 4 East/Central Asian populations which inform the population split time estimates for the Out-of-Africa event and the European-Asian split.
   We find extensive population structure in Africa extending back to before the Out-of-Africa event. The Ethiopian populations, Amhara and Oromo, show evidence of mixing beyond 15 kya. The Maasai and Luhye merge with the Ethiopian populations to form a panmictic East African population ~40kya. We find evidence for extensive mixing between east and west African populations before 50kya. Among the pygmy populations, we see recent gene flow between the Batwa and Mbuti. All African populations except the San merge into a single population around 110 kya. The San exchange migrants with the other African populations beginning ~120 kya. We estimate the Out-of-Africa event to have occurred ~75kya and the European-Asian split to ~25kya.
Out of Africa, which way? 
L. Pagani et al.
While the African origin of all modern human populations is well-established, the dynamics of the diaspora that led anatomically modern humans to colonize the lands outside Africa are still under debate. Understanding the demographic parameters as well as the route (or routes) followed by the ancestors of all non-Africans could help to refine our understanding of the selection processes that occurred subsequently, as well as shedding light on a landmark process in our evolutionary history. Of the three possible gateways out of Africa (via Morocco across the Gibraltar strait, via Egypt through the Suez isthmus or via the Horn of Africa across Bab el Mandeb strait) only the latter two are supported by paleoclimatic and archaeological evidence. Furthermore, recent studies (Pagani et al. 2012) showed that, although the modern Ethiopian populations might be good candidates for the descendants of the source population of such a migration, modern Egyptians could be an even better candidate. Unfortunately, however, only a few Egyptian samples have been genotyped and, as yet, none have been fully sequenced. Here, we have generated 125 Ethiopian and 100 Egyptian whole genome sequences (Illumina HiSeq, 8x average depth). The genomes were partitioned using PCAdmix (Brisbin et al. 2012) to account for the confounding effects of recent introgression from neighboring non-African populations. To explore the genetic legacy of migration routes out of Africa, and in particular to test whether the observed genetic data support one route over another, the African components of Egyptians and Ethiopians were then compared to a panel of available non-African populations from the 1000 Genomes Project (1000 Genomes Project Consortium, 2012). The high resolution provided by whole genome sequencing allows us to shed new light on the paths followed by our ancestors as they left Africa, as well as refining the current knowledge of the demographic history of the populations analyzed.
The Saudi Arabian Genome Reveals a Two Step Out-of-Africa Migration. 
J. J. Farrell et al.
   Here we present the first high-coverage whole genome sequences from a Middle Eastern population consisting of 14 Eastern Province Saudi Arabians. Genomes from this region are of interest to further answer questions regarding “Out-of-Africa” human migration. Applying a pairwise sequentially Markovian coalescent model (PSMC), we inferred the history of population sizes between 10,000 years and 1,000,000 years before present (YBP) for the Saudi genomes and an additional 11 high-coverage whole genome sequences from Africa, Asia and Europe.
   The model estimated the initial separation from Africans at approximately 110,000 YBP. This intermediate population then underwent a long period of decreasing population size culminating in a bottleneck 50,000 YBP followed by an expansion into Asia and Europe. The split and subsequent bottleneck were thus two distinct events separated by a long intermediate period of genetic drift in the Middle East. The two most frequent mitochondria haplogroups (30% each) were the Middle Eastern U7a and the African L. The presence of the L haplogroup common in Africa was unexpected given the clustering of the Saudis with Europeans in the phylogenetic tree and suggests some recent African admixture. To examine this further, we performed formal tests for a history of admixture and found no evidence of African admixture in the Saudi after the split. Taken together, these analyses suggest that the L3 haplogroup found in the Saudi were present before the bottleneck 50,000 YBP. Given the TMRCA estimates for the L3 haplogroup of approximately 70,000 YBP and the timing of the Out-of-Africa split, these analyses suggest that L3 haplogroup arose in the Middle East with a subsequent back migration and expansion into Africa over the Horn-of-Africa during the lower sea levels found during the glacial period bottleneck.
    These results are consistent with the hypothesis that modern humans populated the Middle East before a split 110,000 YBP, underwent genetic drift for 60,000 years before expanding to Asia and Europe as well as back-migration into Africa. Examination of genetic variants discovered by Saudi whole genome sequencing in ancestral African populations and European/Asian populations will contribute to the understanding human migration patterns and the origin of genetic variation in modern humans.
 Geographic Population Structure (GPS) of worldwide human populations infers biogeographical origin down to home village
E. Elhaik et al.
The search for a method that utilizes biological information to predict human’s place of origin has occupied scientists for millennia. Modern biogeography methods are accurate to 700 km in Europe but are highly inaccurate elsewhere, particularly in Southeast Asia and Oceania. The accuracy of these methods is bound by the choice of genotyping arrays, the size and quality of the reference dataset, and principal component (PC)-based algorithms. To overcome the first two obstacles, we designed GenoChip, a dedicated genotyping array for genetic anthropology with an unprecedented number of ~12,000 Y-chromosomal and ~3,300 mtDNA SNPs and over 130,000 autosomal and X-chromosomal SNPs carefully chosen to study ancestry without any known health, medical, or phenotypic relevance. We also 615 individuals from 54 worldwide populations collected as part of the Genographic Project and the 1000 Genomes Project. To overcome the last impediment, we developed an admixture-based Geographic Population Structure (GPS) method that infers the biogeography of worldwide individuals down to their village of origin. GPS’s accuracy was demonstrated on three data sets: worldwide populations, Southeast Asians and Oceanians, and Sardinians (Italy) using 40,000-130,000 GenoChip markers. GPS correctly placed 80%; of worldwide individuals within their country of origin with an accuracy of 87%; for Asians and Oceanians. Applied to over 200 Sardinians villagers of both sexes, GPS placed a quarter of them within their villages and most of the remaining within 50 km of their villages, allowing us to identify the demographic processes that shaped the Sardinian society. These findings are significantly more accurate than PCA-based approaches. We further demonstrate two GPS applications in tracing the poorly understood biogeographical origin of the Druze and North American (CEU) populations. Our findings demonstrate the potential of the GenoChip array for genetic anthropology. Moreover, the accuracy and power of GPS underscore the promise of admixture-based methods to biogeography and has important ramifications for genetic ancestry testing, forensic and medical sciences, and genetic privacy.

June 11, 2013

~60-50 thousand coastal migration to Asia from Africa affirmed

My own opinions on this matter have been repeated ad nauseam in this blog, so I will only briefly touch on a couple of points in this new article.

Contrary to the authors' claim that: "The size of the mtDNA database is very substantial: currently there are almost 13,000 complete non-African mtDNA genomes available, not one of which is pre-L3." there are plenty of pre-L3 mtDNA in Eurasia. Some (or indeed most?) of these might represent more recent African admixture, and one could argue why they believe that to be the case, but no one has ever studied pre-L3 Eurasian mtDNA to conclude that none of it had an ancient presence in Eurasia. The native non-existence of pre-L3 in Eurasia is a viewpoint, not a fact.

Second, the authors present the following table:


Note, however, that they have chosen to date the common ancestor of all African sublineages (to ~70.2ky using ML method), but they have not done the same for the common ancestor of all non-African sublineages (i.e., M+N). A recent study estimates M+N to be 77KBP and L3 to be 78.3KBP, with other possibilities depending on method and portion of the molecule examined.

At present, I see no good reason to think that haplogroup L3 originated either in Africa or Eurasia; one could argue for an African origin, but temporal priority for African L3 is not established. The only thing that seems to be established is that there are "more" L3 subclades in Africa, which is phylogenetically meaningless until the bifurcating structure of the L3 subtree is resolved. And, we should also not forget that the Toba eruption did not cause a volcanic winter in Africa, nor was Africa affected by the drying up of the Sahara-Arabia belt c. 70kya, both of which may have suppressed Eurasian mtDNA variation within L3, irrespective of its ultimate origins. And, as I've argued before, both Toba and the post-70kya ecological crisis are excellent candidates for a Eurasian bottleneck caused by pre-existing populations surviving in refugia such as this.

Finally, the authors dismiss the possibility of an overestimated autosomal mutation rate and its implications for human history: "Recent reestimates of the autosomal mutation rate from whole-genome pedigree data suggest a European–Asian split time of 40–80 ka, although they do not, as has been suggested, lend any support to a dispersal fromAfrica before 80 ka (36) (Genetics)." 

Actually, the most recent estimate might be consistent with a ~96ky split of Africans from non-Africans, and the mutation rate issue is material to the timing of the African/non-African split. It's not clear where the needle will settle when the issue is resolved, but there's plenty of room for both pre- and post-Toba Out-of-Africa as things stand.


PNAS doi: 10.1073/pnas.1306043110

Genetic and archaeological perspectives on the initial modern human colonization of southern Asia

Paul Mellars et al.

It has been argued recently that the initial dispersal of anatomically modern humans from Africa to southern Asia occurred before the volcanic “supereruption” of the Mount Toba volcano (Sumatra) at ∼74,000 y before present (B.P.)—possibly as early as 120,000 y B.P. We show here that this “pre-Toba” dispersal model is in serious conflict with both the most recent genetic evidence from both Africa and Asia and the archaeological evidence from South Asian sites. We present an alternative model based on a combination of genetic analyses and recent archaeological evidence from South Asia and Africa. These data support a coastally oriented dispersal of modern humans from eastern Africa to southern Asia ∼60–50 thousand years ago (ka). This was associated with distinctively African microlithic and “backed-segment” technologies analogous to the African “Howiesons Poort” and related technologies, together with a range of distinctively “modern” cultural and symbolic features (highly shaped bone tools, personal ornaments, abstract artistic motifs, microblade technology, etc.), similar to those that accompanied the replacement of “archaic” Neanderthal by anatomically modern human populations in other regions of western Eurasia at a broadly similar date.

Link

August 20, 2012

Human population size over time (Theunert et al. 2012)

The authors of the current paper estimate a 13,601 ancestral and 22,915 current size for Yoruba, with an expansion beginning at 20.15ky. For Europeans, a more complex model provides a better fit, with an ancestral size of 10,065 going through a bottleneck to 3,300 at 39.5ky and recovering at 32.5ky to a current size of 18,300.

I note that the 2.5x10^-8 mutation rate was used to derive these estimates. The model used in this paper infers population size changes on the basis of both allele frequency and linkage disequilibrium, so it is not clear what the use of the slower directly measured mutation rate which led to the re-dating of the human-chimp divergence will have.

Since the direct rate is about half of the widely used 2.5x10^-8, it might be the case that the European bottleneck will approach the ~70ky turning point which is associated with the Toba eruption and the drying up of the Sahara-Arabia belt. That sounds like a most opportune time for a population living in Eurasia to go through a 3-fold reduction in size! Moreover, the Yoruba expansion may be re-dated to around ~40ky, which would correspond to the Middle to Lower Stone Age transition in Africa. Again, an excellent time for a population whose size was pretty constant to experience growth!

Indeed, according to a recent paper on climate history, the middle regions of Africa were more shielded from climate change than North Africa/Levant, so less extreme variations in size in tropical Africa vs. a drastic bottleneck in North Africa/Levant where the ancestors of Europeans were at the time, makes excellent sense to me.

In another recent paper that used a 2.35x10^-8 mutation rate (Lukic et al. 2012), the founding of the Eurasian population was estimated at 52ky. But, if the direct rate is used, this estimate will probably coincide with the crucial ~100ky mark that coincides with the first appearance of anatomically modern human in the Levant at Skhul/Qafzeh, associated with a Mousterian lithic technology, as well as the appearance of the Nubian Complex in South Arabia. So, there is no need to speculate about the early Out-of-Africa-that-failed with respect to these processes and postulate unwarranted coastal migrations.

A new calibration of the mtDNA phylogeny is probably also needed. Time-dependent evolution in mtDNA when the new human-chimp calibration is used will result in a new saturation curve, where mtDNA evolution will occur at recent times with something close to the genealogical rate, and at long (pre-AMH) intervals at the long-term evolutionary rate conditioned on the Pan-Homo divergence. Someone ought to "do the dates" again, but I don't think it's inconceivable that the L3 clade corresponding to the Out-of-Africa event will be updated from 70ky to something like100ky as well, with the origins of the Eurasian M and N lineages approaching 70ky themselves. So, the puzzle of L3 may in fact be resolved: it may be of ~100ky African origin, with later Eurasian clades N and M appearing at the same time as the ~70ky bottleneck.

It seems to me that the directly measured rate of 1-1.3x10^-8 is a better fit than the widely used one of 2.5x10^-8. Many of the papers currently appearing still use the old rate, so their conclusions need to be updated. 

It seems that the major Out-of-Africa event happened pre-100ky, the Eurasian bottleneck occurred at ~70ky and coincided with well-known adverse geo-climatic developments; some of the survivors went north Out-of-Arabia out this time, started interbreeding with Neandertals, and finally the MP/UP transition occurred post-50ka, probably first in the Levant, and followed very quickly in Russia, Central Europe, and the Mediterranean.

Mol Biol Evol (2012) doi: 10.1093/molbev/mss175

Inferring the History of Population Size Change from Genome-Wide SNP Data

Christoph Theunert, Kun Tang, Michael Lachmann, Sile Hu and Mark Stoneking

Dense, genome-wide single-nucleotide polymorphism (SNP) data can be used to reconstruct the demographic history of human populations. However, demographic inferences from such data are complicated by recombination and ascertainment bias. We introduce two new statistics, allele frequency-identity by descent (AF-IBD) and allele frequency-identity by state (AF-IBS), that make use of linkage disequilibrium information and show defined relationships to the time of coalescence. These statistics, when conditioned on the derived allele frequency, are able to infer complex population size changes. Moreover, the AF-IBS statistic, which is based on genome-wide SNP data, is robust to varying ascertainment conditions. We constructed an efficient approximate Bayesian computation (ABC) pipeline based on AF-IBD and AF-IBS that can accurately estimate demographic parameters, even for fairly complex models. Finally, we applied this ABC approach to genome-wide SNP data and inferred the demographic histories of two human populations, Yoruba and French. Our results suggest a rather stable ancestral population size with a mild recent expansion for Yoruba, whereas the French seemingly experienced a long-lasting severe bottleneck followed by a drastic population growth. This approach should prove useful for new insights into populations, especially those with complex demographic histories.

Link

November 19, 2011

The "Upper Paleolithic" of South Arabia

I came across this interesting book chapter on The "Upper Paleolithic" of South Arabia by Jeffrey Rose and Vitaly Usik. I first became aware of Dr. Rose's work in Southern Arabia when I watched the "Incredible Human Journey" (see Related links below) a couple of years ago. The conclusions of the chapter seem to mesh quite well with some of my recent thoughts about a possible Out-of-Arabia expansion of modern humans, posterior to the earlier Out-of-Africa.

The following figure is instructive:

Notice the super-aridity of MIS 4, circa 70ka BP. This would certainly be an awful time for anyone to move into Arabia. Conversely, if there were anatomically modern people living there prior to MIS 4, the onset of the super-arid phase during MIS 4 would be a great time to get out.

As I mention in my previous post on mtDNA haplogroup L3, I think that the major human expansion associated with haplogroup L3 and its M/N subclades originated in Arabia, and the super-arid MIS 4 phase looks about right for a bottleneck out of which the descendants of only a single woman, the L3 ur-mother would survive.

From the book chapter:
So, we are able to make a few general observations regarding the Upper Paleolithic found in the southern portions of the peninsula: (1) there are multiple phases of human occupation in South Arabia throughout the latter half of the Upper Pleistocene, (2) there are elements loosely related to the Levantine sequence, however, the South Arabian Upper Paleolithic probably belongs to a unique and locally-derived lithic tradition, (3) there do not appear to be any links with East Africa (with the exception of the Hargeisan) from MIS 4-onward, and (4) assemblages from southern and south-western Arabia are dominated by different laminar-based technologies between 75 and 8 ka.
The Hargeisan is interesting, because it is a possible link of an expansion from Arabia to Africa:
One potentially additional piece of evidence for this hypothesized Near Eastern/Arabian-derived human expansion is the anomalous Hargeisan Industry found in the Horn of Africa. Known from a small number of findspots around Hargeisa (Clark, 1954), Boosasso (Graziosi, 1954) and Midhishi Cave in the Golis Mountains of northern Somalia (Gresham, 1984; Brandt, 1986), the Hargeisan has been found overlying MSA material and beneath LSA occupation layers.
Of course, the political situation in Somalia may suggest that scientists won't be studying the Hargeisan anytime soon.

More from the book chapter:
From an archaeological perspective, Straus and Bar-Yosef (2001: 2) entertain the same possibility: “there is, however, no reason a priori to exclude the possibility that intercontinental contacts occurred on a two-way street, especially at Suez, via Sinai, or across the shallow Bab al Mandab, so close to that corridor to sub-Saharan Africa, the Nile.” Marks (2005) and Otte et al. (2007) envisage similar scenarios during the MP/UP transitions in the Near East and Zagros regions. Both scholars argue that the archaeological evidence from Eastern Europe and Western Asia indicate the expansion of European UP technologies radiated from these areas, rather than Africa, during early MIS 3. Echoing this proposition from a biological perspective, Schillaci (2008) proposes the spread of Levantine-derived peoples into Australasia between 60 and 40 ka based on fossil evidence and phylogenetic relationships between populations.
and:
We maintain that the evidence from Arabia indicates the post-MIS 4 human expansion did not originate in sub-Saharan Africa; rather, early modern humans have emerged from a geographic range encompassing areas of northeast Africa, Western Asia, Arabia, and South Asia. These populations would have been forced to contract into environmentally stable refugia around Arabia such as the Ur-Schatt River Valley, coastal oases, Yemeni Highlands, and/or the Dhofar Mountains during climatic downturns. As such, the fluctuating dynamic between landscape carrying capacity and population density may have been a critical mechanism driving early human dispersals from the region. Episodes of climate change caused large portions of the Arabian peninsula to become uninhabitable due to such calamities as the inundation of the emerged continental shelf and desertification throughout the interior. Given the potential importance of these once favorable, now uninhabitable zones, future investigations in and around Arabia should endeavor to explore the heart of the desert and bottom of the sea.

Related:

November 18, 2011

Age of mtDNA haplogroup L3: about 70 thousand years


There are two aspects to this paper: first, it appears to be a solid attempt at inferring the age of mtDNA haplogroup L3. This haplogroup contains several subclades, including M and N, the two macrohaplogroups of the vast majority of Eurasians.

I am usually skeptical of very tight age estimates, but there appear to be no obvious flaws in the paper, and alternative mutation rates are used to derive the 70ka bound. Moreover, the 70ka age is consistent with what appears to be no longer in doubt, namely the arrival of fully anatomically and behaviorally modern humans all over the Old World, starting from the 50-40ka period.

The second aspect of this paper is its claim that pre-70ka dispersals are irrelevant to modern human origins. Indeed, if the early anatomically modern humans from the Levant (Qafzeh/Skhul) or the pre-Toba layers in Asia were ascribed to Out-of-Africa humans, then we would expect their genetic differentiation with East African mtDNA to trace back to Marine Isotope Stage 5 (~130-75ka), and indeed to its early stages, to account for the Mount Carmel hominins.

So, have we solved the Out-of-Africa riddle? Did the Out-of-Africa expansion take place after 70ka? I don't think so, not because there is anything wrong with the mtDNA age, but because the competing hypothesis, that is rarely, if ever discussed, is that there was an Into-Africa event post-70ka.

mtDNA furnishes the best evidence that humans trace their ultimate origins to Africa, since L3, of which M and N are subclades, is a young twig of the mtDNA phylogeny. As the authors of the current paper note:
Although the tree is highly starlike at shallower time depths, suggesting numerous episodes of rapid growth in the human population in the more recent past, it is only at a third of the time depth of the entire tree with the emergence of the L3 haplogroup that the first multifurcating them all the ancient diversity observed outside Africa) (Behar et al. 2008; Torroni et al. 2006; Watson et al. 1997).
Whatever humans were doing between ~200ka (when the first anatomically modern specimen is found in Ethiopia, and when the mtDNA phylogeny coalesces) and ~70ka (when the L3 node does), they were certainly not yet in the overdrive mode we find them c. 50ka when they begin making their grand entrance all over the surface of the planet.

So, while the ultimate roots of modern mankind are in Africa, there is no clear picture -yet- whether the post-70ka major expansion of humans originated in Africa. Certainly, it cannot have originated too far from it, because non M and N mtDNA is virtually absent throughout most of the world. But, it is not possible, yet, to exclude a Near Eastern post-70ka expansion that would make the ~100ka Levantine hominins ancestral to most modern humans, rather than irrelevant sidebranches.

There are several reasons why this may be the case:
  1. East African L3 subclades are found in Arabia, where one finds a rich assortment of basal N subclades, as well as a not insignificant amount of M. These are often dismissed as the result of recent introgression, but they could in fact, and in part, be remnants of an older population, perhaps associated with the Persian Gulf Oasis hypothesis, and certainly absorbed by J1-bearing Arabian ancestors from further north.
  2. The Y-chromosome phylogeny has no clear signal of Out-of-Africa ~70ka. On the contrary, Eurasia possesses DE*, D and E haplogroups, as well as CF, the major human lineage, with C being totally Asian. While Africa possesses the oldest Y-chromosome lineages (basal to CT), the evidence tilts towards Asia being the homeland of CT, which has the closest parallels to a post-70ka event.
  3. Finally, Africa, including East Africa, shows, at present no sign for the presence of fully modern humans at the crucial time period. We do have, of course, Omo ~195ka, crucial anatomically modern humans in Ethiopia, but no clear sign of a bubbling volcano of a population c.70ka ready to errupt onto the Eurasian landmass.
At present, I consider the possibility that the recent post-70ka expansion of modern humans was initiated in the Near East as a possibility that cannot be dismissed. The evidence seems ambiguous, at present, since Eurasia may have a better case for such an expansion in Y-chromosomes, while Africa may have a better case in mtDNA (since it has more basal L3 clades than Eurasia).

A better characterization of Near Eastern mtDNA, especially from Arabia, as well as increased archaeological/palaeoanthropological investigations in East Africa/the Near East/South Asia is needed to finally uncover the material counterpart of the major human expansion that is written in our genes.

A third aspect of the paper is that the human expansion was linked to climate and not on the emergence of symbolic behavior. I have my own reservations on the whole concept of "symbolic behavior". We do see early evidence of such behavior in Africa, such as Blombos Cave in South Africa and North Africa. The authors of the current paper write:
There is an intriguing possible rider to this conclusion. North Africa has been entirely depopulated and repopulated, at least with respect to mtDNA variation (Pereira et al. 2010), since the time of the Aterian industry, where modern symbolic behavior is attested very early, similar to Southern Africa, and in contrast to Eastern Africa (Barton et al. 2009). We might therefore contemplate a possible North Africa ancestry for L3, with its rapid radiation corresponding to an early range expansion into Eastern Africa. However, any potential dispersal between the Mediterranean and the Horn of Africa around the time of the MIS4/3 transition would face severe environmental difficulties, unlike the “green Sahara” conditions of MIS5 and the early Holocene (Drake et al. 2010). We therefore conclude that an indigenous origin for L3 in Eastern Africa remains by far the most likely scenario.

As Mellars (2006) has argued, the early evidence for symbolically mediated behavior in both North and Southern Africa rules out any simple direct link for the expansion of L3 to (Ambrose 1998; Watson et al. 1997). Evidence of engraved ochre now extends back to at least 100 ka (Henshilwood et al. 2009), Nassarius marine shell beads were evidently present across the range of early modern humans from Southern Africa to North Africa and the Levant before 80 ka – possibly tens of thousands of years earlier (Barton et al. 2009; Bouzouggar et al. 2007; d'Errico et al. 2009; Mellars 2006; Vanhaeren et al. 2006) – and evidence for burial ritual is found in early modern humans in the Levant dating to 90–110 ka (Mellars 2006; Shea 2008). Thus, as suggested by Basell (2008) the demographic expansionsthat led to the first successful dispersal out of Africa seem better explained by the play of palaeoenvironmental forces than by recourse to the advantages of “modernity”.
The absence of markers of behavioral modernity in East Africa at the crucial time seems puzzling. Climate may have caused Out-of-East-Africa, but why would Out-of-East-Africans without clear signs of behavioral modernity be able to outcompete the "behaviorally modern" people of North/South Africa and the Levant? This observation, coupled with the absence of any clear identifiable palaeoanthropological population in East Africa at the time in question raises my unease about this scenario.

Moreover, while we can definitely ascribe symbolic thinking to the cases mentioned in the quoted text, but these may represent precursors, and not the full "package" of behaviors that allowed (or even prompted) our ancestors to spread around the planet around the middle of the last 100,000 years.

Mol Biol Evol (2011) doi: 10.1093/molbev/msr245

The expansion of mtDNA haplogroup L3 within and out of Africa

Pedro Soares et al.

Although fossil remains show that anatomically modern humans dispersed out of Africa into the Near East ∼100–130 ka, genetic evidence from extant populations has suggested that non-Africans descend primarily from a single successful later migration. Within the human mtDNA tree, haplogroup L3 encompasses not only many sub-Saharan Africans but also all ancient non-African lineages, and its age therefore provides an upper bound for the dispersal out of Africa. An analysis of 369 complete African L3 sequences places this maximum at ∼70 ka, virtually ruling out a successful exit before 74 ka, the date of the Toba volcanic super-eruption in Sumatra. The similarity of the age of L3 to its two non-African daughter haplogroups, M and N, suggests that the same process was likely responsible for both the L3 expansion in Eastern Africa and the dispersal of a small group of modern humans out of Africa to settle the rest of the world. The timing of the expansion of L3 suggests a link to improved climatic conditions after ∼70 ka in Eastern and Central Africa, rather than to symbolically mediated behavior, which evidently arose considerably earlier. The L3 mtDNA pool within Africa suggests a migration from Eastern Africa to Central Africa ∼60–35 ka, and major migrations in the immediate postglacial, again linked to climate. The largest population size increase seen in the L3 data is 3–4 ka in Central Africa, corresponding to Bantu expansions, leading diverse L3 lineages to spread into Eastern and Southern Africa in the last 3–2 ka.

Link