bioRxiv doi: 10.1101/002238
Fast Principal Component Analysis of Large-Scale Genome-Wide Data
Gad Abraham, Michael Inouye
ABSTRACT
Principal component analysis (PCA) is routinely used to analyze genome-wide single-nucleotide polymorphism (SNP) data, for detecting population structure and potential outliers. However, the size of SNP datasets has increased immensely in recent years and PCA of large datasets has become a time consuming task. We have developed flashpca, a highly efficient PCA implementation based on randomized algorithms, which delivers identical accuracy compared with existing tools in substantially less time. We demonstrate the utility of flashpca on both HapMap3 and on a large Immunochip dataset. For the latter, flashpca performed PCA of 15,000 individuals up to 125 times faster than existing tools, with identical results, and PCA of 150,000 individuals using flashpca completed in 4 hours. The increasing size of SNP datasets will make tools such as flashpca essential as traditional approaches will not adequately scale. This approach will also help to scale other applications that leverage PCA or eigen-decomposition to substantially larger datasets.
Link
January 31, 2014
January 30, 2014
Resurrecting Neandertal Lineages (Vernot and Akey 2014)
Coinciding with the publication of Sankararaman et al. (2014) in Nature, another paper on Neandertal ancestry into modern humans appeared in Science. From the paper:
Science DOI: 10.1126/science.1245938
Resurrecting Surviving Neandertal Lineages from Modern Human Genomes
Benjamin Vernot, Joshua M. Akey
Anatomically modern humans overlapped and mated with Neandertals such that non-African humans inherit ~1-3% of their genomes from Neandertal ancestors. We identified Neandertal lineages that persist in the DNA of modern humans, in whole-genome sequences from 379 European and 286 East Asian individuals, recovering over 15 Gb of introgressed sequence that spans ~20% of the Neandertal genome (FDR = 5%). Analyses of surviving archaic lineages suggests that there were fitness costs to hybridization, admixture occurred both before and subsequent to divergence of non-African modern humans, and Neandertals were a source of adaptive variation for loci involved in skin phenotypes. Our results provide a new avenue for paleogenomics studies, allowing substantial amounts of population-level DNA sequence information to be obtained from extinct groups even in the absence of fossilized remains.
Link
On average, we found 23 Mb of introgressed sequence per individual (Fig. 1F), with East Asian individuals inheriting 21% more Neandertal sequence than Europeans. Within subpopulations, we found small but statistically significant variation in the amount of introgressed sequence among Europeans (Kruskal-Wallis rank sum test; p-value = 4.2 × 10−12), but not among East Asians (p-value = 0.43).and:
Consistent with recent inferences (5, 9), observed patterns of introgression were incom-patible with a one pulse model (Fig. 3B), suggesting that gene flow be-tween Neandertals and humans occurred multiple times. Although we varied many parameters of each model (10) (fig. S14), only the ratio of ancestral effective population size between East Asians and Europeans (NeASN/NeEUR) and the relative amount of introgression between the sec-ond and first pulse (m2/m1) had appreciable effects on model fit (Fig. 3B). We estimate that NeASN/NeEUR is 1.29 (95% CI of 1.15-1.57), and that East Asians received an additional 20.2% (95% CI of 13.4%-27.1%) more Neandertal sequence in the second pulse (10).
Science DOI: 10.1126/science.1245938
Resurrecting Surviving Neandertal Lineages from Modern Human Genomes
Benjamin Vernot, Joshua M. Akey
Anatomically modern humans overlapped and mated with Neandertals such that non-African humans inherit ~1-3% of their genomes from Neandertal ancestors. We identified Neandertal lineages that persist in the DNA of modern humans, in whole-genome sequences from 379 European and 286 East Asian individuals, recovering over 15 Gb of introgressed sequence that spans ~20% of the Neandertal genome (FDR = 5%). Analyses of surviving archaic lineages suggests that there were fitness costs to hybridization, admixture occurred both before and subsequent to divergence of non-African modern humans, and Neandertals were a source of adaptive variation for loci involved in skin phenotypes. Our results provide a new avenue for paleogenomics studies, allowing substantial amounts of population-level DNA sequence information to be obtained from extinct groups even in the absence of fossilized remains.
Link
January 29, 2014
Neandertal admixture in modern humans: some of it adaptive, some selected-against (Sankararaman et al. 2014)
From this paper, this should be of interest for those who argue if they are .1 or .2% more/less Neandertal than others based on commercial testing results:
A different question is whether hybrid sterility was actually noticed by modern humans/Neandertals during the period of admixture. Modern societies have historically frowned upon mixture between diverged sapiens populations, even though there is no evidence that the offspring of, say, an African and a European are biologically disadvantaged. But, in the case of sapiens-Neandertal crossings, the offspring would have been biologically disadvantaged, a fact that may have been noticed over the span of a few generations.
Regardless of the historical dynamics of the admixture process, some of the Neandertal genome proved itself useful in its new sapiens hosts, and while the process may have been painful for the people involved, evolution found a way to use at least some of the material introduced to our species by our Neandertal cousins.
Nature (2014) doi:10.1038/nature12961
The genomic landscape of Neanderthal ancestry in present-day humans
Sriram Sankararaman
Genomic studies have shown that Neanderthals interbred with modern humans, and that non-Africans today are the products of this mixture1, 2. The antiquity of Neanderthal gene flow into modern humans means that genomic regions that derive from Neanderthals in any one human today are usually less than a hundred kilobases in size. However, Neanderthal haplotypes are also distinctive enough that several studies have been able to detect Neanderthal ancestry at specific loci1, 3, 4, 5, 6, 7, 8. We systematically infer Neanderthal haplotypes in the genomes of 1,004 present-day humans9. Regions that harbour a high frequency of Neanderthal alleles are enriched for genes affecting keratin filaments, suggesting that Neanderthal alleles may have helped modern humans to adapt to non-African environments. We identify multiple Neanderthal-derived alleles that confer risk for disease, suggesting that Neanderthal alleles continue to shape human biology. An unexpected finding is that regions with reduced Neanderthal ancestry are enriched in genes, implying selection to remove genetic material derived from Neanderthals. Genes that are more highly expressed in testes than in any other tissue are especially reduced in Neanderthal ancestry, and there is an approximately fivefold reduction of Neanderthal ancestry on the X chromosome, which is known from studies of diverse species to be especially dense in male hybrid sterility genes10, 11, 12. These results suggest that part of the explanation for genomic regions of reduced Neanderthal ancestry is Neanderthal alleles that caused decreased fertility in males when moved to a modern human genetic background.
Link
Fourth, the standard deviation in Neanderthal ancestry among individuals from within the same population is 0.06–0.10%, in line with theoretical expectation (Supplementary Information section 3), showing that Neanderthal ancestry calculators that estimate differences on the order of a per cent18 are largely inferring statistical noise.Also of interest, showing that while overall Neandertal ancestry in Eurasians is low (1+%), this average includes region where it is much higher, and indeed the majority:
The Neanderthal introgression map reveals locations where Neanderthal ancestry is inferred to be as high as 62% in east-Asian and 64% in European populations (Fig. 1b and Extended Data Fig. 2).Finally:
We have shown that interbreeding of Neanderthals and modern humans introduced alleles onto the modern human genetic background that were not tolerated, which probably resulted in part from their contributing to male hybrid sterility. The resulting reduction in Neanderthal ancestry was quantitatively large: in the fifth of the genome with highest B, Neanderthal ancestry is 1.5460.15 times the genomewide average (Extended Data Table 4 and Supplementary Information section 9)22. If we assume that this subset of the genome was unaffected by selection, this implies that the proportion of Neanderthal ancestry shortly after introgression must have been >3%rather than the approximately 2% seen today.One of the lingering questions about Neandertal admixture is why there are no Neandertal Y-chromosomes or mtDNA in modern Eurasians. The disappearance of Neandertal mtDNA seems unlikely according to one study, but might be explained if negative selection was at play.
A different question is whether hybrid sterility was actually noticed by modern humans/Neandertals during the period of admixture. Modern societies have historically frowned upon mixture between diverged sapiens populations, even though there is no evidence that the offspring of, say, an African and a European are biologically disadvantaged. But, in the case of sapiens-Neandertal crossings, the offspring would have been biologically disadvantaged, a fact that may have been noticed over the span of a few generations.
Regardless of the historical dynamics of the admixture process, some of the Neandertal genome proved itself useful in its new sapiens hosts, and while the process may have been painful for the people involved, evolution found a way to use at least some of the material introduced to our species by our Neandertal cousins.
Nature (2014) doi:10.1038/nature12961
The genomic landscape of Neanderthal ancestry in present-day humans
Sriram Sankararaman
Genomic studies have shown that Neanderthals interbred with modern humans, and that non-Africans today are the products of this mixture1, 2. The antiquity of Neanderthal gene flow into modern humans means that genomic regions that derive from Neanderthals in any one human today are usually less than a hundred kilobases in size. However, Neanderthal haplotypes are also distinctive enough that several studies have been able to detect Neanderthal ancestry at specific loci1, 3, 4, 5, 6, 7, 8. We systematically infer Neanderthal haplotypes in the genomes of 1,004 present-day humans9. Regions that harbour a high frequency of Neanderthal alleles are enriched for genes affecting keratin filaments, suggesting that Neanderthal alleles may have helped modern humans to adapt to non-African environments. We identify multiple Neanderthal-derived alleles that confer risk for disease, suggesting that Neanderthal alleles continue to shape human biology. An unexpected finding is that regions with reduced Neanderthal ancestry are enriched in genes, implying selection to remove genetic material derived from Neanderthals. Genes that are more highly expressed in testes than in any other tissue are especially reduced in Neanderthal ancestry, and there is an approximately fivefold reduction of Neanderthal ancestry on the X chromosome, which is known from studies of diverse species to be especially dense in male hybrid sterility genes10, 11, 12. These results suggest that part of the explanation for genomic regions of reduced Neanderthal ancestry is Neanderthal alleles that caused decreased fertility in males when moved to a modern human genetic background.
Link
Austronesian (~1/3) Bantu (~2/3) admixture in Madagascar
Illumina data from this study can be found here.
PNAS doi: 10.1073/pnas.1321860111
Genome-wide evidence of Austronesian–Bantu admixture and cultural reversion in a hunter-gatherer group of Madagascar
Denis Pierron et al.
Linguistic and cultural evidence suggest that Madagascar was the final point of two major dispersals of Austronesian- and Bantu-speaking populations. Today, the Mikea are described as the last-known Malagasy population reported to be still practicing a hunter-gatherer lifestyle. It is unclear, however, whether the Mikea descend from a remnant population that existed before the arrival of Austronesian and Bantu agriculturalists or whether it is only their lifestyle that separates them from the other contemporary populations of South Madagascar. To address these questions we have performed a genome-wide analysis of >700,000 SNP markers on 21 Mikea, 24 Vezo, and 24 Temoro individuals, together with 50 individuals from Bajo and Lebbo populations from Indonesia. Our analyses of these data in the context of data available from other Southeast Asian and African populations reveal that all three Malagasy populations are derived from the same admixture event involving Austronesian and Bantu sources. In contrast to the fact that most of the vocabulary of the Malagasy speakers is derived from the Barito group of the Austronesian language family, we observe that only one-third of their genetic ancestry is related to the populations of the Java-Kalimantan-Sulawesi area. Because no additional ancestry components distinctive for the Mikea were found, it is likely that they have adopted their hunter-gatherer way of life through cultural reversion, and selection signals suggest a genetic adaptation to their new lifestyle.
Link
PNAS doi: 10.1073/pnas.1321860111
Genome-wide evidence of Austronesian–Bantu admixture and cultural reversion in a hunter-gatherer group of Madagascar
Denis Pierron et al.
Linguistic and cultural evidence suggest that Madagascar was the final point of two major dispersals of Austronesian- and Bantu-speaking populations. Today, the Mikea are described as the last-known Malagasy population reported to be still practicing a hunter-gatherer lifestyle. It is unclear, however, whether the Mikea descend from a remnant population that existed before the arrival of Austronesian and Bantu agriculturalists or whether it is only their lifestyle that separates them from the other contemporary populations of South Madagascar. To address these questions we have performed a genome-wide analysis of >700,000 SNP markers on 21 Mikea, 24 Vezo, and 24 Temoro individuals, together with 50 individuals from Bajo and Lebbo populations from Indonesia. Our analyses of these data in the context of data available from other Southeast Asian and African populations reveal that all three Malagasy populations are derived from the same admixture event involving Austronesian and Bantu sources. In contrast to the fact that most of the vocabulary of the Malagasy speakers is derived from the Barito group of the Austronesian language family, we observe that only one-third of their genetic ancestry is related to the populations of the Java-Kalimantan-Sulawesi area. Because no additional ancestry components distinctive for the Mikea were found, it is likely that they have adopted their hunter-gatherer way of life through cultural reversion, and selection signals suggest a genetic adaptation to their new lifestyle.
Link
January 26, 2014
Brown-skinned, blue-eyed, Y-haplogroup C-bearing European hunter-gatherer from Spain (Olalde et al. 2014)
There is nothing like a little ancient DNA weirdness to start off 2014, which promises to be as exciting as 2013 was.
The new study La Brana 1 identifies it as ancestral in the SLC24A5 locus in which virtually all Europeans are derived. This comes in the heels of the Loschbour preprint which identified that sample from Luxembourg as also being ancestral. Taken together, it's now clear that hunter-gatherers from Mesolithic Western Europe were brown.
Curiously, it now seems that both Europe and India were (in part) inhabited by brown people and became lighter by a process of admixture + selection. The process went "all the way" in Europe, but a cline of pigmentation was sustained in India.
The other finding (not mentioned in the abstract) is that La Brana 1 belonged to Y-haplogroup C6! This is a low-frequency European clade of haplogroup C. So now, we have evidence that haplogroup C is not eastern Eurasian (as the presence of its subclades in Australia, India, East Asia, and the Americas might suggest), but a pan-Eurasian entity. It remains to be seen whether this C-in-Europe can be pushed further back in time, but finding it in Mesolithic Iberia reduces the chance that it's some random eastern Eurasian who made it to the outskirts of Europe recently.
Finally, La Brana 1 has derived alleles at loci associated with pathogen resistance. This might be important, because a common hypothesis is that Europeans developed this type of resistance during the Neolithic as they started interacting with the pathogens of domesticated species and started living in less-hygienic higher-density settlements.
Nature (2014) doi:10.1038/nature12960
Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European
Iñigo Olalde et al.
Ancient genomic sequences have started to reveal the origin and the demographic impact of farmers from the Neolithic period spreading into Europe1, 2, 3. The adoption of farming, stock breeding and sedentary societies during the Neolithic may have resulted in adaptive changes in genes associated with immunity and diet4. However, the limited data available from earlier hunter-gatherers preclude an understanding of the selective processes associated with this crucial transition to agriculture in recent human evolution. Here we sequence an approximately 7,000-year-old Mesolithic skeleton discovered at the La Braña-Arintero site in León, Spain, to retrieve a complete pre-agricultural European human genome. Analysis of this genome in the context of other ancient samples suggests the existence of a common ancient genomic signature across western and central Eurasia from the Upper Paleolithic to the Mesolithic. The La Braña individual carries ancestral alleles in several skin pigmentation genes, suggesting that the light skin of modern Europeans was not yet ubiquitous in Mesolithic times. Moreover, we provide evidence that a significant number of derived, putatively adaptive variants associated with pathogen resistance in modern Europeans were already present in this hunter-gatherer.
Link
The new study La Brana 1 identifies it as ancestral in the SLC24A5 locus in which virtually all Europeans are derived. This comes in the heels of the Loschbour preprint which identified that sample from Luxembourg as also being ancestral. Taken together, it's now clear that hunter-gatherers from Mesolithic Western Europe were brown.
Curiously, it now seems that both Europe and India were (in part) inhabited by brown people and became lighter by a process of admixture + selection. The process went "all the way" in Europe, but a cline of pigmentation was sustained in India.
The other finding (not mentioned in the abstract) is that La Brana 1 belonged to Y-haplogroup C6! This is a low-frequency European clade of haplogroup C. So now, we have evidence that haplogroup C is not eastern Eurasian (as the presence of its subclades in Australia, India, East Asia, and the Americas might suggest), but a pan-Eurasian entity. It remains to be seen whether this C-in-Europe can be pushed further back in time, but finding it in Mesolithic Iberia reduces the chance that it's some random eastern Eurasian who made it to the outskirts of Europe recently.
Finally, La Brana 1 has derived alleles at loci associated with pathogen resistance. This might be important, because a common hypothesis is that Europeans developed this type of resistance during the Neolithic as they started interacting with the pathogens of domesticated species and started living in less-hygienic higher-density settlements.
Nature (2014) doi:10.1038/nature12960
Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European
Iñigo Olalde et al.
Ancient genomic sequences have started to reveal the origin and the demographic impact of farmers from the Neolithic period spreading into Europe1, 2, 3. The adoption of farming, stock breeding and sedentary societies during the Neolithic may have resulted in adaptive changes in genes associated with immunity and diet4. However, the limited data available from earlier hunter-gatherers preclude an understanding of the selective processes associated with this crucial transition to agriculture in recent human evolution. Here we sequence an approximately 7,000-year-old Mesolithic skeleton discovered at the La Braña-Arintero site in León, Spain, to retrieve a complete pre-agricultural European human genome. Analysis of this genome in the context of other ancient samples suggests the existence of a common ancient genomic signature across western and central Eurasia from the Upper Paleolithic to the Mesolithic. The La Braña individual carries ancestral alleles in several skin pigmentation genes, suggesting that the light skin of modern Europeans was not yet ubiquitous in Mesolithic times. Moreover, we provide evidence that a significant number of derived, putatively adaptive variants associated with pathogen resistance in modern Europeans were already present in this hunter-gatherer.
Link
January 22, 2014
Lactase persistence and natural selection (Sverrisdóttir et al. 2014)
The big question is: did the present-day high allele frequency in Europeans happen because of natural selection or because of admixture with a population that was already lactase persistent?
For example, the lactase persistence allele occurs at a non-trivial frequency in present-day inhabitants of the Americas, whereas it was zero there a few thousand years ago, with the culprit being post-1492 European colonization. The frequency change in the Americas didn't happen because of natural selection, but because a new population (Europeans) moved in.
If admixture with a lactase persistent population L is at play, then the question remains how L became lactase persistent in the first place. However, this transforms the problem from (a) seeking something in the European cultural or natural environment acting as an agent of selection, into (b) seeking something in the cultural/natural environment of population L. I don't know what L might be, but seeking a population with lots of cows is a good place to start...
Mol Biol Evol (2014) doi: 10.1093/molbev/msu049
Direct estimates of natural selection in Iberia indicate calcium absorption was not the only driver of lactase persistence in Europe
Oddný Ósk Sverrisdóttir et al.
Lactase persistence (LP) is a genetically determined trait whereby the enzyme lactase is expressed throughout adult life. Lactase is necessary for the digestion of lactose – the main carbohydrate in milk – and its production is down-regulated after the weaning period in most humans and all other mammals studied. Several sources of evidence indicate that LP has evolved independently, in different parts of the world over the last 10,000 years, and has been subject to strong natural selection in dairying populations. In Europeans LP is strongly associated with, and probably caused by, a single C to T mutation 13,910bp upstream of the lactase (LCT) gene (-13,910*T). Despite a considerable body of research, the reasons why LP should provide such a strong selective advantage remains poorly understood. In this study we examine one of the most widely cited hypotheses for selection on LP – that fresh milk consumption supplements the poor vitamin D and calcium status of northern Europe's early farmers (the calcium assimilation hypothesis). We do this by testing for natural selection on -13,910*T using ancient DNA data from the skeletal remains of eight late Neolithic Iberian individuals, whom we would not expect to have poor vitamin D and calcium status because of relatively high incident UVB-light levels. None of the 8 samples successfully typed in the study had the derived T-allele. In addition, we reanalyse published data from French Neolithic remains to both test for population continuity and further examine the evolution of LP in the region. Using simulations that accommodate genetic drift, natural selection, uncertainty in calibrated radiocarbon dates, and sampling error, we find that natural selection is still required to explain the observed increase in allele frequency. We conclude that the calcium assimilation hypothesis is insufficient to explain the spread of lactase persistence in Europe.
Link
For example, the lactase persistence allele occurs at a non-trivial frequency in present-day inhabitants of the Americas, whereas it was zero there a few thousand years ago, with the culprit being post-1492 European colonization. The frequency change in the Americas didn't happen because of natural selection, but because a new population (Europeans) moved in.
If admixture with a lactase persistent population L is at play, then the question remains how L became lactase persistent in the first place. However, this transforms the problem from (a) seeking something in the European cultural or natural environment acting as an agent of selection, into (b) seeking something in the cultural/natural environment of population L. I don't know what L might be, but seeking a population with lots of cows is a good place to start...
Mol Biol Evol (2014) doi: 10.1093/molbev/msu049
Direct estimates of natural selection in Iberia indicate calcium absorption was not the only driver of lactase persistence in Europe
Oddný Ósk Sverrisdóttir et al.
Lactase persistence (LP) is a genetically determined trait whereby the enzyme lactase is expressed throughout adult life. Lactase is necessary for the digestion of lactose – the main carbohydrate in milk – and its production is down-regulated after the weaning period in most humans and all other mammals studied. Several sources of evidence indicate that LP has evolved independently, in different parts of the world over the last 10,000 years, and has been subject to strong natural selection in dairying populations. In Europeans LP is strongly associated with, and probably caused by, a single C to T mutation 13,910bp upstream of the lactase (LCT) gene (-13,910*T). Despite a considerable body of research, the reasons why LP should provide such a strong selective advantage remains poorly understood. In this study we examine one of the most widely cited hypotheses for selection on LP – that fresh milk consumption supplements the poor vitamin D and calcium status of northern Europe's early farmers (the calcium assimilation hypothesis). We do this by testing for natural selection on -13,910*T using ancient DNA data from the skeletal remains of eight late Neolithic Iberian individuals, whom we would not expect to have poor vitamin D and calcium status because of relatively high incident UVB-light levels. None of the 8 samples successfully typed in the study had the derived T-allele. In addition, we reanalyse published data from French Neolithic remains to both test for population continuity and further examine the evolution of LP in the region. Using simulations that accommodate genetic drift, natural selection, uncertainty in calibrated radiocarbon dates, and sampling error, we find that natural selection is still required to explain the observed increase in allele frequency. We conclude that the calcium assimilation hypothesis is insufficient to explain the spread of lactase persistence in Europe.
Link
A00 is ~208ky old (Elhaik et al. 2014)
A new paper on Y chromosome haplogroup A00 brings its split time to around the time of the emergence of modern anatomical modernity (~208ky) rather than the much earlier date inferred in the original paper. The low mutation rate (used to derive the old date) was also criticized by Wilson Sayers in an arXiv preprint, while Scozzari et al. recently argued for an old Y chromosome phylogeny (and correspondingly low mutation rate).
I suspect that (i) a good fix on the Y chromosome rate by direct methods, and (ii) ancient DNA work might help resolve this controversy fully. (i) will help us estimate times more accurately, and (ii) might document the presence/absence of lineages at particular time points.
In any case, for the time being, we should doubt that A00 represents a non-sapiens introgression event, although the occurrence of the most basal Y chromosome lineage in a West African farmer population still remains a very interesting finding. It's still possible that a finer sieve might yet detect archaic (or late pre-modern) introgressing lineages in modern humans, but A00 doesn't appear to be one of them.
Interesting (and a first?), a Youtube clip by the lead author on the paper:
European Journal of Human Genetics advance online publication 22 January 2014; doi: 10.1038/ejhg.2013.303
The ‘extremely ancient’ chromosome that isn’t: a forensic bioinformatic investigation of Albert Perry’s X-degenerate portion of the Y chromosome
Eran Elhaik et al.
Mendez and colleagues reported the identification of a Y chromosome haplotype (the A00 lineage) that lies at the basal position of the Y chromosome phylogenetic tree. Incorporating this haplotype, the authors estimated the time to the most recent common ancestor (TMRCA) for the Y tree to be 338 000 years ago (95% CI=237 000–581 000). Such an extraordinarily early estimate contradicts all previous estimates in the literature and is over a 100 000 years older than the earliest fossils of anatomically modern humans. This estimate raises two astonishing possibilities, either the novel Y chromosome was inherited after ancestral humans interbred with another species, or anatomically modern Homo sapiens emerged earlier than previously estimated and quickly became subdivided into genetically differentiated subpopulations. We demonstrate that the TMRCA estimate was reached through inadequate statistical and analytical methods, each of which contributed to its inflation. We show that the authors ignored previously inferred Y-specific rates of substitution, incorrectly derived the Y-specific substitution rate from autosomal mutation rates, and compared unequal lengths of the novel Y chromosome with the previously recognized basal lineage. Our analysis indicates that the A00 lineage was derived from all the other lineages 208 300 (95% CI=163 900–260 200) years ago.
Link
I suspect that (i) a good fix on the Y chromosome rate by direct methods, and (ii) ancient DNA work might help resolve this controversy fully. (i) will help us estimate times more accurately, and (ii) might document the presence/absence of lineages at particular time points.
In any case, for the time being, we should doubt that A00 represents a non-sapiens introgression event, although the occurrence of the most basal Y chromosome lineage in a West African farmer population still remains a very interesting finding. It's still possible that a finer sieve might yet detect archaic (or late pre-modern) introgressing lineages in modern humans, but A00 doesn't appear to be one of them.
Interesting (and a first?), a Youtube clip by the lead author on the paper:
European Journal of Human Genetics advance online publication 22 January 2014; doi: 10.1038/ejhg.2013.303
The ‘extremely ancient’ chromosome that isn’t: a forensic bioinformatic investigation of Albert Perry’s X-degenerate portion of the Y chromosome
Eran Elhaik et al.
Mendez and colleagues reported the identification of a Y chromosome haplotype (the A00 lineage) that lies at the basal position of the Y chromosome phylogenetic tree. Incorporating this haplotype, the authors estimated the time to the most recent common ancestor (TMRCA) for the Y tree to be 338 000 years ago (95% CI=237 000–581 000). Such an extraordinarily early estimate contradicts all previous estimates in the literature and is over a 100 000 years older than the earliest fossils of anatomically modern humans. This estimate raises two astonishing possibilities, either the novel Y chromosome was inherited after ancestral humans interbred with another species, or anatomically modern Homo sapiens emerged earlier than previously estimated and quickly became subdivided into genetically differentiated subpopulations. We demonstrate that the TMRCA estimate was reached through inadequate statistical and analytical methods, each of which contributed to its inflation. We show that the authors ignored previously inferred Y-specific rates of substitution, incorrectly derived the Y-specific substitution rate from autosomal mutation rates, and compared unequal lengths of the novel Y chromosome with the previously recognized basal lineage. Our analysis indicates that the A00 lineage was derived from all the other lineages 208 300 (95% CI=163 900–260 200) years ago.
Link
January 13, 2014
Paternal and maternal demographic histories (Lippold et al. 2014)
A new preprint on the bioRxiv on the different male/female demographic history of humans.
Red=female, blue=male.
This is probably related to the new paper on selection on the Y chromosome which interprets reduced diversity as evidence for selection.
doi: 10.1101/001792
Human paternal and maternal demographic histories: insights from high-resolution Y chromosome and mtDNA sequences
Sebastian Lippold et al.
To investigate in detail the paternal and maternal demographic histories of humans, we obtained ~500 kb of non-recombining Y chromosome (NRY) sequences and complete mtDNA genome sequences from 623 males from 51 populations in the CEPH Human Genome Diversity Panel (HGDP). Our results: confirm the controversial assertion that genetic differences between human populations on a global scale are bigger for the NRY than for mtDNA; suggest very small ancestral effective population sizes (less than 100) for the out-of-Africa migration as well as for many human populations; and indicate that the ratio of female effective population size to male effective population size (Nf/Nm) has been greater than one throughout the history of modern humans, and has recently increased due to faster growth in Nf. However, we also find substantial differences in patterns of mtDNA vs. NRY variation in different regional groups; thus, global patterns of variation are not necessarily representative of specific geographic regions.
Link
Red=female, blue=male.
This is probably related to the new paper on selection on the Y chromosome which interprets reduced diversity as evidence for selection.
doi: 10.1101/001792
Human paternal and maternal demographic histories: insights from high-resolution Y chromosome and mtDNA sequences
Sebastian Lippold et al.
To investigate in detail the paternal and maternal demographic histories of humans, we obtained ~500 kb of non-recombining Y chromosome (NRY) sequences and complete mtDNA genome sequences from 623 males from 51 populations in the CEPH Human Genome Diversity Panel (HGDP). Our results: confirm the controversial assertion that genetic differences between human populations on a global scale are bigger for the NRY than for mtDNA; suggest very small ancestral effective population sizes (less than 100) for the out-of-Africa migration as well as for many human populations; and indicate that the ratio of female effective population size to male effective population size (Nf/Nm) has been greater than one throughout the history of modern humans, and has recently increased due to faster growth in Nf. However, we also find substantial differences in patterns of mtDNA vs. NRY variation in different regional groups; thus, global patterns of variation are not necessarily representative of specific geographic regions.
Link
January 10, 2014
Natural selection on human Y chromosomes
I am not quite sure what the analyses of this paper actually show. Neither migration nor admixture are mentioned in the text, and, in my opinion, these processes have shaped modern human Y chromosomal variation.
Migration may result in the expansion of a successful set of Y chromosome lineages, while admixture between divergent populations may inflate estimates of diversity in a population. The Complete Genomics data is dominated by two haplogroups representing vary recent expansions in Europe and West Africa. Is the expansion of R-M269 in Europe an example of selection or large-scale replacement? What does it mean to model "Europeans" or "Africans" as single evolving populations, when both of these were likely formed over the last few thousand years by admixture of divergent pre-existing populations.
PLoS Genetics doi:DOI: 10.1371/journal.pgen.1004064
Natural Selection Reduced Diversity on Human Y Chromosomes
Melissa A. Wilson Sayres et al.
The human Y chromosome exhibits surprisingly low levels of genetic diversity. This could result from neutral processes if the effective population size of males is reduced relative to females due to a higher variance in the number of offspring from males than from females. Alternatively, selection acting on new mutations, and affecting linked neutral sites, could reduce variability on the Y chromosome. Here, using genome-wide analyses of X, Y, autosomal and mitochondrial DNA, in combination with extensive population genetic simulations, we show that low observed Y chromosome variability is not consistent with a purely neutral model. Instead, we show that models of purifying selection are consistent with observed Y diversity. Further, the number of sites estimated to be under purifying selection greatly exceeds the number of Y-linked coding sites, suggesting the importance of the highly repetitive ampliconic regions. While we show that purifying selection removing deleterious mutations can explain the low diversity on the Y chromosome, we cannot exclude the possibility that positive selection acting on beneficial mutations could have also reduced diversity in linked neutral regions, and may have contributed to lowering human Y chromosome diversity. Because the functional significance of the ampliconic regions is poorly understood, our findings should motivate future research in this area.
Link
Migration may result in the expansion of a successful set of Y chromosome lineages, while admixture between divergent populations may inflate estimates of diversity in a population. The Complete Genomics data is dominated by two haplogroups representing vary recent expansions in Europe and West Africa. Is the expansion of R-M269 in Europe an example of selection or large-scale replacement? What does it mean to model "Europeans" or "Africans" as single evolving populations, when both of these were likely formed over the last few thousand years by admixture of divergent pre-existing populations.
PLoS Genetics doi:DOI: 10.1371/journal.pgen.1004064
Natural Selection Reduced Diversity on Human Y Chromosomes
Melissa A. Wilson Sayres et al.
The human Y chromosome exhibits surprisingly low levels of genetic diversity. This could result from neutral processes if the effective population size of males is reduced relative to females due to a higher variance in the number of offspring from males than from females. Alternatively, selection acting on new mutations, and affecting linked neutral sites, could reduce variability on the Y chromosome. Here, using genome-wide analyses of X, Y, autosomal and mitochondrial DNA, in combination with extensive population genetic simulations, we show that low observed Y chromosome variability is not consistent with a purely neutral model. Instead, we show that models of purifying selection are consistent with observed Y diversity. Further, the number of sites estimated to be under purifying selection greatly exceeds the number of Y-linked coding sites, suggesting the importance of the highly repetitive ampliconic regions. While we show that purifying selection removing deleterious mutations can explain the low diversity on the Y chromosome, we cannot exclude the possibility that positive selection acting on beneficial mutations could have also reduced diversity in linked neutral regions, and may have contributed to lowering human Y chromosome diversity. Because the functional significance of the ampliconic regions is poorly understood, our findings should motivate future research in this area.
Link
January 09, 2014
SLC24A5 light skin pigmentation allele origin
From the paper:
Related:
G3 doi: 10.1534/g3.113.007484
Molecular Phylogeography of a Human Autosomal Skin Color Locus Under Natural Selection
Victor A. Canfield et al.
Divergent natural selection caused by differences in solar exposure has resulted in distinctive variations in skin color between human populations. The derived light skin color allele of the SLC24A5 gene, A111T, predominates in populations of Western Eurasian ancestry. To gain insight into when and where this mutation arose, we defined common haplotypes in the genomic region around SLC24A5 across diverse human populations and deduced phylogenetic relationships between them. Virtually all chromosomes carrying the A111T allele share a single 78-kb haplotype that we call C11, indicating that all instances of this mutation in human populations share a common origin. The C11 haplotype was most likely created by a crossover between two haplotypes, followed by the A111T mutation. The two parental precursor haplotypes are found from East Asia to the Americas but are nearly absent in Africa. The distributions of C11 and its parental haplotypes make it most likely that these two last steps occurred between the Middle East and the Indian subcontinent, with the A111T mutation occurring after the split between the ancestors of Europeans and East Asians.
Link
Adjustment for undercounting is substantial, increasing the estimated age for the combined samples to 12.4 (95% confidence interval 7.6−19.2) kya. If mutation rates in recent humans are lower than predicted from the human-chimpanzee divergence (Scally and Durbin 2012), true ages will be even older. Our adjusted dates overlap those previously reported (Beleza et al. 2012) and are also consistent with the lower limit for the origin of A111T set by the finding that the Alpine “iceman” dated to 5.3 kya was homozygous for this variant (Keller et al. 2012).
Related:
- Europeans and South Asians share by descent SLC24A5 light skin allele
- When Eurasians got lighter skin
Taking the 12.4ky estimate and multiplying by two (for the slower autosomal mutation rate) yields an estimate of 25ky, so it seems that this allele did not accompany the earliest modern human colonists of West Eurasia but emerged in some region and spread from there. It will be interesting to see (through ancient DNA) by what processes of migration, admixture, and selection this transpired.
G3 doi: 10.1534/g3.113.007484
Molecular Phylogeography of a Human Autosomal Skin Color Locus Under Natural Selection
Victor A. Canfield et al.
Divergent natural selection caused by differences in solar exposure has resulted in distinctive variations in skin color between human populations. The derived light skin color allele of the SLC24A5 gene, A111T, predominates in populations of Western Eurasian ancestry. To gain insight into when and where this mutation arose, we defined common haplotypes in the genomic region around SLC24A5 across diverse human populations and deduced phylogenetic relationships between them. Virtually all chromosomes carrying the A111T allele share a single 78-kb haplotype that we call C11, indicating that all instances of this mutation in human populations share a common origin. The C11 haplotype was most likely created by a crossover between two haplotypes, followed by the A111T mutation. The two parental precursor haplotypes are found from East Asia to the Americas but are nearly absent in Africa. The distributions of C11 and its parental haplotypes make it most likely that these two last steps occurred between the Middle East and the Indian subcontinent, with the A111T mutation occurring after the split between the ancestors of Europeans and East Asians.
Link
January 08, 2014
6,500-year old tin bronze from Serbia
Antiquity Volume: 87 Number: 338 Page: 1030–1045
Tainted ores and the rise of tin bronzes in Eurasia, c. 6500 years ago
Miljana Radivojević et al.
The earliest tin bronze artefacts in Eurasia are generally believed to have appeared in the Near East in the early third millennium BC. Here we present tin bronze artefacts that occur far from the Near East, and in a significantly earlier period. Excavations at Pločnik, a Vinča culture site in Serbia, recovered a piece of tin bronze foil from an occupation layer dated to the mid fifth millennium BC. The discovery prompted a reassessment of 14 insufficiently contextualised early tin bronze artefacts from the Balkans. They too were found to derive from the smelting of copper-tin ores. These tin bronzes extend the record of bronze making by c. 1500 years, and challenge the conventional narrative of Eurasian metallurgical development.
Link
Tainted ores and the rise of tin bronzes in Eurasia, c. 6500 years ago
Miljana Radivojević et al.
The earliest tin bronze artefacts in Eurasia are generally believed to have appeared in the Near East in the early third millennium BC. Here we present tin bronze artefacts that occur far from the Near East, and in a significantly earlier period. Excavations at Pločnik, a Vinča culture site in Serbia, recovered a piece of tin bronze foil from an occupation layer dated to the mid fifth millennium BC. The discovery prompted a reassessment of 14 insufficiently contextualised early tin bronze artefacts from the Balkans. They too were found to derive from the smelting of copper-tin ores. These tin bronzes extend the record of bronze making by c. 1500 years, and challenge the conventional narrative of Eurasian metallurgical development.
Link
New chronology of Y chromosome phylogeny (Scozzari et al. 2013)
The authors infer a Y chromosome mutation rate of 0.64 x 10e-9 via the autosomal mutation rate. This is in-between two other mutation rates recently published and lower than the only known direct measurement of this quantity.
The issue of mutation rate calibration also popped up in the A00 paper where two different ages for the A00 clade (209 vs. 338ky) were inferred using either the direct or autosomally-adjusted rate.
The Y chromosome mutation rate needs to be studied further; in any case, the relative age estimates should be useful regardless of this issue.
Genome Res. 2014 Jan 6. [Epub ahead of print]
An unbiased resource of novel SNP markers provides a new chronology for the human Y chromosome and reveals a deep phylogenetic structure in Africa.
Scozzari R et al.
Abstract
The phylogeography of the paternally-inherited MSY has been the subject of intense research. However, sequence diversity and the ages of the deepest nodes of the phylogeny remain largely unexplored due to the severely biased collection of SNPs available for study. We characterized 68 worldwide Y chromosomes by high-coverage next generation sequencing, including 18 deep-rooting ones, and identified 2,386 SNPs, 80% of which were novel. Many aspects of this pool of variants resembled the pattern observed among genome-wide de novo events, suggesting that in the MSY a large proportion of newly arisen alleles have survived in the phylogeny. Some degree of purifying selection emerged in the form of an excess of private missense variants. Our MSY tree recapitulated the previously known topology, but the relative lengths of major branches were drastically modified and the associated node ages were remarkably older. We found significantly different branch lengths when comparing the rare deep-rooted A1b African lineage with the rest of the tree. Our dating results and phylogeography led to the following main conclusions: 1) patrilineal lineages with ages approaching those of early AMH fossils survive today only in central-western Africa; 2) only a few evolutionarily successful MSY lineages survived between 160 and 115 kya; 3) an early exit out of Africa (before 70 kya), which fits recent western Asian archaeological evidence, should be considered. Our experimental design produced an unbiased resource of new MSY markers informative for the initial formation of the anatomically modern human gene pool, i.e. a period of our evolution which had been previously considered to be poorly accessible with paternally-inherited markers.
Link
The issue of mutation rate calibration also popped up in the A00 paper where two different ages for the A00 clade (209 vs. 338ky) were inferred using either the direct or autosomally-adjusted rate.
The Y chromosome mutation rate needs to be studied further; in any case, the relative age estimates should be useful regardless of this issue.
Genome Res. 2014 Jan 6. [Epub ahead of print]
An unbiased resource of novel SNP markers provides a new chronology for the human Y chromosome and reveals a deep phylogenetic structure in Africa.
Scozzari R et al.
Abstract
The phylogeography of the paternally-inherited MSY has been the subject of intense research. However, sequence diversity and the ages of the deepest nodes of the phylogeny remain largely unexplored due to the severely biased collection of SNPs available for study. We characterized 68 worldwide Y chromosomes by high-coverage next generation sequencing, including 18 deep-rooting ones, and identified 2,386 SNPs, 80% of which were novel. Many aspects of this pool of variants resembled the pattern observed among genome-wide de novo events, suggesting that in the MSY a large proportion of newly arisen alleles have survived in the phylogeny. Some degree of purifying selection emerged in the form of an excess of private missense variants. Our MSY tree recapitulated the previously known topology, but the relative lengths of major branches were drastically modified and the associated node ages were remarkably older. We found significantly different branch lengths when comparing the rare deep-rooted A1b African lineage with the rest of the tree. Our dating results and phylogeography led to the following main conclusions: 1) patrilineal lineages with ages approaching those of early AMH fossils survive today only in central-western Africa; 2) only a few evolutionarily successful MSY lineages survived between 160 and 115 kya; 3) an early exit out of Africa (before 70 kya), which fits recent western Asian archaeological evidence, should be considered. Our experimental design produced an unbiased resource of new MSY markers informative for the initial formation of the anatomically modern human gene pool, i.e. a period of our evolution which had been previously considered to be poorly accessible with paternally-inherited markers.
Link
January 05, 2014
Population size and the rate of evolution (Lanfear et al. 2014)
A useful treatment of a very general subject. From the paper:
Trends in Ecology & Evolution Volume 29, Issue 1, January 2014, Pages 33–41
Population size and the rate of evolution
Robert Lanfear et al.
Does evolution proceed faster in larger or smaller populations? The relationship between effective population size (Ne) and the rate of evolution has consequences for our ability to understand and interpret genomic variation, and is central to many aspects of evolution and ecology. Many factors affect the relationship between Ne and the rate of evolution, and recent theoretical and empirical studies have shown some surprising and sometimes counterintuitive results. Some mechanisms tend to make the relationship positive, others negative, and they can act simultaneously. The relationship also depends on whether one is interested in the rate of neutral, adaptive, or deleterious evolution. Here, we synthesize theoretical and empirical approaches to understanding the relationship and highlight areas that remain poorly understood.
Link
For mutations on which natural selection can act (i.e., those
with s != 0, Box 2), the NeRR depends on the fitness effects of
mutations (s, Figure 1). As Ne increases, natural selection
becomes more effective at fixing advantageous mutations
and removing deleterious mutations, but larger populations
also produce more of both types of mutation. Theory sug-
gests that as Ne increases the power of natural selection
increases faster than the production of new mutations (see
[5] for a recent review). This results in lower deleterious
substitution rates as Ne increases (a negative NeRR,
Figure 1B,D), and higher advantageous substitution rates
as Ne increases (a positive NeRR, Figure 1A,C). However,
these predictions can sometimes be altered when the sim-
plifying assumptions of the underlying theory are not met.
Trends in Ecology & Evolution Volume 29, Issue 1, January 2014, Pages 33–41
Population size and the rate of evolution
Robert Lanfear et al.
Does evolution proceed faster in larger or smaller populations? The relationship between effective population size (Ne) and the rate of evolution has consequences for our ability to understand and interpret genomic variation, and is central to many aspects of evolution and ecology. Many factors affect the relationship between Ne and the rate of evolution, and recent theoretical and empirical studies have shown some surprising and sometimes counterintuitive results. Some mechanisms tend to make the relationship positive, others negative, and they can act simultaneously. The relationship also depends on whether one is interested in the rate of neutral, adaptive, or deleterious evolution. Here, we synthesize theoretical and empirical approaches to understanding the relationship and highlight areas that remain poorly understood.
Link
January 01, 2014
Happy New Year 2014
What's on your wish list for the new year in the world of anthropology and human genetics?
Here's my #1 item:
Any ancient African DNA.
The study of prehistoric Eurasians has revealed that modern populations are not simply descended from the people who lived in the same areas even a few thousand years ago.
The people best preserving the genetic legacy of central European Neolithic farmers can be found on the island of Sardinia; of west European hunter-gatherers in the shores of the Baltic; of Upper Paleolithic Siberians in the jungles of the Amazon; of Middle Paleolithic Siberians in Papua and Australia.
And yet, the model for Africa largely remains one of continuity across two hundred thousand years, since the emergence of anatomically modern humans in eastern Africa.
There have been hints that this isn't the case; the study of modern populations has revealed evidence for both archaic African, and -more recently and surprisingly- even a little archaic Eurasian ancestry in virtually all Sub-Saharan Africans. Populations from one of the presumed cradles of H. sapiens (Eastern Africa) are now conclusively known to be recent mixtures of West Eurasians, and even the Bushmen of southern Africa, the subject of so many TV documentaries as an exemplum of the ur-Humans did not escape this admixture.
Paleoanthropology also hints that some of the people who lived in sub-Saharan Africa well into the Lower Stone Age may have been quite divergent, and so do modern human Y-chromosomes. T
So the €1,000,000 question is: who lived in Africa 5 or 10 or 50 or 100 thousand years ago?
Here's my #1 item:
Any ancient African DNA.
The study of prehistoric Eurasians has revealed that modern populations are not simply descended from the people who lived in the same areas even a few thousand years ago.
The people best preserving the genetic legacy of central European Neolithic farmers can be found on the island of Sardinia; of west European hunter-gatherers in the shores of the Baltic; of Upper Paleolithic Siberians in the jungles of the Amazon; of Middle Paleolithic Siberians in Papua and Australia.
And yet, the model for Africa largely remains one of continuity across two hundred thousand years, since the emergence of anatomically modern humans in eastern Africa.
There have been hints that this isn't the case; the study of modern populations has revealed evidence for both archaic African, and -more recently and surprisingly- even a little archaic Eurasian ancestry in virtually all Sub-Saharan Africans. Populations from one of the presumed cradles of H. sapiens (Eastern Africa) are now conclusively known to be recent mixtures of West Eurasians, and even the Bushmen of southern Africa, the subject of so many TV documentaries as an exemplum of the ur-Humans did not escape this admixture.
Paleoanthropology also hints that some of the people who lived in sub-Saharan Africa well into the Lower Stone Age may have been quite divergent, and so do modern human Y-chromosomes. T
So the €1,000,000 question is: who lived in Africa 5 or 10 or 50 or 100 thousand years ago?