Dienekes’ Anthropology Blog: Evolution

Showing posts with label Evolution. Show all posts

September 04, 2014

Everything you ever wanted to know about mutation rate in humans

Annual Review of Genomics and Human Genetics Vol. 15: 47-70 (Volume publication date August 2014)

Determinants of Mutation Rate Variation in the Human Germline

Laure Ségurel, Minyoung J. Wyman, and Molly Przeworski

Because germline mutations are the source of all evolutionary adaptations and heritable diseases, characterizing their properties and the rate at which they arise across individuals is of fundamental importance for human genetics. After decades during which estimates were based on indirect approaches, notably on inferences from evolutionary patterns, it is now feasible to count de novo mutations in transmissions from parents to offspring. Surprisingly, this direct approach yields a mutation rate that is twofold lower than previous estimates, calling into question our understanding of the chronology of human evolution and raising the possibility that mutation rates have evolved relatively rapidly. Here, we bring together insights from studies of human genetics and molecular evolution, focusing on where they conflict and what the discrepancies tell us about important open questions. We begin by outlining various methods for studying the properties of mutations in humans. We review what we have learned from their applications about genomic factors that influence mutation rates and the effects of sex, age, and other sources of interindividual variation. We then consider the mutation rate as a product of evolution and discuss how and why it may have changed over time in primates.

Link

June 15, 2014

Chimp mutation rate is equal to human mutation rate but driven more by males

This is important because (a) it shows evidence for the "slow" mutation rate in a species related to humans, (b) it shows that chimp and human mutation rates are equal and so using the human mutation rate in studies of divergence with chimps is justified, and (c) it is driven differently by males/females than in humans.

Science 13 June 2014: Vol. 344 no. 6189 pp. 1272-1275
DOI: 10.1126/science.344.6189.1272

Strong male bias drives germline mutation in chimpanzees

Oliver Venn

ABSTRACT

Germline mutation determines rates of molecular evolution, genetic diversity, and fitness load. In humans, the average point mutation rate is 1.2 × 10−8 per base pair per generation, with every additional year of father’s age contributing two mutations across the genome and males contributing three to four times as many mutations as females. To assess whether such patterns are shared with our closest living relatives, we sequenced the genomes of a nine-member pedigree of Western chimpanzees, Pan troglodytes verus. Our results indicate a mutation rate of 1.2 × 10−8 per base pair per generation, but a male contribution seven to eight times that of females and a paternal age effect of three mutations per year of father’s age. Thus, mutation rates and patterns differ between closely related species.

Link

February 21, 2014

Evolution equally efficient in removing deleterious variants in Europeans and West Africans

...but apparently not in Denisovans who accumulated deleterious mutations at a higher rate than modern humans. This may account for the fact that we haven't been able to find many Denisovans in the archaeological record as they may simply be a population that "failed" -- although apparently some distant relatives of the single Denisovan genome did admix into Australasians.

arXiv:1402.4896 [q-bio.PE]

No evidence that natural selection has been less effective at removing deleterious mutations in Europeans than in West Africans

Ron Do et al.

Non-African populations have experienced major bottlenecks in the time since their split from West Africans, which has led to the hypothesis that natural selection to remove weakly deleterious mutations may have been less effective in non-Africans. To directly test this hypothesis, we measure the per-genome accumulation of deleterious mutations across diverse humans. We fail to detect any significant differences, but find that archaic Denisovans accumulated non-synonymous mutations at a higher rate than modern humans, consistent with the longer separation time of modern and archaic humans. We also revisit the empirical patterns that have been interpreted as evidence for less effective removal of deleterious mutations in non-Africans than in West Africans, and show they are not driven by differences in selection after population separation, but by neutral evolution.

Link

January 05, 2014

Population size and the rate of evolution (Lanfear et al. 2014)

A useful treatment of a very general subject. From the paper:

For mutations on which natural selection can act (i.e., those
with s != 0, Box 2), the NeRR depends on the fitness effects of
mutations (s, Figure 1). As Ne increases, natural selection
becomes more effective at fixing advantageous mutations
and removing deleterious mutations, but larger populations
also produce more of both types of mutation. Theory sug-
gests that as Ne increases the power of natural selection
increases faster than the production of new mutations (see
[5] for a recent review). This results in lower deleterious
substitution rates as Ne increases (a negative NeRR,
Figure 1B,D), and higher advantageous substitution rates
as Ne increases (a positive NeRR, Figure 1A,C). However,
these predictions can sometimes be altered when the sim-
plifying assumptions of the underlying theory are not met.

Trends in Ecology & Evolution Volume 29, Issue 1, January 2014, Pages 33–41

Population size and the rate of evolution

Robert Lanfear et al.

Does evolution proceed faster in larger or smaller populations? The relationship between effective population size (Ne) and the rate of evolution has consequences for our ability to understand and interpret genomic variation, and is central to many aspects of evolution and ecology. Many factors affect the relationship between Ne and the rate of evolution, and recent theoretical and empirical studies have shown some surprising and sometimes counterintuitive results. Some mechanisms tend to make the relationship positive, others negative, and they can act simultaneously. The relationship also depends on whether one is interested in the rate of neutral, adaptive, or deleterious evolution. Here, we synthesize theoretical and empirical approaches to understanding the relationship and highlight areas that remain poorly understood.

Link

December 27, 2013

Site frequency spectrum from reads is unbiased (from genotype calls, biased at low coverage)

Mol Biol Evol (2013) doi: 10.1093/molbev/mst229

Characterizing Bias in Population Genetic Inferences from Low-Coverage Sequencing Data

Eunjung Han et al.

The site frequency spectrum (SFS) is of primary interest in population genetic studies, because the SFS compresses variation data into a simple summary from which many population genetic inferences can proceed. However, inferring the SFS from sequencing data is challenging because genotype calls from sequencing data are often inaccurate due to high error rates and if not accounted for, this genotype uncertainty can lead to serious bias in downstream analysis based on the inferred SFS. Here, we compare two approaches to estimate the SFS from sequencing data: one approach infers individual genotypes from aligned sequencing reads and then estimates the SFS based on the inferred genotypes (call-based approach) and the other approach directly estimates the SFS from aligned sequencing reads by maximum likelihood (direct estimation approach). We find that the SFS estimated by the direct estimation approach is unbiased even at low coverage, whereas the SFS by the call-based approach becomes biased as coverage decreases. The direction of the bias in the call-based approach depends on the pipeline to infer genotypes. Estimating genotypes by pooling individuals in a sample (multisample calling) results in underestimation of the number of rare variants, whereas estimating genotypes in each individual and merging them later (single-sample calling) leads to overestimation of rare variants. We characterize the impact of these biases on downstream analyses, such as demographic parameter estimation and genome-wide selection scans. Our work highlights that depending on the pipeline used to infer the SFS, one can reach different conclusions in population genetic inference with the same data set. Thus, careful attention to the analysis pipeline and SFS estimation procedures is vital for population genetic inferences.

Link

November 06, 2013

MEGA6 evolutionary genetics software released

Mol Biol Evol (2013) doi: 10.1093/molbev/mst197

MEGA6: Molecular Evolutionary Genetics Analysis version 6.0

Koichiro Tamura et al.

We announce the release of an advanced version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which currently contains facilities for building sequence alignments, inferring phylogenetic histories, and conducting molecular evolutionary analysis. In version 6.0, MEGA now enables the inference of timetrees, as it implements the RelTime method for estimating divergence times for all branching points in a phylogeny. A new Timetree Wizard in MEGA6 facilitates this timetree inference by providing a graphical user interface (GUI) to specify the phylogeny and calibration constraints step-by-step. This version also contains enhanced algorithms to search for the optimal trees under evolutionary criteria and implements a more advanced memory management that can double the size of sequence data sets to which MEGA can be applied. Both GUI and command-line versions of MEGA6 can be downloaded from www.megasoftware.net free of charge.

Link

June 26, 2013

Throwing erectus

Nature 498, 483–486 (27 June 2013) doi:10.1038/nature12267

Elastic energy storage in the shoulder and the evolution of high-speed throwing in Homo

Neil T. Roach et al.

Some primates, including chimpanzees, throw objects occasionally1, 2, but only humans regularly throw projectiles with high speed and accuracy. Darwin noted that the unique throwing abilities of humans, which were made possible when bipedalism emancipated the arms, enabled foragers to hunt effectively using projectiles3. However, there has been little consideration of the evolution of throwing in the years since Darwin made his observations, in part because of a lack of evidence of when, how and why hominins evolved the ability to generate high-speed throws4, 5, 6, 7, 8. Here we use experimental studies of humans throwing projectiles to show that our throwing capabilities largely result from several derived anatomical features that enable elastic energy storage and release at the shoulder. These features first appear together approximately 2 million years ago in the species Homo erectus. Taking into consideration archaeological evidence suggesting that hunting activity intensified around this time9, we conclude that selection for throwing as a means to hunt probably had an important role in the evolution of the genus Homo.

Link

November 17, 2012

Populations histories with a diffusion process formulation

On the left you can see the best topology on a diffusion time scale. It might be interesting that CEU (YRI) appear closer to Africans (Eurasians) than JPT (BIA; Biaka Pygmies).

Mol Biol Evol (2012) doi: 10.1093/molbev/mss257

Inferring population histories using genome-wide allele frequency data

Mathieu Gautier and Renaud Vitalis

The recent development of high throughput genotyping technologies has revolutionized the collection of data in a wide range of both model and non-model species. These data generally contain huge amounts of information about the past demographic history of populations.

In this study we introduce a new method to estimate divergence times on a diffusion time-scale from large SNP datasets, conditionally on a population history which is represented as a tree. We further assume that all the observed polymorphisms originate from the most ancestral (root) population, i.e. we neglect mutations that occur after the split of the most ancestral population. This method relies on a hierarchical-Bayesian model, based on Kimura's time-dependent diffusion approximation of genetic drift. We implemented a Metropolis–Hastings within Gibbs sampler to estimate the posterior distribution of the parameters of interest in this model, which we refer to as the Kimura model. Evaluating the Kimura model on simulated population histories, we found that it provides accurate estimates of divergence time. Assessing model fit using the deviance information criterion (DIC) proved efficient for retrieving the correct tree topology among a set of competing histories. We show that this procedure is robust to low-to-moderate gene flow, as well as to ascertainment bias, providing that the most distantly related populations are represented in the discovery panel. As an illustrative example, we finally analyzed published human data consisting in genotypes for 452,198 SNPs from individuals belonging to four populations worldwide.

Our results suggest that the Kimura model may be helpful to characterize the demographic history of dierentiated populations, using genome-wide allele frequency data.
Link

September 20, 2012

Selection at FADS gene cluster in Africa

A couple of quick comments. First, from the paper:

Given 207 fixed differences between chimpanzee and human in this region, we estimate a TMRCA of 1.49 (SEM = 0.23) million years for the human haplotypes. Similarly, only considering the number of mutations within the haplotype group D1, the TMRCA was 85,000±84,000 years, thus suggesting that selection in Africa occurred approximately 85 kya.

...

Jointly, these two sets of data support the hypothesis that advantageous mutations within the FADS gene cluster occurred prior to human migration out of Africa (~85 kya), and swept to fixation within African but not European or Asian populations.

I don't know how 85+/-84 can be used to infer that this selection occurred prior to the human migration Out-of-Africa. It seems to me that 85+/-84 is compatible with a wide variety of events. On top of this, this date depends on human-chimp split (=6.5Ma), for which a recent estimate of 7-13Ma has been advanced recently, but also 3.7-6.6Ma. So, I would say that uncertainty about when this selection took place leaves little room for pronouncements that it took place either before, or after Out-of-Africa.

Look at the following frequency map:

I sometimes think that intellectual commitment to recent Out-of-Africa-and-never-back is so strong, that the obvious explanation is overlooked. When Africans are polymorphic and Eurasians are not, this is explained as the result of the OoA bottleneck. When Africans are monomorphic and Eurasians are not -as in this paper- this is explained as the result of selection-to-fixation in Africans.

Now, I don't doubt that there was selection in Africans at the FADS gene cluster. But, rather than imagine that African humans were "tethered to marine sources for LC-PUFAs in isolated geographic regions" throughout the ecologically diverse and geographically huge continent of Africa, I can simply imagine that there was a spread of the adaptive haplotype into Africa, followed by selection, as modern humans, who originated, perhaps, in North Africa, had to drastically shift their diet as they entered Sub-Saharan Africa. I am not convinced that this is what happened, but it is certainly worthy of consideration.

PLoS ONE 7(9): e44926. doi:10.1371/journal.pone.0044926

Adaptive Evolution of the FADS Gene Cluster within Africa

Rasika A. Mathias et al.

Long chain polyunsaturated fatty acids (LC-PUFAs) are essential for brain structure, development, and function, and adequate dietary quantities of LC-PUFAs are thought to have been necessary for both brain expansion and the increase in brain complexity observed during modern human evolution. Previous studies conducted in largely European populations suggest that humans have limited capacity to synthesize brain LC-PUFAs such as docosahexaenoic acid (DHA) from plant-based medium chain (MC) PUFAs due to limited desaturase activity. Population-based differences in LC-PUFA levels and their product-to-substrate ratios can, in part, be explained by polymorphisms in the fatty acid desaturase (FADS) gene cluster, which have been associated with increased conversion of MC-PUFAs to LC-PUFAs. Here, we show evidence that these high efficiency converter alleles in the FADS gene cluster were likely driven to near fixation in African populations by positive selection ~85 kya. We hypothesize that selection at FADS variants, which increase LC-PUFA synthesis from plant-based MC-PUFAs, played an important role in allowing African populations obligatorily tethered to marine sources for LC-PUFAs in isolated geographic regions, to rapidly expand throughout the African continent 60–80 kya.

Link

September 12, 2012

Why We Prevailed: Evolution and the Battle for Dominance

I haven't watched it yet, but the list of participants is interesting. From the description:

We once shared the planet with Neanderthals and other human species. Some of our relatives may have had tools, language and culture. Why did we thrive while they perished? Join evolutionary biologists, geneticists and anthropologists as they share profound insights about the origin of man and retrace our singular journey from fledgling prototype to the most dominant species on Earth.

I'll update this entry if I notice anything novel said in the program.

September 06, 2012

ASHG 2012 abstracts are online!

There is so much good stuff there. This year I decided against posting the full abstracts, so I'll just link to a few, adding a few sentences on why they strike me as interesting. And, since there are so many interesting ones, I'll keep updating this entry.

On the Sardinian ancestry of the Tyrolean Iceman confirms that modern Sardinians are most similar to both the Tyrolean Iceman and the Swedish Neolithic TRB individual (presumably Gok4). You can find my analysis of both in the archives of the blog. But, look here:

Strikingly, an analysis including novel ancient DNA data from an early Iron Age individual from Bulgaria also shows the strongest affinity of this individual with modern-day Sardinians. Our results show that the Tyrolean Iceman was not a recent migrant from Sardinia, but rather that among contemporary Europeans, Sardinians represent the population most closely related to populations present in the Southern Alpine region around 5000 years ago. The genetic affinity of ancient DNA samples from distant parts of Europe with Sardinians also suggests that this genetic signature was much more widespread across Europe during the Bronze Age.

As you may have guessed, I can't wait to get my hands on that Iron Age Thracian. His similarity with Sardinians is striking, because by the Iron Age, I would have thought that something akin to the modern genetic landscape would have begun to crystallize in Europe.

Y Chromosome J Haplogroups trace post glacial period expansion from Turkey and Caucasus into the Middle East confirms what I have argued about, i.e., that the West Asian highlands are responsible for the spread of haplogroup J, including, it seems into the Middle East itself. The chronology presented probably assumes the evolutionary mutation rate; also, the lack of haplogroup J in Europe pre-5ka argues for a late expansion. I am fairly convinced that out of this West Asian highlander population came the two dominant groups of West Eurasian prehistory, the Indo-Europeans and the Semites, their spread associated with a "metallurgical edge" in technology and social complexity during the Late Neolithic and Bronze Age. The latter probably picked their language from a T- or E-bearing population of the southern Levant (Ghassulians?), as these two haplogroups might link the Proto-Semites with their African Afroasiatic brethren.

Analytical inference of human demographic history using multiple individual genome sequences:

We estimate that Eurasian populations split from ancient Africans at 58,000-120,800 years ago, and the divergence time of Europeans and Asians occurred at 35,750-70,500 years ago.

This sounds reasonable, and the wide confidence intervals probably reflect current uncertainties about the mutation rate. The European/Asian split time intersects the UP and postdates the ~70ka turning point (Toba + Drying up of Arabia/Sahara). The African/European split time intersects the ~106ka Nubian complex in Arabia.

A genomewide map of Neandertal ancestry in modern humans:

We identify around 35,000 Neandertal-derived alleles in Europeans and 21,000 in East Asians.

This might seem superficially at odds with the recent finding of greater Neandertal ancestry in East Asians than Europeans, but remember that levels of Neandertal admixture depend on allele frequencies of introgressed variants, and East Asians are generally less polymorphic than Europeans.

Analysis of contributions of archaic genome and their functions in modern non-Africans

Totally, we identified 410,683 archaic segments in 909 non-African individuals with averaged segment length 83,460bp. In the genealogy of each archaic segment with Neanderthal, Denisovan, African and chimpanzee segments, 77~81% archaic segment coalesced first with Neanderthal, 4~8% coalesced first with Denisovan, and 14% coalesced first with neither, validating the algorithm. Interestingly, a large proportion of all the archaic segments identified shared 88.9% similarity with Neanderthal, suggesting a single major admixture with Neanderthal at 82~121kya, right after the Africa exodus of the ancestors of modern humans.

It will be interesting to see what these authors get a different date than Sankararaman et al. The mutation rate can't be at fault because these dates are mostly dependent on the recombination rate. My initial guess is that the lower rate of S. et al. may be due to limiting the analysis to alleles with MAF less than 0.1. As I said before, it is unclear whether admixture LD-based signals of admixture with Neandertals can account for the totality of the D-statistics of Non-Africans vs. Africans.

Sequencing of an extended pedigree in Western chimpanzees is interesting for a variety of reasons, but for me the primary one is the inevitable use of this pedigree to fix the chimpanzee autosomal mutation rate, which has so far been assumed to be similar to the human one. On a similar topic, Estimating human mutation rate using autozygosity in a founder population comes up with 1.21x10-8/bp/generation for humans, which is practically the same as that inferred for Iceland, and belongs to the class of slow mutation rates that have been inferred lately and which may reshape our understanding of events ranging from human-chimp speciation to the date of Out-of-Africa.

The genetic structure of Western Balkan populations based on autosomal and haploid markers

Comparison of the variation within autosomal and haploid data sets of studied Western Balkan populations revealed their genetic closeness regardless of a genetic system inspected, in particular among the Slavic speakers. Hence, culturally diverse Western Balkan populations are genetically very similar to each other. Only the Kosovars show slight differences both in the variance of autosomal and uniparentally inherited markers from the other populations of the region, possibly also due to their historically strict patrilineality. In a more general perspective, our results reveal clear genetic continuity between the Near Eastern and European populations, lending further credence to extensive, likely multiple and possibly bidirectional ancient gene flows between the Near East and Europe, cutting through the Balkans.

Asian Expansion of Modern Human out of Africa is not very eloquent, and I think may be missing a zero in one of its numbers, but the point being made (that the major Y-haplogroup E found in Africans is descended from Asian back-migrants) is something which I also think very likely, for reasons explained here.

Paleolithic human migrations in East Eurasia by sequencing Y chromosomes:

Paleolithic human migrations in East Eurasia remains largely unknown due to the lack of sufficient markers derived from the mutations that occurred during that time frame. To tackle this problem, using the sequence capturing, barcoding technology and next-generation sequencing, we identified more than 4,000 new SNPs encompassing most single copy non-recombining region of human Y chromosome. New clades for haplogroups O, C, N, D, and Q could be geographically located. Especially, a few star-like expansions were unveiled, showing strong population growth. The phylogeny of Haplogroup N was radically rearranged, and all the N individuals could now be categorized into either a northern clade N1 or southern clade N2, revealing a Paleolithic migratory routes of the ancestors of Uralic speaking populations. Haplogroup C, especially the East Eurasia-dominant clade C3, could also be separated into at least two ancient clades, suggesting Paleolithic migrations in East Asia. Three major clades under O, M117+, M134xM117, and 002611+, each could be now further classified into several subclades. With these new findings, we proposed the modified the routes and dates for human populations’ migration, especially those in Paleolithic time. A few Y-chromosomal expansions could now be linked to certain prehistoric cultures or ancestors of language families.

Inferring and sequencing the founding bottleneck of Ashkenazim

Applying this methodology to data from self-identified AJ samples, we show 85-90% of them belong to a genetic isolate related to other Mid-Eastern populations. This group has experienced an extreme bottleneck 30-35 generations ago, with subsequent expansion greatly exceeding the growth rate across all humans. Data are consistent with bottleneck size of merely 400 founders.

The remarkable landscape of polymorphism sharing in modern humans

Once again, I used the Complete Genomics data to study the problem of polymorphism sharing in different human populations. There are different sample sizes for the various populations, so I took random samples of 4 individuals/population. I extracted individuals from CEU, TSI, CHB, JPT, and YRI for chromosome 1, removing all polymorphisms if there were any missing alleles and with the --remove-indels flag of vcftools.

Here is the Venn diagram, showing how many sites were polymorphic in all populations, in only one of them, or in any combination thereof:

The total number of polymorphic sites at different populations are:

CEU 346,417
TSI 352,039
CHB 326,398
JPT 325,071
YRI 498,536

This confirms the oft-mentioned observations that Africans are more genetically diverse than other populations. This is also confirmed by the existence of 222,255 polymorphisms that are unique to the YRI population.

A few years ago, the pattern of polymorphism sharing would be interpreted in some way similar to this:

In the above tree model, the story of Homo sapiens proceeds in isolation, from our earliest ancestors, through a series of splits (the first of which is "Out-of-Africa"), all the way to modern populations. Such a story is pretty much necessary to explain the super-homogeneity of Eurasians vs. the super-diversity of Africans. If the human story is self-contained, then the only plausible explanation is the Great Out-of-Africa Bottleneck.

But, the human story is not self-contained. Multiple lines of evidence suggest that several demes within the species Homo heidelbergensis, and perhaps, even H. erectus, co-existed during the Middle Paleolithic in the Old World with people who were genetically and anatomically most similar to living humans. A new possibility emerges: that the heterogeneity of modern humans was created in part by admixture between such divergent populations.

It is a worthwhile exercise to study the behavior of SNP subsets vis a vis the two archaic hominins (Neandertal and Vindija) whose genomes we possess. In a recent experiment, I did just that. The finding that population-specific polymorphism (e.g., variable sites that are specific to Europeans, East Asians, or Africans) show an increased evidence of archaic admixture seems to suggest that the roots of regional populations of Homo sapiens may lie in very divergent regional Middle Paleolithic Homo populations interacting with a successful deme within anatomical Homo sapiens that emerged prior to 100 thousand years ago and concluded its conquest of the Old World about 50-60 thousand years later.

If this idea is correct, then it may turn out that the remarkable landscape of polymorphism sharing (and non-sharing) between modern human populations was not only shaped by what Homo sapiens did as he followed his own story. Thanks to technologies invented by some of his descendants, we are beginning to piece together the remarkable panorama of the prehistory of our species. But the cadre will not be complete until we find a place in it for those mysterious Others, fragments of whose DNA persist in our cells.

September 05, 2012

Our baby-eating ancestors

Much of modern cannibalism revolves around the ritual consumption of one's adult enemies, but apparently this was not what was practiced by H. antecessor. The case for Neandertal cannibalism was advanced by White and Toth, but it's my impression that it's controversial. Since the Upper Paleolithic, modern humans have certainly devised numerous new ways of being cruel to one another, but it seems that cannibalism is not one of them. Certainly, our more recent ancestors have imbued meaning and ritual to the process, but it does seem that the act could very well be performed by a symbol-free culture.

Hominid Hunting has more.

J Hum Evol. 2012 Sep 1. [Epub ahead of print]

Intergroup cannibalism in the European Early Pleistocene: The range expansion and imbalance of power hypotheses.

Saladié P, Huguet R, Rodríguez-Hidalgo A, Cáceres I, Esteban-Nadal M, Arsuaga JL, Bermúdez de Castro JM, Carbonell E.

Abstract

In this paper, we compare cannibalism in chimpanzees, modern humans, and in archaeological cases with cannibalism inferred from evidence from the Early Pleistocene assemblage of level TD6 of Gran Dolina (Sierra de Atapuerca, Spain). The cannibalism documented in level TD6 mainly involves the consumption of infants and other immature individuals. The human induced modifications on Homo antecessor and deer remains suggest that butchering processes were similar for both taxa, and the remains were discarded on the living floor in the same way. This finding implies that a group of hominins that used the Gran Dolina cave periodically hunted and consumed individuals from another group. However, the age distribution of the cannibalized hominins in the TD6 assemblage is not consistent with that from other cases of exo-cannibalism by human/hominin groups. Instead, it is similar to the age profiles seen in cannibalism associated with intergroup aggression in chimpanzees. For this reason, we use an analogy with chimpanzees to propose that the TD6 hominins mounted low-risk attacks on members of other groups to defend access to resources within their own territories and to try and expand their territories at the expense of neighboring groups.

Link

September 03, 2012

Ancient DNA age/Mutation rate per annum, and signal of archaic admixture

UPDATE (8 Sep 2012 ): The following discussion in smallcase is now obsolete. See f-statistics are robust to differences in sample age for details.

The D-statistic takes the form:

D(H1, H2, Vindija, Chimp) = (sum of ABBA - sum of BABA) / (sum ABBA + sum BABA)

If A: chimp allele, and Vindija (the Neandertal source of the "Neandertal genome") has the derived B allele, then in sites where two individuals H1 and H2 differ, there are two possible patterns:

ABBA = H2 matches Neandertal, but H1 does not

BABA = H1 matches Neandertal, but H2 does not

If Neandertal did not contribute DNA more to H1 or to H2, then the rates at which ABBA and BABA occur are equal, and the D-statistic has an expected value of 0.

Now, consider that H1 is a living human, and H2 is one that lived X years ago. It is now not expected, that ABBA and BABA will be equal. Suppose that modern humans and Neandertals diverged Y years ago, and that Vindija is V years old. Then, H1 (the living human) is separated from Vindija by 2*Y-V years of evolution, but H2 (the ancient human) is separated by 2*Y-V-X years. It is now expected that H2 will match Neandertal more often than H1 does at any site, and, consequently, there will be an excess of ABBA over BABA, and a non-zero statistic.

It will appear that ancient genomes may appear to be archaic-admixed even if they are not, and the older they are, the more archaic-admixed they will appear to be.

There is a different complication that may arise from the fact that the mutation rate per annum may not be the same in different human populations. If H1 and H2 are both modern humans, but the mutation rate per annum in the ancestry of H2 is less than the mutation rate per annum in the ancestry of H1, then H2 will be effectively closer to an archaic hominin (such as Vindija) than H1, and will appear to be archaic-admixed relative to H1.

It is not clear whether the mutation rate per annum has been the same in the ancestry of individuals who inhabit different climate zones, tend to have different body sizes, or have different generation lengths. Table S15 of Meyer et al. (2012) may suggest that it is not:

It appears that the the San- and Yoruba-specific branches are a a little longer compared to Eurasian-specific branches. This may contribute to a signal of archaic admixture in Eurasians.

In both described cases, it remains to be seen how much of the signal of admixture might be explainable on the basis of these effects, and how much will remain intact.

September 02, 2012

Population-specific SNPs and archaic admixture in Homo sapiens

A population-specific SNP is one which both alleles occur in a population X and only one in the rest of mankind. The existence of such SNPs can be explained in three different ways:

Both alleles belong to the ancestral gene pool of H. sapiens but were lost (by selection or drift) in most populations except one.
Ancestral humans were monomorphic, but a new allele appeared by mutation in one population, and there has not been enough time or opportunity for it to spread to the others.
An additional allele introgressed into a population by admixture with a regional population of archaic humans; this is equivalent to (2), with the new allele appearing through admixture, rather than new mutation.

In order to identify alleles that are specific to the major human groups, I carried out an ADMIXTURE analysis of the Harvard HGDP set with K=5. The results can be seen below, and as is expected, five major clusters emerge:

ADMIXTURE outputs a *.P file of allele frequencies in the inferred components. I used this file to identify population-specific alleles. Specifically, I identified, for each of the 5 components, SNPs where they were polymorphic, but all the other 4 components were fixed. This is the harvest of such component-specific alleles:

Asian: 2,321
West_Eurasian: 4,516
African: 50,835
Australasian: 1,726
Amerindian: 62

Note, that these numbers do not reflect on the relative number of population-specific SNPs across the genome. Nonetheless, they do appear concordant with what we know about the diminution of genetic diversity away from Africa.

Note also that these SNPs were identified only on the basis of comparisons between modern populations.

In all the following experiments, I will calculate D-statistics of the form D(Pop1, Pop2, Neandertal, Chimp) and D(Pop1,Pop2, Denisova, Chimp) to assess whether Pop1 matches Neandertal/Denisova more than Pop2 does.

Using Asian-specific SNPs

Han matches Neandertal 69% more than Sardinian does
Japanese matches Neandertal 71% more than Orcadian does
Dai matches Neandertal 69% more than Mandenka does
Cambodian matches Neandertal 52% more than San does
She matches Neandertal 5% more than Miao does

These results are consistent with a large number of Asian-specific alleles having been inherited from Neandertals or a Neandertal-like population.

Using West Eurasian-specific SNPs

Sardinian matches Neandertal 63% more than Han does
Orcadian matches Neandertal 61% more than Japanese does
Italian matches Neandertal 55% more than Mandenka does
French matches Neandertal 57% more than San does
Tuscan matches Neandertal 11% more than Basque does

Again, these results are consistent with many West Eurasian-specific alleles having been inherited from Neandertals or a Neandertal-like population.

Using African-specific SNPs

Dai matches Neandertal 55% more than Mandenka does
Cambodian matches Neandertal 41% more than San does
Italian matches Neandertal 55% more than Mandenka does
French matches Neandertal 41% more than San does
BantuKenya matches Neandertal 11% more than MbutiPygmy does

It thus appears that a substantial number of African-specific SNPs make Africans appear less Neandertal-like than Eurasians. This is unexpected if African-specific SNPs are common human SNPs that were retained in Africa but lost Out-of-Africa due to a bottleneck, or if they are SNPs that appeared recently by mutation in the African population.

They are, however, consistent with the following model:

This model is consistent with the evidence: Eurasian-specific alleles tend to match Neandertals, consistent with Neandertal introgression into the population of Eurasians. But, African-specific alleles tend not to match Neandertals.

This can be explained by the presence of archaic African admixture that stems from before the common ancestor of modern humans and Neandertals.

Denisova admixture in Australasian-specific SNPs

Papuan matches Denisova 84% more than Mandenka does
Melanesian matches Denisova 85% more than San does
Papuan matches Denisova 84% more than Sardinian does
Melanesian matches Denisova 85% more than Orcadian does
Papuan matches Denisova 84% more than Japanese does
Melanesian matches Denisova 85% more than Dai does
Papuan matches Denisova 24% more than Melanesia does

It will thus appear that in many Australasian-specific SNPs, Papuans/Melanesians match the Denisovan allele much more often than other populations.

(As a sanity check for my calculations, I note that Reich et al. 2011 (Table 2) inferred that Bougainville Melanesians have 82% of the Denisova ancestry that Papuans do on the basis of genotype data, and hence, Papuans have (100-82)/82 = 22% more, which closely matches my 24% figure)

African diversity in perspective

It is often said that non-Africans harbor a subset of African genetic variation, but that is not really generally true in the strict mathematical sense of "subset". For the 50,835 African-specific SNPs it is true that Africans are polymorphic whereas all other populations are monomorphic. But, for the 2,321 Asian-specific, and 4,516 West Eurasian-specific SNPs it is Asians and West Eurasians respectively that are polymorphic, whereas Africans and all remaining populations are monomorphic.

By examining these special sets of SNPs, where a regional human population is polymorphic and all the rest of mankind is not, we have been able to show that relationships with archaic humans are amplified. The implication is direct: regional-specific variation in humans is in part the heritage of regional continuity in both Africa and Eurasia. However it was that H. sapiens came to dominate our planet, it was not by extinction of archaic humans.

Human population differentiation: tree-like divergence or admixture between divergent demes?

What can account for differences between continental human populations? There are two explanations: (1) Tree-Like Divergence and (2) Incomplete Admixture between divergent demes.

(1) Until quite recently, it was near-universally thought that modern humans are the descendants of a single recent African population. Differences between human groups were ascribed to the operation of genetic drift, natural selection, and new mutation, as modern humans left their primordial Eden and expanded to populate the rest of the globe. According to this model: humans became more different from each other over time.

(2) But, there is a different idea, the reverse of the previous one: that modern humans are descended from many regional groups of earlier hominins that were very highly differentiated from each other. These earlier groups did of course diverge from common ancestors, but in the remote past; perhaps they represent long branches stemming from Homo heidelbergensis. Gene flow between them intensified as humans became more numerous and more mobile, According to this idea, humans around the world became more similar to each other over time.

We are now in a position, through the power of ancient DNA, to answer this question empirically. I am pretty sure that we will answer it over the next decade. And, the way to answer it is simple: take an UP West Eurasian, an UP East Asian, and a LSA African. Under the model of tree-like divergence, these ought to be genetically closer than a living European, a living East Asian, and a living African are to each other, because ~30-40ky of drift and selection acted independently on the three branches:

I suspect that we will be quite surprised when we look at the data, for a number of reasons:

Low genetic diversity of Denisova hominin, consistent with a model in which human diversity is generated by admixture between populations with low intra-group (as in Denisova), but high inter-group diversity (see point #2).
Higher genetic divergence between Denisova and Vindija (across ~6Mm of distance) than between any two living humans from the entire globe.
Diminution of human cranial variability over time, and disappearance of archaic forms.
Possibility of greater Neandertal admixture in UP Europeans than in recent ones.

In any case, if the the UP/LSA folks look like close relatives, we will know for sure that tree-like divergence did indeed shape our differentiation: we started very similar, and ended up different. But if they're not, then the door will open for a greater appreciation of how substantial archaic admixture, tempered by gene flow, has been a major cause of differences between living populations of mankind.

Looking forward to finding out...

August 27, 2012

When Eurasians got lighter skin

My default position is to doubt all molecular dates until I understand how they were derived. Nonetheless, these results seem broadly consistent with the idea that Eurasian modern humans got lighter as their ancestors moved into more northern latitudes of the Old World and replaced Neandertals and others earlier Eurasian occupants, and then they got really lighter post-LGM, and then some got really really lighter with mutations in genes such as SLC24A4 (not studied here).

I suppose we will really find out who got what mutation when only through ancient DNA.

Mol Biol Evol (2012) doi: 10.1093/molbev/mss207

The timing of pigmentation lightening in Europeans

Sandra Belezal et al.

The inverse correlation between skin pigmentation and latitude observed in human populations is thought to have been shaped by selective pressures favoring lighter skin in order to facilitate vitamin D synthesis in regions far from the equator. Several candidate genes for skin pigmentation have been shown to exhibit patterns of polymorphism that overlap the geospatial variation in skin color. However, little work has focused on estimating the timeframe over which skin pigmentation has changed and on the intensity of selection acting on different pigmentation genes. To provide a temporal framework for the evolution of lighter pigmentation, we used forward Monte Carlo simulations coupled with a rejection sampling algorithm to estimate the time of onset of selective sweeps and selection coefficients at four genes associated with this trait in Europeans: KITLG, TYRP1, SLC24A5, and SLC45A2. Using compound haplotype systems consisting of rapidly evolving microsatellites linked to one SNP in each gene, we estimate that the onset of the sweep shared by Europeans and East Asians at KITLG occurred about 30,000 years ago, after the out-of-Africa migration, while the selective sweeps for the European-specific alleles at TYRP1, SLC24A5, and SLC45A2 started much later, within the last 11,000-19,000 years, well after the first migrations of modern humans into Europe. We suggest that these patterns were influenced by recent increases in size of human populations, which favored the accumulation of advantageous variants at different loci.

Link

August 23, 2012

Or, maybe they speciated 3.7-6.6Ma ago? (Sun et al. 2012)

This has certainly been an eventful August in human origins research; if the Neandertal Wars weren't enough, a different issue that had simmered for a while now, the human autosomal sequence mutation rate, has now come to a full boil.

A couple of weeks ago, Langergraber et al. (2012) came out, and combined direct measurement of generation lengths in humans and other primates with the directly measured human autosomal sequence mutation rate to argue for an old 7-13Ma divergence between Pan and Homo.

Yesterday, Kong et al. (2012) independently derived a low direct mutation rate of 1.2x10^-8, and added the observation that older human fathers pass on more mutations to their offspring than younger ones. As I point out in my post on the topic, this has implications for the Homo-Pan divergence as well: if chimp dads are younger than human dads, they will tend to pass fewer mutations to their offspring. Thus, the chimp mutation rate (/generation) might be lower rather than equal to the human one, and this might push the speciation time even further back in time.

Today, a new paper has appeared in Nature Genetics which argues for an "intermediate" rate between the direct ~1-1.3x10^-8 rate and the widely used 2.5x10^-8 one: their rate estimate is: 1.4–2.3x10^-8 and the corresponding Human-Chimp speciation time is 3.7-6.6 million years ago. Kari Stefansson is a co-author of the new paper, as he is of the Kong et al. one, which estimated the mutation rate at 1.2x10^-8.

The new paper builds what appears to be a very exhaustive model of microsatellite mutation:

Microsatellites have been widely used to make inferences about evolutionary history. However, the accuracy of these inferences has been limited by a poor understanding of the mutation process. We developed a new model of microsatellite evolution (Supplementary Note). This model can estimate the time to the most recent common ancestor (TMRCA) of two samples at a microsatellite by taking into account (i) the dependence of the mutation rate on allele length and parental age (Fig. 2a,c); (ii) the step size of mutations (Fig. 2b); (iii) the size constraints on allele length (Fig. 2d and Supplementary Figs. 8 and 9); and (iv) the variation in generation interval over history. In contrast to the generalized stepwise mutation model (GSMM), which predicts a linear increase of average squared distance (ASD) between microsatellite alleles over time, the new model predicts a sublinear increase (Fig. 3) and saturation of the molecular clock, due to the constraints on allele length. We also extended the model to estimate the sequence mutation rate, using the per-nucleotide diversity flanking each microsatellite as an additional datum. To implement the model, we used a Bayesian hierarchical approach, first generating global parameters common to all loci, followed by locus-specific parameters and finally the microsatellite alleles at each locus (Online Methods). We used Markov chain Monte Carlo to infer TMRCA and sequence mutation rate.

I haven't delved deeply into the details of how the sequence mutation rate (per nucleotide/per generation) can be derived by exploiting the microsatellite rate. But, why would the rate estimated with the new method be different than the directly measured one? The authors propose some ideas:

We hypothesize that the lower mutation rate estimates from the whole-genome sequencing studies might be due to (i) the limited number of mutations detected in these studies, which explains why their confidence intervals overlap ours, (ii) possible underestimation of the false negative rate in the whole-genome sequencing studies or (iii) variability in the mutation rate across individuals, such that a few families cannot provide a reliable estimate of the population-wide rate.

Apparently, the team behind Sun et al. became aware of the new Kong et al. after the paper was accepted, so they attached the following note at the end of it, as well as a discussion in the supplement:

Note added in proof: After this paper was accepted, another study35 was published that independently estimates the human sequence mutation rate, using a direct measurement in contrast to the indirect measurement we report here. In spite of some key similarities between our results and those of Kong et al.35 (the male-to-female mutation rate ratio and the absence of an effect of mother's age), they estimate a considerably stronger effect of father's age and an overall sequence mutation rate below the range we infer. The discrepancies in the sequence mutation rate may be in part due to the fact that Kong et al. focus on a more intensively filtered subset of the human genome than we analyze here, but other factors are also likely to be at work (Supplementary Note). As an initial attempt to compare the two studies in terms of their implications for evolutionary history, we ran the same Bayesian inference procedure we developed in this paper (integrating over uncertainty in unknown parameters), now using the sequence-based estimates rather than the microsatellite-based estimates as input (Supplementary Note). Notably, the inferred dates based on the measurement of the sequence mutation rate are older and no longer in direct conflict with the inference that S. tchadensis is on the human lineage since the split from chimpanzees. The sequence- and microsatellite-based data sets are very different, and an important direction for future research will be to understand why the direct sequence–based mutation rate estimate is lower than the one inferred on the basis of microsatellites.

All this leaves me rather perplexed. I guess one take-home lesson from the debate would be to avoid making strong statements about the past that are dependent on a particular mutation rate. The following table from the supplementary material pretty much says it all:

Notice that the two estimates are approximately double one of the other. Personally, I tend to favor the older dates, since they might "match" better with key developments: Out-of-Africa will become pre-100ka and consistent with the appearance of the Nubian technocomplex in Arabia, which seems to be the only real solid evidence of Out-of-Africa in the archaeological record. It would also be consistent with the appearance of modern humans in the Levant c. 100ca at Mt. Carmel, the first clear evidence of Homo sapiens in Eurasia. Moreover, it would explain the early appearance of Neandertaloid features in the Atapuerca hominins at c. 600ka, long before the inferred split of modern humans from Neandertals when the slowest rate is used.

But, my confidence in these correspondences is low until the controversy is resolved one way or another. If the 1.8x10^-8 rate of this paper is closer to the truth, then my money would be on the false negative rate, i.e., full genome sequencing is systematically overlooking SNPs that exist in the genomes.

Apparently, now, we have three rates to contend with: (i) the Icelandic 1.2x10^-8 rate (and other similar rates, such as the 1.36x10^-8 one); the 2.5x10^-8 one that has been very widely used in the literature, and (iii) the "1.82x10^-8 mutations per base pair per generation (90% CI 1.40–2.28 × 10-8; Table 2)" from this paper. This may be disheartening, but all setbacks represent opportunities to learn something new, and now that the issue is out in the open, I'm sure that many "top dogs" will try to figure out what is going on.

Nature Genetics doi:10.1038/ng.2398

A direct characterization of human mutation based on microsatellites

James X Sun et al.

Mutations are the raw material of evolution but have been difficult to study directly. We report the largest study of new mutations to date, comprising 2,058 germline changes discovered by analyzing 85,289 Icelanders at 2,477 microsatellites. The paternal-to-maternal mutation rate ratio is 3.3, and the rate in fathers doubles from age 20 to 58, whereas there is no association with age in mothers. Longer microsatellite alleles are more mutagenic and tend to decrease in length, whereas the opposite is seen for shorter alleles. We use these empirical observations to build a model that we apply to individuals for whom we have both genome sequence and microsatellite data, allowing us to estimate key parameters of evolution without calibration to the fossil record. We infer that the sequence mutation rate is 1.4–2.3-10^-8 mutations per base pair per generation (90% credible interval) and that humanchimpanzee speciation occurred 3.7–6.6 million years ago.

Link

More mutations in children of older fathers, and how it relates to human origins

Most of the coverage of the new Kong et al. paper has focused on the rising risk for inheritable diseases such as autism and schizophrenia in the children of older fathers. And, indeed, that is is the larger story, and, perhaps, the more useful one for society.

But, for those of us interested in the origins of our species, there is another story:

We show that in our samples, with an average father’s age of 29.7, the average de novo mutation rate is 1.20 × 10−8 per nucleotide per generation.

This mutation rate is in line with other direct measured rates, and is about twice smaller than the widely used 2.5x10^-8 rate used in evolutionary studies. Application of the low rate has led to a much older Human-Chimp divergence than was previously thought. That, in turn, will make mitochondrial Eve much older, because the mtDNA clock is calibrated on the Human-Chimp divergence. Practically every study of the last 10 years that looked at human origins and used the 2.5x10^-8 rate needs to be dusted off and made up to date.

But there is yet another story. The beauty of the Langergraber et al. paper is that it inferred the Human-Chimp divergence on the basis of directly observed quantities: mutation rates and generation times. But, there was one quantity which they could not measure directly: the mutation rate in the apes. Thus, they used the mutation rate of humans for the apes as well; that is very reasonable, because presumably the same underlying chemical machinery affects the rate in humans and their simian friends. But, here's where things get complicated:

Mean human paternal ages are about ~7 years older than chimp ones, and ~10 years older than gorilla ones. What this means, is that on average, younger chimp dads and younger gorilla dads have babies. But, the new Kong et al. paper:

Most notably, the diversity in mutation rate of single nucleotide polymorphisms is dominated by the age of the father at conception of the child. The effect is an increase of about two mutations per year. An exponential model estimates paternal mutations doubling every 16.5 years.

A back-of-the envelope calculation suggests that the higher age of human fathers may contribute ~30-50% more mutation in humans than in chimps/gorillas. Conversely, the mutation rate used for chimps should not be the human one: it should be even lower.

What are the implications of this?

The divergence of Humans from Chimps has been estimated by summing up mutations on two branches to their most recent common ancestor (MRCA). Younger chimp fathers = lower mutation rate / generation = Chimp-to-MRCA branch just got older.

In other words, just as we learned than humans diverged from chimps ~7-13 million years ago, it may be that they did so even earlier.

Nature 488, 471–475 (23 August 2012) doi:10.1038/nature11396

Rate of de novo mutations and the importance of father’s age to disease risk

Augustine Kong et al.

Mutations generate sequence diversity and provide a substrate for selection. The rate of de novo mutations is therefore of major importance to evolution. Here we conduct a study of genome-wide mutation rates by sequencing the entire genomes of 78 Icelandic parent–offspring trios at high coverage. We show that in our samples, with an average father’s age of 29.7, the average de novo mutation rate is 1.20???10?8 per nucleotide per generation. Most notably, the diversity in mutation rate of single nucleotide polymorphisms is dominated by the age of the father at conception of the child. The effect is an increase of about two mutations per year. An exponential model estimates paternal mutations doubling every 16.5?years. After accounting for random Poisson variation, father’s age is estimated to explain nearly all of the remaining variation in the de novo mutation counts. These observations shed light on the importance of the father’s age on the risk of diseases such as schizophrenia and autism.

Link

August 19, 2012

Raising a peace banner in the Neandertal Wars

The two camps in the Second Neandertal Wars (*) have assumed maximalist positions on opposing sides of the argument: African structure explains it! vs. Neandertal admixture explains it!. Armed with the Vindija genome, that marvel of technological ingenuity, and a suite of impressive statistical models, the two sides have reached completely opposing conclusions.

In order to formulate my own position, I decided to do what I love best, i.e., to look at the data for myself. My main idea is that the signals of Neandertal and Denisova admixture as measured by these quantities (D-statistics) ...

D(Pop1, Yoruba, Neandertal, Chimp)

D(Pop1, Yoruba, Denisova, Chimp)

... will vary on different SNP ascertainment panels. SNPs ascertained in Africans may have a great number of Palaeoafrican alleles; SNPs in Neandertal-admixed populations will have a great number of Neandertal alleles; SNPs in Denisova-admixed populations will have a great number of Denisova alleles. If a population has admixture from hominin X, this admixture, as measured by the D-statistic, will tend to be inflated in panels possessing alleles that introgressed from X, and suppressed in panels that lack them.

The issue of ascertainment and archaic admixture was addressed by Skoglund and Jakobsson (2011); my aim is different: I am not so much interested in how ascertainment affects admixture estimates, but rather in exploiting the observation of the preceding paragraph (that Palaeoafrican, Neandertal, or Denisovan SNPs will lurk at different rates when ascertained in different individuals) to see what it tells us about human differences.

The signal of "archaic admixture" may be generated by genuine archaic admixture in one population (e.g., Eurasians), making it more similar to the archaic group (e.g., Neandertals), or by archaic admixture -of a different sort- in another population (e.g., Africans), making it less similar to that group. Both these processes may be at work, operating at different intensity in different populations and across different timelines.

I used the Harvard HGDP set, which contains 12 SNP panels, each of which has been ascertained in two chromosomes of a single individual. These panels are:

San, Yoruba, Mbuti, French, Sardinian, Han, Cambodian, Mongolian, Karitiana, Papuan1, Papuan2, Melanesian

A D-statistic was calculated relative to either Neandertal or Denisova for all HGDP populations, as well as the two archaic hominins. Subsequently, I used MCLUST to infer the number of different clusters on the basis of these statistics. In the optimal solution, MCLUST inferred 7 clusters, with each archaic hominin getting its own cluster, while the modern human populations were assigned to 5 clusters corresponding to five major human races recognized by traditional physical anthropology (Mongoloid, Negroid, Australoid, Capoid, and Caucasoid).

Note that these are not admixture proportions, but assignment probabilities! All populations fell into their expected clusters. The populations from Pakistan who are believed to be predominantly Caucasoid with varying degrees of minor admixture of an Ancestral South Indian element were assigned to the Caucasoid cluster. So did the Mozabite Berbers, a Caucasoid population with minority Negroid admixture. Finally, of the Central Asian populations, the Hazara of Pakistan showed mixed affiliations in the Caucasoid and Mongoloid clusters, while the Uygur were assigned to the Mongoloid cluster.

It is noteworthy that by exploiting patterns of relationship of modern to regional archaic humans, we have managed to recreate the major human groups. This is, perhaps, supportive of those who have argued that a degree of regional continuity across the Old World, and not only recent post-Out of Africa genetic divergence is responsible for present-day inter-population differences.

MCLUST also gave us the D-statistic means for the 7 inferred clusters. Remember that these are differences between a population Pop1 and Yoruba, relative to an archaic hominin (Neandertal or Denisova), and for 12 different ascertainment panels:

There are wonderful patterns to be discovered here; you can look at the data for yourselves; that's the open science thing to do.

All our ideas about human origins are conditioned on the availability of genomes from two archaic Eurasian hominins, and the lack of genomes of similar age from Africa.

But, remember:

You can fit Europe, China, India, and the US into Africa, with room to spare.
If Vindija and Denisova, two caves less than 5,000km apart were home to people more divergent from each other than any two humans are today, it's strange to think that only "modern humans" inhabited Africa at the same time.
The maximum genetic distance between living Africans is much higher than the maximum distance between living Eurasians: Africa is much more diverse than Eurasia. It's simpler to assume that the same relative pattern was true during the Middle Stone Age. The palaeoanthropology seems to support this, showing archaic forms present even during the terminal Pleistocene in Africa.
If modern humans did interbreed with 2/2 archaic humans whose sequences we possess, it's strange to think that they somehow shunned the African Others.

In view of the above, I humbly raise my peace banner in the Neandertal Wars, and declare that it isn't either-or: it's both!

(*) The First Neandertal Wars were fought decades ago by anthropologists working with calipers and magnifying lenses. Their outcome was to relegate Neandertals from the enviable position of our likely ancestors to that of an irrelevant sidekick, although a not-negligible minority continued an insurgency against the Out-of-Africa-only victors.

August 16, 2012

Our big human brains may depend on DUF1220 copy numbers

This is quite remarkable, notice how Neandertals who were bigger-brained than living humans had a higher DUF1220-domain copy number (as estimated from the Green et al. 2010 data).

Certainly wondering how many DUF1220-domain copies I have :) My running estimate using my own calculator is that I have 1,493gr worth of brain, which isn't Amud- or Turgenev-worthy, but quite respectable, and which ought to translate into plenty of DUF1220 copies; I guess I'll have to wait until full genome sequencing costs drop a little more before I can find out.

From the press release:

The human brain, with its unequaled cognitive capacity, evolved rapidly and dramatically.

"We wanted to know why," says James Sikela, PhD, who headed the international research team that included researchers from the University of Colorado School of Medicine, Baylor College of Medicine and the National Institutes of Mental Health. "The size and cognitive capacity of the human brain sets us apart. But how did that happen?"

"This research indicates that what drove the evolutionary expansion of the human brain may well be a specific unit within a protein – called a protein domain -- that is far more numerous in humans than other species."

The protein domain at issue is DUF1220. Humans have more than 270 copies of DUF1220 encoded in the genome, far more than other species. The closer a species is to humans, the more copies of DUF1220 show up. Chimpanzees have the next highest number, 125. Gorillas have 99, marmosets 30 and mice just one. "The one over-riding theme that we saw repeatedly was that the more copies of DUF1220 in the genome, the bigger the brain. And this held true whether we looked at different species or within the human population."

From the paper:

Among primate lineages, there is a high correlation between DUF1220 copy number (the highest copy number, greater than 270, was found in Homo sapiens [human and Neanderthal]) and increased brain size ... as well as an increased number of cortical neurons ... Taken together, these observations support the view that DUF1220-domain copy number, i.e., DUF1220-domain dosage, functions as a general effector of evolutionary, pathological, and normal variation in brain size.

The American Journal of Human Genetics, 16 August 2012 doi:10.1016/j.ajhg.2012.07.016

DUF1220-Domain Copy Number Implicated in Human Brain-Size Pathology and Evolution

Laura J. Dumas

DUF1220 domains show the largest human-lineage-specific increase in copy number of any protein-coding region in the human genome and map primarily to 1q21, where deletions and reciprocal duplications have been associated with microcephaly and macrocephaly, respectively. Given these findings and the high correlation between DUF1220 copy number and brain size across primate lineages (R2 = 0.98; p = 1.8 × 10−6), DUF1220 sequences represent plausible candidates for underlying 1q21-associated brain-size pathologies. To investigate this possibility, we used specialized bioinformatics tools developed for scoring highly duplicated DUF1220 sequences to implement targeted 1q21 array comparative genomic hybridization on individuals (n = 42) with 1q21-associated microcephaly and macrocephaly. We show that of all the 1q21 genes examined (n = 53), DUF1220 copy number shows the strongest association with brain size among individuals with 1q21-associated microcephaly, particularly with respect to the three evolutionarily conserved DUF1220 clades CON1(p = 0.0079), CON2 (p = 0.0134), and CON3 (p = 0.0116). Interestingly, all 1q21 DUF1220-encoding genes belonging to the NBPF family show significant correlations with frontal-occipital-circumference Z scores in the deletion group. In a similar survey of a nondisease population, we show that DUF1220 copy number exhibits the strongest correlation with brain gray-matter volume (CON1, p = 0.0246; and CON2, p = 0.0334). Notably, only DUF1220 sequences are consistently significant in both disease and nondisease populations. Taken together, these data strongly implicate the loss of DUF1220 copy number in the etiology of 1q21-associated microcephaly and support the view that DUF1220 domains function as general effectors of evolutionary, pathological, and normal variation in brain size.

Link