January 29, 2012

Early Neandertals used red ochre

I have never quite understood the fascination of archaeologists with red ochre. As far as I can tell, this fascination stems from the fact that it is a pigment that survives in time, and had been used by the earliest artists during the Upper Paleolithic. By extrapolation, its presence in earlier contexts has been interpreted as evidence of "art" or "symbolic behavior". So, a new paper that appeared in PNAS slightly demystifies the pigment by discovering its use more than 200 thousand years in Europe, probably by early Neandertals, and at the same time as its earliest traces in Africa.

In my opinion, this points to the conclusion that use of pigments and red ochre in particular is not a modern human innovation that was adopted (late) by the sister Neandertal taxon, but rather that something that humans used long-before the advent of "modernity", dating, perhaps, to H. heidelbergensis, the common ancestor of modern humans and Neandertals.

Of interest in that regard is the following tidbit of information from an unrelated source:
Sicevo Gorge - a canyon cut into the Kunivica plateau in southeastern Serbia - contains a series of caves, at least one of which has yielded evidence of human presence during the Ice Age of present-day Europe. In 2008, anthropologists excavating in a small cave uncovered a partial human lower jaw with three teeth. 
"We were looking for Neanderthals," said Dr. Mirjana Roksandic, a participating palaeo-anthropologist with the University of Winnepeg (Canada) and a leading research team member. "But this is much better. 
"What they discovered was definitely a human that, at least in terms of morphology, predated the Neanderthal and may have had more in common physically with Homo erectus - thought by many scientists to be the precursor to both Neanderthals and modern humans. Recent tests conducted by Dr. Norbert Mercier at the University of Bordeaux (France) produced a date of "older than" 113,000 years BP - long before modern humans in present-day Europe - and the fossil could be substantially older.
So, I would not assume that 200-250ky ago in Europe was definitely "early Neandertals".

John Hawks covers the paper in detail.

PNAS doi: 10.1073/pnas.1112261109

Use of red ochre by early Neandertals

Wil Roebroeks et al.

Abstract

The use of manganese and iron oxides by late Neandertals is well documented in Europe, especially for the period 60–40 kya. Such finds often have been interpreted as pigments even though their exact function is largely unknown. Here we report significantly older iron oxide finds that constitute the earliest documented use of red ochre by Neandertals. These finds were small concentrates of red material retrieved during excavations at Maastricht-Belvédère, The Netherlands. The excavations exposed a series of well-preserved flint artifact (and occasionally bone) scatters, formed in a river valley setting during a late Middle Pleistocene full interglacial period. Samples of the reddish material were submitted to various forms of analyses to study their physical properties. All analyses identified the red material as hematite. This is a nonlocal material that was imported to the site, possibly over dozens of kilometers. Identification of the Maastricht-Belvédère finds as hematite pushes the use of red ochre by (early) Neandertals back in time significantly, to minimally 200–250 kya (i.e., to the same time range as the early ochre use in the African record).

Link

January 28, 2012

Chris Stringer, "Rethinking Out of Africa"

Over at the Edge:
I'm thinking a lot about species concepts as applied to humans, about the "Out of Africa" model, and also looking back into Africa itself. I think the idea that modern humans originated in Africa is still a sound concept. Behaviorally and physically, we began our story there, but I've come around to thinking that it wasn't a simple origin. Twenty years ago, I would have argued that our species evolved in one place, maybe in East Africa or South Africa. There was a period of time in just one place where a small population of humans became modern, physically and behaviourally. Isolated and perhaps stressed by climate change, this drove a rapid and punctuational origin for our species. Now I don’t think it was that simple, either within or outside of Africa.
There is a 44' video at the site (which I haven't viewed yet).

UPDATE: at 10:45, he suggests that Broken Hill is much younger than "many of us think". This seems exceptionally important, since BH (or Kabwe) is thought to be the African branch of H. heidelbergensis and a precursor to the later H. sapiens. Given the current extent of dates proposed for the specimen, it seems almost certain that "much younger" means post-Omo, and hence: (i) one possible precursor for H. sapiens disappears from Africa, and (ii) one additional post-H. sapiens archaic hominin is added.

January 27, 2012

The Arabian cradle (Fernandes et al. 2012)

I have written about Out-of-Arabia before. It is important to remember, when discussing the prehistory of Arabia in terms of the modern inhabitants, that the peninsula undergoes periods of extreme aridity followed by periods of relative humidity. Hence, unlike other regions of the world where continuous occupation can be argued due to a fairly stable climate, this is not the case for Arabia.

This observation is important because when looking at modern populations we cannot a priori assume the survival of the most ancient inhabitants. Nonetheless, it can be well argued that Homo sapiens is an extremely adaptable species: not only did it spread throughout the world in a geological blink of an eye over the last 50 thousand years or so, but also persisted throughout most of the world, coming to occupy nearly every corner of the planet.

So, even though hyper-arid periods may have driven away most people from desert areas, perhaps they did not drive away everybody. There may yet be relics of ancient populations to be found. This is exactly what a new paper proposes: that Arabia possesses extremely old mtDNA lineages within the major macro-haplogroup N, dating to about 60,000 years ago. This is quite close to the estimates time depth of haplogroup L3 which unites many Africans with the Eurasians belonging to macrohaplogroups M and N.

The mainstream understanding of what happened -according to most geneticists- is that modern humans began spreading from Africa at around that time, about 60-70 thousand years ago. On the contrary, archaeologists have found indisputable evidence (palaeoanthropological or archaeological) of modern humans in Asia from before 100 thousand years, stretching from the Levant to the southern parts of Arabia.

There are two possibilities:

  • The pre-70ka modern humans in Asia left absolutely no traces of mtDNA, and all of the extant mtDNA in Asia is derived from post-70ka Africans. Hence, the pre-70ka modern humans in Asia were the descendants of failed exodi.
  • The people who expanded post-70ka in Asia were descended from people who lived in Asia before 100ka, descendants of successful exodi perhaps associated with the Mount Carmel hominins or the recently discovered Nubian Complex.
I am rather in favor of the second hypothesis; the authors of the current paper favor the first. It seems unnatural that pre-70ka modern humans in Asia would just vanish: why would they? They, apparently, lived across a vast area, and were bearers of technologies that were no worse than contemporaneous African cultures. Moreover, there is simply no archaeological evidence about population movements originating in Africa at 70-60ka.

However, if the second hypothesis is true, there is a problem: haplogroup L3 is dated to 70ka, so if the expansion associated with it started in Asia, that means that there must have been substantial back-migration of L3-related lineages back to Africa. I don't see any major problem with that hypothesis, but it is true that many scientists are reluctant to feature extensive back-migration to Africa into their models. At present it has not been possible to determine to what extent genetic diversity in Africa is due to great antiquity vs. admixture of divergent human populations, which I have called Afrasian (related to Eurasians) and Palaeoafrican. If L3 did originate in Africa, then the concusion of a recent African exodus is inescapable.

The major contribution of the current paper is that it fixes a major human expansion Out-of-Arabia at very close to 60ka. Whether this expansion originated from transient Out-of-Africans who had recently exited Africa, or from long settled populations of Asia (prior to 100ka) remains to be seen.

From the paper:
The presence of archaeological sites in the Gulf basin demonstrates a long tradition of human occupation.9 However, neither direct cultural influences from the Levant nor any African influence has been detected in the Upper Palaeolithic (Late Pleistocene) lithics observed in eastern Arabia, pointing to a local development of cultural techniques.9,47 Curiously, however, the fact that some of the branches studied here include deep lineages in eastern Africa (haplogroups I, N1a, and N1f) shows that migration back to Africa occurred a number of times between 15 and 40 ka ago. 
The hypothesized Gulf Oasis9 appears to be the most likely locus of the earliest branching of haplogroup N, including the three relict basal N(xR) haplogroups studied here, as well as the major Eurasian haplogroup R. Time estimates, frequencies, and genetic diversities reported here for these haplogroups are often similar between the Levant and Arabia, challenging the hypothesis of longterm isolation between these two regions. The other two refugia identified in the south and southwest of the Peninsula might have acted as a corridor for migrations west, back toward eastern Africa. Y chromosome microsatellite diversity in the Arabian Peninsula has suggested that Dubai and Oman share genetic affinities with other Near Eastern populations, whereas Saudi Arabia and Yemen show signs of greater isolation (although for fast-evolving microsatellites, these differences might reflect more recent events).4

The American Journal of Human Genetics, 26 January 2012 doi:10.1016/j.ajhg.2011.12.010

The Arabian Cradle: Mitochondrial Relicts of the First Steps along the Southern Route out of Africa

Verónica Fernandes et al.

A major unanswered question regarding the dispersal of modern humans around the world concerns the geographical site of the first human steps outside of Africa. The “southern coastal route” model predicts that the early stages of the dispersal took place when people crossed the Red Sea to southern Arabia, but genetic evidence has hitherto been tenuous. We have addressed this question by analyzing the three minor west-Eurasian haplogroups, N1, N2, and X. These lineages branch directly from the first non-African founder node, the root of haplogroup N, and coalesce to the time of the first successful movement of modern humans out of Africa, ∼60 thousand years (ka) ago. We sequenced complete mtDNA genomes from 85 Southwest Asian samples carrying these haplogroups and compared them with a database of 300 European examples. The results show that these minor haplogroups have a relict distribution that suggests an ancient ancestry within the Arabian Peninsula, and they most likely spread from the Gulf Oasis region toward the Near East and Europe during the pluvial period 55–24 ka ago. This pattern suggests that Arabia was indeed the first staging post in the spread of modern humans around the world.

Link

fineStructure paper (Lawson et al. (2012)

Related:

PLoS Genet 8(1): e1002453. doi:10.1371/journal.pgen.1002453

Inference of Population Structure using Dense Haplotype Data

Daniel John Lawson et al.

The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in unprecedented detail, but presents new statistical challenges. We propose a novel inference framework that aims to efficiently capture information on population structure provided by patterns of haplotype similarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes are reconstructed using chunks of DNA donated by the other individuals. Results of this “chromosome painting” can be summarized as a “coancestry matrix,” which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA) and model-based approaches such as STRUCTURE in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and we identify 226 populations reflecting differences on continental, regional, local, and family scales. We present multiple lines of evidence that, while many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE, is available from http://www.paintmychromosomes.com/.

January 26, 2012

Y chromosomes of West African descendants (Torres et al. 2012)

PLoS ONE 7(1): e29687. doi:10.1371/journal.pone.0029687


Y Chromosome Lineages in Men of West African Descent

Jada Ben Torres et al.

The early African experience in the Americas is marked by the transatlantic slave trade from ~1619 to 1850 and the rise of the plantation system. The origins of enslaved Africans were largely dependent on European preferences as well as the availability of potential laborers within Africa. Rice production was a key industry of many colonial South Carolina low country plantations. Accordingly, rice plantations owners within South Carolina often requested enslaved Africans from the so-called “Grain Coast” of western Africa (Senegal to Sierra Leone). Studies on the African origins of the enslaved within other regions of the Americas have been limited. To address the issue of origins of people of African descent within the Americas and understand more about the genetic heterogeneity present within Africa and the African Diaspora, we typed Y chromosome specific markers in 1,319 men consisting of 508 west and central Africans (from 12 populations), 188 Caribbeans (from 2 islands), 532 African Americans (AAs from Washington, DC and Columbia, SC), and 91 European Americans. Principal component and admixture analyses provide support for significant Grain Coast ancestry among African American men in South Carolina. AA men from DC and the Caribbean showed a closer affinity to populations from the Bight of Biafra. Furthermore, 30–40% of the paternal lineages in African descent populations in the Americas are of European ancestry. Diverse west African ancestries and sex-biased gene flow from EAs has contributed greatly to the genetic heterogeneity of African populations throughout the Americas and has significant implications for gene mapping efforts in these populations.

Link

January 24, 2012

Paleolithic Siberian domestic dog

From the press release:
A 33,000-year-old dog skull unearthed in a Siberian mountain cave presents some of the oldest known evidence of dog domestication and, together with an equally ancient find in a cave in Belgium, indicates that modern dogs may be descended from multiple ancestors.
I've been following the dog domestication saga for a few years now; it seems that geneticists are in general agreement that domestic dogs share a fairly recent ancestry from East Asia, although there are some lingering controversies about the role of other dogs in the formation of modern breeds. On the contrary, there are now two cases of Upper Paleolithic domesticated dogs, from both Belgium and Siberia. I can't wrap my head around the idea that dogs that were domesticated more than 30 thousand years ago, and would -presumably- have plenty of time to adapt would be totally replaced.

It would be great if we could get some Paleolithic dog DNA for comparison, as this would show whether some modern dog breeds are differentially affiliated to Paleolithic dogs, which would support a "multiregional evolution of domestic dogs".

PLoS ONE 6(7): e22821. doi:10.1371/journal.pone.0022821

A 33,000-Year-Old Incipient Dog from the Altai Mountains of Siberia: Evidence of the Earliest Domestication Disrupted by the Last Glacial Maximum

Nikolai D. Ovodov et al.


Abstract
Background
Virtually all well-documented remains of early domestic dog (Canis familiaris) come from the late Glacial and early Holocene periods (ca. 14,000–9000 calendar years ago, cal BP), with few putative dogs found prior to the Last Glacial Maximum (LGM, ca. 26,500–19,000 cal BP). The dearth of pre-LGM dog-like canids and incomplete state of their preservation has until now prevented an understanding of the morphological features of transitional forms between wild wolves and domesticated dogs in temporal perspective.

Methodology/Principal Finding
We describe the well-preserved remains of a dog-like canid from the Razboinichya Cave (Altai Mountains of southern Siberia). Because of the extraordinary preservation of the material, including skull, mandibles (both sides) and teeth, it was possible to conduct a complete morphological description and comparison with representative examples of pre-LGM wild wolves, modern wolves, prehistoric domesticated dogs, and early dog-like canids, using morphological criteria to distinguish between wolves and dogs. It was found that the Razboinichya Cave individual is most similar to fully domesticated dogs from Greenland (about 1000 years old), and unlike ancient and modern wolves, and putative dogs from Eliseevichi I site in central Russia. Direct AMS radiocarbon dating of the skull and mandible of the Razboinichya canid conducted in three independent laboratories resulted in highly compatible ages, with average value of ca. 33,000 cal BP.

Conclusions/Significance
The Razboinichya Cave specimen appears to be an incipient dog that did not give rise to late Glacial – early Holocene lineages and probably represents wolf domestication disrupted by the climatic and cultural changes associated with the LGM. The two earliest incipient dogs from Western Europe (Goyet, Belguim) and Siberia (Razboinichya), separated by thousands of kilometers, show that dog domestication was multiregional, and thus had no single place of origin (as some DNA data have suggested) and subsequent spread.

Link

January 20, 2012

Archaic DNA data mining for dummies

I have repeatedly stressed how full genome sequencing will allow us to detect archaic DNA in modern humans, so I thought of writing a simple post where I lay out the rationale behind my conviction.

The age of the microarray


Microarrays test for a few 105 variants in the human genome. Conceptually, we can view the difference between two individuals as follows:

A??????????T???????????C
A??????????C???????????T


As you can see, these two individuals differ in a couple of locations tested by the microarray and are the same in one.

The age of the full genome


What will happen when we use full genomes? All the unknown positions in the two sequences will be known.

This may end up looking like this (Possibility #1):

ACGAAATTCGATGTAATTAGGGC
ACGAAATTCGACGTAATTAGGGT


i.e., the sites that were polymorphic in the microarray were the only ones that were polymorphic, and the rest of the sequence appears like a carbon copy of each other.

Or, it may end up looking like this (Possibility #2):

ACGAAATTCGATGTAATTAGGGC
ACGAAGATCGACGTACTTGGGGT


i.e., there are additional differences between the two individuals that were not captured by the microarray.

In the second scenario, there are 6 mutations between the two sequences compared to only 2 in the first one. So, the two sequences share a much older common ancestor compared to the first scenario.

By scanning stretches of DNA in full genomes, it is possible to identify regions where the number of mutations between two sequences are so many (expressed e.g., as a fraction of the number of differences between humans and chimps), that the common ancestor must have lived a very long time ago, even millions of years ago.

In some cases, we will be able to directly compare these sequences to actual archaic hominins, which is how Mendez et al. were able to infer archaic introgression from a Denisova-like hominin into Melanesians. But, even in the absence of archaic DNA, a good enough case of archaic admixture can be made.

Balancing selection


Balancing selection is one mechanism whereby two very different sequences could be mantained for a very long time in the human population. The major histocompatibility complex is one part of the human genome where this is believed to take place.

Balancing selection occurs when heterozygotes have a selective advantage over homozygotes. In "regular" evolution, either due to drift or to selection, one allele drives another one to extinction either due to simple chance (drift) or due to an advantage (directional selection). In balancing selection the two alleles are maintained because people who have both of them (heterozygotes) outbreed people who have only one or the other (homozygotes).

It is, however, possible to distinguish between sequences maintained by balancing selection and those that are not. For example, one can examine the functional consequence of polymorphism, or survey the geographical distribution of the variant sequences.

Recombination


A different issue is that of recombination. Recombination slices up genome sequences  and stitches up new sequences that are a combination of those inherited from one's father and mother. Going back to our previous example:


ACGAAATTCGATGTAATTAGGGC
ACGAAGATCGACGTACTTGGGGT



Now consider this:

ACGAAATTCGATGTAATTAGGGC
ACGAAGATCGACGTAATTAGGGT



You can see that now the two sequences appear more similar to each other. This could in fact be, because a stretch of DNA (ATTA in blue) from the top sequence has become stitched up to the bottom.

If there has been archaic admixture in modern humans, we cannot expect to find very long stretches of archaic DNA. Rather, we expect to find a pastiche of archaic and modern sequence due to multiple generations of recombination. For really old admixture events recombination may obliterate all traces of admixture altogether!

This is why full genome sequencing is important, since it allows us to look at arbitrarily small lengths of DNA.   Archaic sequences of various lengths may lurk in-between the test points covered by microarrays, and by comparing full genomes we have a chance of uncovering them.

It may not, however, be possible to detect archaic admixture in very small lengths, because of statistics: 10 mutations in a length of a 100 and 100 mutations in a length of 1,000 both give the same age estimate, but the latter has a much tighter confidence interval..

Conclusion


Full genome sequencing will allow us to detect archaic DNA in modern humans by identifying regions of DNA that have common ancestors that are much older than the genomewide average. Some of these regions may be explained by balancing selection, while traces of others may have been lost by recombination. Nonetheless, not all of the evidence will have disappeared (especially for events in the last 100-200 thousand years), so expect it to surface sooner or later.

Introgression of archaic haplotype at OAS1 in Melanesians (Mendez et al. 2012)

It seems that Michael Hammer was good on his promise that in 2012 "This year, we should be able to confirm what we found and go way beyond that."  In a new paper, conclusive evidence is presented about introgression of an archaic sequence into Melanesian populations. The argument is as follows:

  • Melanesians are more diverse in that region than Africans.
  • The common ancestor of the "archaic" and "African" haplotypes lived >3 million years ago.
  • The "archaic" haplotype matches the ancient DNA from the Denisova hominin.
  • Balancing selection (which can sometimes maintain extremely old polymorphism) is not reasonable in this case, because it would need to maintain both "archaic" and "African" haplotypes for a long time, but then (inexplicably) would continue to operate in Melanesia and cease to operate everywhere else.

Notice that once again, this is based on resequencing a small region of the genome. This is why I am all the more confident in my prediction that the advent of full genome sequencing will uncover more archaic admixture in humans. It may not always be able to use all the above listed criteria to confirm this admixture (since we do not and cannot have ancient DNA from all the archaic hominins that once roamed the planet), but all the remaining ones will suffice to make a very good case for introgression.

What I find particularly interesting, is that Mendez et al. re-iterate a few times that genomewide averages admit to different explanations:

Full genome comparisons of the Neandertal and Denisova draft genomes with modern human sequences have revealed different amounts of shared ancestry between each of these archaic forms and anatomically modern human (AMH) populations from different geographic regions. For example, a higher proportion of SNPs was shared between non-African and Neandertal, and between Melanesian and the Denisova genomes, than between either Neandertal or Denisova and extant African genomes (Green et al. 2010; Reich et al. 2010). An intriguing possibility is that these patterns result from introgression of archaic genes into AMH populations in Eurasia. However, this SNP sharing pattern could also be explained by ancestral population structure in Africa (i.e., without the need to posit introgression). For example, if non-Africans and the ancestors of Neandertals descend from the same deme in a subdivided African population, and this structure persisted with low levels of gene flow among African residents until the ancestors of non-Africans migrated into Eurasia, then we would expect more SNP sharing between non-Africans and Neandertals (Durand et al. 2011). 
... 
While genome-wide comparisons detect more sequence agreement between non-African and Neandertal genomes, and between Melanesian and Denisova genomes, the specific loci exhibiting these signals have not yet been identified. Furthermore, current analyses do not elucidate the relative roles of recent introgression versus long-term population structure in Africa in explaining these patterns.

The current paper does a good job at showing how in one particular region archaic introgression into Melanesians is indeed the best explanation for the evidence. But, the fact that the authors seem to re-iterate the possibility of African population structure and repeatedly caution against using patterns of genomewide sharing between modern and archaic humans is a strong hint that there are more things to come on the topic.

We should remember that the widely-circulated estimates of Neandertal->Eurasian introgression are based on genomewide averages. It is true that Reich et al. (2010) identified 13 regions of potential Neandertal introgression, which together make up a very small portion of the human genome. So, the jury is out on whether African population structure or Neandertal introgression is responsible for most of the genomewide pattern.

What you can be sure of is that many scientists are busy lining up full genomes from different human populations as we speak, and finding plenty of regions where haplotypes of extremely old divergence times co-exist in our species. We will probably learn more about such efforts during 2012.



Mol Biol Evol (2012)doi: 10.1093/molbev/msr301

Global genetic variation at OAS1 provides evidence of archaic admixture in Melanesian populations

Fernando L. Mendez, Joseph C. Watkins and Michael F. Hammer

Recent analysis of DNA extracted from two Eurasian forms of archaic human show that more genetic variants are shared with humans currently living in Eurasia than with anatomically modern humans in sub-Saharan Africa. While these genome-wide average measures of genetic similarity are consistent with the hypothesis of archaic admixture in Eurasia, analyses of individual loci exhibiting the signal of archaic introgression are needed to test alternative hypotheses and investigate the admixture process. Here, we provide a detailed sequence analysis of the innate immune gene, OAS1, a locus with a divergent Melanesian haplotype that is very similar to the Denisova sequence from the Altai region of Siberia. We re-sequenced a 7 kb region encompassing the OAS1 gene in 88 individuals from 6 Old World populations (San, Biaka, Mandenka, French Basque, Han Chinese, and Papua New Guineans) and discovered previously unknown and ancient genetic variation. The 5' region of this gene has unusual patterns of diversity, including 1) higher levels of nucleotide diversity in Papuans than in sub-Saharan Africans, 2) very deep ancestry with an estimated time to the most recent common ancestor of >3 million years, and 3) a basal branching pattern with Papuan individuals on either side of the rooted network. A global geographic survey of >1500 individuals showed that the divergent Papuan haplotype is nearly restricted to populations from eastern Indonesia and Melanesia. Polymorphic sites within this haplotype are shared with the draft Denisova genome over a span of ∼90 kb and are associated with an extended block of linkage disequilibrium, supporting the hypothesis that this haplotype introgressed from an archaic source that likely lived in Eurasia.

Link

January 19, 2012

Shortage of female math geniuses not due to "stereotype threat"

Men are over-represented at the high end of math performance: there are more male math geniuses than female ones.

A theory that was proposed to explain that fact is that of stereotype threat. According to this theory, there is a stereotype in society that "women are bad in math"; women internalize this stereotype and lose confidence about their math abilities, and so they tend to perform sub-optimally in math tests, hence rendering the idea of "women are bad in math" a self-fulfilling prophecy.

This new study demonstrates that much of the literature that has accumulated around the idea of a "stereotype threat" can be relegated to the trash bin, and those who hope that fighting the stereotype will lead to more females joining the mathematical elite have their work cut out for them.

A video on the topic by the first author:






Review of General Psychology, Jan 16 , 2012, No Pagination Specified. doi: 10.1037/a0026617

Can stereotype threat explain the gender gap in mathematics performance and achievement?

Stoet, Gijsbert; Geary David C.

Men and women score similarly in most areas of mathematics, but a gap favoring men is consistently found at the high end of performance. One explanation for this gap, stereotype threat, was first proposed by Spencer, Steele, and Quinn (1999) and has received much attention. We discuss merits and shortcomings of this study and review replication attempts. Only 55% of the articles with experimental designs that could have replicated the original results did so. But half of these were confounded by statistical adjustment of preexisting mathematics exam scores. Of the unconfounded experiments, only 30% replicated the original. A meta-analysis of these effects confirmed that only the group of studies with adjusted mathematics scores displayed the stereotype threat effect. We conclude that although stereotype threat may affect some women, the existing state of knowledge does not support the current level of enthusiasm for this as a mechanism underlying the gender gap in mathematics. We argue there are many reasons to close this gap, and that too much weight on the stereotype explanation may hamper research and implementation of effective interventions.

Link

January 18, 2012

Manifold Learning for Human Population Structure Studies (Siu et al. 2012)

Software implementing this should be available here.

PLoS ONE 7(1): e29901. doi:10.1371/journal.pone.0029901


Manifold Learning for Human Population Structure Studies

Hoicheong Siu et al.

The dimension of the population genetics data produced by next-generation sequencing platforms is extremely high. However, the “intrinsic dimensionality” of sequence data, which determines the structure of populations, is much lower. This motivates us to use locally linear embedding (LLE) which projects high dimensional genomic data into low dimensional, neighborhood preserving embedding, as a general framework for population structure and historical inference. To facilitate application of the LLE to population genetic analysis, we systematically investigate several important properties of the LLE and reveal the connection between the LLE and principal component analysis (PCA). Identifying a set of markers and genomic regions which could be used for population structure analysis will provide invaluable information for population genetics and association studies. In addition to identifying the LLE-correlated or PCA-correlated structure informative marker, we have developed a new statistic that integrates genomic information content in a genomic region for collectively studying its association with the population structure and LASSO algorithm to search such regions across the genomes. We applied the developed methodologies to a low coverage pilot dataset in the 1000 Genomes Project and a PHASE III Mexico dataset of the HapMap. We observed that 25.1%, 44.9% and 21.4% of the common variants and 89.2%, 92.4% and 75.1% of the rare variants were the LLE-correlated markers in CEU, YRI and ASI, respectively. This showed that rare variants, which are often private to specific populations, have much higher power to identify population substructure than common variants. The preliminary results demonstrated that next generation sequencing offers a rich resources and LLE provide a powerful tool for population structure analysis.

Link

January 17, 2012

Comparison of MCLUST with fineSTRUCTURE

Dan Lawson has written up a comparison of fineSTRUCTURE and MCLUST and a PDF with further details. Dan first talked to me about doing this comparison in December, and it's unfortunate that I didn't try my new fastIBD method in time, so it could also be included in the analysis.

There are two parts to this type of structure inference:

  • Deriving a matrix of relationships between individuals (using PLINK IBS, ChromoPainter, or fastIBD, or ...)
  • Clustering these relationships (using fineSTRUCTURE, MCLUST, or ...)
Assessing the quality of the inferred structure is tricky, since these linkage-based methods tend to infer clusters that are finer-scaled than the level of population labels. It's not easy to know what e.g., a couple of Sardinian clusters mean if one does not have finer-level details about the origin of different Sardinian individuals. I tend to take a pragmatic view, that if clusters correspond to real-world phenomena (as the Iberian or Armenian ones do), then they are of value.

The analysis of Lawson and Falush seems to identify the main issues qute well: MCLUST is much faster, as good, but requires tuning for the number of dimensions; fineSTRUCTURE on the other hand does not require such tuning, is slower, but requires a prior (which is good or bad depending on whether you're a Bayesian or not). Both clustering algorithms perform better in the presence of linkage information than in the absence thereof.

One additional issue that MCLUST seems good at is its ability to detect clusters of varying shape, and hence discover recently admixed populations that form such clusters in PCA/MDS space. The simulated data of Lawson & Falush assume a biological model of splits/expansions, so it is not clear how their approach would handle lateral gene flow that results in "stretched" clusters of individuals.

I would love to see many different methods evaluated on a standard real-world dataset. Running ChromoPainter/fineSTRUCTURE is computationally very expensive, but I will try my hand at the Stanford HGDP set and the No1stOr2ndDegreeRelatives subset thereof, which consists of 940 individuals. If anyone wants to try alternative methods on the same real-world set, drop me an e-mail or write a comment, and I'll link to your analysis.


PS: I also have to applaud the quick response of Lawson and Falush to my idea of comparing MCLUST and fineSTRUCTURE. It is exactly the type of "open science" that I am a strong advocate for.

January 16, 2012

Phased Omni haplotypes with ShapeIT

The working directory of the 1000 Genomes ftp site contains phased haplotypes for 2,123 individuals from the 1000 Genomes Project (US/Europe). The data were phased with ShapeIT, which I've recently played with, and could recommend as a fairly user-friendly and high quality phasing software.

You can use vcftools to convert the data into PLINK format, which appears to be quite efficient (but use --plink-tped) compared to doing it on the single file I previously linked to. So, it's also a way to get 1000Genomes data into the more useful PLINK format, and it's pre-phased as a bonus.

January 13, 2012

Napoleon Bonaparte belonged to haplogroup E1b1b1c1* (E-M34*)

A previous paper on his mtDNA which was H.  A previous study found that Hitler also belonged to haplogroup E1b1b. So, expect plenty of war and mayhem if a new European leader emerges with a haplogroup E1b1b chromosome -- and, yes, I'm joking.

Journal of Molecular Biology Research Vol 1, No 1 (2011)

Haplogroup of the Y Chromosome of Napoléon the First

Gerard Lucotte, Thierry Thomasset, Peter Hrechdakian

Abstract
This paper describes the finding of the determination of the Y-haplogroup of French Emperor Napoléon I (Napoléon Bonaparte). DNA was extracted from two islands of follicular sheaths located at the basis of two of his beard hairs, conserved in the Vivant Denon reliquary. The Y-haplogroup of Napoléon I, determined by the study of 10 NRY-SNPs (non-recombinant Y-single nucleotide polymorphisms), is E1b1b1c1*. Charles Napoléon, the current collateral male descendant of Napoléon I, belongs to this same Y-haplogroup; his Y-STR profile was determined by using a set of 37 NRY-STRs (non-recombinant Y-microsatellites).

Link

Back to (North) Africa (Henn et al. 2012)

A great new paper has just appeared, presenting new data, new conclusions about African prehistory, and new methodologies. I'll have to read it before I comment on it, but since it's open access you can read it for yourselves.

UPDATE I:


The new data are publicly available here, with information about samples here.
The new PCADMIX software is also available.



PLoS Genet 8(1): e1002397. doi:10.1371/journal.pgen.1002397 

Genomic Ancestry of North Africans Supports Back-to-Africa Migrations 

Brenna Henn et al.

 North African populations are distinct from sub-Saharan Africans based on cultural, linguistic, and phenotypic attributes; however, the time and the extent of genetic divergence between populations north and south of the Sahara remain poorly understood. Here, we interrogate the multilayered history of North Africa by characterizing the effect of hypothesized migrations from the Near East, Europe, and sub-Saharan Africa on current genetic diversity. We present dense, genome-wide SNP genotyping array data (730,000 sites) from seven North African populations, spanning from Egypt to Morocco, and one Spanish population. We identify a gradient of likely autochthonous Maghrebi ancestry that increases from east to west across northern Africa; this ancestry is likely derived from “back-to-Africa” gene flow more than 12,000 years ago (ya), prior to the Holocene. The indigenous North African ancestry is more frequent in populations with historical Berber ethnicity. In most North African populations we also see substantial shared ancestry with the Near East, and to a lesser extent sub-Saharan Africa and Europe. To estimate the time of migration from sub-Saharan populations into North Africa, we implement a maximum likelihood dating method based on the distribution of migrant tracts. In order to first identify migrant tracts, we assign local ancestry to haplotypes using a novel, principal component-based analysis of three ancestral populations. We estimate that a migration of western African origin into Morocco began about 40 generations ago (approximately 1,200 ya); a migration of individuals with Nilotic ancestry into Egypt occurred about 25 generations ago (approximately 750 ya). Our genomic data reveal an extraordinarily complex history of migrations, involving at least five ancestral populations, into North Africa.

Link

January 11, 2012

How people get blue eyes

Genome-wide association studies can uncover links between genetic variants and phenotypes, even in the absence of any knowledge of how these links come about. All it takes is to make a statistical case linking genetic variation with the recorded phenotypic information.

This is somewhat unsatisfactory for a couple of reasons. First, we would like to know how cause and effect works, rather than simply observe that it does. Why do some people with certain genetic alleles have blue eyes?

Second, such functional studies allow us to predict phenotypes from genotypes. A great number of genetic mutations may cause particular phenotypes, and we are only able to discover associations between a subset of them that happens to exist in a population. Developing knowledge about function, rather than just statistical association, may help us in the future to infer the phenotypes of individuals from the deep past for which all non-osteological traces of phenotype have vanished, and may have been affected by genetic variants that are now extinct.

Many human traits are governed by a great number of genes, either through additive effects, or through complex interactions. Eye color is an example of a particular trait the genetic underpinnings of which in Caucasoids (other races have eyes that are uniformly brown) have been known for a while. Now a new study shows precisely how genetic mutations disrupt the formation of pigment in melanocytes, resulting in light-pigmented irides.


Genome Res doi:10.1101/gr.128652.111

HERC2 rs12913832 modulates human pigmentation by attenuating chromatin loop formation between a long-range enhancer and the OCA2 promoter

Mijke Visser et al.

Pigmentation of skin, eye and hair reflects some of the most evident common phenotypes in humans. Several candidate genes for human pigmentation are identified, and the SNP rs12913832 has strong statistical association with human pigmentation. It is located within an intron of the non-pigment gene HERC2, 21 kb upstream of the pigment gene OCA2, and the region surrounding rs12913832 is highly conserved among animal species. However, the exact functional role of HERC2 rs12913832 in human pigmentation is unknown. Here we demonstrate that the HERC2 rs12913832 region functions as an enhancer regulating OCA2 transcription. In darkly pigmented human melanocytes carrying the rs12913832 T-allele, we detected binding of the transcription factors HLTF, LEF1 and MITF to the HERC2 rs12913832 enhancer, and a long-range chromatin loop between this enhancer and the OCA2 promoter which leads to elevated OCA2 expression. In contrast, in lightly pigmented melanocytes carrying the rs12913832 C-allele, chromatin-loop formation, transcription factor recruitment and OCA2 expression are all reduced. Hence, we demonstrate that allelic variation of a common non-coding SNP located in a distal regulatory element not only disrupts the regulatory potential of this element but also affects its interaction with the relevant promoter. We provide the key mechanistic insight that allele-dependent differences in chromatin-loop formation (i.e. structural differences in the folding of gene loci) results in differences in allelic gene expression that affects common phenotypic traits. This concept is highly relevant for future studies aiming to unveil the functional basis of genetically-determined phenotypes including diseases.

Link

Lactase persistence in Neolithic Iberia

This is an extremely important study as it establishes the occurrence of lactase persistence in Neolithic Europe. This invalidates the idea proposed by some about a very late (post-Neolithic) introduction of lactase persistence into Europe by a pastoral population from the east, since we now have good evidence about the presence of this trait in a Neolithic sample from Atlantic Europe.

The frequency is higher than in the early Neolithic Linearbandkeramik (where it was absent in the tested samples), and lower than in present-day Basques, although levels of 27% are quite comparable to some modern south European populations. We are unlikely to detect the earliest occurrence of this trait (when it was limited to the original mutant and his descendants, prior to having a substantial advantage for digesting milk), but the new findings represent a new non-zero data point in the time series, which will certainly fill up as more points in space and time are tested.

European Journal of Human Genetics advance online publication 11 January 2012; doi: 10.1038/ejhg.2011.254

Low prevalence of lactase persistence in Neolithic South-West Europe

Theo S Plantinga et al.

The ability of humans to digest the milk component lactose after weaning requires persistent production of the lactose-converting enzyme lactase. Genetic variation in the promoter of the lactase gene (LCT) is known to be associated with lactase production and is therefore a genetic determinant for either lactase deficiency or lactase persistence during adulthood. Large differences in this genetic trait exist between populations in Africa and the Middle-East on the one hand, and European populations on the other; this is thought to be due to evolutionary pressures exerted by consumption of dairy products in Neolithic populations in Europe. In this study, we have investigated lactase persistence of 26 out of 46 individuals from Late Neolithic through analysis of ancient South-West European DNA samples, obtained from two burials in the Basque Country originating from 5000 to 4500 YBP. This investigation revealed that these populations had an average frequency of lactase persistence of 27%, much lower than in the modern Basque population, which is compatible with the concept that Neolithic and post-Neolithic evolutionary pressures by cattle domestication and consumption of dairy products led to high lactase persistence in Southern European populations. Given the heterogeneity in the frequency of the lactase persistence allele in ancient Europe, we suggest that in Southern Europe the selective advantage of lactose assimilation in adulthood most likely took place from standing population variation, after cattle domestication, at a post-Neolithic time when fresh milk consumption was already fully adopted as a consequence of a cultural influence.

Link