November 30, 2011

ChromoPainter and fineSTRUCTURE

The site gives pretty good information on this, although parts of it are still under construction. Link to paper and supporting information.

The authors use linkage information to discover fine-scale population structure. But, it should be noted that, while haplotype information (i.e., the co-inheritance of marker states at the local genomic level) does indeed provide additional information, the inference of fine-scale population structure does not depend on the presence of such information, nor is it impossible to obtain such fine-scale structure without it.

For example, a year ago, I showed how it is possible to infer K=64 clusters from the HGDP panel without using any linkage information, but setting the maximum possible number of clusters at 70. More recently, I inferred 20 clusters in North-Central Europe alone, 42 clusters in Africa alone, or even 124 clusters in my most ambitious run yet (out of a 150 maximum considered).

It should be noted that in all these experiments, the maximum number of clusters considered plays a significant role, as does the number of PCA/MDS dimensions considered, since MCLUST only finds the optimal number of clusters within the given limits. One of these days, I will try a mega-Clusters Galore exercise, with e.g., 100 MDS dimensions and 250 maximum number of clusters. This may take a while to run, but it will show the limits of the Clusters Galore approach.

While I do think that using haplotype information may add some extra power for ancestry inference, it should be properly compared against MCLUST over PCA/MDS, i.e., a state of the art clustering algorithm that has been shown to infer fine-scale structure without using any haplotype information. The claim that such structure "is only captured by the haplotype-based approach" is premature.

(UPDATE Jan 17, 2012) Lawson and Falush have compared MCLUST with fineSTRUCTURE here.

Inference of population structure using dense haplotype data

Daniel John Lawson et al.

The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un-precedented detail, but presents new statistical challenges. We propose a novel inference frameworkthat aims to efficiently capture information on population structure provided by patterns of haplotypesimilarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes arereconstructed using chunks of DNA donated by the other individuals. Results of this ‘chromosome paint-ing’ can be summarized as a ‘coancestry matrix’, which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA), and model-based approaches such as STRUCTURE, in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability, and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and identify 226 populations reflecting differences on conti-nental, regional, local and family scales. We present multiple lines of evidence that whilst many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels, and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE are available from

November 29, 2011

Aurignacian in Greece >40 thousand years ago

Antiquity Volume: 85 Number: 330 Page: 1131–1150

Franchthi Cave revisited: the age of the Aurignacian in south-eastern Europe

K. Douka1 et al.

The Aurignacian, traditionally regarded as marking the beginnings of Sapiens in Europe, is notoriously hard to date, being almost out of reach of radiocarbon. Here the authors return to the stratified sequence in the Franchthi Cave, chronicle its lithic and shell ornament industries and, by dating humanly-modified material, show that Franchthi was occupied either side of the Campagnian Ignimbrite super-eruption around 40000 years ago. Along with other results, this means that groups of Early Upper Palaeolithic people were active outside the Danube corridor and Western Europe, and probably in contact with each other over long distances.Along with other results, this means that groups of Early Upper Palaeolithic people were active outside the Danube corridor and Western Europe, and probably in contact with each other over long distances.


Oldest North African rock art

Antiquity Volume: 85 Number: 330 Page: 1184–1193

First evidence of Pleistocene rock art in North Africa: securing the age of the Qurta petroglyphs (Egypt) through OSL dating

Dirk Huyge et al

Long doubted, the existence of Pleistocene rock art in North Africa is here proven through the dating of petroglyph panels displaying aurochs and other animals at Qurta in the Upper Egyptian Nile Valley. The method used was optically stimulated luminescence (OSL) applied to deposits of wind-blown sediment covering the images. This gave a minimum age of ~15 000 calendar years making the rock engravings at Qurta the oldest so far found in North Africa.


November 28, 2011

Sephardic signature within mtDNA haplogroup T (?)

This paper proposes that "The haplotype of a suspected Sephardic origin has mutations 16114T-16126T-16153A-16192T-16294T-16519C in the first control region of mito-
chondrial DNA."

From the paper:
four avenues are pursued: (1) A search is conducted throughout multiple databases of the first control region of mitochondrial DNA for the T2e5 motif to ascertain the prevalence and geographic affiliation of the new haplotype. (2) One T2e5 sample is
sequenced for polymorphisms along the entire mitochondrial DNA and compared with T2e sequences to identify any potential coding region mutations that are important for the Sephardic sequence and its relation to other branches. (3) A phylogenetic tree is built from T2e control sequences to provide further information on the relation among lineages including the Sephardic cluster. Although full genomic sequences are usually preferable to avoid misclassifications based on control region information alone, T2e is an ideal subhaplogroup to exploit the more abundant control region data because it is defined by mutations in the control regions alone. Time to the most recent common ancestor is estimated to address questions of when the lineage emerged as well as where. (4) The frequencies of T sub-haplogroups are compared across growing published literature of various populations including from Europe, the Americas, and the Near East. Although the geographic distribution of haplogroup T has been investigated, less is known about the different subhaplogroups, especially T2e.
With respect to (1), the author writes:
The combined databases do not appear to have any biases for Iberia, Mexico, or Sephardim.
This is a rather weak claim, since the incidence of a haplotype in a given dataset depends on the relative number of samples of the different populations, and Sephardic Jews are indeed over-represented in the database searches relative to their actual population numbers. In any case, no explicit test of bias was performed

Stronger evidence for the Sephardic-ness of the haplotype in question could be arrived by dating it to a period consistent with the origins of that population. However:
Time estimates to the most recent common ancestor of the Sephardic signature T2e5 ranged all the way from after the expulsion – clearly impossible – to 415 000 years before present (YBP) (Fast: 338 YBP, 95% confidence interval (95% CI)=present to 763 YBP; Intermediate: 688 YBP, 95% CI=present-3820 YBP; slow: 6811 YBP, CI1=present to 15 245). Given mutations rates that vary by two orders of magnitude,22 as well as other issues with mutation rates and the rho statistic,23,41 at present coalescence analysis cannot be used to distinguish between different plausible timelines for the proposed Sephardic cluster.
The third piece of evidence in favor of the hypothesis of this paper is the relative frequency of the parent haplogroup T2e relative to T2b. This is, however, irrelevant, since mtDNA haplogroup T2e has been found in prehistoric European hunter-gatherers, so- its higher frequency in Saudi Arabia today does not indicate that its presence in Europe was effected in historical times, e.g., by Jews.

Moreover, higher frequency -in itself- does not indicate the direction of gene flow. Suppose that a particular haplogroup occurs at a frequency of 50% in a population A of 10 million that lives 2,000 miles away, and at a frequency of 10% in a population B of 500 million that lives 500 miles away. Clearly, population B is a much better source of the haplogroup than A, despite its lower frequency.

The final piece of evidence produced by the author:
The small T2e5 cluster satisfies criteria for being a signature. Although it is premature to set specific thresholds of a signature, a sample of 25% known Sephardic and 50% suspicion of Sephardic origin is overwhelmingly above what would be expected for a general European haplogroup.
On the contrary, T2e5 is found in Latin America (including Brazil), Iberia, and among Sephardic Jews who trace their ancestry to Iberia. Hence, if there is anything "in common" between the current T2e5 population, it is the geographical background of Iberia.

Strong evidence for the specific Jewish origin of T2e5 would be provided if it turned up in a different Jewish population. In that case, it could be well argued that this was indeed a lineage of Jewish origin that happened (for whatever reason) to become more frequent in the Sephardic population. On the contrary, the absence of T2e5 in non-Sephardic Jews suggests that this is not necessarily a Jewish-origin lineage.

In conclusion: this paper represents a valiant attempt to identify a Sephardic signature, but I remain unconvinced that a strong enough case for T2e5 being such a signature has been made. The evidence appears to be consistent with that hypothesis, but not sufficient to reject alternatives, namely that this is represents a European founder in the Sephardic population. Indeed, the author honestly admits that the origin of the "Sephardic signature" remains elusive:
These include Jewish settlers seeking asylum after destruction of temples in Jerusalem by Romans and Babylonians 2000–2500 years ago, slightly earlier Jewish settlers in Iberia,7,43 non-Jewish Muslims in the dispersal of Islam 1000+ years ago, non-Jewish Iberian peopling 2500+ years ago that predates all Jewish influx,44 and settlers in Iberia (or Italy) 45000 years ago that entirely predate the existence of Jewish groups. Thus, what is arguably the most contentious issue of whether there is genetic evidence of original Jewish DNA for the Sephardic line cannot be resolved.
Does it matter whether the line was originally Jewish or not? Not in the grand scheme of things, but it is certainly important for geneaologists: if it was originally Jewish then e.g., Latin Americans who belong to it must seek Sephardic Jewish ancestors; if it was pre-Jewish Iberian, then they may/may not have such ancestors.

PS: A minor mistake in the paper is the identification of a Sephardic sample as coming from "Salonica, Turkey". Salonica has, of course, never been part of Turkey: it was part of the Ottoman Empire and is now part of Greece. Fortunately Salonica is prominent enough to avoid confusion, but it's always a good idea to use appropriate terminology when referring to placenames.

European Journal of Human Genetics advance online publication 23 November 2011; doi: 10.1038/ejhg.2011.200

Sephardic signature in haplogroup T mitochondrial DNA

Felice L Bedford

A rare combination of mutations within mitochondrial DNA subhaplogroup T2e is identified as affiliated with Sephardic Jews, a group that has received relatively little attention. Four investigations were pursued: Search of the motif in 250 000 control region records across 8 databases, comparison of frequencies of T subhaplogroups (T1, T2b, T2c, T2e, T4, T*) across 11 diverse populations, creation of a phylogenic median-joining network from public T2e control region entries, and analysis of one Sephardic mitochondrial full genomic sequence with the motif. It was found that the rare motif belonged only to Sephardic descendents (Turkey, Bulgaria), to inhabitants of North American regions known for secret Spanish–Jewish colonization, or were consistent with Sephardic ancestry. The incidence of subhaplogroup T2e decreased from the Western Arabian Peninsula to Italy to Spain and into Western Europe. The ratio of sister subhaplogroups T2e to T2b was found to vary 40-fold across populations from a low in the British Isles to a high in Saudi Arabia with the ratio in Sephardim more similar to Saudi Arabia, Egypt, and Italy than to hosts Spain and Portugal. Coding region mutations of 2308G and 14499T may locate the Sephardic signature within T2e, but additional samples and reworking of current T2e phylogenetic branch structure is needed. The Sephardic Turkish community has a less pronounced founder effect than some Ashkenazi groups considered singly (eg, Polish), but other comparisons of interest await comparable averaging. Registries of signatures will benefit the study of populations with a large number of smaller-size founders.


November 27, 2011

Ancient mtDNA from Cardial culture (early Neolithic Iberia)

Molecular Ecology DOI: 10.1111/j.1365-294X.2011.05361.x

Ancient DNA from an Early Neolithic Iberian population supports a pioneer colonization by first farmers

C. GAMBA et al.

The Neolithic transition has been widely debated particularly regarding the extent to which this revolution implied a demographic expansion from the Near East. We attempted to shed some light on this process in northeastern Iberia by combining ancient DNA (aDNA) data from Early Neolithic settlers and published DNA data from Middle Neolithic and modern samples from the same region. We successfully extracted and amplified mitochondrial DNA from 13 human specimens, found at three archaeological sites dated back to the Cardial culture in the Early Neolithic (Can Sadurní and Chaves) and to the Late Early Neolithic (Sant Pau del Camp). We found that haplogroups with a low frequency in modern populations—N* and X1—are found at higher frequencies in our Early Neolithic population (∼31%). Genetic differentiation between Early and Middle Neolithic populations was significant (FST∼0.13, P less than 10−5), suggesting that genetic drift played an important role at this time. To improve our understanding of the Neolithic demographic processes, we used a Bayesian coalescence-based simulation approach to identify the most likely of three demographic scenarios that might explain the genetic data. The three scenarios were chosen to reflect archaeological knowledge and previous genetic studies using similar inferential approaches. We found that models that ignore population structure, as previously used in aDNA studies, are unlikely to explain the data. Our results are compatible with a pioneer colonization of northeastern Iberia at the Early Neolithic characterized by the arrival of small genetically distinctive groups, showing cultural and genetic connections with the Near East.


Y-haplogroup O3 in the eastern Himalayas

From the paper:
The samples were typed through seven panels of 75 single nucleotide polymorphisms (SNPs), as listed in the latest Y- chromosome phylogenetic tree (Karafet et al., 2008). The panels were organized as follows: Panel 1 (within Haplogroup O), M175, M119, P203, M110, M268, P31, M95, M176, M122, M324, M121, P201, M7, M134, M117, 002611, P164, L127 (rs17269396), KL1 (rs17276338); Panel 2 (non- Haplogroup O), M130, P256, M1, M231, M168, M174, M45, M89, M272, M258, M242, M207, M9, M96, P125, M304, M201, M306; Panel 3 (Haplogroup C), M217; Panel 4 (Haplogroup D), P47, N1, P99, M15, M125, M55, M64.1, M116.1, M151, N2, 022457; Panel 5 (Haplogroup N), M214, LLY22g, M128, M46/Tat, P63, P119, P105, P43,M178; Panel 6 (Haplogroup R), M306, M173, M124, M420, SRY10831.2, M17, M64.2, M198, M343, V88, M458, M73, M434, P312, M269, U106/M405; Panel 7 (Haplogroup Q), P36.2.
Wikipedia article on Luoba and Deng.

Annals of Human Genetics DOI: 10.1111/j.1469-1809.2011.00690.x

Y-chromosome O3 Haplogroup Diversity in Sino-Tibetan Populations Reveals Two Migration Routes into the Eastern Himalayas

Longli Kang et al.

The eastern Himalayas are located near the southern entrance through which early modern humans expanded into East Asia. The genetic structure in this region is therefore of great importance in the study of East Asian origins. However, few genetic studies have been performed on the Sino-Tibetan populations (Luoba and Deng) in this region. Here, we analyzed the Y-chromosome diversity of the two populations. The Luoba possessed haplogroups D, N, O, J, Q, and R, indicating gene flow from Tibetans, as well as the western and northern Eurasians. The Deng exhibited haplogroups O, D, N, and C, similar to most Sino-Tibetan populations in the east. Short tandem repeat (STR) diversity within the dominant haplogroup O3 in Sino-Tibetan populations showed that the Luoba are genetically close to Tibetans and the Deng are close to the Qiang. The Qiang had the greatest diversity of Sino-Tibetan populations, supporting the view of this population being the oldest in the family. The lowest diversity occurred in the eastern Himalayas, suggesting that this area was an endpoint for the expansion of Sino-Tibetan people. Thus, we have shown that populations with haplogroup O3 moved into the eastern Himalayas through at least two routes.

November 26, 2011

mtDNA of Venezuelans

Am J Phys Anthropol DOI: 10.1002/ajpa.21629

A melting pot of multicontinental mtDNA lineages in admixed Venezuelans

Alberto Gómez-Carballa et al.

The arrival of Europeans in Colonial and post-Colonial times coupled with the forced introduction of sub-Saharan Africans have dramatically changed the genetic background of Venezuela. The main aim of the present study was to evaluate, through the study of mitochondrial DNA (mtDNA) variation, the extent of admixture and the characterization of the most likely continental ancestral sources of present-day urban Venezuelans. We analyzed two admixed populations that have experienced different demographic histories, namely, Caracas (n = 131) and Pueblo Llano (n = 219). The native American component of admixed Venezuelans accounted for 80% (46% haplogroup [hg] A2, 7% hg B2, 21% hg C1, and 6% hg D1) of all mtDNAs; while the sub-Saharan and European contributions made up ∼10% each, indicating that Trans-Atlantic immigrants have only partially erased the native American nature of Venezuelans. A Bayesian-based model allowed the different contributions of European countries to admixed Venezuelans to be disentangled (Spain: ∼38.4%, Portugal: ∼35.5%, Italy: ∼27.0%), in good agreement with the documented history. Seventeen entire mtDNA genomes were sequenced, which allowed five new native American branches to be discovered. B2j and B2k, are supported by two different haplotypes and control region data, and their coalescence ages are 3.9 k.y. (95% C.I. 0–7.8) and 2.6 k.y. (95% C.I. 0.1–5.2), respectively. The other clades were exclusively observed in Pueblo Llano and they show the fingerprint of strong recent genetic drift coupled with severe historical consanguinity episodes that might explain the high prevalence of certain Mendelian and complex multi-factorial diseases in this region.


November 25, 2011

42,000-year old fishermen from East Timor

From the New Scientist:
The new finds blow that record out of the water. Sue O'Connor at the Australian National University in Canberra and colleagues dug through deposits at the Jerimalai shelter in East Timor. They discovered 38,000 fish bones from 23 different taxa, including tuna and parrotfish that are found only in deep water. Radiocarbon dating revealed the earliest bones were 42,000 years old.

Amidst the fishy debris was a broken fish hook fashioned from shell, which the team dated to between 16,000 and 23,000 years. "This is the earliest known example of a fish hook," says O'Connor. Another hook, made around 11,000 years ago, was also found.

Sandra Bowdler at the University of Western Australia in Perth, who was not involved in the study, is convinced that those colonising East Timor 42,000 years ago had "fully formed" fishing skills. "By this time, modern humans are assumed to have the same mental capacities as today," she says.

"There is nothing like this anywhere else in the world," says Ian McNiven of Monash University in Melbourne, who was not a member of O'Connor's team. "Maybe this is the crucible for fishing."

"The fish hooks appear at 20,000 years ago, but we do have literally thousands of bones from tuna, large tuna 50cm or more in size, which is a species not readily caught from the shore," he said.

"Really you have to be out in boats ... We don't know how they were caught, these earliest tuna ... It could be that people were using fish hooks 42,000 years old, we just don't know."

The research paper draws on archaeological research from Southeast Asia and Oceania, which suggests "the parrotfish and unicornfish would likely have been caught by netting or spearing, whereas trevallies, triggerfish, snappers, emperors, and groupers are most commonly captured by angling using a baited hook".

While the single-piece baited hooks do not seem suitable for pelagic fishing, the study authors suggest other types of hooks would have developed at the same time.

Some of the fish bones were scarred with marks that could have been made by fine barbs for fish spears, or complex hooks used for trolling.

So far, no artefacts related to netting have been recovered, but the manufacture of strong fibre line is implied by the hooks.
Those Upper Paleolithic people never cease to amaze.

Science 25 November 2011:
Vol. 334 no. 6059 pp. 1117-1121
DOI: 10.1126/science.1207703

Pelagic Fishing at 42,000 Years Before the Present and the Maritime Skills of Modern Humans

Sue O’Connor1, Rintaro Ono2, Chris Clarkson3


By 50,000 years ago, it is clear that modern humans were capable of long-distance sea travel as they colonized Australia. However, evidence for advanced maritime skills, and for fishing in particular, is rare before the terminal Pleistocene/early Holocene. Here we report remains of a variety of pelagic and other fish species dating to 42,000 years before the present from Jerimalai shelter in East Timor, as well as the earliest definite evidence for fishhook manufacture in the world. Capturing pelagic fish such as tuna requires high levels of planning and complex maritime technology. The evidence implies that the inhabitants were fishing in the deep sea.


November 23, 2011

Dogs domesticated in East Asia after all?

I'm not holding my breath that this will be the final chapter of the dog domestication saga. Previous installments:
Press release of current study:
Data on genetics, morphology and behaviour show clearly that dogs are descended from wolves, but there's never been scientific consensus on where in the world the domestication process began. "Our analysis of Y-chromosomal DNA now confirms that wolves were first domesticated in Asia south of Yangtze River -- we call it the ASY region -- in southern China or Southeast Asia", Savolainen says.

The Y data supports previous evidence from mitochondrial DNA. "Taken together, the two studies provide very strong evidence that dogs originated in the ASY region", Savolainen says.

Archaeological data and a genetic study recently published in Nature suggest that dogs originate from the Middle East. But Savolainen rejects that view. "Because none of these studies included samples from the ASY region, evidence from ASY has been overlooked," he says.

Peter Savolainen and PhD student Mattias Oskarsson worked with Chinese colleagues to analyse DNA from male dogs around the world. Their study was published in the scientific journal Heredity.
The paper is open access, so you can make up your own mind on whether or not this seals the case.

Heredity advance online publication 23 November 2011; doi: 10.1038/hdy.2011.114

Origins of domestic dog in Southern East Asia is supported by analysis of Y-chromosome DNA

Z-L Ding et al.

Global mitochondrial DNA (mtDNA) data indicates that the dog originates from domestication of wolf in Asia South of Yangtze River (ASY), with minor genetic contributions from dog–wolf hybridisation elsewhere. Archaeological data and autosomal single nucleotide polymorphism data have instead suggested that dogs originate from Europe and/or South West Asia but, because these datasets lack data from ASY, evidence pointing to ASY may have been overlooked. Analyses of additional markers for global datasets, including ASY, are therefore necessary to test if mtDNA phylogeography reflects the actual dog history and not merely stochastic events or selection. Here, we analyse 14 437 bp of Y-chromosome DNA sequence in 151 dogs sampled worldwide. We found 28 haplotypes distributed in five haplogroups. Two haplogroups were universally shared and included three haplotypes carried by 46% of all dogs, but two other haplogroups were primarily restricted to East Asia. Highest genetic diversity and virtually complete phylogenetic coverage was found within ASY. The 151 dogs were estimated to originate from 13–24 wolf founders, but there was no indication of post-domestication dog–wolf hybridisations. Thus, Y-chromosome and mtDNA data give strikingly similar pictures of dog phylogeography, most importantly that roughly 50% of the gene pools are shared universally but only ASY has nearly the full range of genetic diversity, such that the gene pools in all other regions may derive from ASY. This corroborates that ASY was the principal, and possibly sole region of wolf domestication, that a large number of wolves were domesticated, and that subsequent dog–wolf hybridisation contributed modestly to the dog gene pool.


November 19, 2011

The "Upper Paleolithic" of South Arabia

I came across this interesting book chapter on The "Upper Paleolithic" of South Arabia by Jeffrey Rose and Vitaly Usik. I first became aware of Dr. Rose's work in Southern Arabia when I watched the "Incredible Human Journey" (see Related links below) a couple of years ago. The conclusions of the chapter seem to mesh quite well with some of my recent thoughts about a possible Out-of-Arabia expansion of modern humans, posterior to the earlier Out-of-Africa.

The following figure is instructive:

Notice the super-aridity of MIS 4, circa 70ka BP. This would certainly be an awful time for anyone to move into Arabia. Conversely, if there were anatomically modern people living there prior to MIS 4, the onset of the super-arid phase during MIS 4 would be a great time to get out.

As I mention in my previous post on mtDNA haplogroup L3, I think that the major human expansion associated with haplogroup L3 and its M/N subclades originated in Arabia, and the super-arid MIS 4 phase looks about right for a bottleneck out of which the descendants of only a single woman, the L3 ur-mother would survive.

From the book chapter:
So, we are able to make a few general observations regarding the Upper Paleolithic found in the southern portions of the peninsula: (1) there are multiple phases of human occupation in South Arabia throughout the latter half of the Upper Pleistocene, (2) there are elements loosely related to the Levantine sequence, however, the South Arabian Upper Paleolithic probably belongs to a unique and locally-derived lithic tradition, (3) there do not appear to be any links with East Africa (with the exception of the Hargeisan) from MIS 4-onward, and (4) assemblages from southern and south-western Arabia are dominated by different laminar-based technologies between 75 and 8 ka.
The Hargeisan is interesting, because it is a possible link of an expansion from Arabia to Africa:
One potentially additional piece of evidence for this hypothesized Near Eastern/Arabian-derived human expansion is the anomalous Hargeisan Industry found in the Horn of Africa. Known from a small number of findspots around Hargeisa (Clark, 1954), Boosasso (Graziosi, 1954) and Midhishi Cave in the Golis Mountains of northern Somalia (Gresham, 1984; Brandt, 1986), the Hargeisan has been found overlying MSA material and beneath LSA occupation layers.
Of course, the political situation in Somalia may suggest that scientists won't be studying the Hargeisan anytime soon.

More from the book chapter:
From an archaeological perspective, Straus and Bar-Yosef (2001: 2) entertain the same possibility: “there is, however, no reason a priori to exclude the possibility that intercontinental contacts occurred on a two-way street, especially at Suez, via Sinai, or across the shallow Bab al Mandab, so close to that corridor to sub-Saharan Africa, the Nile.” Marks (2005) and Otte et al. (2007) envisage similar scenarios during the MP/UP transitions in the Near East and Zagros regions. Both scholars argue that the archaeological evidence from Eastern Europe and Western Asia indicate the expansion of European UP technologies radiated from these areas, rather than Africa, during early MIS 3. Echoing this proposition from a biological perspective, Schillaci (2008) proposes the spread of Levantine-derived peoples into Australasia between 60 and 40 ka based on fossil evidence and phylogenetic relationships between populations.
We maintain that the evidence from Arabia indicates the post-MIS 4 human expansion did not originate in sub-Saharan Africa; rather, early modern humans have emerged from a geographic range encompassing areas of northeast Africa, Western Asia, Arabia, and South Asia. These populations would have been forced to contract into environmentally stable refugia around Arabia such as the Ur-Schatt River Valley, coastal oases, Yemeni Highlands, and/or the Dhofar Mountains during climatic downturns. As such, the fluctuating dynamic between landscape carrying capacity and population density may have been a critical mechanism driving early human dispersals from the region. Episodes of climate change caused large portions of the Arabian peninsula to become uninhabitable due to such calamities as the inundation of the emerged continental shelf and desertification throughout the interior. Given the potential importance of these once favorable, now uninhabitable zones, future investigations in and around Arabia should endeavor to explore the heart of the desert and bottom of the sea.


November 18, 2011

Age of mtDNA haplogroup L3: about 70 thousand years

There are two aspects to this paper: first, it appears to be a solid attempt at inferring the age of mtDNA haplogroup L3. This haplogroup contains several subclades, including M and N, the two macrohaplogroups of the vast majority of Eurasians.

I am usually skeptical of very tight age estimates, but there appear to be no obvious flaws in the paper, and alternative mutation rates are used to derive the 70ka bound. Moreover, the 70ka age is consistent with what appears to be no longer in doubt, namely the arrival of fully anatomically and behaviorally modern humans all over the Old World, starting from the 50-40ka period.

The second aspect of this paper is its claim that pre-70ka dispersals are irrelevant to modern human origins. Indeed, if the early anatomically modern humans from the Levant (Qafzeh/Skhul) or the pre-Toba layers in Asia were ascribed to Out-of-Africa humans, then we would expect their genetic differentiation with East African mtDNA to trace back to Marine Isotope Stage 5 (~130-75ka), and indeed to its early stages, to account for the Mount Carmel hominins.

So, have we solved the Out-of-Africa riddle? Did the Out-of-Africa expansion take place after 70ka? I don't think so, not because there is anything wrong with the mtDNA age, but because the competing hypothesis, that is rarely, if ever discussed, is that there was an Into-Africa event post-70ka.

mtDNA furnishes the best evidence that humans trace their ultimate origins to Africa, since L3, of which M and N are subclades, is a young twig of the mtDNA phylogeny. As the authors of the current paper note:
Although the tree is highly starlike at shallower time depths, suggesting numerous episodes of rapid growth in the human population in the more recent past, it is only at a third of the time depth of the entire tree with the emergence of the L3 haplogroup that the first multifurcating them all the ancient diversity observed outside Africa) (Behar et al. 2008; Torroni et al. 2006; Watson et al. 1997).
Whatever humans were doing between ~200ka (when the first anatomically modern specimen is found in Ethiopia, and when the mtDNA phylogeny coalesces) and ~70ka (when the L3 node does), they were certainly not yet in the overdrive mode we find them c. 50ka when they begin making their grand entrance all over the surface of the planet.

So, while the ultimate roots of modern mankind are in Africa, there is no clear picture -yet- whether the post-70ka major expansion of humans originated in Africa. Certainly, it cannot have originated too far from it, because non M and N mtDNA is virtually absent throughout most of the world. But, it is not possible, yet, to exclude a Near Eastern post-70ka expansion that would make the ~100ka Levantine hominins ancestral to most modern humans, rather than irrelevant sidebranches.

There are several reasons why this may be the case:
  1. East African L3 subclades are found in Arabia, where one finds a rich assortment of basal N subclades, as well as a not insignificant amount of M. These are often dismissed as the result of recent introgression, but they could in fact, and in part, be remnants of an older population, perhaps associated with the Persian Gulf Oasis hypothesis, and certainly absorbed by J1-bearing Arabian ancestors from further north.
  2. The Y-chromosome phylogeny has no clear signal of Out-of-Africa ~70ka. On the contrary, Eurasia possesses DE*, D and E haplogroups, as well as CF, the major human lineage, with C being totally Asian. While Africa possesses the oldest Y-chromosome lineages (basal to CT), the evidence tilts towards Asia being the homeland of CT, which has the closest parallels to a post-70ka event.
  3. Finally, Africa, including East Africa, shows, at present no sign for the presence of fully modern humans at the crucial time period. We do have, of course, Omo ~195ka, crucial anatomically modern humans in Ethiopia, but no clear sign of a bubbling volcano of a population c.70ka ready to errupt onto the Eurasian landmass.
At present, I consider the possibility that the recent post-70ka expansion of modern humans was initiated in the Near East as a possibility that cannot be dismissed. The evidence seems ambiguous, at present, since Eurasia may have a better case for such an expansion in Y-chromosomes, while Africa may have a better case in mtDNA (since it has more basal L3 clades than Eurasia).

A better characterization of Near Eastern mtDNA, especially from Arabia, as well as increased archaeological/palaeoanthropological investigations in East Africa/the Near East/South Asia is needed to finally uncover the material counterpart of the major human expansion that is written in our genes.

A third aspect of the paper is that the human expansion was linked to climate and not on the emergence of symbolic behavior. I have my own reservations on the whole concept of "symbolic behavior". We do see early evidence of such behavior in Africa, such as Blombos Cave in South Africa and North Africa. The authors of the current paper write:
There is an intriguing possible rider to this conclusion. North Africa has been entirely depopulated and repopulated, at least with respect to mtDNA variation (Pereira et al. 2010), since the time of the Aterian industry, where modern symbolic behavior is attested very early, similar to Southern Africa, and in contrast to Eastern Africa (Barton et al. 2009). We might therefore contemplate a possible North Africa ancestry for L3, with its rapid radiation corresponding to an early range expansion into Eastern Africa. However, any potential dispersal between the Mediterranean and the Horn of Africa around the time of the MIS4/3 transition would face severe environmental difficulties, unlike the “green Sahara” conditions of MIS5 and the early Holocene (Drake et al. 2010). We therefore conclude that an indigenous origin for L3 in Eastern Africa remains by far the most likely scenario.

As Mellars (2006) has argued, the early evidence for symbolically mediated behavior in both North and Southern Africa rules out any simple direct link for the expansion of L3 to (Ambrose 1998; Watson et al. 1997). Evidence of engraved ochre now extends back to at least 100 ka (Henshilwood et al. 2009), Nassarius marine shell beads were evidently present across the range of early modern humans from Southern Africa to North Africa and the Levant before 80 ka – possibly tens of thousands of years earlier (Barton et al. 2009; Bouzouggar et al. 2007; d'Errico et al. 2009; Mellars 2006; Vanhaeren et al. 2006) – and evidence for burial ritual is found in early modern humans in the Levant dating to 90–110 ka (Mellars 2006; Shea 2008). Thus, as suggested by Basell (2008) the demographic expansionsthat led to the first successful dispersal out of Africa seem better explained by the play of palaeoenvironmental forces than by recourse to the advantages of “modernity”.
The absence of markers of behavioral modernity in East Africa at the crucial time seems puzzling. Climate may have caused Out-of-East-Africa, but why would Out-of-East-Africans without clear signs of behavioral modernity be able to outcompete the "behaviorally modern" people of North/South Africa and the Levant? This observation, coupled with the absence of any clear identifiable palaeoanthropological population in East Africa at the time in question raises my unease about this scenario.

Moreover, while we can definitely ascribe symbolic thinking to the cases mentioned in the quoted text, but these may represent precursors, and not the full "package" of behaviors that allowed (or even prompted) our ancestors to spread around the planet around the middle of the last 100,000 years.

Mol Biol Evol (2011) doi: 10.1093/molbev/msr245

The expansion of mtDNA haplogroup L3 within and out of Africa

Pedro Soares et al.

Although fossil remains show that anatomically modern humans dispersed out of Africa into the Near East ∼100–130 ka, genetic evidence from extant populations has suggested that non-Africans descend primarily from a single successful later migration. Within the human mtDNA tree, haplogroup L3 encompasses not only many sub-Saharan Africans but also all ancient non-African lineages, and its age therefore provides an upper bound for the dispersal out of Africa. An analysis of 369 complete African L3 sequences places this maximum at ∼70 ka, virtually ruling out a successful exit before 74 ka, the date of the Toba volcanic super-eruption in Sumatra. The similarity of the age of L3 to its two non-African daughter haplogroups, M and N, suggests that the same process was likely responsible for both the L3 expansion in Eastern Africa and the dispersal of a small group of modern humans out of Africa to settle the rest of the world. The timing of the expansion of L3 suggests a link to improved climatic conditions after ∼70 ka in Eastern and Central Africa, rather than to symbolically mediated behavior, which evidently arose considerably earlier. The L3 mtDNA pool within Africa suggests a migration from Eastern Africa to Central Africa ∼60–35 ka, and major migrations in the immediate postglacial, again linked to climate. The largest population size increase seen in the L3 data is 3–4 ka in Central Africa, corresponding to Bantu expansions, leading diverse L3 lineages to spread into Eastern and Southern Africa in the last 3–2 ka.


November 16, 2011

Armenian Y-chromosomes revisited (Herrera et al. 2011)

Armenian Y-chromosomes have been a largely ignored since the publication of the classic Weale et al. (2001) paper a decade ago. The Armenian DNA Project has largely covered the void during the intervening years, but it is nice that the topic is revisited by academics.

Armenia is sandwiched between Anatolia, the Fertile Crescent, the Iranian plateau, the Caucasus, and the Black and Caspian seas, making the study of Armenian Y-chromosomes extremely interesting for the student of Eurasian prehistory.

Gene flow from the surrounding regions may have affected the Armenian population over historical time, but the remoteness of the Armenian highlands, coupled with the national church -- which distinguished Armenians from both the Orthodoxy of the Roman Empire, the Zoroastrianism of the Persians, and, later the Islam of Arabs and Ottomans -- may have prevented it.

My comments on the paper will follow below once I read it.

UPDATE I: The paper spends a lot of time on analysis of Y-STR variance; my opinion of Y-STRs as a tool for inferring past population movements is, to put it mildly, low. When Bahamian Y-STR variance is higher than African one, and E-V13, one of the youngest European Y-haplogroups (in terms of Y-STR variance) turns up in Spain in one of the earliest ancient DNA samples, it goes without saying that the burden of proof is on those who wish to continue to talk about Neolithic or other population movements to make the assumptions of their models clearer. Nonetheless, there is still some utility in Y-STRs, so I reproduce some tree diagrams from the paper (top left), and link to the supplementary info that has a collection of haplotypes that may be useful to genealogists.

From the paper:
However, owing to the contentions associated with the current calibrations of the Y-STR mutation rates,32,34,35,41 as well as the limitations of the assumptions utilized by the methodologies for time estimations, the absolute dates generated in this study should only be taken as rough estimates of upper bounds.
Indeed. We are at the point where Y-STRs are at the end of their utility, but the replacement technology of extensive Y-chromosome sequencing has not quite arrived in an economical way yet.

I will have some additional thoughts on Y-chromosome distribution in the third update, but, for the time being, the two most important "nuggets" of information are: (i) the unusual haplogroup frequencies in Sasun (high R2 and T), which may be due to a founder effect, but it would be interesting if Armenian historians could find some explanation for their occurrence there, and (ii) the occurrence of R-M269*(xL23) in Ararat Valley. I invite more knowledgeable readers to comment on the issue; the haplotypes are in Table 2 of the supplement.

UPDATE III: The ubuiquity of haplogroup G2a in Neolithic Europe, coupled with the absence of other prominent present-day European haplogroups, has important implications about European discontinuity.

But, it also has implications about West Asian discontinuity. The Neolithic in Europe arrived by all accounts from either of two principal areas: Anatolia or the Levant. Today, in Anatolia and the Levant, we see a set of haplogroups of which haplogroup J is the most important and ubiquitous one. Haplogroup R1b is also quite frequent in Armenia, the east Caucasus, Anatolia, and Iran, but its frequency drops dramatically to the east and south. And, there is a whole assortment of other haplogroups with varying frequency.

Why didn't all these non-G2a haplogroups participate in the early Neolithic colonization of Europe? It could very well be that a very small founder population crossed the Aegean into Europe, one that happened to be G2a-dominated. But, that is ultimately not very satisfying: if there was plenty of J and R1b in West Asia at the time of the Neolithic expansion, why are these haplogroups so conspicuous in their absence -at least so far- from Neolithic Europe?

The case of haplogroup J is particularly problematic. If we had to guess, by looking at present-day distribution, which lineage tracks population movements from the Near East to Europe, there is simply no better candidate: every map of this haplogroup, and especially of its J2a sublineage shows an unambiguous pattern of radiation, with a core area consisting of Southern Italy, Greece, Anatolia, West Asia, Mesopotamia and the northern parts of the Levant. All these regions are crucial to the story of the Neolithic, so the absence of J in Neolithic Europe is perplexing.

And, the story has other complications. From the current paper:
The relative expansion times for haplogroup J2-M172 (Table 4) generally correspond with those yielded for R1b-M343, with the exception of Greece and Crete, which, unlike haplogroup R1b-M343, are slightly older than the dates yielded for several of the Near Eastern groups as well as the four Armenian populations.
As mentioned above, I don't give much weight on Y-STR evidence, but observations such as the above certainly add to the feeling of unease that something is not quite right with the default picture of prehistory.

Another observation on the Armenian population, is its very low frequency of haplogroup R1a1. Proponents of the Kurgan model of Indo-European dispersals sometimes associate this haplogroup with the Proto-Indo-European community, and it is strange why -if their ideas are right- Armenia is so lacking in this haplogroup, like its Caucasian neighbors. Why would these hypothetical migrants make such a huge impact in faraway India and barely a dent in nearby Armenia?

Finally, the occurrence of some I2, E-V13, and, perhaps, J2b in Armenia may point to Balkan contacts. But, when did these contacts occur? Are they traceable to the migration of Phrygians to Anatolia, according to the Herodotean account of Armenian origins, or can they be attributed to later contacts with Greeks or other Europeans?

The veil of mystery seems to be raised even higher by every new study: we may be less certain of what really happened today than in the days of happy ignorance, ten years ago. Ultimately it is new data, like the ones included in this paper, that will make every piece of evidence fit, and the grand puzzle of the history of Eurasia will be revealed in all its glory.

European Journal of Human Genetics , (16 November 2011) | doi:10.1038/ejhg.2011.192

Neolithic patrilineal signals indicate that the Armenian plateau was repopulated by agriculturalists

Kristian J Herrera, Robert K Lowery, Laura Hadden, Silvia Calderon, Carolina Chiou, Levon Yepiskoposyan, Maria Regueiro, Peter A Underhill and Rene J Herrera

Armenia, situated between the Black and Caspian Seas, lies at the junction of Turkey, Iran, Georgia, Azerbaijan and former Mesopotamia. This geographic position made it a potential contact zone between Eastern and Western civilizations. In this investigation, we assess Y-chromosomal diversity in four geographically distinct populations that represent the extent of historical Armenia. We find a striking prominence of haplogroups previously implicated with the Agricultural Revolution in the Near East, including the J2a-M410-, R1b1b1*-L23-, G2a-P15- and J1-M267-derived lineages. Given that the Last Glacial Maximum event in the Armenian plateau occured a few millennia before the Neolithic era, we envision a scenario in which its repopulation was achieved mainly by the arrival of farmers from the Fertile Crescent temporally coincident with the initial inception of farming in Greece. However, we detect very restricted genetic affinities with Europe that suggest any later cultural diffusions from Armenia to Europe were not associated with substantial amounts of paternal gene flow, despite the presence of closely related Indo-European languages in both Armenia and Southeast Europe.


November 15, 2011

Does capitalism reduce fitness? (part II)

Rasmus Nielsen responds to my commentary of his students' ‘rEvolutionary Biologists say: Capitalism Reduces Fitness!’ sign and the blog post associated with it:
Dienekes was shocked by this travesty and decided to make a blog post about it. To my surprise his outrage was not about the social conditions in West Oakland but rather about the loose use of fitness employed in the blog. He took the statement by my students and postdocs literally and pointed out that if you include all the different components of fitness, and not just viability, there is in fact no good scientific evidence that the absolute fitness of individuals growing up in capitalist societies is reduced.
I largely ignore economics and politics in this blog, not because I have no interest in them, but because this is an anthropology blog. And, while I'm sure social conditions in West Oakland may be in great need of improvement, it's not my business to improve them; apparently there are plenty of kind souls working toward that goal already.

It is one thing to hold a political opinion as an individual and another to link that opinion to one's scientific discipline. An electrical engineer, a medical doctor, and a biologist may think that capitalism is a terrible or wonderful system, but if they attempted to link that opinion with electrical engineering, medicine, or biology, I would like to see the evidence for it.

The people in question explicitly linked their political opinions with evolutionary biology, both by ascribing those opinion to "rEvolutionary biologists" and by explicitly linking their dismissal of capitalism to fitness, a central concept in evolutionary biology.

Dr. Nielsen continues:
Most of the students and postdocs in my group are from Europe, and many have not been here for long. They have perhaps not quite gotten use to American political discourse and may not express themselves in a way that most Americans find convincing. But at least they haven’t quite lost their sense of empathy and care for other people. I figure that if I keep them here, in an American academic environment, for a couple of years more they will get cured of that problem and will be able to concentrate fully on their research careers without getting distracted by the economic and social problems they encounter in the neighborhoods around campus on their commute from and to work. If I push them hard, the may even eventually end up getting real jobs and move up in the East Oakland hills. They will then never have to worry about the problems in West Oakland again, and can spend all their time making sure they include all components of fitness when making blog posts.

Last time I checked, both East Oakland and West Oakland, and indeed the entire United States have a capitalist economy. If someone cares why people in West Oakland have a different life expectancy than those in East Oakland, they must seek the explanation elsewhere, and not in their common economic system. If they wanted to examine the influence of capitalism on life expectancy, they could, perhaps, compare South vs. North Korea, two countries with similar populations, that also happen to have a difference in life expectancy of about 10 years, with capitalist South Koreans outliving non-capitalist North Koreans.

November 14, 2011

Splits or Waves? Trees or Webs?

Tree models are used in both linguistics and genetics for inferring population history. The trouble with them is that human populations do not really evolve (either genetically or culturally, as in language), tree-like, but rather exchange both genes and words.

Linguistic evolution has been mostly described in terms of tree models, but languages are not insulated from each other, and they interact after their initial differentiation. This interaction is facilitated by geographic proximity, and also by linguistic proximity.

Geographic proximity makes it possible for speakers of different languages to talk to each other, learn each other's languages, or develop hybrid languages or a lingua franca. Linguistic proximity facilitates communication: it is fairly easy, for example, for speakers of Germanic languages to interact, and much more difficult for those of, say, English and Chinese.

If speakers of a language become separated by distance or geographical barriers, then lateral exchange between different groups becomes minimal, and language evolution can be well-described by a tree model. If, on the other hand, there exists a language continuum across a wide area, effected by a common process (say, the spread of agriculture), then there is room for substantial cross-interaction of different emergent languages at the stage when they can be still thought as dialects of the parent language.

While the current paper's focus is on Germanic languages, the endgame seems to be on the much harder and more vigorously contested field of Indo-European studies.

The author has put up a nice supplementary page online on a first attempt of using NeighborNet with an Indo-European dataset, pictured on the right. A publication on the topic is listed as being in preparation:
The utility of Germanic as a case-study is that it provides a (reasonably) known external history against which to assess our methodological approaches. On the strength of the findings here, a similar logic can now be extended to probing the unknown of how the early divergence history of Indo-European unfolded. In the full exploration in Heggarty (in preparation a), it transpires that even the data underlying figures 1 and 2 here suggest an early divergence along the lines of a dialect continuum. And for all the purported analytical elegance of binary branches, as a real-world demographic scenario it is this Indo-European continuum that offers the more straightforward and economical explanation. A splits-then-borrowing scenario has instead to invoke not just a complex series of divergent migrations, but then later movements to attenuate this by bringing certain groups back into contact again. This in turn entails consequences for which of the main rival hypotheses—the migratory Kurgan ‘horse culture’, or the progressive demic diffusion of agriculture—best fits as the driving force that shaped the pattern of the earliest Indo-European expansion.
Phil. Trans. R. Soc. B 12 December 2010 vol. 365 no. 1559 3829-3843

Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories

Paul Heggarty et al.

Linguists have traditionally represented patterns of divergence within a language family in terms of either a ‘splits’ model, corresponding to a branching family tree structure, or the wave model, resulting in a (dialect) continuum. Recent phylogenetic analyses, however, have tended to assume the former as a viable idealization also for the latter. But the contrast matters, for it typically reflects different processes in the real world: speaker populations either separated by migrations, or expanding over continuous territory. Since history often leaves a complex of both patterns within the same language family, ideally we need a single model to capture both, and tease apart the respective contributions of each. The ‘network’ type of phylogenetic method offers this, so we review recent applications to language data. Most have used lexical data, encoded as binary or multi-state characters. We look instead at continuous distance measures of divergence in phonetics. Our output networks combine branch- and continuum-like signals in ways that correspond well to known histories (illustrated for Germanic, and particularly English). We thus challenge the traditional insistence on shared innovations, setting out a new, principled explanation for why complex language histories can emerge correctly from distance measures, despite shared retentions and parallel innovations.


November 13, 2011

Doing science right

I recently complained about the non-arrival of the Tyrolean Iceman's genome this October, despite the fact that at least parts of it have been available for almost a year.

More recently, I plugged the Roman DNA Project that has already exceeded its funding goal, and seems to be going strong. Kristina Killgrove promises:
Donors will also be able to follow my progress through Twitter and blog feeds not available to the general public, getting real-time updates and learning the DNA results along with me.
Personally, I would prefer if project progress would be visible to the world at large, rather than only to project donors, in accordance with my default stance in favor of completely open science. Nonetheless, I appreciate that this limitation may serve as an incentive for donations. I hope that there will not be formal limitations barring donors from communicating news about the Project's progress to the community at large.

And, it seems that we won't have to wait too long for results. Kristina tweeted back to me that:
@dienekesp Funding will be released in mid-Dec. Samples shipped mid-Jan. First results expected 2-3 months after that.

@dienekesp Depends on # of samples, but before summer will probably have all results.
The success of this and other projects from the SciFund challenge would be a very positive sign that regular people with a passion for science are willing to fund experts to do the kind of research they want. The funding and success of such endeavors may be the best answer to those who think that digging deeper into the public purse is the only way to advance science and education.

November 12, 2011

Does capitalism reduce fitness?

I was much surprised to find this blog post on my feed today:

nielsen lab, occupy, and berkeley protests

I don't have much to say about the Occupy movement itself. People can differ in their opinions about the causes of the current economic malaise, and to argue about them. But, I will comment on a point:

We’ve thus been taking part in multiple ‘Occupy’ activities, starting the first meeting of Occupy Oakland in mid-October, marches in SF, and the Nov 2 Oakland general strike. This was a great moment – young and old, blue and white collar workers walked and rejoiced together the whole day. Our sign said ‘rEvolutionary Biologists say: Capitalism Reduces Fitness!’

Well, it’s the objective truth. An flyer distributed in the demonstration was pointing out that an African-American boy born in West Oakland has 15 shorter life expectancy than someone born up in the hills. If that’s not reduction in fitness, what is?

A reduction of life expectancy by 15 years is not evidence of reduction of fitness. Fitness is measured in offspring, it is not measured in years lived. African countries, for example, have very high population growth rates, and very low life expectancies. That is fitness. Fitness is not a comfortable long life, but having lots of babies and living long enough to ensure their survival as independent entities.

It is strange that evolutionary biologists from a top institution would make such a simple mistake. Perhaps public spending is not the best way to advance science and education?

November 11, 2011

Falsification in action

I am an occasional critic of Anatole Klyosov's Y-STR based age estimation methodology on the GENEALOGY-DNA-L list. As I have mentioned before, I am boycotting Y-STRs because they are simply worthless for the student of prehistory due to their poor qualities as molecular clocks and lack of any clear correspondence with population movements.

Nonetheless, Klyosov's professional credentials and substantial "dna genealogy" paper production, may lead some to give his work, characterized by very narrow confidence intervals and rather imaginative archaeological reconstructions, undue attention.

Klyosov resurfaced on GENEALOGY-DNA-L, taking a swipe at my criticism of his narrow confidence intervals:
Instead of walking in circles considering "bushy trees" all these years and complaining on "huge confidence intervals", one better take ACTUAL genealogy data, ACTUAL haplotype datasets, and compare actual dates with those resulted from DNA genealogy. This will show what ACTUAL margins of error looks like. With "bushy trees", they should be first subdivided on separate branches, and each branch should be analyzed individually.

Thankfully, the arrival of ancient DNA analysis can be used to falsify Klyosov's assertions. In December 2010 he discussed the possibility that some E1b1b1 subclades may have played a role in wiping out the "Bell Beakers":
However, E-V13 is already out, since it was formed around 2600 ybp (Lutak and Klyosov, Proceedings, 2009, April, pp. 639-669). E-V65 is out on the same reason (2625 ybp). E-V22 is a good candidate, with its common ancestor around 5075 ybp (ibid). E1b1b1a1-V12 also could be there, with its common ancestor of 4300+/-680 ybp. E3b1, as Adams et al (2008) called them (it is apparently E-81), has a common ancestor in Iberia around 4825 ybp (Klyosov, Proceedings, 2009, March, pp. 390-421), which nicely fit to the concept.
The recent publication of 7,000-year-old E-V13 from Neolithic Spain, indicates that this haplogroup was in existence at least that long ago, and hence could not have been formed 2,600 years before present. Klyosov's error is at least 2.5x, consistent with my assertions that Y-STR based age estimates carry huge confidence intervals, and inconsistent with his self-assurance that they do not.

I see nothing wrong in advancing speculative hypotheses based on the available evidence. I've advanced some of my own ideas for the spread of E-V13 that appear to be less plausible in the light of the ancient DNA evidence, even though a historical, Greek-mediated spread of a subset of E-V13 as proposed by Di Gaetano et al. and King et al. is still possible.

What is certainly wrong is to have over-confidence in one's assertions and not to admit the limitations of Y-STR based age estimates when they are staring us in the face on both theoretical and empirical grounds.

November 09, 2011

To survive: be fat or be smart

The bottom line is that it makes sense for an animal to combine the "fat" and "smart" strategies to survive. It makes sense: a very fat but very dumb animal has all the energy reserves it will ever need, but at the expense of locomotion efficiency, avoidance of predators, etc. A very smart but very lean animal has all the brain power needed to survive, but has very little "in the tank" if it finds itself in a bad spot and has to go without food for a long time.

The versatile strategy is best, and humans are the one species that seems to have gone the "brain power" way, without sacrificing completely other traits needed for survival.

Nature (2011) doi:10.1038/nature10629

Energetics and the evolution of human brain size

Ana Navarrete et al.

The human brain stands out among mammals by being unusually large. The expensive-tissue hypothesis1 explains its evolution by proposing a trade-off between the size of the brain and that of the digestive tract, which is smaller than expected for a primate of our body size. Although this hypothesis is widely accepted, empirical support so far has been equivocal. Here we test it in a sample of 100 mammalian species, including 23 primates, by analysing brain size and organ mass data. We found that, controlling for fat-free body mass, brain size is not negatively correlated with the mass of the digestive tract or any other expensive organ, thus refuting the expensive-tissue hypothesis. Nonetheless, consistent with the existence of energy trade-offs with brain size, we find that the size of brains and adipose depots are negatively correlated in mammals, indicating that encephalization and fat storage are compensatory strategies to buffer against starvation. However, these two strategies can be combined if fat storage does not unduly hamper locomotor efficiency. We propose that human encephalization was made possible by a combination of stabilization of energy inputs and a redirection of energy from locomotion, growth and reproduction.


November 07, 2011

Cave painters painted spotted horses as they saw them.

I am innately skeptical of "symbolism" when it comes to most ancient art (see e.g., the destruction of the "mother goddess" theory). So, an article that shows that ancient artists painted horses as they saw them, and did not put dots on them for some strange symbolic reason, is very welcome.

From ScienceNOW:
About 25,000 years ago, humans began painting a curious creature on the walls of European caves. Among the rhinos, wild cattle, and other animals, they sketched a white horse with black spots. Although such horses are popular breeds today, scientists didn't think they existed before humans domesticated the species about 5000 years ago. Now, a new study of prehistoric horse DNA concludes that spotted horses did indeed roam ancient Europe, suggesting that early artists may have been reproducing what they saw rather than creating imaginary creatures.

Related: Ancient DNA for horse coat color

PNAS doi: 10.1073/pnas.1108982108

Genotypes of predomestic horses match phenotypes painted in Paleolithic works of cave art

Melanie Pruvost et al.

Archaeologists often argue whether Paleolithic works of art, cave paintings in particular, constitute reflections of the natural environment of humans at the time. They also debate the extent to which these paintings actually contain creative artistic expression, reflect the phenotypic variation of the surrounding environment, or focus on rare phenotypes. The famous paintings “The Dappled Horses of Pech-Merle,” depicting spotted horses on the walls of a cave in Pech-Merle, France, date back ∼25,000 y, but the coat pattern portrayed in these paintings is remarkably similar to a pattern known as “leopard” in modern horses. We have genotyped nine coat-color loci in 31 predomestic horses from Siberia, Eastern and Western Europe, and the Iberian Peninsula. Eighteen horses had bay coat color, seven were black, and six shared an allele associated with the leopard complex spotting (LP), representing the only spotted phenotype that has been discovered in wild, predomestic horses thus far. LP was detected in four Pleistocene and two Copper Age samples from Western and Eastern Europe, respectively. In contrast, this phenotype was absent from predomestic Siberian horses. Thus, all horse color phenotypes that seem to be distinguishable in cave paintings have now been found to exist in prehistoric horse populations, suggesting that cave paintings of this species represent remarkably realistic depictions of the animals shown. This finding lends support to hypotheses arguing that cave paintings might have contained less of a symbolic or transcendental connotation than often assumed.


November 06, 2011

Y-chromosomes of the Bahamas

I like the line about there being substantially more Y-STR variation in E1b1a7a-U174 and E1b1ba8-U175 in the Bahamas than any African collection. I have argued for years that the central assumption of phylogeography, that the location of highest Y-STR diversity is not necessarily the point of origin of a haplogroup, since Y-STR diversity can be affected both by antiquity and by admixture. Nonetheless, I keep reading papers where tiny differences in Y-STR variation, even if we forget about the noisiness of Y-STRs themselves, are taken as evidence of ancient migrations. Thankfully, the time when Y-STRs were used to infer ancient migrations is over, and the huge collection of Y-STR haplotypes amassed by population geneticists, forensic specialists, and genealogists alike can be put to uses for which it is more amenable.

I can't say I know much about the history of the Bahamas, but this was something I had not heard of before:
Over the last 150 years, the Bahamas has been witness to a varied array of settlers, including Chinese immigrant workers, Greek spongers, Jewish business-men and individuals of Lebanese descent fleeing religious persecution. The extent to which each group has contributed genetically to the Bahamian paternal gene pool, however, is unknown. Our findings suggest that the Greeks, which exhibit relatively high frequencies of haplogroups E1b1b1a*-M78, J2a*-M410, and R1b1b1*-L23 (Semino et al., 2004; Myres et al., 2011), are a likely source of these lineages in the Bahamas, although the presence of M78 derived chromosomes may also signal gene flow from Lebanon (Zalloua et al., 2008). J1e-P58 lineages, on the other hand, which are characteristic of Jewish populations (Hammer et al., 2009) and Arab speaking groups (Chiaroni et al., 2010), may represent genetic signatures of Eastern European Jews and/or Lebanese migrants entering the Bahamas in the early twentieth century.
Another interesting tidbit:
Western European colonialism, although short-lived, appears to have left marked genetic imprints throughout the Bahamian archipelago, with Long Island receiving the strongest European genetic signals and Exuma, the weakest; a distribution pattern consistent with our earlier reports utilizing autosomal STR markers (Simms et al., 2008, 2011). The higher frequency of M269 derived individuals in the Long Island population (55.8%), when compared with the other five Bahamian islands surveyed (ranging from 8.5% to 18.3%), suggests higher gene flow from European males (Saunders, 2003b). According to the 1851 census, Long Island possessed one of the smallest European components (13.1%) yet, by 1953, almost 50% of this population was of ‘‘mixed’’ ancestry (Craton, 1998).
R-M269 seems quintessentially European today, and most living R-M269 men probably have West European ancestry. But, the finding of a high frequency in a Bahamian spot ought to remind us that Y-chromosomes can achieve high frequencies in little time, given the right conditions. Indeed, we can very well draw a parallel between the prehistoric spread of R-M269 into Europe, an event that is still shrouded in mystery, with the late historical movement of the same haplogroup into the Americas. Taking the broad view, these two unrelated events represent two pulses of the same westward spread of a successful Y-chromosome lineage.

It is also nice that scientists are beginning to take notice of very basal Y-chromosomes, going back to Y-chromosome Adam.

Two samples that fell outside of haplogroups B-T (defined by M42) were observed in Abaco (1.5%) and New Providence (0.7%), two Bahamian islands separated by a total of 139.4 km, as well as in a single sample from Haiti (unpublished data). When tested for V171, which, according to Cruciani et al. (2011a) defines the A2-T lineage, all three samples exhibited the ancestral allele. Instead, each individual was derived for the paralogous V152 mutation that determines the A1b lineage. It should be noted that each of the three samples possessed an eight base pair long Poly-T stretch at the M91 locus, indicative of the monophyletic haplogroup A defined by Karafet et al. (2008). However, as a result of the rearrangement of the tree by Cruciani et al. (2011a), haplogroup A no longer represents a monophyletic group, as the A2 and A3 lineages are now united with all haplogroup A lineages other than A1 by their shared possession of V171.
Haplogroup A chromosomes have been collected as isolated examples in many genealogical projects and scientific studies. It's a great idea for someone to take the initiative and collect the most divergent ones, invest in genotyping them fully, and push the boundaries of what we know about the most ancient history of modern human patrilineages.

AJPA DOI: 10.1002/ajpa.21616

Paternal lineages signal distinct genetic contributions from British Loyalists and continental Africans among different Bahamian islands

Tanya M. Simms et al.

Over the past 500 years, the Bahamas has been influenced by a wide array of settlers, some of whom have left marked genetic imprints throughout the archipelago. To assess the extent of each group's genetic contributions, high-resolution Y-chromosome analyses were performed, for the first time, to delineate the patriarchal ancestry of six islands in the Northwest (Abaco and Grand Bahama) and Central (Eleuthera, Exuma, Long Island, and New Providence) Bahamas and their genetic relationships with previously published reference populations. Our results reveal genetic signals emanating primarily from African and European sources, with the predominantly sub-Saharan African and Western European haplogroups E1b1a-M2 and R1b1b1-M269, respectively, accounting for greater than 75% of all Bahamian patrilineages. Surprisingly, we observe notable discrepancies among the six Bahamian populations in their distribution of these lineages, with E1b1a-M2 predominating Y-chromosomes in the collections from Abaco, Exuma, Eleuthera, Grand Bahama, and New Providence, whereas R1b1b1-M269 is found at elevated levels in the Long Island population. Substantial Y-STR haplotype variation within sub-haplogroups E1b1a7a-U174 and E1b1ba8-U175 (greater than any continental African collection) is also noted, possibly indicating genetic influences from a variety of West and Central African groups. Furthermore, differential European genetic contributions in each island (with the exception of Exuma) reflect settlement patterns of the British Loyalists subsequent to the American Revolution.


November 05, 2011

The advantage of being first

From press release:
"We find that families who are at the forefront of a range expansion into new territories had greater reproductive success. In other words, that they had more children, and more children who also had children," Labuda explained. "As a result, these families made a higher genetic contribution to the contemporary population than those who remained behind in what we call the range core, as opposed to the wave front.

The research confirms in humans a phenomenon that has already been observed in other species with much shorter generation spans. "We knew that the migration of species into new areas promoted the spread of rare mutations through a phenomenon known as 'gene surfing', but now we find that selection at the wave front could make this surfing much more efficient," Excoffier said. This evolutionary mechanism in combination with founder effects and social or cultural transmission of reproductive behavior could explain why some genetic diseases are found at an elevated frequency in the Charlevoix and Saguenay Lac Saint-Jean regions where the study was carried out, as rare mutations can also surf during a range expansion.
Science DOI: 10.1126/science.1212880

Deep Human Genealogies Reveal a Selective Advantage to Be on an Expanding Wave Front

Claudia Moreau et al.

Since their origin, human populations have colonized the whole planet, but the demographic processes governing range expansions are mostly unknown. We analyzed the genealogy of more than 1 million individuals resulting from a range expansion in Quebec between 1686 and 1960 and reconstructed the spatial dynamics of the expansion. We find that a majority of the present Saguenay Lac Saint-Jean population can be traced back to ancestors having lived directly on or close to the wave front. Ancestors located on the front contributed significantly more to the current gene pool than those from the range core, likely due to a 20% larger effective fertility of women on the wave front. This fitness component is heritable on the wave front and not in the core, implying that this life-history trait evolves during range expansions.


Darwinian linguistic controversies

There is an interesting essay titled Darwin's Tongues, which covers some of the controversies associated with the use of biological evolutionary methods on linguistic problems. A couple of the papers mentioned in the article are Dunn et al. (2011) and Atkinson (2011). This little bit sparked my interest:
Using this database, Cysouw’s team repeated Atkinson’s technique and found two separate geographic origins for language, one in East Africa and another in West Asia’s Caucasus region, with a large swath of the Middle East and South Africa also possible. Crucially, Cysouw’s analysis suggests that none of these regions contain phoneme-rich languages that stand out as having far more speech sounds than any of the others.
I've contacted Dr. Cysouw to see if anything on this has been published/is available, and I will update this blog entry if it has.

November 04, 2011

Sardinian snails got around

Interesting, in the context of my recent musings.

PLoS ONE 6(6): e20734. doi:10.1371/journal.pone.0020734

Phylogeography of a Land Snail Suggests Trans-Mediterranean Neolithic Transport

Ruth Jesse et al.

Fragmented distribution ranges of species with little active dispersal capacity raise the question about their place of origin and the processes and timing of either range fragmentation or dispersal. The peculiar distribution of the land snail Tudorella sulcata s. str. in Southern France, Sardinia and Algeria is such a challenging case.

Statistical phylogeographic analyses with mitochondrial COI and nuclear hsp70 haplotypes were used to answer the questions of the species' origin, sequence and timing of dispersal. The origin of the species was on Sardinia. Starting from there, a first expansion to Algeria and then to France took place. Abiotic and zoochorous dispersal could be excluded by considering the species' life style, leaving only anthropogenic translocation as parsimonious explanation. The geographic expansion could be dated to approximately 8,000 years before present with a 95% confidence interval of 10,000 to 3,000 years before present.

This period coincides with the Neolithic expansion in the Western Mediterranean, suggesting a role of these settlers as vectors. Our findings thus propose that non-domesticated animals and plants may give hints on the direction and timing of early human expansion routes.