December 10, 2011

First analysis of Metspalu et al. (2011) data (plus K12a admixture calculator)

Here are the results of my first analysis of the new Metspalu et al. (2011) data (populations with _M endings), together with a large number of other samples from various sources, including Chaubey et al. (2011) (_Ch endings), that I had not used before.



Uploaded with ImageShack.us

Spreadsheet of population averages; no outliers removed in source datasets. I'll defer all the technical and other details for when I release Dodecad v4, which will (most likely) be based on the same dataset.

Fst divergences:


MDS plot of first two dimensions based on above table.

You can use DIYDodecad 2.1 with the 'K12a' calculator, which incorporates the K=12 inferred clusters of the above analysis.

Instructions: uncompress the contents of the K12a bundle to your working directory, and follow the instructions of the DIYDodecad 2.1 README file, substituting 'K12a' for 'dv3' in all those instructions. Terms of use: 'K12a', including all files in the downloaded RAR file is free for non-commercial personal use. Commercial uses are forbidden. Contact me for non-personal uses of the calculator.

December 08, 2011

Population structure in South Asia (Metspalu et al. 2011)

I haven't read the paper fully yet (it's open access), but the abstract seems to agree with what I've written both here and over at the Dodecad blog, about South Asians being primarily a West Asian/South Asian variable mix. I will try to get and analyze the new data from the paper; it is strange that every time I am just about ready to release the new version of Dodecad v4, I discover a source of new data!

The American Journal of Human Genetics, Volume 89, Issue 6, 731-744, 9 December 2011

Shared and Unique Components of Human Population Structure and Genome-Wide Signals of Positive Selection in South Asia

Mait Metspalu et al.


South Asia harbors one of the highest levels genetic diversity in Eurasia, which could be interpreted as a result of its long-term large effective population size and of admixture during its complex demographic history. In contrast to Pakistani populations, populations of Indian origin have been underrepresented in previous genomic scans of positive selection and population structure. Here we report data for more than 600,000 SNP markers genotyped in 142 samples from 30 ethnic groups in India. Combining our results with other available genome-wide data, we show that Indian populations are characterized by two major ancestry components, one of which is spread at comparable frequency and haplotype diversity in populations of South and West Asia and the Caucasus. The second component is more restricted to South Asia and accounts for more than 50% of the ancestry in Indian populations. Haplotype diversity associated with these South Asian ancestry components is significantly higher than that of the components dominating the West Eurasian ancestry palette. Modeling of the observed haplotype diversities suggests that both Indian ancestry components are older than the purported Indo-Aryan invasion 3,500 YBP. Consistent with the results of pairwise genetic distances among world regions, Indians share more ancestry signals with West than with East Eurasians. However, compared to Pakistani populations, a higher proportion of their genes show regionally specific signals of high haplotype homozygosity. Among such candidates of positive selection in India are MSTN and DOK5, both of which have potential implications in lipid metabolism and the etiology of type 2 diabetes.

Link

Loss of air sacs and hominin speech

Journal of Human Evolution
doi:10.1016/j.jhevol.2011.07.007

Loss of air sacs improved hominin speech abilities
Bart de Boer

Abstract
In this paper, the acoustic-perceptual effects of air sacs are investigated. Using an adaptive hearing experiment, it is shown that air sacs reduce the perceptual effect of vowel-like articulations. Air sacs are a feature of the vocal tract of all great apes, except humans. Because the presence or absence of air sacs is correlated with the anatomy of the hyoid bone, a probable minimum and maximum date of the loss of air sacs can be estimated from fossil hyoid bones. Australopithecus afarensis still had air sacs about 3.3 Ma, while Homo heidelbergensis, some 600 000 years ago and Homo neandethalensis some 60 000 years ago, did no longer. The reduced distinctiveness of articulations produced with an air sac is in line with the hypothesis that air sacs were selected against because of the evolution of complex vocal communication. This relation between complex vocal communication and fossil evidence may help to get a firmer estimate of when speech first evolved.

Link

December 06, 2011

Y-chromosome ties between Taiwan and Polynesia

Gene. 2011 Nov 3. [Epub ahead of print]

Increased Y-chromosome resolution of haplogroup O suggests genetic ties between the Ami aborigines from Taiwan and the Polynesian Islands of Samoa and Tonga.

Mirabal S, Herrera KJ, Gayden T, Regueiro M, Underhill PA, Garcia-Bertrand RL, Herrera RJ.
Source

Abstract
The Austronesian expansion has left its fingerprint throughout two thirds of the circumference of the globe reaching the island of Madagascar in East Africa to the west and Easter Island, off the coast of Chile, to the east. To date, several theories exist to explain the current genetic distribution of Austronesian populations, with the "slow boat" model being the most widely accepted, though other conjectures (i.e., the "express train" and "entangled bank" hypotheses) have also been widely discussed. In the current study, 158 Y chromosomes from the Polynesian archipelagos of Samoa and Tonga were typed using high resolution binary markers and compared to populations across Mainland East Asia, Taiwan, Island Southeast Asia, Melanesia and Polynesia in order to establish their patrilineal genetic relationships. Y-STR haplotypes on the C2 (M38), C2a (M208), O1a (M119), O3 (M122) and O3a2 (P201) backgrounds were utilized in an attempt to identify the differing sources of the current Y-chromosomal haplogroups present throughout Polynesia (of Melanesian and/or Asian descent). Specifically, while haplogroups C2a, S and K3-P79 suggest a Melanesian component in 23%-42% of the Samoan and Tongan Y chromosomes, the prominence of sub-haplogroup O3a2c* (P164), which has previously been observed at only minimal levels in Mainland East Asians (2.0-4.5%), in both Polynesians (ranging from 19% in Manua to 54% in Tonga) and Ami aborigines from Taiwan (37%) provides, for the first time, evidence for a genetic connection between the Polynesian collections and the Ami.

Link

December 05, 2011

Signals of domestication of wild horses in mtDNA

BMC Evolutionary Biology 2011, 11:328 doi:10.1186/1471-2148-11-328

Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication

Sebastian Lippold et al.

Abstract (provisional)
Background
DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes) from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus) were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region) were investigated for larger sample sets.

Results
In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski) using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73%) already existed before the beginning of domestication about 5,000 years ago.

Conclusions
Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the time depth of the domestic horse mtDNA gene pool.

Link

December 04, 2011

Old and recent clines in Brabant

This paper uses genealogical data to show that while some clines observed today stretch back to pre-industrial times, others do not. This is a nice result that shows that:
  • It's best to try to find test subjects with deep genealogies when one makes inferences about the past
  • Clines in modern-day populations may reflect very recent events, and not necessarily deep historical or even archaeological events
It should be mentioned that the paper does not contradict broad trends within the mentioned haplogroups that have been previously described. And, of course, this makes sense, since broad trends are more difficult to establish than those at the small-scale geographical level.

There is evidence for discontinuity at the European level across thousands of years, and it seems that we won't be able to escape the inevitable chore of figuring out "who went were" across all time scales, rather than relying on simplistic models of Paleolithic hunters receiving Neolithic farmers, and the two living happily ever after around the same hearths until today.


European Journal of Human Genetics , (30 November 2011) | doi:10.1038/ejhg.2011.218

Temporal differentiation across a West-European Y-chromosomal cline: genealogy as a tool in human population genetics

Maarten HD Larmuseau et al.

Abstract
The pattern of population genetic variation and allele frequencies within a species are unstable and are changing over time according to different evolutionary factors. For humans, it is possible to combine detailed patrilineal genealogical records with deep Y-chromosome (Y-chr) genotyping to disentangle signals of historical population genetic structures because of the exponential increase in genetic genealogical data. To test this approach, we studied the temporal pattern of the ‘autochthonous’ micro-geographical genetic structure in the region of Brabant in Belgium and the Netherlands (Northwest Europe). Genealogical data of 881 individuals from Northwest Europe were collected, from which 634 family trees showed a residence within Brabant for at least one generation. The Y-chr genetic variation of the 634 participants was investigated using 110 Y-SNPs and 38 Y-STRs and linked to particular locations within Brabant on specific time periods based on genealogical records. Significant temporal variation in the Y-chr distribution was detected through a north–south gradient in the frequencies distribution of sub-haplogroup R1b1b2a1 (R-U106), next to an opposite trend for R1b1b2a2g (R-U152). The gradient on R-U106 faded in time and even became totally invisible during the Industrial Revolution in the first half of the nineteenth century. Therefore, genealogical data for at least 200 years are required to study small-scale ‘autochthonous’ population structure in Western Europe.

Link

December 03, 2011

Selection for skin color: not so simple

Investigative Genetics 2011, 2:24 doi:10.1186/2041-2223-2-24

Contrasting signals of positive selection in genes involved in human skin color variation from tests based on SNP scans and resequencing

Johanna Maria de Gruijter et al.

Abstract (provisional)
Background
Numerous genome-wide scans conducted by genotyping previously-ascertained single nucleotide polymorphisms (SNPs) have provided candidate signatures of positive selection in various regions of the human genome, including in genes involved in pigmentation traits. However, it is unclear how well the signatures discovered by such haplotype-based test statistics can be reproduced in tests based on full resequence data. Four genes, OCA2, TYRP1, DCT and KITLG, implicated in human skin color variation, have shown evidence for positive selection in Europeans and East Asians in previous SNP-scan data. In the current study, we resequenced 4.7-6.7 kb of DNA from each of these genes in Africans, Europeans, East Asians and South Asians.

Results
Applying all commonly-used allele frequency distribution neutrality test statistics to the newly generated sequence data provided conflicting results in respect of evidence for positive selection. Previous haplotype-based findings could not be clearly confirmed. The application of Markov Chain Monte Carlo Approximate Bayesian Computation to these sequence data using a simple forward simulator revealed broad posterior distributions of the selective parameters for all four genes providing no support for positive selection. However, when we applied this approach to published sequence data on SLC45A2, another human pigmentation candidate gene, we could readily confirm evidence for positive selection as previously detected with sequence-based and some haplotype-based tests.

Conclusions
Overall, our data indicate that even genes that are strong biological candidates for positive selection and show reproducible signatures of positive selection in SNP scans do not always show the same replicability of selection signals in other tests, which should be considered in future studies on detecting positive selection in genetic data.

Link

December 02, 2011

Natural selection in African Americans pre- and post-admixture

I have mentioned before that African Americans should not be used to generalize about Africa, not only because of their ~20% European admixture, but also because they live in an environment completely different from the one their African ancestors adapted to: different climate, different set/intensity of pathogens, different social position, different physical requirements and workloads. It is nice to see a paper which attempts to quantify pre- and post-admixture signals of selection in this population; I think this may be a fertile area of future research, and it may also illuminate some of the specificities of the AA population.

Genome Research doi:10.1101/gr.124784.111

Genome-wide detection of natural selection in African Americans pre-and post-admixture

Wenfei Jin et al.

It is particularly meaningful to investigate natural selection in African Americans (AfA) due to the high mortality their African ancestry has experienced in history. In this study, we examined 491,526 autosomal SNPs genotyped in 5,210 individuals and conducted a genome-wide search for selection signals in 1,890 AfA. Several genomic regions showing excess of African or European ancestry, which were thought as the footprints of selection since population admixture, were detected based on a commonly used approach. However, we also developed a new strategy to detect natural selection both pre-and post-admixture by reconstructing an ancestral African population (AAF) from inferred African components of ancestry in AfA and comparing it with indigenous African populations (IAF). Interestingly, many selection-candidate genes identified by the new approach were associated with AfA specific high-risk diseases such as prostate cancer and hypertension, suggesting an important role these disease-related genes might have played in adapting to new environment. CD36 and HBB, whose mutations confer a degree of protection against malaria, were also located in the highly differentiated regions between AAF and IAF. Further analysis showed that the frequencies of alleles protecting against malaria in AAF were lower than that in IAF, which consists with the relaxed selection pressure of malaria in the New World. There is no overlap between the top candidate genes detected by the two approaches, indicating the different environmental pressures AfA experienced pre-and post-population-admixture. We suggest that the new approach is reasonably powerful and can also be applied to other admixed populations such as Latinos and Uyghurs.

Link

December 01, 2011

The Nubian Complex in southern Arabia, 106 thousand years ago

I covered Jeffrey Rose's work most recently here. Together with his work on the Gulf Oasis, Jebel Faya, and the Skhul/Qafzeh hominids from the Levant, it now appears that modern humans were widely dispersed >100 thousand years ago in the Near East, possessing distinct lithic technologies. It is becoming increasingly impossible to reconcile this evidence with scenaria of Out-of-Africa after 70ka.

The pre-100ka Near East was seemingly teeming with modern humans; it may have been possible to dismiss these as the Out-of-Africa that failed when the Skhul/Qafzeh hominids were the only players in the game, but populations stretching from the Levant to southern Arabia did not simply vanish and were replaced after 70ka.

This leads to a conundrum:
  • Either geneticists are in error when they date the L3/modern human expansion to 70 thousand years ago, or
  • They are in error when they place its origin to Africa.
As I have argued before, the archaeological and genetic evidence can be reconciled if the L3/modern human expansion occurred recently Out-of-Arabia, during the super-arid conditions of MIS 4, after having establishing themselves there in the good times that preceded it.

From the paper:
The Nubian Complex is a regionally distinct Middle Stone Age (MSA) technocomplex first reported from the northern Sudan in the late 1960 s [1], [2]. Archaeological sites belonging to the Nubian Complex (Fig. 1) have since been found throughout the middle and lower Nile Valley [3]–[6], desert oases of the eastern Sahara [7], [8], and the Red Sea hills [9], [10]. Numerical ages from Nubian Complex sites (Table 1) are constrained within Marine Isotope Stage 5 (MIS 5), although temporal differences have been observed among assemblages; as such, it is divided into two phases, an early and a late Nubian Complex [5], [11].
I had previously speculated about the origin of modern humans in a wet Sahara, followed by their expansion into West Asia during MIS 5. I don't know how tenable this scenario is archaeologically, but it certainly appears to be chronologically and genetically: modern mankind coming to its own in the wet Sahara, collapsing demographically as the desert reasserted itself; finding a secondary cradle in Arabia, and expanding as the Arabian desert reasserted itself. Is this the solution: a tale of two deserts, pumping humans from Africa to the Near East pre-100ka and from the Near East to the rest of the world post-70ka?

From the paper:
The taxonomic identity of the Nubian Complex toolmakers is unknown, as no skeletal evidence has been discovered in association with any such assemblage. Although some archaic forms may have persisted in other parts of Africa at that time [79], the distribution of early anatomically modern human (AMH) remains suggest this species is the most likely candidate to have occupied northeast Africa during the Late Pleistocene. Cranial fragments of Homo sapiens found in the Omo river valley, Ethiopia (Fig. 1), represent the first appearance of AMH in East Africa ~195 ka [80]. Remains from Herto [81], Singa [82], and Mumba [83] in East Africa date to between ~160 and ~100 ka. Skeletal remains from Jebel Irhoud in Morocco show that an early form of Homo sapiens had expanded into North Africa as early as ~160 ka [84], and a modern human child discovered at Grotte des Contrebandiers in Morocco verifies the presence of AMH in North Africa by ~110 ka [85]. At the site of Taramsa Hill 1 in the lower Nile Valley, an AMH child dated to ~55 ka was found in association with a lithic industry (Taramsan) that is thought to have developed out of the late Nubian Complex [21], [86]. Despite the lack of direct evidence, given that AMH are the only species to have been found in North Africa from the late Middle Pleistocene onward, it is warranted to speculate that the Nubian Complex toolmakers were modern humans.
I have classified both Jebel Irhoud and Singa, as well as Omo II, and the Qafzeh/Skhul hominids as H. sapiens in my recent analysis of the Mounier et al. (2011) data. So, while it would be desirable to have osteological remains from the sites described in this paper, I'd say the odds are greatly in favor of them being modern humans.

Interestingly:
To some degree, the discovery of late Nubian Complex assemblages in Dhofar upholds this model. The distribution of this technocomplex in the middle and lower Nile Valley, the Horn of Africa, Yemen, and now Dhofar provides a trail of diagnostic artifacts - stone breadcrumbs - spread across the southern dispersal route out of Africa. The close similarity between African and Arabian late Nubian Complex assemblages suggests that these sites are more or less contemporaneous; they were separated for an insufficient amount of time for independently derived technological traits to develop between regions. As the late Nubian Complex at Aybut Al Auwal is dated to MIS 5c, slightly earlier than the late Nubian Complex in Africa [11], we remain open to the possibility that the late Nubian Complex originated in Arabia, and subsequently spread back into northeast Africa. Given the coarse chronological resolution in both Africa and Arabia (Table 1), however, the question of directionality cannot be adequately addressed, suffice to say there is cultural exchange across the Red Sea during MIS 5c.
Certainly the osteological evidence of modern humans in Africa (Omo and Irhoud, especially) predates that for the Near East. Also, some kind of Out-of-Africa must have taken place, since the Eurasian Y-chromosome and mtDNA phylogeny can be securely seen as a subset of the African phylogeny. Nonetheless, we don't really know when the Out-of-Africa even took place, and it could very well be that there was a first Into-Africa during MIS 5c.


Importantly:
Although southern Arabia experienced successive periods of extreme aridity after MIS 5, terrestrial archives document another increase in precipitation across the interior of Arabia during early MIS 3 [59], [104], enabling north-south demographic exchange between ~60–50 ka. South Arabian populations may have spread to the north at this time, taking with them a Nubian-derived Levallois technology based on elongated point production struck from bidirectional Levallois cores, which is notably the hallmark of the Middle-Upper Palaeolithic transition in the Levant [105], [106]. Further survey in central Arabia is required to test whether the Nubian Complex extends north of Dhofar. Until then, the fate of the Nubian Complex in Arabia must remain in question.
The puzzle is slowly filling up.

The problem with many of the stone tools found on the Arabian Peninsula, says New York University archaeologist Christian Tryon, is that they don't have much personality. "Stone tools are often unexciting things," he says, "like Paleolithic screwdrivers and hammers." None, he says, displayed signs of craftsmanship that could identify them definitively as the handiwork of African H. sapiens versus other hominids, such as Neandertals. None, that is, until archaeologist Jeffrey Rose of the University of Birmingham in the United Kingdom uncovered tools in Oman in 2010 with clear African connections. "This is the first time I've been convinced," says Tyron, who was not involved in the new work.
If modern humans were living in southern Arabia 106,000 years ago, the important question for human history is what happened next. Did they die out in Oman—another "failed expansion," as archaeologists describe it—or migrate north, going on to populate the globe? If the latter, it would challenge current genetic data placing global human migration out of Africa perhaps 80,000 years ago. Instead of "out of Africa," says Rose, "we could be looking at 'out of Arabia.' "
Indeed.

From the press release:
According to the authors, the evidence from Oman provides a "trail of stone breadcrumbs" left by early humans migrating across the Red Sea on their journey out of Africa. "After a decade of searching in southern Arabia for some clue that might help us understand early human expansion, at long last we've found the smoking gun of their exit from Africa," says Rose. "What makes this so exciting," he adds, "is that the answer is a scenario almost never considered."

These new findings challenge long-held assumptions about the timing and route of early human expansion out of Africa. Using a technique called Optically Stimulated Luminescence (OSL) to date one of the sites in Oman, researchers have determined that Nubian MSA toolmakers had entered Arabia by 106,000 years ago, if not earlier. This date is considerably older than geneticists have put forth for the modern human exodus from Africa, who estimate the dispersal of our species occurred between 70,000 and 40,000 years ago.

Even more surprising, all of the Nubian MSA sites were found far inland, contrary to the currently accepted theory that envisions early human groups moving along the coast of southern Arabia. "Here we have an example of the disconnect between theoretical models versus real evidence on the ground," says co-author Professor Emeritus Anthony Marks of Southern Methodist University. "The coastal expansion hypothesis looks reasonable on paper, but there is simply no archaeological evidence to back it up. Genetics predict an expansion out of Africa after 70,000 thousand years ago, yet we've seen three separate discoveries published this year with evidence for humans in Arabia thousands, if not tens of thousands of years prior to this date."
PLoS ONE 6(11): e28239. doi:10.1371/journal.pone.0028239

The Nubian Complex of Dhofar, Oman: An African Middle Stone Age Industry in Southern Arabia

Jeffrey I. Rose et al.

Despite the numerous studies proposing early human population expansions from Africa into Arabia during the Late Pleistocene, no archaeological sites have yet been discovered in Arabia that resemble a specific African industry, which would indicate demographic exchange across the Red Sea. Here we report the discovery of a buried site and more than 100 new surface scatters in the Dhofar region of Oman belonging to a regionally-specific African lithic industry - the late Nubian Complex - known previously only from the northeast and Horn of Africa during Marine Isotope Stage 5, ~128,000 to 74,000 years ago. Two optically stimulated luminescence age estimates from the open-air site of Aybut Al Auwal in Oman place the Arabian Nubian Complex at ~106,000 years ago, providing archaeological evidence for the presence of a distinct northeast African Middle Stone Age technocomplex in southern Arabia sometime in the first half of Marine Isotope Stage 5.

Link

November 30, 2011

ChromoPainter and fineSTRUCTURE

The paintmychromosomes.com site gives pretty good information on this, although parts of it are still under construction. Link to paper and supporting information.

The authors use linkage information to discover fine-scale population structure. But, it should be noted that, while haplotype information (i.e., the co-inheritance of marker states at the local genomic level) does indeed provide additional information, the inference of fine-scale population structure does not depend on the presence of such information, nor is it impossible to obtain such fine-scale structure without it.

For example, a year ago, I showed how it is possible to infer K=64 clusters from the HGDP panel without using any linkage information, but setting the maximum possible number of clusters at 70. More recently, I inferred 20 clusters in North-Central Europe alone, 42 clusters in Africa alone, or even 124 clusters in my most ambitious run yet (out of a 150 maximum considered).

It should be noted that in all these experiments, the maximum number of clusters considered plays a significant role, as does the number of PCA/MDS dimensions considered, since MCLUST only finds the optimal number of clusters within the given limits. One of these days, I will try a mega-Clusters Galore exercise, with e.g., 100 MDS dimensions and 250 maximum number of clusters. This may take a while to run, but it will show the limits of the Clusters Galore approach.

While I do think that using haplotype information may add some extra power for ancestry inference, it should be properly compared against MCLUST over PCA/MDS, i.e., a state of the art clustering algorithm that has been shown to infer fine-scale structure without using any haplotype information. The claim that such structure "is only captured by the haplotype-based approach" is premature.


(UPDATE Jan 17, 2012) Lawson and Falush have compared MCLUST with fineSTRUCTURE here.


Inference of population structure using dense haplotype data

Daniel John Lawson et al.

The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un-precedented detail, but presents new statistical challenges. We propose a novel inference frameworkthat aims to efficiently capture information on population structure provided by patterns of haplotypesimilarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes arereconstructed using chunks of DNA donated by the other individuals. Results of this ‘chromosome paint-ing’ can be summarized as a ‘coancestry matrix’, which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA), and model-based approaches such as STRUCTURE, in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability, and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and identify 226 populations reflecting differences on conti-nental, regional, local and family scales. We present multiple lines of evidence that whilst many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels, and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE are available from http://www.paintmychromosomes.com/

November 29, 2011

Aurignacian in Greece >40 thousand years ago

Related:
Antiquity Volume: 85 Number: 330 Page: 1131–1150

Franchthi Cave revisited: the age of the Aurignacian in south-eastern Europe

K. Douka1 et al.

The Aurignacian, traditionally regarded as marking the beginnings of Sapiens in Europe, is notoriously hard to date, being almost out of reach of radiocarbon. Here the authors return to the stratified sequence in the Franchthi Cave, chronicle its lithic and shell ornament industries and, by dating humanly-modified material, show that Franchthi was occupied either side of the Campagnian Ignimbrite super-eruption around 40000 years ago. Along with other results, this means that groups of Early Upper Palaeolithic people were active outside the Danube corridor and Western Europe, and probably in contact with each other over long distances.Along with other results, this means that groups of Early Upper Palaeolithic people were active outside the Danube corridor and Western Europe, and probably in contact with each other over long distances.

Link

Oldest North African rock art

Antiquity Volume: 85 Number: 330 Page: 1184–1193

First evidence of Pleistocene rock art in North Africa: securing the age of the Qurta petroglyphs (Egypt) through OSL dating

Dirk Huyge et al

Long doubted, the existence of Pleistocene rock art in North Africa is here proven through the dating of petroglyph panels displaying aurochs and other animals at Qurta in the Upper Egyptian Nile Valley. The method used was optically stimulated luminescence (OSL) applied to deposits of wind-blown sediment covering the images. This gave a minimum age of ~15 000 calendar years making the rock engravings at Qurta the oldest so far found in North Africa.

Link

November 28, 2011

Sephardic signature within mtDNA haplogroup T (?)

This paper proposes that "The haplotype of a suspected Sephardic origin has mutations 16114T-16126T-16153A-16192T-16294T-16519C in the first control region of mito-
chondrial DNA."

From the paper:
four avenues are pursued: (1) A search is conducted throughout multiple databases of the first control region of mitochondrial DNA for the T2e5 motif to ascertain the prevalence and geographic affiliation of the new haplotype. (2) One T2e5 sample is
sequenced for polymorphisms along the entire mitochondrial DNA and compared with T2e sequences to identify any potential coding region mutations that are important for the Sephardic sequence and its relation to other branches. (3) A phylogenetic tree is built from T2e control sequences to provide further information on the relation among lineages including the Sephardic cluster. Although full genomic sequences are usually preferable to avoid misclassifications based on control region information alone, T2e is an ideal subhaplogroup to exploit the more abundant control region data because it is defined by mutations in the control regions alone. Time to the most recent common ancestor is estimated to address questions of when the lineage emerged as well as where. (4) The frequencies of T sub-haplogroups are compared across growing published literature of various populations including from Europe, the Americas, and the Near East. Although the geographic distribution of haplogroup T has been investigated, less is known about the different subhaplogroups, especially T2e.
With respect to (1), the author writes:
The combined databases do not appear to have any biases for Iberia, Mexico, or Sephardim.
This is a rather weak claim, since the incidence of a haplotype in a given dataset depends on the relative number of samples of the different populations, and Sephardic Jews are indeed over-represented in the database searches relative to their actual population numbers. In any case, no explicit test of bias was performed

Stronger evidence for the Sephardic-ness of the haplotype in question could be arrived by dating it to a period consistent with the origins of that population. However:
Time estimates to the most recent common ancestor of the Sephardic signature T2e5 ranged all the way from after the expulsion – clearly impossible – to 415 000 years before present (YBP) (Fast: 338 YBP, 95% confidence interval (95% CI)=present to 763 YBP; Intermediate: 688 YBP, 95% CI=present-3820 YBP; slow: 6811 YBP, CI1=present to 15 245). Given mutations rates that vary by two orders of magnitude,22 as well as other issues with mutation rates and the rho statistic,23,41 at present coalescence analysis cannot be used to distinguish between different plausible timelines for the proposed Sephardic cluster.
The third piece of evidence in favor of the hypothesis of this paper is the relative frequency of the parent haplogroup T2e relative to T2b. This is, however, irrelevant, since mtDNA haplogroup T2e has been found in prehistoric European hunter-gatherers, so- its higher frequency in Saudi Arabia today does not indicate that its presence in Europe was effected in historical times, e.g., by Jews.

Moreover, higher frequency -in itself- does not indicate the direction of gene flow. Suppose that a particular haplogroup occurs at a frequency of 50% in a population A of 10 million that lives 2,000 miles away, and at a frequency of 10% in a population B of 500 million that lives 500 miles away. Clearly, population B is a much better source of the haplogroup than A, despite its lower frequency.

The final piece of evidence produced by the author:
The small T2e5 cluster satisfies criteria for being a signature. Although it is premature to set specific thresholds of a signature, a sample of 25% known Sephardic and 50% suspicion of Sephardic origin is overwhelmingly above what would be expected for a general European haplogroup.
On the contrary, T2e5 is found in Latin America (including Brazil), Iberia, and among Sephardic Jews who trace their ancestry to Iberia. Hence, if there is anything "in common" between the current T2e5 population, it is the geographical background of Iberia.

Strong evidence for the specific Jewish origin of T2e5 would be provided if it turned up in a different Jewish population. In that case, it could be well argued that this was indeed a lineage of Jewish origin that happened (for whatever reason) to become more frequent in the Sephardic population. On the contrary, the absence of T2e5 in non-Sephardic Jews suggests that this is not necessarily a Jewish-origin lineage.

In conclusion: this paper represents a valiant attempt to identify a Sephardic signature, but I remain unconvinced that a strong enough case for T2e5 being such a signature has been made. The evidence appears to be consistent with that hypothesis, but not sufficient to reject alternatives, namely that this is represents a European founder in the Sephardic population. Indeed, the author honestly admits that the origin of the "Sephardic signature" remains elusive:
These include Jewish settlers seeking asylum after destruction of temples in Jerusalem by Romans and Babylonians 2000–2500 years ago, slightly earlier Jewish settlers in Iberia,7,43 non-Jewish Muslims in the dispersal of Islam 1000+ years ago, non-Jewish Iberian peopling 2500+ years ago that predates all Jewish influx,44 and settlers in Iberia (or Italy) 45000 years ago that entirely predate the existence of Jewish groups. Thus, what is arguably the most contentious issue of whether there is genetic evidence of original Jewish DNA for the Sephardic line cannot be resolved.
Does it matter whether the line was originally Jewish or not? Not in the grand scheme of things, but it is certainly important for geneaologists: if it was originally Jewish then e.g., Latin Americans who belong to it must seek Sephardic Jewish ancestors; if it was pre-Jewish Iberian, then they may/may not have such ancestors.

PS: A minor mistake in the paper is the identification of a Sephardic sample as coming from "Salonica, Turkey". Salonica has, of course, never been part of Turkey: it was part of the Ottoman Empire and is now part of Greece. Fortunately Salonica is prominent enough to avoid confusion, but it's always a good idea to use appropriate terminology when referring to placenames.


European Journal of Human Genetics advance online publication 23 November 2011; doi: 10.1038/ejhg.2011.200

Sephardic signature in haplogroup T mitochondrial DNA

Felice L Bedford

Abstract
A rare combination of mutations within mitochondrial DNA subhaplogroup T2e is identified as affiliated with Sephardic Jews, a group that has received relatively little attention. Four investigations were pursued: Search of the motif in 250 000 control region records across 8 databases, comparison of frequencies of T subhaplogroups (T1, T2b, T2c, T2e, T4, T*) across 11 diverse populations, creation of a phylogenic median-joining network from public T2e control region entries, and analysis of one Sephardic mitochondrial full genomic sequence with the motif. It was found that the rare motif belonged only to Sephardic descendents (Turkey, Bulgaria), to inhabitants of North American regions known for secret Spanish–Jewish colonization, or were consistent with Sephardic ancestry. The incidence of subhaplogroup T2e decreased from the Western Arabian Peninsula to Italy to Spain and into Western Europe. The ratio of sister subhaplogroups T2e to T2b was found to vary 40-fold across populations from a low in the British Isles to a high in Saudi Arabia with the ratio in Sephardim more similar to Saudi Arabia, Egypt, and Italy than to hosts Spain and Portugal. Coding region mutations of 2308G and 14499T may locate the Sephardic signature within T2e, but additional samples and reworking of current T2e phylogenetic branch structure is needed. The Sephardic Turkish community has a less pronounced founder effect than some Ashkenazi groups considered singly (eg, Polish), but other comparisons of interest await comparable averaging. Registries of signatures will benefit the study of populations with a large number of smaller-size founders.

Link

November 27, 2011

Ancient mtDNA from Cardial culture (early Neolithic Iberia)

Molecular Ecology DOI: 10.1111/j.1365-294X.2011.05361.x

Ancient DNA from an Early Neolithic Iberian population supports a pioneer colonization by first farmers

C. GAMBA et al.

The Neolithic transition has been widely debated particularly regarding the extent to which this revolution implied a demographic expansion from the Near East. We attempted to shed some light on this process in northeastern Iberia by combining ancient DNA (aDNA) data from Early Neolithic settlers and published DNA data from Middle Neolithic and modern samples from the same region. We successfully extracted and amplified mitochondrial DNA from 13 human specimens, found at three archaeological sites dated back to the Cardial culture in the Early Neolithic (Can Sadurní and Chaves) and to the Late Early Neolithic (Sant Pau del Camp). We found that haplogroups with a low frequency in modern populations—N* and X1—are found at higher frequencies in our Early Neolithic population (∼31%). Genetic differentiation between Early and Middle Neolithic populations was significant (FST∼0.13, P less than 10−5), suggesting that genetic drift played an important role at this time. To improve our understanding of the Neolithic demographic processes, we used a Bayesian coalescence-based simulation approach to identify the most likely of three demographic scenarios that might explain the genetic data. The three scenarios were chosen to reflect archaeological knowledge and previous genetic studies using similar inferential approaches. We found that models that ignore population structure, as previously used in aDNA studies, are unlikely to explain the data. Our results are compatible with a pioneer colonization of northeastern Iberia at the Early Neolithic characterized by the arrival of small genetically distinctive groups, showing cultural and genetic connections with the Near East.

Link

Y-haplogroup O3 in the eastern Himalayas

From the paper:
The samples were typed through seven panels of 75 single nucleotide polymorphisms (SNPs), as listed in the latest Y- chromosome phylogenetic tree (Karafet et al., 2008). The panels were organized as follows: Panel 1 (within Haplogroup O), M175, M119, P203, M110, M268, P31, M95, M176, M122, M324, M121, P201, M7, M134, M117, 002611, P164, L127 (rs17269396), KL1 (rs17276338); Panel 2 (non- Haplogroup O), M130, P256, M1, M231, M168, M174, M45, M89, M272, M258, M242, M207, M9, M96, P125, M304, M201, M306; Panel 3 (Haplogroup C), M217; Panel 4 (Haplogroup D), P47, N1, P99, M15, M125, M55, M64.1, M116.1, M151, N2, 022457; Panel 5 (Haplogroup N), M214, LLY22g, M128, M46/Tat, P63, P119, P105, P43,M178; Panel 6 (Haplogroup R), M306, M173, M124, M420, SRY10831.2, M17, M64.2, M198, M343, V88, M458, M73, M434, P312, M269, U106/M405; Panel 7 (Haplogroup Q), P36.2.
Wikipedia article on Luoba and Deng.

Annals of Human Genetics DOI: 10.1111/j.1469-1809.2011.00690.x

Y-chromosome O3 Haplogroup Diversity in Sino-Tibetan Populations Reveals Two Migration Routes into the Eastern Himalayas

Longli Kang et al.

The eastern Himalayas are located near the southern entrance through which early modern humans expanded into East Asia. The genetic structure in this region is therefore of great importance in the study of East Asian origins. However, few genetic studies have been performed on the Sino-Tibetan populations (Luoba and Deng) in this region. Here, we analyzed the Y-chromosome diversity of the two populations. The Luoba possessed haplogroups D, N, O, J, Q, and R, indicating gene flow from Tibetans, as well as the western and northern Eurasians. The Deng exhibited haplogroups O, D, N, and C, similar to most Sino-Tibetan populations in the east. Short tandem repeat (STR) diversity within the dominant haplogroup O3 in Sino-Tibetan populations showed that the Luoba are genetically close to Tibetans and the Deng are close to the Qiang. The Qiang had the greatest diversity of Sino-Tibetan populations, supporting the view of this population being the oldest in the family. The lowest diversity occurred in the eastern Himalayas, suggesting that this area was an endpoint for the expansion of Sino-Tibetan people. Thus, we have shown that populations with haplogroup O3 moved into the eastern Himalayas through at least two routes.

November 26, 2011

mtDNA of Venezuelans

Am J Phys Anthropol DOI: 10.1002/ajpa.21629

A melting pot of multicontinental mtDNA lineages in admixed Venezuelans

Alberto Gómez-Carballa et al.

The arrival of Europeans in Colonial and post-Colonial times coupled with the forced introduction of sub-Saharan Africans have dramatically changed the genetic background of Venezuela. The main aim of the present study was to evaluate, through the study of mitochondrial DNA (mtDNA) variation, the extent of admixture and the characterization of the most likely continental ancestral sources of present-day urban Venezuelans. We analyzed two admixed populations that have experienced different demographic histories, namely, Caracas (n = 131) and Pueblo Llano (n = 219). The native American component of admixed Venezuelans accounted for 80% (46% haplogroup [hg] A2, 7% hg B2, 21% hg C1, and 6% hg D1) of all mtDNAs; while the sub-Saharan and European contributions made up ∼10% each, indicating that Trans-Atlantic immigrants have only partially erased the native American nature of Venezuelans. A Bayesian-based model allowed the different contributions of European countries to admixed Venezuelans to be disentangled (Spain: ∼38.4%, Portugal: ∼35.5%, Italy: ∼27.0%), in good agreement with the documented history. Seventeen entire mtDNA genomes were sequenced, which allowed five new native American branches to be discovered. B2j and B2k, are supported by two different haplotypes and control region data, and their coalescence ages are 3.9 k.y. (95% C.I. 0–7.8) and 2.6 k.y. (95% C.I. 0.1–5.2), respectively. The other clades were exclusively observed in Pueblo Llano and they show the fingerprint of strong recent genetic drift coupled with severe historical consanguinity episodes that might explain the high prevalence of certain Mendelian and complex multi-factorial diseases in this region.

Link