Showing posts with label X2. Show all posts
Showing posts with label X2. Show all posts

January 23, 2015

Ancient mtDNA from collective burials in Germany

Journal of Archaeological Science Volume 51, November 2014, Pages 174–180

Collective burials among agro-pastoral societies in later Neolithic Germany: perspectives from ancient DNA

Esther J. Lee et al.

Ancient DNA research has focused on the genetic patterns of the earliest farmers during the European Neolithic, especially with regards to the demographic changes in the transition from hunting and gathering to agriculture. However, genetic data is relatively lacking after this earliest transition period, when societies had fully adapted to new agrarian lifestyles specific to their local environment. During the later central European Neolithic (ca. 3600–2800 cal BC), large-scale collective burials and monumental architecture appeared within the landscape of many agricultural societies. This phenomenon has been argued to represent the emergence of a “collective” identity. With the aim of exploring genetic-based relations among individuals collectively buried, we obtained human skeletal remains of nearly 200 individuals from four later Neolithic collective burial sites in Germany: Calden, Odagsen, Groβenrode, and Panker. We successfully reproduced reliable mitochondrial DNA (mtDNA) haplotypes from eight Neolithic individuals, which were assigned to haplogroups H, HV0, and X2. Shared haplotypes observed among individuals within Calden and Odagsen suggest that genetic relations may have shaped the arrangement of the deceased within later Neolithic agricultural groups.

Link

October 17, 2012

Ancient mtDNA haplogroup X2 from Central Europe

Davidski reminds me of a paper by Lee et al. I had posted the abstract of, but did not comment on. He highlights the fact that mtDNA haplogroup X2 has been detected at this site (3.6-2.8ky cal BC) but not in earlier LBK Neolithic Europeans. Furthermore, he attributes the arrival of X2 in Europe to "Northwest Eurasians":
Reading the quotes below, I can’t help thinking that X2 lineages in Europe might be associated with the arrival of the so called Northwest Eurasians of North/Central/East Europe and the North Caucasus, while X1 with the earlier migrations of the Sardinian-like Southwest Eurasians of Mediterranean Europe, North Africa and the Near East.
However, mtDNA haplogroup X2 seems to have originated in the Near East:
Finally, phylogeography of the subclades of haplogroup X suggests that the Near East is the likely geographical source for the spread of subhaplogroup X2, and the associated population dispersal occurred around, or after, the LGM when the climate ameliorated. The presence of a daughter clade in northern Native Americans testifies to the range of this population expansion.
Moreover, it occurs at a higher frequency in Southern Europeans than Northern Europeans and is well-represented in the Caucasus, Near East, and even Africa. These twin facts are inconsistent with it being related to "Northwest Eurasians", however that hypothetical people is defined.

Of related interest, mtDNA haplogroup X2b has been detected in Iron Age "princely burials" from the same location and by the same group. Also from Reidla et al.:
The sister groups X2b and X2c (X1 and X2, respectively, in the work of Herrnstadt et al. 2002) encompass one-third of the European sequences (excluding the samples from the North Caucasus). It is of interest that some North African sequences (from Morocco and Algeria) belong to X2b as well. Subhaplogroup X2b shows a diversity that is consistent with a postglacial population expansion in both West Eurasia and North Africa.
Fernandes et al. (2012) consider X2b to be of European origin. X2 has been discovered in a Megalithic long mound from France (4.2ky cal BP), and in abundance at Treilles (c. 3,000 BC), in the latter case associated with a predominantly Y-haplogroup G2a (with some I-P37.2) population. In Jean Manco's excellent compendium, X2b is also listed as being present in Neolithic Portugal (3,400 years BC), and X2j in Neolithic Germany (4625-4250 BC); the latter is said to be "North African" by Fernandes et al. (2012).

Therefore, we can probably reject Davidski's speculation...
So, X2 has been located at multiple late Neolithic sites in Central Europe, including the Corded Ware burial ground at Eulau, Eastern Germany. Of course, that’s also where Y-chromosome haplogroup R1a was found (see here). I suspect this wasn’t a coincidence and it’s likely these markers entered Europe together from the east, probably between 4,000 and 3,000 B.C.
X2 shows no association with northern Europeans at present, and occurs in ancient DNA samples from Western Europe that show no indication of being related to Y-haplogroup R1a at all, and even precede the hypothetical 4-3ky BC entry window.

Also of interest is that no X2 was mentioned in recent published data from Ukraine and West Siberia, and none of it was detected in Mesolithic Europeans. So, it seems that X2 variants entered Europe during the Neolithic, and there is no indication that they did so with Davidski's hypothetical R1a-bearing Northwest Europeans.

September 11, 2012

Ancient mtDNA from late Neolithic collective burials in Germany

Journal of Archaeological Science doi:10.1016/j.jas.2012.08.037

Collective burials among agro-pastoral societies in later Neolithic Germany: Perspectives from ancient DNA

Esther J. Lee et al.

Abstract

Ancient DNA research has focused on the genetic patterns of the earliest farmers during the European Neolithic, especially with regards to the demographic changes in the transition from hunting and gathering to agriculture. However, genetic data is relatively lacking after this earliest transition period, when societies had fully adapted to new agrarian lifestyles specific to their local environment. During the later central European Neolithic (ca. 3600 - 2800 cal BC), large-scale collective burials and monumental architecture appeared within the landscape of many agricultural societies. This phenomenon has been argued to represent the emergence of a “collective” identity. With the aim of exploring genetic-based relations among individuals collectively buried, we obtained human skeletal remains of nearly 200 individuals from four later Neolithic collective burial sites in Germany: Calden, Odagsen, Großenrode, and Panker. We successfully reproduced reliable mitochondrial DNA (mtDNA) haplotypes from eight Neolithic individuals, which were assigned to haplogroups H, HV0, and X2. Shared haplotypes observed among individuals within Calden and Odagsen suggest genetic relations may have shaped the arrangement of the deceased within later Neolithic agricultural groups.

Link

January 30, 2012

AAPA 2012 abstracts (part 1)

Here are some interesting abstracts from the 81st Annual Meeting of the American Association of Physical Anthropologists.


Maternal marks of admixture in Cape Coloreds of South Africa.
KRISTINE G. BEATY1, DELISA L. PHILLIPS1, MACIEJ HENNEBERG2 and MICHAEL H. CRAWFORD1.
Previous studies of genetic diversity have suggested that the Cape Coloureds of South Africa are a highly admixed population with genetic roots from indigenous African groups including Khoisans, and the later arrival of Bantu speaking Xhosa farmers. Further genetic contributions came during European colonization of South Africa, which added to the inclusion of largely male European markers to the gene pool. Slaves from Indonesia, Malaysia, Madagascar and India are also thought to have contributed to the genetic makeup of this ethnic group. This study examines the maternal contribution of each of these groups to the genetic diversity of the Cape Coloreds through sequencing of the hypervariable region I of the mitochondrial DNA and through restriction fragment length polymorphism.
A total of 123 individuals were examined for this study. High frequencies of haplogroups L1 and L2 were found at 81.3 percent in this group (100 of the 123 individuals), which indicates that this group has a large African contribution to its mitochondrial makeup. Restrictions of the major European haplogroups identified nine individuals, 7.3 percent of the sample, belonged to haplogroups I and J. Five individuals (4.1 percent of the sample) belonged to the superhaplogroup M, indicating that Asian slaves did contribute to the maternal gene pool. The majority of maternal lineages in this Cape Coloured sample are African in origin, with some European influence and a small contribution from Asian maternal lineages.

Ancient DNA reveals the population origin of the Eastern Xinjiang.
SHIZHU GAO2, HONGJIE LI1, CHUNXIANG LI1 and HUI ZHOU1,3.
Connecting with the Turpan Basin, the Eurasia steppe and the Gansu Corridor, the Eastern region of Xinjiang has played a significant role in the history of human migration, cultural developments, and communications between the East and the West. The population origin, migration and integration of this region have attracted extensive interest among scientists.
In order to research the population origin and movement of the Eastern Xinjiang, genetic polymorphisms studies of the Hami population were conducted. The Hami site is located in the East of Tian-Moutain in Xinjiang, dating back to the Bronze-early Iron Age. Archaeological studies showed that the culture of the Hami site possessed features from both the East and the West. Ancient mtDNA analysis showed that A, C, D, F, G, Z and M7 of the Eastern maternal lines, and W, U2e, U4, and U5aof the Western maternal lines were identified. Tajimas’D test and mismatch distribution analysis show that the Hami population had experienced population expansion in recent time. The demographic analysis of haplogroups suggests that the populations of the Northwest China, Siberia and the Central Asia have contributed to the mtDNA gene pool of the Hami population.
Our study reveals the genetic structure of the early population in Eastern Xinjiang, and its relationships with other Eurasian populations. The results will provide valuable genetic information to further explore the population origin and migration of Xinjiang and Central Asia.


Analysis of Chuvash mtDNA points to Finno-Ugric origin.
ORION M. GRAF1, STEPHEN M. JOHNSON1, JOHN MITCHELL2, STEPHEN WILCOX3, GREGORY LIVSHITS4 and MICHAEL H. CRAWFORD1.
A sample of 92 unrelated individuals from Chuvashia, Russia was sequenced for hypervariable region-I (HVR-I) of the mtDNA molecule. These data have been verified using RFLP analysis of the control region, revealing that the majority exhibit haplogroups H (31%), U (22%), and K (11%), which occur in high frequencies in western and northern Europe, but are virtually absent in Altaic or Mongolian populations. Multidimensional scaling (MDS) was used to examine distances between the Chuvash and reference populations from the literature. Neutrality tests (Tajima’s D (-1.43365) p<0.05, Fu’s FS (-25.50518) p<0.001) and mismatch analysis, which illustrates unimodal distribution, all suggest an expanding population.
The Chuvash speak a Turkic language that is not mutually intelligible to other extant Turkish groups, and their genetics are distinct from Turkic-speaking Altaic groups. Some scholars have suggested that they are remnants of the Golden Horde, while others have advocated that they are the products of admixture between Turkic and Finno-Ugric speakers who came into contact during the 13th century. Earlier genetic research using autosomal DNA markers indicated a Finno-Ugric origin for the Chuvash. This study examines uniparental mitochondrial DNA markers to better elucidate their origins. Results from this study maintain that the Chuvash are not related to Altaic or Mongolian populations along their maternal line, thus supporting the “Elite” hypothesis that their language was imposed by a conquering group —leaving Chuvash mtDNA largely of Eurasian origin. Their maternal markers appear to most closely resemble Finno-Ugric speakers rather than Turkic speakers.


An ancient DNA perspective on the Iron Age “princely burials” from Baden-Wurttemberg, Germany.
ESTHER J. LEE1, CHRISTOPH STEFFEN1, MELANIE HARDER1, BEN KRAUSE-KYORA1, NICOLE VON WURMB-SCHWARK2 and ALMUT NEBEL3.
During the Iron Age in Europe, fundamental social principles such as age, gender, status, and kinship were thought to have played an important role in the social structure of Late Hallstatt and Early Latene societies. In order to address the question of kinship relations represented in the Iron Age “princely burials” that are characterized by their rich material culture, we carried out genetic analysis of individuals associated with the Late Hallstatt culture from Baden-Wurttemberg, Germany. Bone specimens of thirty-eight skeletal remains were collected from five sites including Asperg Grafenbuhl, Muhlacker Heidenwaldle, Hirschlanden, Ludwigsburg, and Schodeingen. Specimens were subjected to DNA extraction and amplification under strict criteria for ancient DNA analysis. We successfully obtained mitochondrial DNA (mtDNA) control region sequences from seventeen individuals that showed different haplotypes, which were assigned to nine haplogroups including haplogroups H, I, K, U5, U7, W, and X2b. Despite the lack of information from nuclear DNA to infer familial relations, information from the mtDNA suggests an intriguing genetic composition of the Late Hallstatt burials. In particular, twelve distinct haplotypes from Asperg Grafenbuhl suggest a heterogeneous composition of maternal lineages represented in the “princely burials”. The results from this study provide clues to the social structure reflected in the burial patterns of the Late Hallstatt culture and implications on the genetic landscape during the Iron Age in Europe.


Genetic snapshot from ancient nomads of Xinjiang.
HONGJIE LI1, SHIZHU GAO2, CHUNXIANG LI1, YE ZHANG1, WEN ZENG3, DONG WEI3 and HUI ZHOU1,3.
Nomads of the Eurasian steppes are known to have played an important role in the transfer commodities and culture among East Asia, Central Asia, and Europe. However, the organization of nomadic societies and initial population genetic composition of nomads were still poorly understood because of few archaeological materials and written history.
In this study, the genetic snapshot of nomads was emerged by examining mitochondrial DNA and Y-chromosome DNA of 30 human remains from Heigouliang (HGL) site in the eastern of Xinjiang, which dated 2000 years ago and associated to the nomadic culture by archaeological studies. Mitochondrial DNA analysis showed that the HGL population included both East Eurasian haplogroups (A, C, D, G, F and Z) and West Eurasian haplogroups (H, K, J, M5 and H). The component of Eastern haplogroups is dominant. The distribution frequency and Fst values of Eastern haplogroups indicated the HGL population presented close genetic affinity to the nearby region modern populations of Gansu and Qinghai, while those of western haplogroups showed similar with Mongolia and Siberia populations. The results implied various maternal lineages were introduced into the HGL population. Regarding the Y chromosomal DNA analysis, nearly all samples belonged to haplogroup Q which is thought to be the mark of the Northern Asian nomads. We identified paternal kinship among three individuals at the same tomb by Y-STR marker.
Combined with archaeological and anthropological investigations, we inferred that the gene flow from the neighboring regions was possibly associated with the expansion of Xiongnu Empire.


Vikings, merchants and pirates at the top of the world: Y-chromosomal signatures of recent and ancient migrations in the Faroe Islands.
ALLISON E. MANN1, EYDFINN MAGNUSSEN2 and CHRISTOPHER R. TILLQUIST1.
The Faroe Islands are a small archipelago in the North Atlantic Ocean. With a current population of approximately 48,000 individuals and evidence of high levels of genetic drift, the Faroese are thought to have remained highly homogeneous since the islands were settled by Vikings around 900CE. Despite their geographic isolation, however, there is historical evidence that the Faroese experienced sporadic contact with other populations since the time of founding. Contact with Barbary pirates in the seventeenth century is documented in the Faroes; there is also the possibility of modern migrations to work in the highly productive fishery. This study set out to distinguish the signal of the original founders from later migrants. Eleven Y-chromosomal STR markers were scored for 139 Faroese males from three geographically dispersed islands. Haplotypes were analyzed using Athey's method to infer haplogroup. Median-joining networks within haplogroups were constructed to determine the phylogenetic relationships within the Faroese and between likely parental populations—Danish, Irish, and Norwegians. Dispersal patterns of individuals around Faroese haplogroups suggest different times of haplotype introduction to the islands. The most common haplogroup, R1a, consists of a large node with a tight network of neighbor haplotypes, such that 68% of individuals are one or two mutational steps away. This pattern may represent the early founder event of R1a in the Faroes. Other distributions, especially of non-Scandinavian haplotypes, document more recent introductions to the islands. The overall pattern is one of a strong founder effect followed by minor instances of later migrations.



Date estimates for major mitochondrial haplogroups in Yemen.
DEVEN N. VYAS1, VIKTOR ČERNÝ2, ALI AL-MEERI3 and CONNIE J. MULLIGAN1.
Yemen occupies a key location as the first stop for anatomically modern humans on a theoretical southern migration route out of Africa. If modern humans did pass through Yemen during the first migrations out of Africa and if they left modern-day descendants, we would expect to see deep divergences in the Yemeni mitochondrial gene tree. Alternatively, if modern humans passed through Yemen but did not leave modern-day descendants or if Yemen was not on the path of these ancient migrations, we would expect more recent dates to be associated with Yemeni mitochondrial haplogroups.
Using 44 previously sequenced mitochondrial genomes as well as 24 newly sequenced mitochondrial genomes from samples collected throughout Yemen, several methods were used to estimate divergence dates of major Yemeni haplogroups including L2, M, R0a and HV. Specifically, phylogenetic trees were generated using MrBayes and maximum likelihood methods. Bayesian and ρ statistic based methods were used to estimate dates of Yemeni haplogroups and these dates were compared with each other, previously published dates for these haplogroups, approximate dates of climatic change that might be expected to correlate with population expansions, and estimates based on archaeological and paleontological evidence for the first migrations out of Africa. These comparisons are intended to cover the range of possible haplogroup divergence dates with respect to the history of early modern humans in southern Arabia.


September 30, 2011

"Comparing Ancient and Modern DNA Variability in Human Populations" abstracts

Excerpts from the conference site.

Temporal differentiation across a West-European Y-chromosomal cline - genealogy as a tool in human population genetics
Maarten H.D. Larmuseau et al.
The pattern of population genetic variation and allele frequencies within a species are unstable and are changing in time according to different evolutionary factors. For humans, it is possible to combine detailed patrilineal genealogical records with deep Y-chromosome genotyping to disentangle signals of historical population genetic structures due to the exponential increase of genetic genealogical data. To test this approach we studied the temporal pattern of the 'autochthonous' micro-geographical genetic structure in the region of Brabant in Belgium and The Netherlands (Northwest-Europe). Genealogical data of 881 individuals from Northwest-Europe were collected from which 634 family trees showed a residence within Brabant for at least one generation. The Y-chromosome genetic variation of the 634 participants was investigated using 110 Y-SNPs and 38 Y-STRs and linked to particular locations within Brabant on specific time periods based on genealogical records. Significant temporal variation in the Y-chromosome distribution was detected through a north-south gradient in the frequencies distribution of subhaplogroup R1b1b2a1 (R-U106), next to an opposite trend for R1b1b2a2g (R-152). The gradient on R-U106 faded in time and became even totally invisible during the Industrial revolution in the first half of the 19th century. Therefore, genealogical data for at least 200 year are required to study small-scale 'autochthonous' population structure in Western-Europe.
The Dutch medieval and post-medieval genetic landscapes
Eveline Altena et al.
Since 2005 many archeological human skeletons have been sampled for DNA research under forensic conditions in The Netherlands. This enables us to perform a large scale genetic survey on reliable genetic data from the prehistory until the present. The majority of the available archaeological DNA samples, though, originate from medieval and post-medieval sites. Here we present preliminary autosomal and Y-chromosomal data from more then 500 archaeological human skeletons, excavated at several medieval and post-medieval sites. We also compare these historical genetic data with data from more then 2000 modern Dutch males.
Comparing ancient and modern DNA variability in North Eastern Iberia: the Neolithic impact of first farmers
Cristina Gamba et al.
Archaeological, anthropological and demographic hypotheses can be tested by comparing ancient and modern DNA from human samples in a diachronical context. In this case, it was possible to evaluate genetic continuity or discontinuity between different periods, and/or to infer ancient human migrations in a set of Iberian samples. We evaluated the demographic impact associated to the spread of the Neolithic in North Eastern Iberia. We recovered mitochondrial DNA from 13 Early Neolithic specimens from three archaeological sites: Can Sadurní, Chaves and Sant Pau. A bayesian simulation approach was performed to compare the obtained results with Middle Neolithic and modern samples from the same region. We tested different scenarios to determine which among them better explained the analyzed data. By comparing simulated and observed FST values, we observed genetic differentiation between Early Neolithic and Middle Neolithic populations, which suggests that at the beginning of the Neolithic, genetic drift played an important role.
Genetic differentiation was also observed between Early Neolithic and modern- day populations. These data are compatible with the arrival of small genetically-distinctive groups at the beginning of the Neolithic, suggesting a pioneer colonization of North Eastern Iberia by first farmers.
The following abstract is interesting as it suggests we should not view the "Neolithic" as a singular event. X2 was also discovered in Megalithic France, as well as a likely immigrant population from the Near East and the Caucasus in the Tarim Basin, and Bronze Age Eulau. From a paper on the Reidla et al. (2003): Overall, it appears that the populations of the Near East, the Caucasus, and Mediterranean Europe harbor subhaplogroup X2 at higher frequencies than those of northern and northeastern Europe (P less than .05) and that X2 is rare in Eastern European as well as Central Asian, Siberian, and Indian populations and is virtually absent in the Finno-Ugric and Turkic-speaking people of the Volga-Ural region.

Where are all the "WIX"? Rare European maternal lineages W, I, and X2 in the past and present
Esther J. Lee et al.
Studies utilizing ancient DNA to examine past populations in Europe have increased dramatically in recent years. Specifically, mitochondrial DNA (mtDNA) sequences for over 100 individuals in prehistoric Europe have been sequenced and published. Scholars have intensively focused on the so-called Neolithic transition in Europe, the transformation from hunter-gatherer lifestyle to agro-pastoralism, and continue to debate whether the process was a result of population movement or cultural dispersion. Both hypotheses continue to be tested and genetics analyses from past and present populations have suggested a complex movement of people and cultures across Eurasia. This work focuses on the mtDNA haplogroups identified in past European populations that are rare in the present, haplogroups W, I, and X2. New data will be presented from Neolithic Funnel Beaker collective burials sites, a late Neolithic Bell Beaker site, and an Iron Age Halstatt site in Germany, in which the three maternal lineages are identified. Among the published European Neolithic data, haplogroup X2 appears in late Neolithic sites in Germany and France but not in the earlier LBK culture. Haplogroup X2 shows an intriguing phylogenetic landscape with a wide geographical distribution at an overall low frequency, but on the other hand, pockets of high diversity and frequency among certain modern western Eurasian populations have been described. The discussion focuses on whether the presence of the three haplogroups in the past is a result of ascertainment bias or some viable population movement.
The following seems to suggest Denisova admixture in the East Asian mainland, and not just the island groups, identified in the recent Reich et al. (2011) paper. The sentence about biased Neandertal similarity with increasing distance to Africa is also interesting; the data that is available so far shows non significant differences in Neandertal similarity among Eurasians, although the published values do seem to show higher (and perplexing) averages in China vs. Europe.

Archaic human ancestry in East Asia
Pontus Skoglund & Mattias Jakobsson
Recent studies of ancient genomes have suggested that gene flow from archaic hominin groups to the ancestors of modern humans occurred on two separate occasions during the modern human expansion out of Africa. At the same time, decreasing levels of human genetic diversity have been found at increasing distance from Africa as a consequence of human expansion out of Africa. We re-analyzed the signal of archaic ancestry in modern human populations and we investigated how serial founder models of human expansion affect the signal of archaic ancestry using simulations. We show that genetic drift coupled with an ascertainment bias for common alleles can cause artificial, but largely predictable, differences in affinity to archaic genomes between descendants of an admixture event. In genotype data from non-African humans, this effect results in a biased genetic similarity to Neandertal with increasing distance from Africa. In addition to the two previously reported connections between non-Africans and Neandertals as well as between Oceanians and a Denisovan archaic human genome from Siberia, we found a significant affinity between East Asians (in particular Southeast Asians) and the Denisovan genome, a pattern that is not expected under a model of solely Neandertal-related admixture in the ancestry of East Asians. This observation could be explained either by substantial migration from Oceania into East Asia, or more common history between anatomically modern- and archaic populations than previously proposed.

August 18, 2010

Ancient Megalithic mtDNA from France

An extremely interesting paper, the first one on Megalithic remains, and a link between the Megalithic people and the early central European Neolithic Linearbandkeramik, where N1a was unexpectedly detected as a major component a few years ago. I'll probably have more to say on this after I read the paper.

UPDATE:

From the paper:
We reproducibly retrieved partial HVR-I sequences (nps 16,165 to 16,390) from three human remains (Prisse´ 1, 2, and 4, Table 1), one adult and two children deposited during different stages of use of the burial chamber. Corresponding sequences could be unambiguously assigned to haplogroups X2, U5b, and N1a (Table 2 and Supporting Online Information).
Haplogroup U5b subclusters are believed to have spread from central-southern Europe post-LGM. Haplogroup X2 is believed to have spread from the Near East and Mediterranean Europe; it is one of those mystery haplogroups that turn up in the Taklamakan desert as well as Native Americans. Together with the clearly invasive nature of N1a, these results are consistent with migrationism.

The authors write:
The widespread distribution of the N1a lineage in Early and Middle Neolithic northwestern Europe may indicate genetic continuity from Mesolithic populations.
This scenario would support a Mesolithic contribution to the earliest Neolithic of Atlantic Europe. This would imply that the N1a lineage was already common in
indigenous north European populations and that the spread of the Neolithic was principally the result of cultural diffusion. Although so far the N1a lineage has not
been encountered among late European hunter-gatherers in central and north Europe (Bramanti et al., 2009; Malmstro¨m et al., 2009), it is worth noting that less
than half of the hunter-gatherers’ paleogenetic data come indeed from the pre-Neolithic period (predating LBK expansion). Finally, no paleogenetic data currently
exist for the Mesolithic period in Western Europe. This prevents any conclusion being drawn about N1a occurrence during the Mesolithic period in those regions.
Of course we won't know if N1a occurred in France prior to the Neolithic until we test pre-Neolithic French samples. However, if N1a was present in France prior to the Neolithic, then why wasn't it present in central-northern Europe where substantial sample sizes exist? This would require a partition of pre-Neolithic populations of Europe, and also existence of N1a in both the Linearbandkeramik (that spread on a south-north vector) and in Mesolithic French. So, while we wait for pre-Neolithic Western Europeans to come up N1a, I'm willing to wager that they will not, and that N1a spread into France with the Neolithic or the later spread of Megalithic cultures.

Related:

American Journal of Physical Anthropology DOI: 10.1002/ajpa.21376

News from the west: Ancient DNA from a French megalithic burial chamber

Marie-France Deguilloux et al.

Recent paleogenetic studies have confirmed that the spread of the Neolithic across Europe was neither genetically nor geographically uniform. To extend existing knowledge of the mitochondrial European Neolithic gene pool, we examined six samples of human skeletal material from a French megalithic long mound (c.4200 cal BC). We retrieved HVR-I sequences from three individuals and demonstrated that in the Neolithic period the mtDNA haplogroup N1a, previously only known in central Europe, was as widely distributed as western France. Alternative scenarios are discussed in seeking to explain this result, including Mesolithic ancestry, Neolithic demic diffusion, and long-distance matrimonial exchanges. In light of the limited Neolithic ancient DNA (aDNA) data currently available, we observe that all three scenarios appear equally consistent with paleogenetic and archaeological data. In consequence, we advocate caution in interpreting aDNA in the context of the Neolithic transition in Europe. Nevertheless, our results strengthen conclusions demonstrating genetic discontinuity between modern and ancient Europeans whether through migration, demographic or selection processes, or social practices.

Link

January 22, 2010

Caucasoid mtDNA U3 and X2 in Taklamakan Desert

Related:
American Journal of Physical Anthropology doi:10.1002/ajpa.21257

Early Eurasian migration traces in the Tarim Basin revealed by mtDNA polymorphisms

Yinqiu Cui et al.

Abstract

The mitochondrial DNA (mtDNA) polymorphisms of 58 samples from the Daheyan village located in the central Taklamakan Desert of the Tarim Basin were determined in this study. Among the 58 samples, 29 haplotypes belonging to 18 different haplogroups were analyzed. Almost all the mtDNAs belong to a subset of either the defined Western or Eastern Eurasian pool. Extensive Eastern Eurasian lineages exist in the Daheyan population in which Northern-prevalent haplogroups present higher frequencies. In the limited existing Western Eurasian lineages, two sub-haplogroups, U3 and X2, that are rare in Central Asia were found in this study, which may be indicative of the remnants of an early immigrant population from the Near East and Caucasus regions preserved only in the Tarim Basin. The presence of U3 in modern and archeological samples in the Tarim Basin suggests that the immigration took place earlier than 2,000 years ago and points to human continuity in this area, with at least one Western lineage originating from the Near East and Caucasus regions.

Link

September 30, 2009

Some mtDNA links between Europe and Asia

I was planning on writing up a more complete narrative for this post, but I don't think the evidence is -as of yet- strong enough to support very strong speculation. I will simply say that the recent results of Bramanti et al. for a U-dominated older mtDNA stratum in Central/North-eastern Europe can be reasonably extended to cover both North-western Europe and northern Eurasia up to Lake Baikal, the prehistoric limit between Caucasoids and Mongoloids.

This boreal zone of U dominance contrasts with that of the Neolithic and Bronze Age inhabitants, where the familiar mix of ten or so main Caucasoid haplogroups makes its appearance, in various proportions and in various degrees of admixture at the eastern end of its expansion. The eastern Caucasoids were probably derived from both (i) West Asia via the spread of the Neolithic economy to the east wherever it could be ecologically supported, (ii) in the more northern parts, from migrations across the steppe from Central and Eastern Europe.

More ancient DNA research is needed to establish (i) how complete was the U dominance in the pre-Neolithic northern zone, and (ii) when, and where did the other Caucasoid haplogroups break into it.

Anyway, here is the post as it stands:

Ricaut et al. (2004) discovered the presence of mtDNA haplogroup N1a (16147A, 16172C, 16223T, 16248T, and 16355T) in an Iron Age Scytho-Siberian skeleton from the Altai, reporting the presence of haplogroup N1a among Iranians and upper caste Havik Brahmins from India.

The same sequence was detected in a Neolithic Central European (DER1) of the Linearbandkeramik (LBK) culture, with reported modern matches in Egypt and Armenia. The following haplogroups were detected in the Neolithic LBK gene pool: H*, N1a, K, HV, T2, V, J, W, U3.

A later study by Gokcumen et al. (2008) discovered the presence of N1a in modern Kazakhs from the Altai:
The haplotypic variation within the seven N1a samples was relatively high (Table 2), with these haplotypes belonging to both the European and Central Asian branches of this haplogroup, as recently defined by Haak et al. (2005). Thus, the source of N1a haplotypes in Altaian Kazakhs was unclear, although they seemed to have originated west of this part of Central Asia (Gokcumen et al., 2007).
Haplogroup N1a was found to be a genuine signature of the Central European Neolithic by contrasting its high representation in the LBK with the overwhelming presence of haplogroup U (and especially U5 and U4) mtDNA among the Paleolithic and Mesolithic populations of the region.

A separate Neolithic Funnel Beaker (TRB) sample from Scandinavia (Malmström et al. 2009) included only three individuals belonging to haplogroups H, J, and T. Obviously, a sample of 3 is insufficient, but the absence of haplogroup U in it parallels that of the LBK. By contrast, the contemporaneous Mesolithic Pitted Ware culture, represented by 19 samples had single instances of J, and T (which may be due to admixture with the TRB), a single instance of haplogroup V, one of the few ones thought to be European in origin, and a gene pool that was apparently dominated by haplogroups U4 and U5. The picture emerging from the northmost European hunter-gatherers is one of a restricted set of haplogroups where U subclades were dominant (about 3/4).

N1a was also detected in medieval high-status Hungarians:
Commoners show a predominance of mtDNA haplotypes and haplogroups (H, R, T), common in west Eurasia, while high-status individuals, presumably conquering Hungarians, show a more heterogeneous haplogroup distribution, with haplogroups (N1a, X) which are present at very low frequencies in modern worldwide populations and are absent in recent Hungarian and Sekler populations.
While, as we saw, N1a was frequent among Neolithic Central Europeans, its absence in Hungarian commoners suggests that it was re-introduced -in the high status individuals- from Asia.

Interestingly, there has been European and Asian mtDNA evidence that allows us to have a good idea of the mtDNA landscape on which N1a-bearing people migrated from west to east:

The pre-farming foragers of Europe were dominated by mtDNA haplogroup U. The easternmost sample in the aforementioned study was from Samara, in European Russia and consisted of a U5a, and a U5a1 sample. How far to the west and east did the U-dominated population of pre-Neolithic northern Caucasoids extend?

Neolithic Siberians from Lake Baikal, the eastermost anthropologically attested limit of prehistoric Caucasoid populations had only U5a as a Western Caucasoid element in a population dominated by Eastern Eurasian mtDNA. Similarly, the Lokomotiv Siberian burials from Lake Baikal only had U5a in an other Mongoloid mtDNA gene pool. Yu Hong, a Sogdian in China (1,400 years ago) also belonged to haplogroup U5.

U5a was not limited to the territory of Central Europe to China in ancient times. It was the haplogroup of Cheddar Man, a Paleolithic Briton, and U5a1 or U5a1a has also been detected in a Mycenaean from Bronze Age Greece. Interestingly, U5a1 seems to have decreased in frequency in Britain from the 4th c. to the present.

Is it possible that negative selection is affecting mtDNA frequencies in Europe? U-haplogroup turns up in many ancient DNA samples, but the discovery that it was absent (or non-detectible) in Neolithic farmers raises the possibility that its reduced frequency may be due to demography, i.e., the overwhelming of Paleolithic foragers by Neolithic (and later) intruders.

We know that in the Bronze and subsequent ages, Siberians from Krasnoyarsk belonged to a rich assortment of Caucasoid haplogroups. It seems that newcomers from the West joined the U-dominated earliest settlers:
Twenty samples were found to belong to west Eurasian haplogroups (U2, U4,
U5a1, T1, T3, T4, H5a, H6, HV, K, and I
), whereas the 6 remaining samples were attributed to east Eurasian haplogroups (Z, G2a, C, F1b and N9a).
At the other end of the Eurasiatic steppe, in the Bronze Age site of Eulau in Germany, the gene pool was also quite different from that of the Paleolithic inhabitants, with haplogroups K1b, U5b, I, H, X2, K1a2 detected.

Haplogroup X2 represents another link between the west and Siberia according to Reidla et al. (2003):
Overall, it appears that the populations of the Near East, the Caucasus, and Mediterranean Europe harbor subhaplogroup X2 at higher frequencies than those of northern and northeastern Europe (P less than .05) and that X2 is rare in Eastern European as well as Central Asian, Siberian, and Indian populations and is virtually absent in the Finno-Ugric and Turkic-speaking people of the Volga-Ural region. [...] the few Altaian (Derenko et al. 2001) and Siberian haplogroup X lineages are not related to the Native American cluster, and they are more likely explained by recent gene flow from Europe or from West Asia.
The Tubalar, Altaic speakers from the northeastern Altai showed a mixed Caucasoid-Mongoloid mtDNA gene pool, with the western component consisting of haplogroups H8, U4b, U5a1, and X2e:
Specifically, northeastern Altai appears to be a good candidate for the ancestral homeland of the haplogroup U4b, which is apparently ancient European. For some haplogroups, such as X2e, the relatively recent arrival to the Altai region is more likely.
Derenko et al. (2002) discovered a rich assortment of Caucasoid haplogroups in several populations from the Altai, including all aforementioned ones (H, HV1, J*, J1, J1b1, T1, T4, U1a, U2, U3, U4, U5a1, I, X and N1a):
The applied approach permitted identification of 60% of mtDNA types the majority of which had southern Caucasoid origin. Less than 10% of mtDNA types were of eastern European origin.
Derenko et al. (2003) also studied several populations from South Siberia where the Caucasoid component was much diminished (17%) with the following haplogroups present: H, U, J, T, I, N1a, X.

September 15, 2009

Variable genetic ancestry in Brazilians

Braz J Med Biol Res. 2009 Sep 11. pii: S0100-879X2009005000026.

DNA tests probe the genomic ancestry of Brazilians.

Pena SD, Bastos-Rodrigues L, Pimenta JR, Bydlowski SP.

We review studies from our laboratories using different molecular tools to characterize the ancestry of Brazilians in reference to their Amerindian, European and African roots. Initially we used uniparental DNA markers to investigate the contribution of distinct Y chromosome and mitochondrial DNA lineages to present-day populations. High levels of genetic admixture and strong directional mating between European males and Amerindian and African females were unraveled. We next analyzed different types of biparental autosomal polymorphisms. Especially useful was a set of 40 insertion-deletion polymorphisms (indels) that when studied worldwide proved exquisitely sensitive in discriminating between Amerindians, Europeans and Sub-Saharan Africans. When applied to the study of Brazilians these markers confirmed extensive genomic admixture, but also demonstrated a strong imprint of the massive European immigration wave in the 19th and 20th centuries. The high individual ancestral variability observed suggests that each Brazilian has a singular proportion of Amerindian, European and African ancestries in his mosaic genome. In Brazil, one cannot predict the color of persons from their genomic ancestry nor the opposite. Brazilians should be assessed on a personal basis, as 190 million human beings, and not as members of color groups.

Link

September 04, 2009

ASHG 2009 abstracts

It's that time of year again. Here is a list of abstracts from ASHG 2009 that caught my attention in three broad areas. It will be very interesting to see these when they become full papers, but if you are one of the lucky ones that goes to Hawaii this October and want to drop me a line about any of them, feel free to do so!

Population Genetics

Haplogroup H of mitochondrial DNA, a far echo of the West in the heart of Central Asia
Through the millennia, Inner Asia played a pivotal role in shaping the history that greatly added to the cultural, ethnic, and genetic diversity observed throughout present Eurasia. Perhaps the two most significant phenomena witnessed in this part of the world were the ambitious expansion strategy employed by Mongolia’s most prominent personality, Genghis Khan and the complex network known as the Silk Road that for nearly 3,000 years contributed to the exchange of goods and the transmission of philosophy, art, and science that laid the foundation for the great civilizations of China, India, Egypt, Persia, Arabia, and Rome, and in several respects to the modern world. Over the last few years, through an international collaborative effort, researchers at the Sorenson Molecular Genealogy Foundation were able to collect 2,727 DNA samples, informed consents, and genealogical data in Mongolia, Kyrgyzstan, and Kazakhstan. All the samples were sequenced for the three hypervariable segments of the mitochondrial DNA (mtDNA) control region to assess the genetic composition of the modern population of these countries. We identified ~600 different haplotypes that could be ascribed to more than 30 haplogroups and sub-haplogroups. As expected, most haplogroups are typical of modern East Asian populations, but intriguingly, many different Western Eurasian clades were also identified, with a particular high incidence of H (~8.0%), the most common haplogroup in Europe. This feature cannot be attributed to genetic drift since different H sub-lineages have also been identified, each of them represented by several different haplotypes. The mtDNA distribution profile in the heart of Central Asia suggests a direct link between this area and Western Eurasia that could be explained by ancient migrations or by more recent historical events, such as Genghis Khan’s conquering efforts and trade or cultural exchanges along the Silk Route. To discriminate between these two possible scenarios, we are now analyzing a subset of these samples at the highest possible level of resolution - that of complete mtDNA sequences - focusing particularly on those H mtDNAs that seem to be the most informative considering their control-region haplotypes. Our preliminary data seems to be in favor of rather ancient genetic inputs from the West in shaping the peculiar mtDNA gene pool of Inner Asia’s present-day populations.
The following study seems to do precisely what I recently asked for:
However, as the PCA analysis shows, Ashkenazi Jews are distinct from both Europeans and non-Jewish Middle Eastern populations and cannot be viewed as a simple mix of the two; their distinctiveness must be -in part- due to the specific features of the small founder population of that community after it became effectively reproductively semi-isolated from gentiles after Roman times. It would be interesting to see different Jewish communities studied in the context of a broad variety of European and Middle Eastern populations, to determine whether Ashkenazi distinctiveness is specifically Ashkenazi or more generally Jewish distinctiveness; I would bet on a combination of the two.

Abraham's children in the genome era: Major Jewish Diaspora populations comprise distinct genetic clusters with shared Middle Eastern ancestry
Despite residence all over the world, Jewish populations have maintained continuous genetic, cultural, and religious tradition over 4,000 years. The unique ethnic makeup and social practices provide an invaluable opportunity to understand their genetic origins and migrations and to elucidate the genetic basis of complex disorders. To generate a comprehensive HapMap of ethnically diverse, healthy Jewish populations, we used the Affymetrix array 6.0 to genotype 381 samples recruited from 7 Jewish communities with different geographic origins: Eastern European Ashkenazim; Italian, Greek and Turkish Sephardim; Iranian, Iraqi, and Syrian Mizrahim (Middle Easterners). Here, we present population structure results from compiled datasets after merging with the Human Genome Diversity Project and the Population Reference Sample studies, which consisted of 146 non-Jewish Middle Easterners (Druze, Bedouin and Palestinian), 30 northern Africans (Mozabite from Algeria), 1547 Europeans, and 653 individuals from other African, Asian, Latin American, and Oceanian populations. Both principal component analyses and multi-dimensional scaling analysis of pairwise Fst distance show that Jewish populations form a cluster clearly distinct from all major continental populations. The results also reveal a finer population substructure in which each of 7 Jewish populations studied here form distinctive clusters - in each instance within group Fst was smaller than between group, although some groups (Iranian, Iraqi) demonstrated greater within group diversity and even sub-clusters, based on village of origin. By pairwise Fst analysis, the Jewish groups are closest to Southern Europeans (i.e. Tuscan Italians) and to Druze, Bedouins, Palestinians. Interestingly, the distance to the closest Southern European population follows the order from proximal to distal: Ashkenazi, Sephardic, Syrian, Iraqi, and Iranian, which reflects historical admixture with local communities. STRUCTURE results show that the Jewish Diaspora groups all demonstrated Middle Eastern ancestry, but varied significantly in the extent of European admixture. There is almost no European ancestry in Iranian and Iraqi Jews, whereas Syrian, Sephardic, and Ashkenazi Jews have European admixture ranging from 30%~60%. Analysis of identity-by-descent provides further insight on recent and distinct history of such populations. These results demonstrate the shared and distinctive genetic heritage of Jewish Diaspora groups.
So, it seems that there will soon be real genomic data on the source and extent of admixture in Jews. The absence of Greek and Anatolian samples may be problematic in finding the sources of such admixture, but the presence of Tuscans, who are reasonably close to them in a pan-European context should do well to serve as a substitute. In a recent sutdy (in which Anatolians were not included), the closest populations to Ashkenazi Jews were Italians of mostly southern provenance (Fst=0.0040) and Greeks (Fst=0.0042) and fairly close to Tuscans (Fst=0.0066)


The following study seems to demonstrate my recent suggestion of archaic admixture in Africa itself:
It does not, however, tell us that this is because of archaic introgression in Europeans. The culprit could equally well be long-term population structure in Africa, i.e., the presence of "modern" and "archaic" populations in Africa itself.
Deep population structure in sub-Saharan African populations
We analyzed ~500 Kb of resequencing data from 91 different intergenic regions in samples from three sub-Saharan African populations: Mandenka from Senegal, Biaka pygmies from the Central African Republic and San from Namibia. We employed novel methodology to estimate the split times and migration rates between populations. We found strong evidence for split times that predate the exodus of modern humans out of Africa (e.g., > 100 Kya). In addition, we also found evidence of ancient admixture (with unknown ‘archaic’ human groups) in the recent history of both the Biaka and the San.
Analysis of Genomic Admixture in Costa Rica Population
Costa Rica (CR) population is a unique population representing a typical admixture of major continental ancestral populations. 1,301 samples collected from participants in a population-based study conducted in the Guanacaste region of CR were genotyped on a custom Illumina iSelect chip harboring 27,635 SNPs. The SNPs on the chip were selected based on multi-ethnic tagging strategy for three HapMap populations: CEU, YRI and JPT+CHB and cover 1,000 candidate genes/regions for a range of cancers. This data set was sufficiently large for the investigation of population substructure in our CR study and the examination of linkage disequilibrium (LD) patterns. Three HapMap major continental populations and a Native American population from the Illumina iControl DB were used as the reference populations for these analyses. Our preliminary results indicate that the Guanacaste CR population was formed mainly by a three-way admixture with 42.5%, 38.3% and 15.2% Native Indian, European, and African respectively. In addition, 4.0% residual genetic component derived from Asians was observed in our CR samples. Both model based STRUCTURE program and Principal Component Analysis (PCA) revealed consistent substructure pattern for the CR population. The magnitude of LD in the CR population seems to be smaller than all the reference populations except YRI. A more detailed knowledge of the underlying genetic structure of the CR population would be informative to assess its population genetic history and to assist in the interpretation of investigations of complex diseases in the CR or a comparably admixed population.
Analysis of Genetic Substructure of Han Chinese Using Genome-Wide SNP Arrays: Implication for Association Studies.
China will start this year a $30 million effort of genome-wide association studies (GWAS) of common diseases in Chinese populations which have been largely underrepresented in the similar effort worldwide. A general concern is population stratification (ancestry differences) among subpopulations which can cause false positive associations. Han Chinese is the largest ethnic group in the world, however, its population substructures are often expected and yet well characterized. In this study, we examined population substructures in a diverse set of >1,700 Han Chinese samples collected from 26 regions, each genotyped with at least 160K single nucleotide polymorphisms (SNPs). Our results showed that: (a) Han Chinese population is complicatedly substructured, with the main observed clusters roughly corresponding to northern Han, central Han and southern Han; (b) Han Chinese samples collected from large cities, such as Shanghai, Beijing and Guangzhou, show diverse source of ancestries including three aforementioned clusters; (c) HapMap samples (CHB & CHD) and HGDP samples (Han & Han-NChina) deliver a limited representation of Han Chinese people. Building on the above insights, we investigated false positive rates and statistical power in various study designs using both empirical and simulated data. We further explored sample collection strategies and public data usage for future association studies.
It will be interesting to see if the authors of the following study estimated gene flow in non-southern European populations as controls, to see what is the excess of Sub-Saharan admixture detected in the three southern European samples, and exactly what "methods that can infer admixture proportions in the absence of accurate ancestral populations" they used. Hopefully they will also extend their linkage disequilibrium analysis for the other populations besides Spaniards.

Characterizing the history of sub-Saharan African gene flow into southern Europe
Recent analyses of whole-genomeSNP data sets have suggested a history of sub-Saharan African ancestral contribution into southern Europe but not in northern Europe, consistent with previous analyses based on the Ychromosome and mitochondrial DNA. However, there has been no characterization of the proportion of African admixture in southern Europe, or of its date. Here we analyze data from ~450,000 autosomal SNPs in the Population Reference Sample, ~650,000 SNPs from the Human Genome Diversity Panel, and ~1.5 million SNPs from the HapMap Phase 3 Project, and studied patterns of correlation in allele frequencies across populations to confirm the evidence of African ancestry in many southern European populations but not in northern Europeans. Using methods that can infer admixture proportions in the absence of accurate ancestral populations, we estimated that the proportion of sub-Saharan African ancestry in Spain is 2.4 +/- 0.3%, in Tuscany 1.5 +/- 0.3%, and in Greece 1.9 +/- 0.7% (1 standard error). We also studied the decay of admixture linkage disequilibrium with genetic distance, which provided a preliminary estimate of the date of African gene flow into Spain of roughly 60 generations ago, or about 1,700 years ago assuming 28 years per generation. This date is consistent with the historically known movement of individuals of North African ancestry into Spain, although it is possible that this estimate also reflects a wider range of mixture times.
Genome-wide patterns of population structure and admixture among Hispanic/Latino populations
In order to document genome-wide patterns of variation in Hispanics/ Latinos (HL’s) we genotyped individuals from five distinct populations recruited in the US: Mexico, Colombia, Ecuador, Dominican Republic and Puerto Rico. We present population structure results from an extensive genome-wide SNP dataset compiled by merging Affymetrix 500K and Illumina 650K data from these populations together with the Human Genome Diversity Panel, HapMap, Mao et al (2005), and POPRES studies. We apply Principal Component Analysis (PCA) and a clustering method, frappe, to infer admixture and genetic relationships of 262 HL individuals with 467 Africans, 715 Europeans, and 210 Native Americans comprising a total of 88 populations. We observe substructure within Native Americans, and, as expected, find that the admixed HL populations show Native American ancestry derived from local Native American populations. We find striking differences in estimated population-wide mean African, European and Native American ancestry proportions which are consistent with historical admixture and proximity to slave trade routes. The Dominican Republic and Puerto Rico, located on islands along slave trade routes, show high levels of African Ancestry (means 41.7% and 23.6% respectively) with less Native American Ancestry (11.5% and 18.9%). Colombians show a wide range of both African and Native American ancestry, though they have an overall mean of slightly higher Native American ancestry (36.3%) and lower African ancestry (11.7%) than the highly-African Dominicans and Puerto Ricans. Ecuadorians show the highest Native American mean ancestry (54.0%) with low estimated mean African Ancestry (7.3%). Mexico shows the largest range of Native American ancestry (11.0% - 79.0%) with an overall mean of 50.1% Native American ancestry and the lowest African ancestry (5.6%). Our study shows a broad range in admixture proportions across different HL individuals as well as different admixture patterns across populations. We also compare this genotype data with mtDNA and Y chromosome genotypes and use simulations to estimate ancient male and female sex ratios in each HL population. Lastly, we discuss implications of population structure for genome-wide association studies in admixed populations such as HL’s, especially when recruited in the United States.
A new statistical method to infer population admixture events using genetic variation data
We present a novel statistical method that uses densely-spaced Single- Nucleotide-Polymorphism (SNP) data to identify the major admixture events occurring throughout a population’s history. The model has several advantages over leading available analytical approaches in this area, such as principal-components-analysis and STRUCTURE. In particular it can simultaneously (i) take advantage of the information inherent in patterns of linkage disequilibrium, i.e. non-random associations amongst neighbouring SNPs along a chromosome, (ii) efficiently analyse hundreds of individuals at hundreds of thousands of SNPs genome-wide, and (iii) allow for relatively straight-forward interpretation and direct inference of key historical parameters, such as the proportions and times of major admixture events. Using simulated data matched to currently available human datasets, we show that our model can identify and accurately date admixture events that have occurred between 7 and 150 generations ago. As our technique exploits the rich information in genetic data to infer details of a population’s admixture history, it marks a powerful complement to anthropological research and can help to resolve a number of existing controversies. We present results from applications of our model to two datasets: (1) SNP data from 22 distinct genetic regions for individuals from three chimpanzee populations in Africa; (2) genome-wide 650K SNP data for individuals from 53 world-wide populations of the Human Genome Diversity Panel (Science 319, 1100-1104). We highlight a number of intriguing new insights from these analyses. For example, the chimpanzee analysis showcases the model’s ability to infer the relative divergence among populations. The human analysis identifies several important admixture events, some of which are historically wellestablished (e.g. identification of recent European genetic influx into the Maya Native American population), others that can be placed into a clear historical context (e.g. an East Asian genetic influx into several Central and South Asian populations dated precisely to the era of the Mongol empire), and some that are to our knowledge novel (e.g. admixture in the Cambodian population between a Central/South Asian source and an East Asian source dated to around the period of the Cambodian Empire).
Bayesian methods of estimating ancestry using whole-genome SNP data
Estimation of the genetic ancestry of an individual is useful for association studies, disease risk prediction, population genetic analyses and is of inherent interest for the individual themselves. We have investigated methods of estimating ancestry using whole-genome SNP data on each individual. We focus on the scenario where the goal is to determine ancestry in relation to a set of genotype or haplotype data that is available from a set of distinct source populations, for example, the HapMap 2, HapMap 3 or 1000 Genomes datasets. Inference in this setting can focus either on the estimation of global ancestry, in which an overall estimate of the proportion of ancestry from the source populations is needed, or local ancestry, which aims to partition an individual genome into distinct segments of ancestry from the source populations. We have compared 2 models based on the estimated allele frequencies in the source populations at a set of unlinked SNPs. Model 1 only models global admixture, whereas Model 2 models both global and local admixture. Using simulated individuals with differing proportions of CEU and YRI admixture (based on HapMap3 data) we find that there is a relatively small difference in the mean square error of the estimates of global admixture from the 2 methods (1.16 10-4 and 8.88 10- 5 respectively). Since Model 1 is much faster to fit that Model 2 these results suggest that Model 1 can be used to estimate the level of global ancestry, or at the very least will be useful as an initial estimate for use in Model 2. Further investigation is required to see how these results hold for more genetically similar source populations. In contrast, the mean square error for the estimates of local admixture from the 2 methods is 0.298 and 0.0861 respectively, suggesting that an explicit model of local ancestry is needed to carry out this level of inference. We are also investigating the utility and practicality of using linked SNP data to estimate global and local admixture.
A detailed phylogeography of mtDNA haplogroup C1d: another piece in the Native American puzzle
Recent studies based on complete mitochondrial DNA (mtDNA) sequences revealed that two almost concomitant paths of migration from Beringia led to the dispersal of the first Americans (Paleo-Indians) approximately 15-17 thousand years ago (kya). This first expansion was followed by later more restricted diffusion events from the same dynamically changing Beringian source. Thus, five pan-American (A2, B2, C1, D1, and D4h3a) and four geographically confined (D2, D3, X2a, and C4c) mtDNA haplogroups represent the current female legacy of the ancient migratory events that gave rise to the native populations of the double continent. Regarding haplogroup C1, all its members appear to belong to one of three branches: C1b (characterized by the control-region transition at np 493), C1c, and C1d (with the control-region transition at np 16051). These three sub-haplogroups are found throughout the Americas, thus supporting the scenario that they most likely differentiated at the early stages of the Paleo-Indian southward migration. If considered as three separate founders, C1b, C1c, and C1d would bring the currently known number of native pan-American lineages to seven. As a whole, the C1 haplogroup has an estimated age of 17.0- 19.6 ky, while the three individual branches are dated 16.5-17.0 ky, 17.2- 17.6 ky, and 7.6-9.7 ky, respectively. The extremely young age estimate of C1d has been attributed, at least for the moment, to a major underrepresentation of C1d mtDNAs (only nine complete sequences published to date) in the current Native American mtDNA phylogeny. We have addressed this issue in the current study by completely sequencing more than 60 novel mtDNAs belonging to haplogroup C1d, which were carefully selected on the basis of both control-region variation and geographic/ethnic origin. Phylogeographic analyses have provided not only an accurate evaluation of the expansion time of C1d in the Americas, but also a detailed picture of its current distribution in both general mixed and indigenous populations.

Genetic diversity of European population isolates in the context of their geographic neighbors
Mapping traits in population isolates provides an opportunity to simplify the challenges of complex trait mapping because such populations likely have enhanced levels of linkage disequilibrium and reduced genetic heterogeneity for the underlying traits. Here we analyze high-throughput SNP genotyping data to compare genomic-scale patterns of variation in several European population isolates (Adygei, Basque, Orcadian, Roma from Slovakia, Sardinians, and Sorbs) and contrast their patterns of variation to geographical proximal populations. Our results reveal insights for the demographic history of each of these unique populations, suggest substantial variation among these population isolates in patterns of diversity, and highlight the importance of population selection in genome-wide association mapping.
Incompatibility of current Finnish mitochondrial diversity with simulations of assumed settlement history
Traditionally, geneticists studying Finnish population history have assumed a model where Northern and Eastern Finland were mostly uninhabited until the 16th Century A.D. and were then settled by small family groups from South-Western Finland. The reduced genetic diversity and the distinct Finnish disease heritage are seen as consequences of these founder effects. Y-chromosomal diversity is indeed reduced in the present population, especially in the eastern parts of the country. However, mitochondrial diversity is not heavily reduced compared to South-Western Finnish or other European populations. This discrepancy has been explained with the higher mitochondrial mutation rate having restored mitochondrial diversity in these populations since the founder effects.
In our view it seems unlikely that even with high mitochondrial mutation rates mtDNA diversity could be restored over a mere 17 generations after the alleged tight bottlenecks. Archaeological evidence also suggests a different settlement history, e.g. settlement beginning in South-Eastern instead of South-Western Finland.
In this study we use simuPOP, a state-of-the-art forward simulation tool, to simulate datasets corresponding to Finnish mitochondrial diversity under the traditional model and compare them with actual present-day Finnish data. We show that current mitochondrial variation is unlikely under this model, increasing the credibility of alternative hypotheses.
On the borderline between the east and the west: the maternal genetic background of Karelians
Introduction: The frontier between Finland and Russia represents one of the most conspicuous socioeconomic gaps in the world. Based on the mean gross national product, there is a ten-fold difference between Russian Karelian Republic and Finnish Karelia. Otherwise these populations share the same geophysical environment. For these reasons, Karelia has been a very interesting field of research for multifactorial disease studies. However, this area has undergone many demographic incidents, such as wars and famine, which may cause local differences in the gene pool. In this study, we wanted to elucidate the maternal genetic background of Karelians. Materials: Blood samples were collected from healthy unrelated individuals without known foreign background from four Karelian districts; Aunus(n=218), Viena(n= 87), Tver(n=61) and Finnish Karelia (n=70), The sample collection was performed according to the Basic Principles of the Declaration of Helsinki. Methods: The entire mitochondrial DNA was sequenced in 32 reactions per sample with the BigDye® Terminator v3.1 Cycle Sequencing Kit in the Applied Biosystem’s 3730 Genetic Analyzer sequencing machine. Sequence alignments were made by the SeqScape® Software, Version 2.5 (Applied Biosystem). Results: Haplogroup H was very common in all populations. However, H1a is almost absent in Finnish Karelia. Also U and its subhaplogroups were common. Specially U5b1b1 reached over 16% in Viena Karelians. U4 was most common among Tver Karelians. Conclusions: The maternal genetic background seem to be complex in this area. There is clear regional differences. Also there is solid evidence of gene flow from various sources. Representation of the clearly Asian haplogroups is strikingly low.
Genetic Landscape of Eurasia Viewed from Large Allele Frequency Differences.
The diversification leading to modern human populations in Eurasia is one of the most important topics in the study of human expansions after leaving Africa. Most studies of Eurasia populations have used either limited markers or involved insufficient population coverage. We chose 68 markers based on large allele frequency differences among a few Eurasian populations and then typed them on 1766 individuals from 34 populations representing all subdivisions of Eurasia. Analyses using the STRUCTURE program showed a clinal east-west division when K=2, with a median border dividing Central Asia along the Ob River, the Kazakh highland, the western side of Pamir Mountains, and the southwestern side of the Himalayas. We fit curves to the STRUCTURE loadings using distances of the population coordinates from the median border. The genetic structure changed dramatically only within 2000km on each side of the border. At higher values of K the western populations of East Asia are the first to be distinguished (at K=3): Mongols, Tibetans, Qiang, and Baima, are most distinct from the more eastern populations. At K=4 Southwest and South Asians are distinguished from the Europeans; At K=5 Southeast Asians and at K=6 Central Asians are successively distinguished from eastern East Asians. Several more isolated populations such as Samaritans, Atayals, or Micronesians were distinguished in different independent runs when K=7 providing no clear anthropological information. South Asians were always clustered with Southwest Asians with pronounced similarity to Central Asians. The failure to distinguish South Asians maybe due to the selection of the markers with large allele frequency differences specifically between Europeans and East Asians. We also tested for statistical differences in the allele frequencies for all pairs of clusters when K=6. The results showed significant borders (P less than 0.0001) including those between western East Asians and eastern East Asians or Central Asians; however, insignificant borders were observed between Southwest Asians and Southeast Asians or western East Asians, neither was between Central Asians and eastern East Asians. This indicates substantial gene flow in North Asia between eastern East Asians and Central Asians, and in South Asia between South Asians and Southeast Asians. Using increased population and marker coverage, this study helps to understand the details of genetic diversity and landscape of Eurasians.

Anthropometry

Dairy intake associates with the IGF2 rs680 polymorphism to height variation in Greek children. The GENDAI study
Objective: Height is a classic polygenic trait with a number of genes underlyingits variation. We evaluated the prospect of gene to diet interactions ina children cohort for the IGF2 rs680 polymorphism and height variation.Methods: We screened 795 peri-adolescent children (424 females) aged10-11 years old from the (Gene and Diet Attica Investigation; GENDAI)paediatric cohort for the IGF2 rs680 polymorphism. Results: Children homozygousfor common allele (GG) were taller (148.9 ± 7.9 cm) comparing tothose with the A allele (148.1 ± 7.9 cm), after adjusting for age, sex, anddairy intake (β±SE: 2.1± 0.95, p=0.026). A trend for interaction for theIgfrs680xdairy intake is also revealed (p=0.09). Stratification by IGF2 rs680genotype revealed a positive association between dairy products intakeand height only in A allele carriers, adjusted for the same confounders(standardized β=0.111, p=0.014). When dairy intake was classified, basedon the median value, into two equal groups of low (1.9 ± 0.7 servings/day)and high dairy products intake (4.4 ± 1.5 servings/day), it was found thatin A allele children high dairy eaters were significantly taller (p=0.05) comparedwith low dairy eaters (148.8 ± 7.9 cm vs 147.4 ± 7.7 cm respectively,adjusted for age and sex). Conclusion: A higher consumption of dairy productsassociated with increased height depending on the rs680 IGF2 genotype.Thus, exploring height variants and elucidating possible interactionswith environmental factors like diet could help us to design
A Non-synonymous HNF4A Variant is Associated with Glycemia During Pregnancy and Offspring Head Circumference in Populations of European Ancestry in the HAPO Study
The Hyperglycemia and Adverse Pregnancy Outcome (HAPO) study is a multicenter, international study, which examined the association of maternal glucose levels with fetal growth and outcome in 25,000 pregnant women from multiple ethnic groups to demonstrate a continuous relationship between maternal glucose measures and birth size throughout the range of glucose concentrations. We hypothesize genetic factors contribute to these phenotypes, and examined 1536 fetal and maternal SNPs in 79 candidate loci previously implicated in insulin secretion or sensitivity to determine associations with maternal glycemia and insulin secretion (fasting glucose and Cpeptide and 1-hr glucose from the OGTT) at ~28 weeks gestation and/or offspring size at birth (birth weight, length, head circumference, and sum of skinfolds) for HAPO mothers of European (Belfast and Manchester, UK, and Brisbane and Newcastle, Australia; N=3828) and Asian (Bangkok, Thailand; N=1813) ancestry and their offspring. Associations were assessed through linear regressions with the single trait/outcome under an additive genetic model adjusting for known confounders. Among our strongest signals was rs1800961G>A, which encodes a Thr>Ile amino acid change in exon 4 of HNF4A, recently identified in a GWAS meta-analysis as a variant associated with decreased HDL levels. In the HAPO study, this SNP was strongly associated with increased fetal head circumference (0.5cm [95%CI: 0.3-0.7] per maternal minor allele; P=1.2x10-7) in those of European descent. The maternal minor allele was also weakly associated with 1-hour glucose (4.3mg/dL [95%CI: 0.5-7.9]; P=0.03), birth length (0.7cm [95%CI: 0.2-1.1]; P=0.003), birth weight (52.6g [95%CI: -8.0-113.3]; P=0.09), and sum of skinfolds (0.3cm [95%CI: -0.1-0.6]; P=0.13). This same minor allele in the fetal genome was weakly associated with cord C-peptide (0.1ug/dL [95%CI: 0.01-0.22]; P=0.03), and head circumference (0.2cm [95%CI: -0.1-0.4]; P= 0.08). The same trends were observed among the Thai, although not significantly probably due to a reduction in power from the low risk allele frequency (<2%).>
Selection

In a recent study, Heyer used germline mutation rates to estimate time depth, so I am more inclined to take her dates at face value than in papers which used "evolutionary" rates. It will be interesting to see which Y-chromosome types the authors associates with the both the older and recent expansions.

Super Y-chromosomes in Eurasia and the impact of social selection and Neolithic transition
Some Y-chromosomal haplotypes have been found at unusually high frequenciesin Asian and European human populations. The massive spreadof these lineages has been explained by the impact of social selection i.e.the high reproductive success of some males and their relative/descendantsdue to their high social status. The most well-known examples are the “Khanhaplotype” and the “Manchou haplotype” in Asia, and the U’Neill haplotypein Ireland. But are these frequent haplotypes always associated with recentevents of social selection, or could they be linked to much older processes?To address this question, we have surveyed ~ 3500 males in 97 populationsfrom Turkey to Japan. We have focused on the 12 most frequently representedhaplotypes in Eurasia and tested whether their expansions are linkedto a specific factor such as language or subsistence methods. Our resultsshow that both recent and ancient processes are responsible for the expansionsof these lineages. The recent expansions (2000-3000 years) likely tobe linked to social selection are prevalent in Altaic-speaking and pastoralpopulations. This might indicate a recent cultural change in the social organizationof these populations. The ancient expansions (8000-10000 years)are over-represented in Indo-European speaking and sedentary farmer populations,and are likely to be the result of the Neolithic transition.

Lactase Persistence; Multiple causal mutations in sub-Saharan pastoralists
Background Milk is the primary source of nutrition for newborn mammals, including humans. The majority of human adults, estimated at approximately 65%, are unable to digest lactose (the main carbohydrate in milk) effectively since lactase expression is down-regulated after weaning, as it is in other mammals. In some humans however, lactase expression persists into adulthood (lactase persistence, LP) allowing adult consumption of milk from other species, and the frequencies of this trait vary throughout the world. A C-T SNP -13910 bases upstream from the lactase gene (LCT) is associated with LP in Europe. The -13910*T is rare in milk drinking groups in Africa although two other variants (-13915*G, -14010*C) have been shown previously to be significantly associated with LP and in an accompanying abstract (Ingram et al) we confirm a third locus (-13907*G) and present a fourth candidate SNP. However some LP individuals have also been identified who carry none of these alleles. Aims To examine the distribution across Africa of these and other allelic variants; to examine other regulatory regions in population groups in which enhancer alleles are lacking. Results The geographic and ethnic distribution of -13907*G, -13910*T, -13915*G, -14009*G, and -14010*C in 10 different countries and 15 distinct ethnic groups across Africa (n=1221 individuals) is presented here. Several other variants in this enhancer region are also described here for the first time. These tightly clustered enhancer variants are more frequent in pastoralist milk drinking groups than agriculturalist populations and are associated with several different LCT core haplotypes. Two further candidate regulatory regions have been sequenced in the same populations including a 1000bp region immediately upstream from LCT where novel variants have been found. Conclusions The data support the notion that many different mutations do have a functional role in LP, and that the trait has arisen independently several times, being subject to the positive selection conferred by the increased ability to digest milk lactose by people in pastoralist societies.
Extreme Evolutionary Disparities Seen in Positive Selection Across Seven Complex Diseases
Genome-wide association studies (GWASs) have successfully illuminated disease-associated variation. But whether human evolution is heading towards or away from disease susceptibility remains an open question. We analyzed the seven diseases studied by the Wellcome Trust Control Case Consortium (WTCCC), to calculate the relative selective pressure at every significant loci. Results reveal striking differences between the seven studied diseases. We find evidence of recent positive selection in favor of alleles increasing the risk of Type 1 Diabetes (T1D), Crohn’s Disease (CD), Hypertension (HT), Rheumatoid Arthritis (RA), and Bipolar Disorder (BD). Riskassociated alleles (defined as the allele most strongly associated with disease among associated SNPs) for Type 2 Diabetes (T2D) fall largely within the random neutral region, and Coronary Artery Disease (CAD) shows less positive selection than expected by random. When only protective alleles are considered (defined as the allele least strongly associated with disease among associated SNPs), we find that SNPs only associated with T1D, CD, and RA appear to exhibit significant signatures of positive selection. There is significant asymmetry in the 96 SNPs strongly associated with T1D (pvalue ≤0.005) showing strong signs of positive selection, with 79 SNPs selecting for the risky allele, and only 17 SNPs selecting for the protective allele. Furthermore, selection patterns of Coronary Artery Disease (CAD) fall far below the expected levels of random, implying stable allele frequencies. Results reveal the evolutionary trajectories of T1D and CD favor risk alleles, possibly due to their simultaneous role in protection from infectious diseases. These results inform on current understanding of disease etiology, thus aiding efforts to discover novel approaches to disease treatment and prevention.
Detecting Natural Selection in the Human Genome from Pilot1 Data in the 1000 Genomes Project
Identifying signatures of natural selection in the human genome is of fundamental implication for the study of population evolution and for the biomedical research. The distribution of selection in genome will provide important functional information. Natural selection modify the level of variability within and between populations and shapes the pattern of genetic variations in the genome. Genetic variation in genome is the raw data for detection of natural selection. The 1000 Genomes Project produces whole genome sequencing data and offers a unique and great opportunity to scan the genome for signature of natural selection. Five statistics: Tajima’D, Fu and Li’s F, Achaz’s Y, Fay and Wu’s H and Zeng et al.’s E (based on comparing the site frequency spectrum within population) and Fst statistic (based on the measure of population subdivision) were applied to Pilot 1 data in 1,000 genome project to scan the entire genome for detection of selection, where 344 chromosomes from ASI, CEU and YRI were sequenced. A total of more than 20 million of variant sites, 4.8 millions common in three populations were identified. We calculated seven statistics in 10 kb and 100 kb windows across the genome for each population and obtained their empirical distributions. Results show that two kinds of windows analyses lead to the similar distributions. The proportional rank of the test statistic in a particular window compared with the overall empirical genomic distribution was taken as empirical P-value for that window. We identified 3,046 candidate selection regions in ASI population, 2,015 selection regions in CEU, and 2,204 selection regions in YRI at 5% empirical significance level in 10 kb by five statistics based on differences in frequency spectrum. Among 457 candidate genes of selection reported from PubMed, we detected 102 selection genes in ASI, 53 selection genes in CEU, and 101 selection genes in YRI and 11 selection genes common in three populations by familiar Tajima D test. By comparison we obtained 3.9 million SNPs and the whole genome’s fixation index about 0.10~0.11. By compared with the empirical genome-wide distribution of FST, we identified 5, 278 candidate selection regions at an empirical significance level of 2.5% from each of the 22 autosomal chromosomes. Among 581 identified selection regions by FST which were reported from literatures, we found that 294 selection regions overlap our results.
Genomic Landscape of Positive Natural Selection in North European Populations
Analysing genetic variation of human populations to detect loci that have been affected by positive natural selection is important for understanding adaptive history and phenotypic variation in humans. In this study, we analysed recent positive selection in Northern Europe from genome-wide datasets of 250 000 and 500 000 single nucleotide polymorphisms in a total of over 1000 individuals from Great Britain, Northern Germany, Eastern and Western Finland, and Sweden. Coalescent simulations were used to demonstrate that the integrated haplotype score (iHS) and long-range haplotype (LRH) statistics have sufficient power in genome-wide datasets of different sample sizes and SNP densities. Furthermore, the behavior of the FST statistic in closely related populations was characterized by allele frequency simulations. In the analysis of the North European dataset, dozens of regions in the genome showed strong signs of recent positive selection. Most of these regions have not been discovered in previous scans, and many contain genes with interesting functions (e.g. RAB38, INFG, NOS1AP, and APOE). In the putatively selected regions, we observed a statistically significant overrepresentation of genetic association to complex disease, which emphasizes the importance of the analysis of positive selection in understanding the evolution of human disease. Altogether, this study demonstrates the potential of genome-wide datasets to discover loci that lie behind evolutionary adaptation in different human populations.
Evidence of Indigenous American specific selection in skin pigmentation genes
Recent studies of selection in human pigmentation genes have focused on Old World populations, neglecting the evolutionary changes that have occurred in Indigenous American populations since their migration into the Americas. Previous research shows correlations between Indigenous American ancestry and skin pigmentation variation, suggesting a genetic role in the determination of skin pigmentation among these populations. However, few genes contributing to these differences have been described. To identify genes that may have undergone Indigenous American specific changes, this work examines signatures of selection in 82 pigmentation candidate genes by genotyping 88 indigenous individuals from Central and South America using the Affymetrix Genomewide Human SNP Array 6.0. The resulting 906,600 single nucleotide polymorphisms (SNPs) were surveyed for signatures of selection in the Indigenous American populations compared to the HapMap Phase I populations. Evidence of selection was identified using four measures selected for the complementarity of their approaches, including the reduction in heterozygosity (lnRH), Locus-Specific Branch Length (LSBL), Tajima’s D, and by examination of the haplotype block structure. When computing lnRH and LSBL as well as when examining changes in haplotype frequency, the East Asian and European HapMap populations were included because they are the most closely related populations available. These analyses differentiate the selective changes that appear to be shared among East Asian and Indigenous American populations from those that are unique to the Indigenous American populations. For each test, the top5%of the empirical distribution of results was examined and pigmentation genes falling in this tail of the distribution were considered to show statistically significant evidence of selection. Based on these analyses, 12 genes - ADAM17, POMC, AP3B1,OPRM1, SILV, OCA2/HERC, PLDN, MYO5A, RAB27A, CYP1A2, ATRN, and ASIP - show evidence of selection unique to the Indigenous American populations. Many of these genes have known functional roles in melanogenesis and suggest potential pathways responsible for the observed differences in skin pigmentation between Indigenous American and Old World populations.
Patterns of correlation between genetic ancestry and facial features suggest selection on females is driving differentiation.
Human facial features show extensive variation within and among populations. By investigating the relationship between dimorphism in facial features and genetic ancestry in different populations, we can explore the roles of sexual and natural selection on the human face. We measured sexual dimorphism in facial traits while controlling for the effects of overall size differences and then tested for interactions between sex and genetic ancestry. The study sample consists of 254 subjects (n=170 females, n=84 males), ages 18-35, showing West African and European genetic ancestry sampled in the United States and Brazil. Maximum likelihood genetic ancestry estimates were determined from 176 ancestry informative markers (AIMs), which allowed for the proportional estimation of genetic ancestry from four parental populations (West African, European, East Asian, and Native American). Three-dimensional photographs of faces were acquired using the 3dMDface imaging system (Atlanta, GA). 22 standard anthropometric landmarks were placed on each image and XYZ coordinates were collected. All 231 possible pairwise inter-landmark distances were calculated and then log transformed. Using the pairwise distances, we tested whether some distances were larger in one sex than the other, having taken size into account, in a) African Americans sampled in the United States, b) Brazilians sampled in Brazil, and c) the combined African American and Brazilian sample. We found that several pairwise distances differed between the sexes. For example, the distance from the brow to nasal bridge was found to be more than 5% larger in females than males. We then tested for an interaction between sex and genetic ancestry by testing for differences in the slopes of the ancestry association between males and females. Although the pattern differed slightly between samples, after Bonferroni correction many correlations were the found to be same in both sexes. However, females in all three samples had many additional significant correlations that were not seen in males, while males had very few correlations that were not found in females. The results of these analyses suggest that selection on females is driving the differentiation in facial features among populations.
Effect of natural selection on North Asian mitochondrial haplogroup variation
The human mtDNA exhibits striking, region-specific sequence variation. The regional distribution of mtDNA haplogroups have attributed either to genetic drift assisted by purifying selection (Elson et al., 2004; Kivisild et al., 2006; Ingman, Gyllensten, 2007) or to an adaptation to different climates (Mishmar et al., 2003; Ruiz-Pesini et al., 2004). In an attempt to study the mode of selection in mtDNA variation in human populations we sequenced and analyzed 211 complete mtDNA sequences belonging to haplogroups A, C and D accounting in total for 49.3% of mtDNA lineages in North Asia. The North Asian haplogroups A, C and D showed a highly significant deviation from the standard neutral model as well as a bell-shaped distribution of pairwise differences consistent with rapid population expansion. To determine the overall importance of selection in shaping human mtDNA variation we calculated Ka/Ks ratio both for aggregated mtDNAs and for 13 proteinencoding genes within particular haplogroups (A, C and D). We have found a prevalence of Ks over Ka within haplogroups A, C and D indicating the influence of negative selection on mtDNA during evolution. Consistent with some previous reports we have found the Ka/Ks ratio for the ATP6 gene to be the highest among the North Asian sequences suggesting thereby that this gene has been subject to positive selection. We have also observed a set of genes with a somewhat higher Ka/Ks ratio relative to other mitochondrial genes - CO2 for haplogroup A, ND3 and ND4 for haplogroup C. Meanwhile the other approach taking into account the difference in NS/S ratios between the haplogroup-associated and private substitutions (Elson et al., 2004) shows the significant departures from neutrality only for haplogroup D and its subhaplogroup D4. Furthermore single gene analysis reveals the relatively strong influence of negative selection only in CYTb gene within haplogroupD(p=0.011, NI=14.1). In general, our results indicate that there is an evidence for both gene-specific and lineage-specific variation in selection acting on North Asian mtDNAs.
Selection for blue eyes in Europe and light skin pigmentation in East Asia at OCA2/HERC2
OCA2 and HERC2 are two genes on chromosome 15 separated by lessthan 10 kb. Mutations in this region have been shown to have an effect onpigmentation including causing oculocutaneous albinism type 2. In Europeans,a three SNP haplotype (rs4778138, rs4778241, rs7495174) and threeindividual SNPs (rs12913832, rs916977, rs1667394) have been associatedwith blue eyes. We have labeled the three SNP haplotype BEH1. We foundthat the first individual SNP, rs12913832, was in near complete LD withanother SNP (rs1129038). Wetreat these two SNPs together as a haplotype,BEH2. We also found that the other two individual SNPs were actually innear complete LD with each other and decided to label them BEH3. In EastAsians, a SNP (rs1800414) has been identified that is associated with alight skin pigmentation phenotype. We typed these eight SNPs in 64-70population samples. We then examined worldwide distribution of the fourpigmentation alleles. We saw that the light skin allele was at its highestfrequency in eastern East Asia, at midrange frequencies in Southeast Asia,and at lower frequencies in western East Asia. It is virtually absent from therest of the world. BEH1 and BEH3 show very similar global patterns, lowfrequencies to midrange frequencies in Africa and East Asia, midrangefrequencies in India and Eastern Siberia, and midrange to high frequenciesin Southwest Asia, Europe, Western Siberia, the Pacific Islands, and theAmericas. BEH2 shows a different pattern from the other two. It showslow frequencies in East Africa, India, Eastern Siberia, and the Americas,midrange frequencies in Southwest Asians and Southern Europeans, andhigh frequencies in Eastern and Northwestern Europe and Western Siberia.We then typed additional SNPs and test each pigmentation allele for selectionusing the Relative Extended Haplotype Homozygosity (REHH) test. Wefound that the light skin allele of rs1800414 is under selection in East Asiaand that the blue eye allele of BEH2 is under selection in Europe andSouthwest Asia. We show light skin pigmentation has been selected for inEast Asia. This is likely due to lower UV exposure at the higher latitudes(compared to equatorial Africa) and the need for lighter skin for vitamin Dproduction. We also show that blue eyes are selected for in Europe. Thisis most likely due to sexual selection, though another unknown effect of thisparticular allele could be selected for and the blues eyes are a side effect.
Ancestry variation along the genome in Latin American populations and implications for recent natural selection
Latin American populations stem from the admixture starting about 500 years ago of Europeans, Africans and Native Americans. Extreme deviation in ancestry estimates at certain genome locations (relative to the genomewide average) could reflect the action of recent natural selection. We evaluated the distribution of ancestry estimates along the genome using 678 microsatellite markers in 249 individuals sampled from 13 admixed populations across Latin America. We found a significant deviation in ancestry at two genomic locations with more than four times standard deviations from the genome-wide mean: an excess of European ancestry at 14q32 (Zscore = 4.14), and an excess of African ancestry at 6p22 (Z-score = 4.71). These deviations in ancestry were observed in the analysis of the combined dataset as well as in most of the individual populations examined. We showed that our findings are robust to the Native American ancestry populations used. We discussed the implications for recent natural selection in the context of the unique history of the New World, as well as the possibility of artifacts.