Showing posts with label Huns. Show all posts
Showing posts with label Huns. Show all posts

September 06, 2013

ASHG 2013 abstracts

Feel free to point me to more interesting abstracts than the ones I noticed during my "first pass".

Morphometric and ancient DNA study of human skeletal remanants in Indian Subcontinent.
N. Rai et al.
Recovery and sequencing of mtDNA from ancient human remnants is a daunting task but provides valuable information about human migrations and evolution. Our present study is the first to recover, amplify and sequence (HVR and coding regions of mtDNA) inadequately preserved and highly degraded (1.5 Ky to ≤1.0 Ky ago) hominids mitochondrial DNA of three most intriguing and indigenous ancient population of South and South-East Asia (Myanmar=20 Buried individuals, Nicobar Islands=15 and Andaman Island=6). Following all parameters and to avoid the chance of contamination we independently extracted and sequenced the DNA in two different labs and measured the cranial variability in all hominid skulls using 128 cranial landmarks, compiled 3D morphometrics, genetic data of ancient DNA samples and analyzed the admixture and genetic affinities of above three populations. Results showed the predominant frequency of F1a1 and complete absence of 9bp deletion in ancient Nicobarese. Unlike in previous reports on modern Nicobarese, the high frequency of F1a1 haplogroup in ancient Nicobarese show the probable migration of Nicobarese from South East Asia and the complete absence of 9bp deletion suggests the different events of settlement. This study failed to detect genetic affinities of Burmese with Nicolbarese even though their phenotype and language appears to be same. We first time report any kind of population study on Burmese populations and with the genetic affinity of Burmese with East Asian, East Indian (Including Gadhwal region of Himalaya) and Bangladeshi populations, we found significant admixture with West Eurasians. Our study strongly supports the West Eurasian and East Asian route of migration and settlement of early Burmese population. The three populations in the present study are quite different in their genetic structure but 3D morphometric study using huge number of landmarks explains a close homology among these populations and this can be explained by the role of climatic signature on these populations.
 Y chromosomes of ancient Hunnu people and its implication on the phylogeny of East Asian linguistic families. 
LL. Kang et al.
The Hunnu (Xiongnu) people, also called Huns in Europe, were the largest ethnic group to the north of Han Chinese until the 5th century. The ethno-linguistic affiliation of the Hunnu is controversial among Yeniseian, Altaic, Uralic, and Indo-European. Ancient DNA analyses on the remains of the Hunnu people had shown some clues to this problem. Y chromosome haplogroups of Hunnu remains included Q-M242, N-Tat, C-M130, and R1a1. Recently, we analyzed three samples of Hunnu from Barköl, Xinjiang, China, and determined Q-M3 haplogroup. Therefore, most Y chromosomes of the Hunnu samples examined by multiple studies are belonging to the Q haplogroup. Q-M3 is mostly found in Yeniseian and American Indian peoples, suggesting that Hunnu should be in the Yeniseian family. The Y chromosome diversity is well associated with linguistic families in East Asia. According to the similarity in the Y chromosome profiles, there are four pairs of congenetic families, i.e., Austronesian and Tai-Kadai, Mon-Khmer and Hmong-Mien, Sino-Tibetan and Uralic, Yeniseian and Palaesiberian. Between 4,000-2,000 years before present, Tai-Kadai, Hmong-Mien, Sino-Tibetan, and Yeniseian languages transformed into toned analytic languages, becoming quite different from the rest four. Since Hunnu was in the Yeniseian family, all these four toned families were distributed in the inland of China during the transformations. There must be some social or biological factors induced the transformations at that time, which is worth doing more linguistic and genetic researches.
Genomic scans for haplotypes of Denisova and Neanderthal ancestry in modern human populations.
F. L. Mendez, M. F. Hammer University of Arizona, Tucson, AZ., USA.
Evidence of archaic introgression into modern humans has accumulated in recent years. While most efforts to characterize the introgression process have relied on genome averages, only a small number of introgressive haplotypes have been shown to have an archaic origin after rejection of the alternative hypothesis of incomplete lineage sorting. Accurate identification of introgressive haplotypes is crucial both to characterize potentially functional consequences of archaic admixture and to quantify more precisely the genomic impact of archaic introgression. We perform two independent genomic scans for haplotypes of Denisova and of Neanderthal origin in a geographically diverse sample of complete genome sequences. These scans are based on the local sharing of polymorphisms and linkage disequilibrium, respectively. The analysis of concordance between the methods is then used to estimate the power and to compare demographic inference when performed using either all the data or just the genomic regions with no evidence of introgression. Moreover, we evaluate the extent to which Denisova haplotypes are observed in non-Melanesian populations, and investigate whether the presence of such haplotypes is better explained by their persistence in the population since introgression or by more recent gene flow from Melanesians.
Admixture Estimation in a Founder Population. 
Y. Banda1 et al.
Admixture between previously diverged populations yields patterns of genetic variation that can aid in understanding migrations and natural selection. An understanding of individual admixture (IA) is also important when conducting association studies in admixed populations. However, genetic drift, in combination with shallow allele frequency differences between ancestral populations, can make admixture estimation by the usual methods challenging. We have, therefore, developed a simple but robust method for ancestry estimation using a linear model to estimate allele frequencies in the admixed individual or sample as a function of ancestral allele frequencies. The model works well because it allows for random fluctuation in the observed allele frequencies from the expected frequencies based on the admixture estimation. We present results involving 3,366 Ashkenazi Jews (AJ) who are part of the Kaiser Permanente Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort and genotyped at 674,000 SNPs, and compare them to the results of identical analyses for 2,768 GERA African Americans (AA). For the analysis of the AJ, we included surrogate Middle Eastern, Italian, French, Russian, and Caucasus subgroups to represent the ancestral populations. For the African Americans, we used surrogate Africans and Northern Europeans as ancestors. For the AJ, we estimated mean ancestral proportions of 0.380, 0.305, 0.113, 0.041 and 0.148 for Middle Eastern, Italian, French, Russian and Caucasus ancestry, respectively. For the African Americans, we obtained estimated means of 0.745 and 0.248 for African and European ancestry, respectively. We also noted considerably less variation in the individual admixture proportions for the AJ (s.d. = .02 to .05) compared to the AA (s.d.= .15), consistent with an older age of admixture for the former. From the linear model regression analysis on the entire population, we also obtain estimates of goodness of fit by r2. For the analysis of AJ, the r2 was 0.977; for the analysis of the AA, the r2 was 0.994, suggesting that genetic drift has played a more prominent role in determining the AJ allele frequencies. This was confirmed by examination of the distribution of differences for the observed versus predicted allele frequencies. As compared to the African Americans, the AJ differences were significantly larger, and presented some outliers which may have been the target of selection (e.g. in the HLA region on chromosome 6p).
Admixture in the Pre-Columbian Caribbean. 
J. C. Martinez-Cruzado et al.
The biological origin of the Caribbean aborigines that greeted Columbus is one of the most controversial issues regarding the population history of this region. Genome studies suggest an Equatorial-Tucanoan origin, consistent with the Arawakan language spoken by most natives of the region. However, the archaeological evidence suggests an early arrival from Mesoamerica, and their admixture with the more recent Arawak-speaking group stemming from the Amazon remains a possibility. The lineages comprehending most Puerto Rican samples belonging to haplogroups B1 and C1, which in turn encompass 44% of all Native American mtDNAs in the island, have an unambiguous South American origin. However, none of those belonging to haplogroup A2, encompassing 52% of all Native American mtDNAs, have been related to South America or any other continental region. To augment the scarce data from Mesoamerican countries other than Mexico, we present the complete mtDNA sequence of 6 Honduran samples belonging to distinct control region lineages in addition to 3 from the Dominican Republic and 3 from Puerto Rico. Interestingly, maximum likelihood phylogenetic reconstruction including 40 published haplogroup A2 sequence haplotypes from Mesoamerica, Central America and South America clusters 8 out of 10 Mesoamerican and Andean haplotypes in a deep rooted group, separate from, and excluding all Costa Rican, Panamian and Brasilian haplotypes, suggesting a relatively recent origin for Chibchan-Paezan and Amazonian groups. Furthermore, 4 of the 5 Greater Antillean A2 haplotypes are included in the deeply rooted Mesoamerican-Andean cluster. Moreover, the only Cuban haplotype in the literature and the remaining A2 haplotype from the Dominican Republic form even more deeply rooted private branches. Similarly, the only haplogroup C1d sample sequenced from the Dominican Republic forms a private branch with the deepest root in a maximum likelihood tree containing 19 additional C1d haplotypes from Mexico to Brasil plus the CRS. In conclusion, our preliminary results suggest that a substantial proportion of the Native American mtDNA lineages from the Greater Antilles do not share an Amazonian origin with the language their people spoke in 1492. Furthermore, the position of two Dominican lineages at the earliest split in both their respective trees suggests an early origin that could be explained by extensive lineage extinctions in Mesoamerica and the Andes or an origin in North America.
 The possible role of social selection in the distribution of the "Proto-Mongolian" haplotype in Kazakhs, Kyrgyz, Mongols and other Eurasian populations.
M. Zhabagin et al.
Social factors may be important contributors to reproductive success and determination of the selective survival of individuals. Therefore, social selection and other social factors are important for understanding population structure and its formation. The role of social selection on the distribution and formation of Y-chromosomal gene pool has been studied. There is a strong connection between social selection and birth rate of the descendants, whose fathers had achieved high social status during the expansion of the Mongol Empire and associated historical events. A total of 783 haplotypes, including 687 newly obtained and 96 retrieved from the literature were assigned to the haplogroup C3*-M217 (xM48) based on genotyping 17 Y-chromosomal STR markers. These haplotypes represent 11 populations of Eurasia: Kazakhs, Mongols, Kyrgyz, Telengits, Circassians, Balkar, Temirgoys, Karachai, Evenki, Kizhi and the Pashtuns. As the result, a major haplotype 13-16-25-15-16-18-14-10-22-11-10-11-13-10-21 (DYS389a-DYS389b-DYS390-DYS456-DYS19-DYS458-DYS437-DYS438-DYS448-GATA4-DYS391-DYS392-DYS393-DYS439-DYS635, N=94) was found to have 12.00% frequency within haplogroup C3*. This haplotype includes and extends the previously described “star-cluster” haplotype. Noteworthy, the frequency of this major haplotype within haplogroup C3* was 16.80% in Kazakhs, 10.13% in Mongols and 2.63% in Kirgiz who are not considered as direct descendants of Genghis Khan. 35.10% of the major haplotype was represented by Kazakh tribe Ashamayly-Kerey, 17.02% by the Khalkh Mongols and 7.44% by the Barguts. Therefore, we suppose this major ancestral haplotype to be the "proto-Mongolian haplotype", inherited by Genghis Khan and his descendants. It is important to mention that Temujin belongs to Kiyat-Borjigin tribe that in turn is a branch of the bigger Borjigin tribe, part of the Khalkh Mongols. Thus, Genghis Khan might be considered as a carrier rather than founder of the star-cluster haplotype. He and his descendants are the ones who contributed to a positive effect of social selection in the distribution of this haplotype. Other examples are the Barguts, who had Genghis Khan’s credit and were granted with a number of privileges, or the Kerey, based on the fact that Temujin had been brought up at the court of the Togrul Khan, belonging to the Kerey tribe.
Y-chromosomal variation in native South Americans: bright dots on a gray canvas.
M. Nothnagel et al.
While human populations in Europe and Asia have often been reported to reveal a concordance between their extant genetic structure and the prevailing regional pattern of geography and language, such evidence is lacking for native South Americans. In the largest study of South American natives to date, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other. We observed virtually no structure for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships, augmented by locally confined Y-STR autocorrelation. Analysis of repeatedly taken random subsamples from Europe adhering to the same sampling scheme excluded the possibility that this finding was due to our specific scheme. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America, which are virtually absent from North and Central America, but occur at high frequency in Asia. Our data suggest a late introduction of C3* into South America no more than 6,000 years ago and low levels of migration between the ancestor populations of C3* carrier and non-carriers. Our findings are consistent with a rapid peopling of the continent, followed by long periods of isolation in small groups, and highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions.
The timing and history of Neandertal gene flow into modern humans. 
S. Sankararaman et al.
   Previous analyses of modern human variation in conjunction with the Neandertal genome have revealed that Neandertals contributed 1-4% of the genes of non-Africans with the time of last gene flow dated to 37,000-86,000 years before present. Nevertheless, many aspects of the joint demographic history of modern humans and Neandertals are unclear. We present multiple analyses that reveal details of the early history of modern humans since their dispersal out of Africa.
   1.We analyze the difference between two allele frequency spectra in non-Africans: the spectrum conditioned on Neandertals carrying a derived allele while Denisovans carry the ancestral allele and the spectrum conditioned on Denisovans carrying a derived allele while Neandertals carry the ancestral allele. This difference spectrum allows us to study the drift since Neandertal gene flow under a simple model of neutral evolution in a panmictic population even when other details of the history before gene flow are unknown. Applying this procedure to the genotypes called in the 1000 Genomes Project data, we estimate the drift since admixture in Europeans of about 0.065 and about 0.105 in East Asians. These estimates are quite close to those in the European and East Asian populations since they diverged, implying that the Neandertal gene flow occurred close to the time of split of the ancestral populations. 
   2.Assuming only one Neandertal gene flow event in the common ancestry of Europeans and East Asians, we estimate the drift since gene flow in the common ancestral population. We show that an upper bound on this shared drift is 0.018. Because this is far less than the drift associated with the out-of-Africa bottleneck of all non-African populations, this shows that the Neandertal gene flow occurred after the out-of-Africa bottleneck. 
   3.We use the genetic drift shared between Europeans and East Asians, in conjunction with the observation of large regions deficient in Neandertal ancestry obtained from a map of Neandertal ancestry in Eurasians, to estimate the number of generations and effective population size in the period immediately after gene flow. These analyses suggest that only a few dozen Neandertals may have contributed to the majority of Neandertal ancestry in non-Africans today.
Genetic characterisation of two Greek population isolates. 
K. Hatzikotoulas et al.
   Genetic association studies of low-frequency and rare variants can be empowered by focusing on isolated populations. It is important to genetically characterize population isolates for substructure and recent admixture events as these may give rise to spurious associations. Under the auspices of the HELlenic Isolated Cohorts study (HELIC; www.helic.org) we have collected >3,000 samples from two isolated populations in Greece: the Pomak villages (HELIC Pomak), a set of religiously-isolated mountainous villages in the North of Greece; and Anogia and surrounding mountainous villages on Crete (HELIC MANOLIS). All samples have information on anthropometric, cardiometabolic, biochemical, haematological and diet-related traits. 1,500 individuals from each population isolate have been typed on the Illumina OmniExpress and Human Exome Beadchip platforms. Multidimensional scaling analysis with the 1000 Genomes Project data shows similarities of the two population isolates with Mediterranean populations such as the Tuscans from Italy and Iberians from Spain. We also observe evidence for structure within the isolates, with the Kentavros village in the Pomak strand demonstrating high levels of differentiation. To characterise the degree of isolatedness in these populations we estimated the proportion of individuals with at least one “surrogate parent” (using only the subset of samples with pairwise pi-hat<0 .2="" 707="" adolescents="" an="" and="" at="" attica="" compared="" comprises="" district.="" find="" for="" from="" genome="" greek="" in="" individuals="" is="" isolate="" least="" manolis="" of="" one="" outbred="" parent="" population="" proportion="" random="" regions="" study="" surrogate="" teenage="" that="" the="" this="" to="" unrelated="" we="" which="" with="">60% and in the Pomak isolate is >65% compared to ~1% in the outbred Greek population. Our results establish these populations as isolates and provide some insights into the genomic architecture of Greek populations, which have not been previously characterised.
Efficient and Accurate Whole-Genome Human Phasing.
T. Blauwkamp et al.
   High throughput DNA sequencing allows whole human genomes to be resequenced rapidly and inexpensively producing a comprehensive list of variants relative to the reference genome. However, short read sequencing technologies are limited in their ability to determine phasing information, thus resulting in heterozygous calls being represented as the average of the maternal and paternal chromosomes. Phasing information is of critical importance to personal medicine as it provides a better linkage between genotype and phenotype, permitting new advances in our understanding of compound heterozygote linked diseases, pharmacogenomics, HLA typing, and prenatal genome sequencing. Here, we describe a new sample prep method that enables whole human genome haplotyping at high accuracy using only 30Gb of sequence data. Genomic DNA was fragmented into ~10Kb fragments, end repaired, and ligated to adapters. Hundreds of aliquots with approximately 50MB of DNA in each were amplified, fragmented and converted into individual shotgun libraries. The pooled libraries were sequenced in a single lane of a HiSeq2500 at 2x100bp to generate ~30Gb of sequence. The resulting sequence information was analyzed to obtain a set of long blocks of ~10Kb, covering multiple heterozygous SNPs, allowing phasing of these SNPs relative to each other. An HMM-based phasing algorithm was used to compute the most likely phase and confidence intervals based on the observed coverage and sequencer quality scores. Phasing of those blocks relative to each other was done by another HMM-based algorithm which uses a panel of previously phased genomes. Comparing our results with phase information inferred by transmission from the parents, we found that over 98% of heterozygous SNPs were phased within long blocks (N50=500kb) at a switch error rate below 1 switch per megabase of phased sequence. We present results obtained from multiple cell lines and human samples. This new library prep method and data analysis pipeline enables whole human genome phasing with only 30Gb of raw sequence, which represents only ~30% more sequencing than current 30x baseline run for human sequencing. Compared to other published reports, this method is capable of phasing a greater fraction of SNPS with ~75% less sequencing. Coupling our higher percentage of SNPs phased with high accuracy and the lowest sequencing requirement, this new technology is the most affordable approach to generating completely phased whole human genomes.
 Inference of Natural Selection and Demographic History for African Pygmy Hunter-Gatherers.
P. H. Hsieh et al.
   African Pygmies are hunter-gatherers primarily inhabiting the Central African rainforests, where they are exposed to high temperatures, high humidity, and a pathogen and parasite-enriched woody habitat. These factors undoubtedly influenced their evolutionary history as they adapted to this environment. Many Pygmy populations have historically been in socio-economic contact with neighboring Niger-Kordofanian speaking farmer populations, particularly since the agriculture expansion in sub-Saharan Africa that began five thousand years ago (kya). To look for the true signatures of adaptation to the rainforest habitat of pygmies we must control for this complex demographic history. We sequenced and combined 40x whole genome sequence data from 3 Baka pygmies from Cameroon, 4 Biaka pygmies from the Central African Republic, and 9 Niger-Kordofanian speaking Yoruba farmers from Nigeria. We used ?a?i, a model-based demographic inference tool, to infer the history of these populations. Our best-fit model suggests that the ancestors of the farmer and pygmy populations diverged 150 kya and remained isolated from each other until 40 kya. This divergence is more ancient than estimated by previous studies that included fewer loci, but is consistent with a PSMC analysis, a separate inference tool that uses different aspects of the genomic data than ?a?i. Interestingly, our analysis shows that models with bi-directional asymmetric gene flow between farmers and pygmies are statistically better supported than previously suggested models with a single wave of uni-directional migration from farmers to pygmies. To identify possible targets of positive selection, we conducted a genomic scan using complementary methods, including the frequency-spectrum based G2D test, the population differentiation based XP-CLR test, and the haplotype based iHS test. We performed 10,000 simulations based on the above best-fit demographic model in order to assign statistical significance to each reported target of natural selection. Our results reveal that genes involved in cell adhesion, cellular signaling, olfactory perception, and immunity were likely targeted by natural selection in the pygmies or their recent ancestors. Our analysis also shows that genes involved in the function of lipid binding are enriched in highly differentiated non-synonymous mutations, suggesting that this function may have acted differently on the Pygmies and farmers after their divergence from their common ancestor.
Population demography and maternal history of Oceania.
A. T. Duggan et al.
   We present a large-scale study of mtDNA diversity across Near and Remote Oceania with whole-genome mtDNA sequencing and a sample collection of more than 1,300 individuals spanning from the Bismarck Archipelago in the west to the Cook Islands in the east. As the location of at least two major migration events (initial colonization over 40,000 years ago, followed by an expansion of Austronesian-speaking migrants around 3,500 years ago), Oceania provides a unique opportunity to study the effects of population admixture. Our results support the idea of sex-biased admixture between the resident populations and the migrants of the Austronesian expansion. We find that haplogroups of putative Asian origin which are thought to have spread with the Austronesian expansion are found at high frequency in all but two populations and, in general, we see little evidence of distinction between Papuan and Austronesian speaking populations. Santa Cruz, which is part of the Solomon Islands but geographically distinct from the main island chain and considered part of Remote Oceania, has long been considered a linguistic oddity and is now accepted to represent a very deep branch in the Oceanic language family. We find that it is also a genetic outlier, with potential direct connections to the Bismarck Archipelago not evident in the main Solomon Islands chain. In this expanded dataset, we find additional evidence of instability and increased heteroplasmy at the ‘Polynesian motif’ position 16247, further confirming previous findings restricted to the Solomon Islands. 

 Reconstructing Austronesian population history. 
M. Lipson et al.
   Present-day populations that speak Austronesian languages are spread across half the globe, from Easter Island in the Pacific Ocean to Madagascar in the Indian Ocean. Evidence from linguistics and archaeology suggests that the "Austronesian expansion," a vast cultural and linguistic dispersal that began 4--5 thousand years ago, had its origin in Taiwan. However, genetic studies of Austronesian ancestry have been inconclusive, with some finding affinities with aboriginal Taiwanese, others advancing an autochthonous origin within Island Southeast Asia, and others proposing a model involving multiple waves of migration from Asia. Here, we analyze genome-wide data from a diverse set of 31 Austronesian-speaking and 25 other groups typed at 18,412 overlapping single nucleotide polymorphisms (SNPs) to trace the genetic origins of Austronesians. We use a recently developed computational tool for building phylogenetic models of population relationships incorporating the possibility of admixture, which allows us to infer ancestry proportions and sources of genetic material for 26 admixed Austronesian-speaking populations. Our analysis provides strong confirmation of widespread ancestry of Taiwanese origin: at least a quarter of the genetic material in all Austronesian-speaking populations that we studied---including all of the Asian ancestry in populations from eastern Indonesia and Oceania---is more closely related to aboriginal Taiwanese than to any populations we sampled from the mainland. Surprisingly, we also show that western Austronesian-speaking populations have inherited substantial proportions of their Asian ancestry from a source that falls within the variation of present-day Austro-Asiatic populations in Southeast Asia. No Austro-Asiatic languages are spoken in Island Southeast Asia today, although there are some linguistic and archaeological suggestions of an early connection between mainland and island populations. The most plausible explanation for these findings, in light of the historical evidence, is that western Island Southeast Asia was settled by Austronesian groups who had previously mixed with Austro-Asiatic speakers on the mainland.
 No significant differences in the accumulation of deleterious mutations across diverse human populations. 
R. Do et al.
   Differences in demographic history across populations are expected to cause differences in the accumulation of deleterious mutations because natural selection works less efficiently when population sizes are small. Surprisingly, however, the relative burden of deleterious mutations has never been directly measured across human populations on a per-haploid genome basis, despite the fact that this is what matters biologically in the absence of dominance and epistasis. Here we empirically measure the relative accumulation of deleterious mutations in 13 diverse populations (Yoruba, Mandenka, San, Mbuti, Dinka, Australian, French, Sardinian, Han, Dai, Mixe, Karitiana and Papuan) along with one archaic population (Denisova). All the present-day populations have statistically indistinguishable accumulations of coding mutations. We highlight two examples. First, we find no evidence for a lower mutational load in West Africans than in Europeans despite the approximately 30% higher genetic diversity in West Africans: the accumulation of nonsynonymous mutations in West Africans is 1.01±0.02 times that in Europeans, and for “probably damaging” mutations, the ratio is 1.03±0.04. Second, we find no evidence for a lower mutational load in populations that have experienced agriculture-related expansions over the last 10,000 years and those that have not: the ratio in Chinese to Karitiana hunter gatherers from Brazil is 0.99±0.07. We determined that these null results are not an artifact of insensitivity of our method to differences in demographic history. As a positive control, we also analyzed archaic Denisovans who are known to have had a small population size for hundreds of thousands of years since separation from modern humans. We show that the Denisovan lineage has accumulated “probably damaging” mutations 1.33±0.06 times more rapidly than modern humans since they split. These analyses are important because of the new constraints they place on the distribution of selection coefficients in humans. Given the currently estimated demographic histories of West Africans and Europeans, combined with the fact that we do not detect a lower accumulation of deleterious mutations in West Africans than Europeans, we can conclude that only a small proportion of nonsynonymous mutations have selection coefficients in the range s=-0.01 to -0.001, which is the range of selection coefficients which would be expected to show a lower accumulation in West Africans than in Africans.
Deep coverage Bedouin genomes reveal Bedouin haplotypes shared among worldwide populations in the 1000 Genomes Project. 
J. L. Rodriguez-Flores et al.
   The 1000 Genomes Project (1000G) has sampled and sequenced over 2500 genomes that are representative of the genetic diversity in populations worldwide. The Arabian Peninsula has not been previously included in 1000G, hence the connections between genetic variation in the indigenous Bedouin people and worldwide populations is unknown. We have sampled genomes from Bedouin individuals in the nation of Qatar as a window into the genetic variation in this understudied region. Our goal was to use this sample to assess the hypothesis that there is detectable shared ancestry between Bedouin and Southern European populations resulting from the history of empires that spanned both the Mediterranean and Arabian regions and the hypothesis that there is shared ancestry between Bedouin and contemporary Latin American populations, since the majority of European settlers in Latin America from the past half millennia are primarily from Southern European countries. We selected 60 Qataris with over 95% Bedouin ancestry and at least 3 generations of ancestry in Qatar for deep coverage genome sequencing. Genomes were sequenced by the Illumina Genome Network using TruSeq DNA PCR-free sample preparation, generating over 120 gigabases of paired-end 100 base pair reads per genome on a HiSeq 2500, yielding over 30x depth and genotypes for >96% of the genome using both the ELAND/CASAVA and BWA/GATK pipelines. Using these genotypes, we inferred haplotypes using SHAPEIT for Bedouin Qataris and for 1000G populations on a set of sites polymorphic in both 1000G and Bedouins. We used admixture analysis to assess shared ancestry between our Bedouin sample and 1000G populations using the ancestry deconvolution method SUPPORTMIX. Given the lack of appropriate ancestral populations, we conducted a leave-one-out approach, where for each population (1000G + Bedouin = n), we removed the population and used the remaining n-1 populations as an ancestral reference panel. Using this approach, we observed up to 15% Bedouin ancestry in European, South Asian, and American populations. Likewise, we observed ancestry from Europe, South Asia, and America in the Bedouin. For individuals from the Americas, the analysis identified a considerable number of segments shared with Bedouins previously classified as European ancestry. 
Using a haplotype-based model to infer Native American colonization history.
C. Lewis et al.
   We apply a powerful haplotype-based model (described in Lawson et al. 2012) to infer the population history of 410 individuals from ~50 Native American groups, using data interrogated at >470,000 genome-wide autosomal Single-Nucleotide-Polymorphisms (SNPs). The model matches haplotype patterns among individuals' chromosomes to infer which individuals share recent common ancestry at each location of the genome, an approach that has previously been demonstrated to increase power substantially over widely-used alternative approaches that consider SNPs independently. We apply this methodology to 1861 samples described in Reich et al. (2012), incorporating 263 additional samples from 32 relevant world-wide regions collated from other publicly available resources and currently unavailable data. We utilize these methodology and data in two ways. First, we infer intermixing (i.e. "admixture") events among different Native American groups by identifying the groups that share the most haplotype segments. Using additional unpublished techniques, we determine the dates of these intermixing events, the proportions of DNA contributed, and the precise genetic make-up of the groups involved. These unique characteristics set this methodology apart from all presently available software, allowing us to place these mixing events into a clear historical context and thus identify the factors (e.g. the rise or fall of various Native American empires) that have contributed most to the genetic architecture of present-day Native American groups. Second, we match DNA patterns from each Native American group to a set of over 30 populations from Siberia and East Asia, describing each Native American group as a mixture of DNA from these regions. This enables us to shed light on the widely debated number of distinct migrations into the Americas during the initial colonization across the Bering Strait, comparing our results to previous inference from the literature. Our application demonstrates the power gained by using rich haplotype information relative to approaches that ignore this information.
Using Ancient Genomes to Detect Positive Selection on the Human Lineage. 
K. Prüfer et al.
   At least two distinct groups of archaic hominins inhabited Eurasia before the arrival of modern humans: Neandertals and Denisovans. The analysis of the genomes of these archaic humans revealed that they are more closely related to one another than they are to modern humans. However, since modern and archaic humans are so closely related, only about 10% of the archaic DNA sequences fall outside the present-day human variation whereas for 90% of the genome, Neandertal or Denisova DNA sequences are more closely related to some humans than to others. The fact that the archaic sequence often falls within the diversity of modern humans can be used to detect selective sweeps that affected all modern humans after their split from archaic humans since such sweeps will result in genomic regions where both the Neandertal and Denisova genomes fall outside the modern human variation. The genetic lengths of such external regions are proportional to the strength of selection, since stronger selection will lead to faster sweeps allowing less time for recombination to decrease their size. We have implemented a test for such external regions as a hidden Markov model. At each polymorphic position the model emits ancestral or derived based on whether the tested archaic genome carries the ancestral or derived variant of SNPs observed in present-day humans. The model was applied to 185 African genomes from the 1000 genomes phase 1 data. We identified thousands of external regions using the Neandertal and Denisova genomes, separately. Approximately one third of the regions are overlapping between the two genomes. These regions are significantly longer than regions only identified in only one of the archaic genomes. Based on this excess of overlap for long regions, we devise a measure to identify a set of regions that are candidates for selective sweeps on the human lineage since the split from Neandertal and Denisova.
Pulling out the 1%: Whole-Genome In-Solution (WISC) capture for the targeted enrichment of ancient DNA sequencing libraries. 
C. D. Bustamante et al.
   The very low levels of endogenous DNA remaining in most ancient specimens has precluded the shotgun sequencing of many interesting samples due to cost. For example, ancient DNA (aDNA) libraries derived from bones and teeth often contain <1 b="" by="" capacity="" dna.="" dna="" endogenous="" environmental="" is="" majority="" meaning="" of="" sequencing="" taken="" that="" the="" up=""> We will present a method for the targeted enrichment of the endogenous component of human aDNA sequencing libraries. Using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to significantly enrich for human-derived DNA fragments. This approach, which we call whole-genome in-solution capture (WISC), allows us to obtain genome-wide ancestral information from ancient samples with very low endogenous DNA contents. We demonstrate WISC on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased dramatically, with up to 59% of reads mapped to human and folds enrichment ranging from 5X to 139X. Furthermore, we maintained coverage of the majority of fragments present in the pre-capture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062-147,243) for the post-capture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217-73,266) for the pre-capture libraries, increasing resolution in population genetic analyses. We will also present the results of performing WISC on other aDNA libraries from both archaic human and non-human samples, including ancient domestic dog samples. Our capture approach is flexible and cost-effective, allowing researchers to access aDNA from many specimens that were previously unsuitable for sequencing. Furthermore, this method has applications in other contexts, such as the enrichment of target human DNA in forensic samples.
Insights into population history from a high coverage Neandertal genome. 
D. Reich1, for.the. Neandertal Genome Consortium2 
   We have sequenced to about 50-fold coverage a genome sequence from about 40 mg of a bone found in Denisova Cave in Southern Siberia. The genome of this female is much more closely related to the low-coverage Neandertal genomes from Croatia, Spain, Germany and the Caucasus than to the genome of archaic Denisovans, a sister group of Neandertals, and provides unambiguous evidence that both Neandertals and Denisovans inhabited the Altai Mountains in Siberia. The high-coverage Neandertal genome, combined with our earlier sequencing of a high quality Denisova genome, allows novel insights about the population history of archaic humans:
    •We document recent inbreeding in this Altai Neandertal. The inbreeding coefficient of about 1/8 corresponds to about the homozygosity that would be expected from a mating of half siblings. 
    •The Altai Neandertal genome shares almost seven percent more derived alleles with present-day Africans than does the Denisova genome. This means that the Denisovans derived a proportion of their ancestry from a very archaic human lineage, and the amount of this ancestry they inherit is larger than in Neandertals. 
    • The Denisovan genome is affected by major recent gene flow from an Altai-related Neandertal. 
    • To further characterize the variation among Neandertals we sequenced the genome of a Neandertal from the Caucasus to about 0.5-fold coverage. Comparisons to present-day genomes show that the Neandertals who contributed genes to present-day non-Africans were more closely related to this Caucasian Neandertal than to the Neandertals we sequenced from the Altai. 
    •We built a map of Neandertal ancestry in modern humans, using data from all non-Africans in the 1000 Genomes Project. We show that the average Neandertal ancestry on chromosome X of present-day non-Africans is about a fifth of the genome average. It is known that hybrid incompatibility loci concentrate on chromosome X. Thus, this observation is consistent with a model of hybrid incompatibility in which Neandertal variants that introgressed into modern humans were rapidly selected away due to epistatic interactions with the modern human genetic background.
Inferring complex demographies from PSMC coalescent rate estimates: African substructure and the Out-of-Africa event.
S. Gopalakrishnan et al.
   Human population history is an intriguing and complex story with many events like population growth, bottlenecks, time-dependent/non-homogeneous migration, population splits and mixtures. Estimating complete demographies with population sizes, rates of gene flow and population split times has proven to be a challenging endeavor. We propose a framework for jointly estimating the demography parameters, especially gene-flow rates and split times, for a large number of populations. We use coalescent rate estimates obtained from Pairwise Sequentially Markovian Coalescent (PSMC) as the starting point for our analysis. Since PSMC works on only two chromosomes at a time, we apply PSMC to all pairs of individuals to obtain the pairwise coalescent rates for lineages from every pair of sampled populations. Using a mathematical model for calculating coalescent probabilites given population parameters, we estimate demography using the parameters that best fit the observed coalesecent rates.
   In this study, we focus on two aspects of African population genetics, 1. the nature of population structure in Africa going back in time and 2. the timing of the Out-of-Africa event. To address these questions, we assembled a dataset with whole genome sequences from 162 individuals using both in-house sequencing and publicly available sources. These samples span 22 populations worldwide. These include eleven African populations which we use to dissect the population substructure in Africa. In addition, we also have 2 Middle Eastern, 5 European and 4 East/Central Asian populations which inform the population split time estimates for the Out-of-Africa event and the European-Asian split.
   We find extensive population structure in Africa extending back to before the Out-of-Africa event. The Ethiopian populations, Amhara and Oromo, show evidence of mixing beyond 15 kya. The Maasai and Luhye merge with the Ethiopian populations to form a panmictic East African population ~40kya. We find evidence for extensive mixing between east and west African populations before 50kya. Among the pygmy populations, we see recent gene flow between the Batwa and Mbuti. All African populations except the San merge into a single population around 110 kya. The San exchange migrants with the other African populations beginning ~120 kya. We estimate the Out-of-Africa event to have occurred ~75kya and the European-Asian split to ~25kya.
Out of Africa, which way? 
L. Pagani et al.
While the African origin of all modern human populations is well-established, the dynamics of the diaspora that led anatomically modern humans to colonize the lands outside Africa are still under debate. Understanding the demographic parameters as well as the route (or routes) followed by the ancestors of all non-Africans could help to refine our understanding of the selection processes that occurred subsequently, as well as shedding light on a landmark process in our evolutionary history. Of the three possible gateways out of Africa (via Morocco across the Gibraltar strait, via Egypt through the Suez isthmus or via the Horn of Africa across Bab el Mandeb strait) only the latter two are supported by paleoclimatic and archaeological evidence. Furthermore, recent studies (Pagani et al. 2012) showed that, although the modern Ethiopian populations might be good candidates for the descendants of the source population of such a migration, modern Egyptians could be an even better candidate. Unfortunately, however, only a few Egyptian samples have been genotyped and, as yet, none have been fully sequenced. Here, we have generated 125 Ethiopian and 100 Egyptian whole genome sequences (Illumina HiSeq, 8x average depth). The genomes were partitioned using PCAdmix (Brisbin et al. 2012) to account for the confounding effects of recent introgression from neighboring non-African populations. To explore the genetic legacy of migration routes out of Africa, and in particular to test whether the observed genetic data support one route over another, the African components of Egyptians and Ethiopians were then compared to a panel of available non-African populations from the 1000 Genomes Project (1000 Genomes Project Consortium, 2012). The high resolution provided by whole genome sequencing allows us to shed new light on the paths followed by our ancestors as they left Africa, as well as refining the current knowledge of the demographic history of the populations analyzed.
The Saudi Arabian Genome Reveals a Two Step Out-of-Africa Migration. 
J. J. Farrell et al.
   Here we present the first high-coverage whole genome sequences from a Middle Eastern population consisting of 14 Eastern Province Saudi Arabians. Genomes from this region are of interest to further answer questions regarding “Out-of-Africa” human migration. Applying a pairwise sequentially Markovian coalescent model (PSMC), we inferred the history of population sizes between 10,000 years and 1,000,000 years before present (YBP) for the Saudi genomes and an additional 11 high-coverage whole genome sequences from Africa, Asia and Europe.
   The model estimated the initial separation from Africans at approximately 110,000 YBP. This intermediate population then underwent a long period of decreasing population size culminating in a bottleneck 50,000 YBP followed by an expansion into Asia and Europe. The split and subsequent bottleneck were thus two distinct events separated by a long intermediate period of genetic drift in the Middle East. The two most frequent mitochondria haplogroups (30% each) were the Middle Eastern U7a and the African L. The presence of the L haplogroup common in Africa was unexpected given the clustering of the Saudis with Europeans in the phylogenetic tree and suggests some recent African admixture. To examine this further, we performed formal tests for a history of admixture and found no evidence of African admixture in the Saudi after the split. Taken together, these analyses suggest that the L3 haplogroup found in the Saudi were present before the bottleneck 50,000 YBP. Given the TMRCA estimates for the L3 haplogroup of approximately 70,000 YBP and the timing of the Out-of-Africa split, these analyses suggest that L3 haplogroup arose in the Middle East with a subsequent back migration and expansion into Africa over the Horn-of-Africa during the lower sea levels found during the glacial period bottleneck.
    These results are consistent with the hypothesis that modern humans populated the Middle East before a split 110,000 YBP, underwent genetic drift for 60,000 years before expanding to Asia and Europe as well as back-migration into Africa. Examination of genetic variants discovered by Saudi whole genome sequencing in ancestral African populations and European/Asian populations will contribute to the understanding human migration patterns and the origin of genetic variation in modern humans.
 Geographic Population Structure (GPS) of worldwide human populations infers biogeographical origin down to home village
E. Elhaik et al.
The search for a method that utilizes biological information to predict human’s place of origin has occupied scientists for millennia. Modern biogeography methods are accurate to 700 km in Europe but are highly inaccurate elsewhere, particularly in Southeast Asia and Oceania. The accuracy of these methods is bound by the choice of genotyping arrays, the size and quality of the reference dataset, and principal component (PC)-based algorithms. To overcome the first two obstacles, we designed GenoChip, a dedicated genotyping array for genetic anthropology with an unprecedented number of ~12,000 Y-chromosomal and ~3,300 mtDNA SNPs and over 130,000 autosomal and X-chromosomal SNPs carefully chosen to study ancestry without any known health, medical, or phenotypic relevance. We also 615 individuals from 54 worldwide populations collected as part of the Genographic Project and the 1000 Genomes Project. To overcome the last impediment, we developed an admixture-based Geographic Population Structure (GPS) method that infers the biogeography of worldwide individuals down to their village of origin. GPS’s accuracy was demonstrated on three data sets: worldwide populations, Southeast Asians and Oceanians, and Sardinians (Italy) using 40,000-130,000 GenoChip markers. GPS correctly placed 80%; of worldwide individuals within their country of origin with an accuracy of 87%; for Asians and Oceanians. Applied to over 200 Sardinians villagers of both sexes, GPS placed a quarter of them within their villages and most of the remaining within 50 km of their villages, allowing us to identify the demographic processes that shaped the Sardinian society. These findings are significantly more accurate than PCA-based approaches. We further demonstrate two GPS applications in tracing the poorly understood biogeographical origin of the Druze and North American (CEU) populations. Our findings demonstrate the potential of the GenoChip array for genetic anthropology. Moreover, the accuracy and power of GPS underscore the promise of admixture-based methods to biogeography and has important ramifications for genetic ancestry testing, forensic and medical sciences, and genetic privacy.

September 05, 2012

East to West across Eurasia

A couple more interesting abstracts from the DNA in Forenscics 2012.


Genetic journey of the N1c haplogroup
Pamjav H, Nemeth E, Feher T, Volgyi A
Binary and Y-STR polymorphisms associated with the NRY region of the human Y chromosome preserve the paternal genetic legacy that has persisted to the present, permitting inference of human evolution, population migration and demographic history.The NRY region of the Y chromosome acts much like mtDNA to reveal the structure among human populations and possiblyto infer the order and timing of their descents. In the present study, we have investigated the originof haplogroup N1c-Tat phylogeographic structure and the genetic relationship of Eurasianpopulations by examining STR variation in a large number of individuals. We have identified 54samples as the haplogroup N1c-Tat from 5 population groups (N=632). To place the results into awider geographic context, we included 209 samples from published sources and 296 samples from the FTDNA public database into the phylogenetic analysis. According to previous studieshaplogroup N-M231 is of East Asian ancestry. Our results suggest that N1c-Tat mutation probably originated in South Siberia 8-9 thousand years ago and had spread through the Urals into the European part of present-day Russia. Its distribution is not fully correlated with the spread of Uralic languages. Turkic-speaking ethnic groups in South Siberia have high N1c-Tat presence and STR variance, while the N1c-L550 subgroup largely occurs among non-Uralic-speaking Europeanpopulations. Only the European N1c-Tat (xL550) subgroup can be linked to the spread of Finno-Ugric languages from the Kama-Urals area ~6,000 years ago. The subgroup N1c-L550 cannot be considered Finno-Ugric origin and its carriers might have been assimilated by Indo-European groups, resulting in their spread across Europe in historical times with Vikings and Balto-Slavs. Based on the present study Buryats were dominated by a young, about 800-years old N1c-Tat cluster, which suggest that this ethnic group could be a relatively recent admixture of Mongolian conquerors with a Paleo-Siberian population groups.
Of course these ages should be taken with a grain of salt because it is unclear how they were derived (i.e., whether the "evolutionary mutation rate" was used). Hopefully, someone will treat the  subject of N1c ages with Y-SNPs that do not have the problem of saturation that affects microsatellites. This is an interesting test case, because a ~3-fold change in ages will have important consequences for our understanding of the spread of Finno-Ugric languages into Europe: an earlier date would associate them with the Comb Ceramic, while a later, Bronze Age date would associate them with the Seima-Turbino phenomenon.


Huns in Bavaria? Genetic analyses of an artificially deformed skull from an early medieval cemetery in Burgweinting (Regensburg, Germany)

Schleuder R, Wilde S, Burger J, Grupe G, Forster P, Harbeck M
The morphological examination of an early medieval burial site in Burgweinting, which is dated to the end of the 5th century, revealed one female with an artificially, circularly deformed skull, a practice that is thought to be associated with the arrival of Nomads of the Eurasian steppe, particularly the Huns.    

Individuals with such artificial cranial deformations also can be found in other Late Roman and Early Medieval cemeteries in Europe mostly in the Carpathian basin but only as few isolated cases in Western Europe, where mostly women show such deformations.  
Regarding the artificial cranial deformations it is unclear whether a foreign custom was taken over by Germanic tribes or whether the individuals were members or descendants of Eurasian nomads.  
With the help of the find of Burgweinting, we exemplarily investigated this question.To identify the possible foreign origin of this female with alleged “Asian” skull deformation we sequenced the HVRI and HVRII region of the mitochondrial DNA.  
Our results show that the ancestry of a woman with artificially deformed skull can be linked to an at least partly Asian origin. So this indicates that at least some of the few individuals with skull deformation had not adopted the costume but can be seen as former members or descendants of the hunnish tribal community.   
It will be worthwhile if geneticists can co-operate with physical anthropologists and/or archaeologists more broadly in cases where morphology, or burial customs indicate that a possibly heterogeneous population exists at that site. The above is a good example of that synergy in action.

March 17, 2010

Abstracts from AAPA 2010

Some abstracts from the upcoming (April 14-17) meeting of the American Association of Physical Anthropologists.

Why are pygmies small? An anthropometrical and anthropogenetical question
NOEMIE BECKER et al.
Pygmy populations from central Africa have the shortest stature worldwide. The name “pygmy” indeed comes from the Greek “pugmaios” that is a measure of length. This reduced stature has been the subject of numerous endocrinological studies and many evolutionary hypotheses have suggested that this phenotype was an adaptation to the rainforest (hot, humid and dense environment), to alimentation or due to life history trade-offs (high mortality). We have anthropometrical data for a sample of more than 1000 individuals from 7 pygmy populations and 3 neighbouring farmer populations from Gabon, Cameroon and Central African Republic. DNA samples are also available for a large number of individuals. The analysis of anthropometrical data shows that all pygmy groups have a male mean stature under 160 cm (this was used in the definition settled by Cavalli-Sforza in1986) and that a high variability exists between various pygmy populations. Verdu et al. (2009) published a genetic analysis based on neutral microsatellites on the same populations and found that pygmies present a variable admixture proportion with nonpygmies. Comparing this data with our anthropometrical data at the individual level we find a strong correlation between level of admixture and stature, thus strongly supporting the existence of a genetic component in pygmy short stature. We developed a candidate-gene approach to search for such genetic factor and will present current results on various genes located in the GH-IGF1 axis.
New evidence on headshaping from the Early Byzantine Maroneia in Thrace, Greece.
PARASKEVI TRITSAROLI

The first case of headshaping from Early Byzantine Greece was identified in 2006 at the cemetery of Maroneia (5th-6th c. A.D.). Biocultural evidence suggested the presence of a female individual culturally linked to Hunic traditions. This paper analyzes the second case of headshaping on a female skeleton uncovered in 2009 and allows for the wider discussion of the presence of a larger group related to the Huns in the city of Maroneia. The skull was examined by combining macroscopic observation and x-ray. Points of pressure are recorded in the frontal, post-coronal and occipital regions resulting in an undulation of diploic bone. Possible bilateral pressure on the frontal bone has produced an artificially narrowed frontal. The skull extends posterosuperiorly. These features suggest the application of bandaging producing circular modification. Both headshaped skulls exhibit the same type of modification. Similarly, both women were buried in a supine position, without offerings, just like the remaining 36 deceased individuals in the cemetery of Maroneia. Headshaping was unknown among Byzantine customs. On the contrary, the Huns who attacked the Balkans twice and who unsuccessfully threatened Maroneia in 411 practiced a pronounced form of circular headshaping. Consequently, biocultural evidence strongly supports the hypothesis that a group linked to the Huns was installed at the city and was assimilated into this Early Byzantine society. Future biogeochemical analysis needs to be undertaken in order to investigate migration patterns. However, headshaping reflects the cosmopolitan character of Maroneia, an important urban center in a province of the Byzantine Empire.


The genetic legacy of indigenous Caribbean peoples: Evidence from autosomal and mitochondrial data.
JADA BENN TORRES et al.

Archeological evidence suggests that autochthonous peoples began to migrate into the eastern island chain in the Caribbean, known as the Lesser Antilles, as early as 7200 years BP. Upon the arrival of Europeans, an estimated 2-4 million people lived on these islands. Within 32 years of contact, the native populations had virtually disappeared from the region due to European-introduced disease, abuse, and genocide. This lead many scholars to conclude that indigenous Caribbean people had become extinct. However, small pockets of indigenous communities have survived and are present today on several Lesser Antillean islands. Furthermore, ethnohistoric data suggests that gene flow occurred between autochthonous peoples and enslaved Africans beginning in the colonial period. In this study, we examine the genetic legacy of autochthonous Caribbean peoples from the Lesser Antilles in contemporary African- Caribbean populations as evidenced from mitochondrial data and novel autosomal data. A total of 516 individuals from eight Caribbean islands were typed for 109 ancestry informative markers and a subset of individuals were also typed for their mitochondrial haplogroup. Mitochondrial haplogroups indicate that 5% of the sample has indigenous ancestry while admixture estimates from autosomal markers show 4% indigenous ancestry. Both lines of data suggest that despite the dramatic postcontact decline in population size, indigenous Caribbean people have made notable genetic contributions to contemporary African-Caribbean populations. Furthermore, these genetic contributions vary according to the genetic system typed and across the islands.
Chuvash origins: Evidence frommtDNA Markers.
ORION M. GRAF et al.

A sample of 96 unrelated individuals from Chuvashia, Russia was sequenced for hypervariable region-I (HVR-I) of the mtDNA molecule. The Chuvash speak a Turkic language that is not mutually intelligible to other extant Turkish groups, and their genetics are distinct from Turkic-speaking Altaic groups. Some scholars have suggested that they are remnants of the Golden Horde, while others have advocated that they are the products of admixture between Turkic and Finno-Ugric speakers who came into contact during the 13th century. Earlier genetic research using autosomal DNA markers suggested a Finno-Ugric origin for the Chuvash. This study examines non-recombining DNA markers to better elucidate their origins. The majority of individuals in this sample exhibit haplogroups H (31%), U (22%), and K (11%), all representative of western and northern Europeans, but absent in Altaic or Mongolian populations. Multidimensional scaling (MDS) was used to examine distances between the Chuvash and 8 reference populations compiled from the literature. Mismatch analysis showed a unimodal distribution. Along with neutrality tests (Tajima’s D (-1.43365) p less than 0.05, Fu’s FS (-25.50518) p less than 0.001), the mismatch distribution is suggestive of an expanding population. These tests suggest that the Chuvash are not related to the Altai and Mongolia along their maternal line but supports the “Elite” hypothesis that their language was imposed by a conquering group-- leaving Chuvash mtDNA largely of Eurasian origin with a small amount of Central Asian gene flow. Their maternal markers appear to most closely resemble Finno-Ugric speakers rather than fellow Turkic speakers.
Population history and substructure of Anatolia and Turkey as evidenced by craniofacial diversity.
NORIKO SEGUCHI et al.

Anatolia, the Asian segment of Turkey, is an area of evolutionary importance for human groups who used this corridor as a bridge for migration between the Caucasus, Western Asia and Europe since Lower Paleolithic times. Historically, Anatolia has been occupied by diverse civilizations, including the Byzantine and Ottoman Empires. This study is an attempt to understand Turkish population substructure and history by examining craniofacial diversity through several temporal periods framed within a population genetic model. If the region of Anatolia has been used as a migratory corridor for peoples spanning disparate geographic areas (Balkans, Central Asia, and East Asia), then gradual craniofacial change is expected due to these migrations coupled with extensive admixture. Studies using mtDNA indicate a pre-Neolithic expansion resulting in extensive migration, while Y chromosome studies reveal haplogroup clustering and gene flow from the Caucus with less admixture from Central and East Asia. Overall, our results indicate minimal Turkish population substructure. When crania were separated into sex, our results are consistent with uniparental marker population history. Female crania show a distinctness with modern groups and are actually more similar to Neolithic European and Near Eastern populations. This would indicate a relatively stable female population in Anatolia since Neolithic times. Male crania are more heterogeneous and cluster within a larger geographic zone of Eurasia and the Near East consistent with greater male migration. There is little support for admixture from Central or East Asian groups. These results support the hypothesis for a Turkic language displacement with insignificant genetic exchange.
Genetic analyses reveal a history of serial founder effects, admixture between longseparated founding populations in Oceania, and interbreeding with archaic humans.
SARAH JOYCE, KEITH HUNLEY

Genetic anthropologists continue to debate whether human neutral genetic variation primarily reflects a continuum of demes connected by local gene flow or colonization and serial founder effects. A second unresolved issue concerns the genetic contribution of archaic species to the modern human gene pool. Some studies suggest that this contribution was substantial and that it played an important role in human adaptation. These issues remain unresolved because of inadequacies and biases in datasets, problems in statistical methodology, and the failure to recognize that different evolutionary processes may produce similar outcomes. This study redresses these limitations by analyzing gene identity within and between populations in a dataset comprised of 614 STRs assayed in 1,983 people from 99 widespread populations. Our strategy is to fit hierarchical models to these data and examine residual deviations from the models. Each model involves nesting smaller units such as populations into larger units such as continental regions. It is possible to restate many of these models as either expansions or reductions of each other and thereby identify aspects of population structure that have had a major impact on the overall pattern of diversity. The strong fit of a model estimated using the Neighbor Joining algorithm indicates that human genetic diversity primarily reflects a history of successive founder effects associated with our exodus from Africa, not a continuum of demes connected by gene flow. Residual deviations from the model suggest: 1) the genomes of Oceanic peoples are the product of two independent waves of migration to the region and admixture, and 2) genetic exchange occurred between archaic and modern humans after their initial divergence.
Correlations between genetic ancestry and superficial traits indicate substantial admixture stratification in Brazil.
LAUREL N. PEARSON et al.

Brazil is one of the most admixed countries in the world. How this admixture affected the distribution of genetic ancestry across Brazilian ethnic (“Color”) groups is a fundamental question which to date has only received minimal attention. In an effort to systematically study variation in genetic ancestry in Brazil, we collected DNA and various phenotypic measures from 596 volunteers in Brasilia, Brazil. Participants were asked to provide their self-described “Color” as defined by the Brazilian census (Preta/Black, Parda/Brown, Branca/White, Indigena/Indigenous, Amarela/Yellow). Phenotype data was collected from each subject including hair texture, highresolution eye photographs, skin and hair color by reflectometry, and three-dimensional facial photographs. To estimate genomic ancestry, DNA from each participant was genotyped using 176 ancestry informative markers (AIMs), autosomal SNPs with large frequency differences between parental populations known to contribute to Brazilian admixture (West African, East Asian, European and Indigenous American). Although genomic ancestry shows significant overlap across “Color” groups, there are highly significant differences in average proportional ancestry. Additionally, analyses comparing trait values and genetic ancestry show significant correlations consistent with expectations of populations stratified with respect to genetic ancestry. Ethnographic research indicates that designations of “Color” are fluid and largely based on physical traits as opposed to known ancestry. This likely contributes to the observed ancestry overlap between ethnic groups and the strong association between phenotype and group. This study emphasizes the importance of genetic marker based estimates of ancestry as well as objective assessment of superficial traits in understanding the admixture process.
Geographic structure of genetic variation in North America: Population fissions and European admixture.
KARI BRITT SCHROEDER et al.

A satisfactory understanding of how modern Native North America populations are biologically related to each other requires increased sampling of populations and/or genetic markers and testing of the fit of different models of population structure. To this end, we combine new autosomal microsatellite data from Native North American populations with previously published data. Using J.C. Long’s Generalized Hierarchical Modeling software, we evaluate the fit of different trees to the data. Although we observe a correlation between population pairwise genetic and geographic distances, as expected with a long-term process of isolation by distance, we show that this correlation likely results from geographically-structured population fissions. This pattern could result from the initial peopling of North America or from a later process. The magnitude of European ancestry in the sampled populations, as estimated with the software structure, varies drastically among geographic regions, and may limit our ability to use modern genetic variation to investigate Native North American prehistory.This study was funded by the Wenner-Gren Foundation for Anthropological Research, grant number 7580 to K.B. Schroeder and D.G. Smith, and by the National Science Foundation, grant BCS- 0422144 to R.S. Malhi, B.M. Kemp, and D.G. Smith.
Coalescent modeling of Yakut origins points to small founding population based on mtDNA variation.
MARK ZLOJUTRO et al.

Based on archaeological and ethnohistorical evidence, the Yakut people of northeastern Siberia are considered to be descendants of ancient Turkic-speaking populations once living in the distant Altai- Sayan region on the Russian- Mongolian border. The results of phylogeographic studies on Siberian mtDNA variation have been generally concordant with a southern Yakut origin, although the timing of the northern migration, the size of the founder group and the degree of genetic admixture with non-Turkic Siberian populations are less apparent. In an effort to better understand Yakut origins, we modeled 25 demographic scenarios, including parameters such as effective population size, growth rate and gene flow, and tested by coalescent simulation whether any are consistent with the patterns of mtDNA diversity observed in present-day Yakuts. The models consist of either two simulated demes that represent Yakuts and a South Siberian ancestral population, or three demes that also include a regional Northeast Siberian population that served as a source of localized gene flow into the Yakut deme. The model that produced the best fit to the observed data defined a founder group with an effective female population size of only 150 individuals, migrating northwards approximately 1,000 years BP and undergoing significant admixture with neighboring populations in Northeastern Siberia. These simulation results indicate a pronounced founder effect that was primarily kin-structured and reconcile reported discrepancies between Yakut mtDNA and Y chromosome diversity levels.
The role of selection-nominated candidate genes in determining Indigenous American skin pigmentation.
ELLEN E QUILLEN et al.
World-wide variability in skin pigmentation has been a subject of anthropological inquiry from the beginning of our discipline. Recent genomic studies indicate that skin pigmentation is one of the most rapidly evolving phenotypes in many human populations and that genes underlying skin pigmentation have been subject to some of the most extreme selective pressures of any genes in the human genome. Unlike previous research, this study both identifies pigmentation genes that have undergone selection in Indigenous American populations and tests the influences of these genes on skin color in admixed individuals. 906,600 single nucleotide polymorphisms (SNPs) were surveyed for signatures of selection in indigenous populations from Central and South America. Evidence of selection was identified by comparison to HapMap Phase I populations using reduction in heterozygosity (lnRH), Locus- Specific Branch Length (LSBL), Tajima’s D, and haplotype block structure. In the 12 pigmentation candidate genes that show the strongest evidence of selection (ADAM17, POMC, AP3B1, OPRM1, SILV, OCA2/HERC, PLDN, MYO5A, RAB27A, CYP1A2, ATRN, and ASIP), 48 SNPs selected to represent the overall variation in the selection nominated candidate genes were genotyped in individuals of admixed Indigenous American and European ancestry. These SNPs show substantial allele frequency differences between the parental populations. Using admixture based regression model analyses, genes contributing to darker skin pigmentation in Indigenous Americans were found. This study not only identified skin pigmentation genes contributing to skin color variation in previously understudied Indigenous American populations, it validated the usefulness of using population genetic tests of selection to identify functional genes. This study was generously funded by the National Science Foundation Dissertation Improvement Grant 0925976

September 20, 2009

History of the people of the Hungarian plain in the 1st millennium

Hum Biol. 2008 Dec;80(6):655-67

History of the peoples of the Great Hungarian Plain in the first millennium: a craniometric point of view

Holló G, Szathmáry L, Marcsik A, Barta Z.

We carried out an examination relying on six dimensions of 1,573 crania coming from the Great Hungarian Plain. The crania represent seven archeological periods: Sarmatian age (1-4th century), the period of transition (about 400-420), Hun and Gepidic epochs (about 420-455 and 455-567, respectively), early Avar age (about 568-670), late Avar period (about 670-895), the epoch of the Hungarian conquest and settlement (about 895-1000), and the Arpadian age (about 1000-1301). We were curious about the anatomical background behind cultural changes of the various populations that inhabited this area. After having noticed some discontinuities between the populations, as revealed by univariate analysis of single dimensions, we performed a principal-components analysis to see whether or not the diverse components showed eventual breaks in the sequence of the populations. Knowing that all the dominant populations had Asian roots, except for the Gepids of Germanic origin, we expected a considerable difference between the Gepidic population and all the other inhabitants. We also assumed that a conquest itself with a large-scale assimilation was unlikely to leave breaklike traits in anatomical patterns, except for aggressive conquests. We found that the second principal component (which correlated with cranial breadth and partly with height) showed a remarkable hiatus in both sexes between Gepids and early Avars. Having done a statistical proof (simultaneous tests for general linear hypotheses) of the observed phenomenon, we found that the gap referring to subsequent populations was significant only in males. A possible reason for this result is that the Avar conquest was much more radical than has been thought. In addition, considering that men were more likely to die in wars, women survived and were assimilated into the conquerors' populations with higher probability, so it is not surprising that the results of multicomparison tests are significant only in men.

Link

November 07, 2006

Mongoloid components in Eastern Europe

Slavs and other eastern Europeans are typically Caucasoid although one does not seldom find among them individuals with certain attenuated Mongoloid influences. The extent of Mongoloid admixture in eastern Europe will eventually be determined by autosomal admixture studies which sample the relevant source populations of the Mongoloid component in Europe, namely the Uralic and Altaic speakers of Siberia and Central Asia.

At present, the only study which studied the genomic study of a Slavic sample of Russians (Science 20 December 2002: Vol. 298. no. 5602, pp. 2381 - 2385) determined a 93% membership coefficient in the main Caucasoid cluster, with a 3% membership in the main (East Asian) Mongoloid cluster. Unfortunately Central Asian Turkic and Finno-Ugrian populations from Europe and Asia were not sampled.

The presence of Mongoloid mtDNA types in East Europe is well established, but it should be remembered that movements from the east did not usually involve large numbers of women (*). Therefore, one expects that inference from mtDNA will underestimate the total number of immigrants.

Moreover, as I have pointed out before, Turkic speakers of Central Asia were likely to possess majority components of Caucasoid Y chromosomes associated with Mongoloid mtDNA components. Today, haplogroup C chromosomes make up a large component of Y-chromosome variation in Central Asia (including the famous "Genghis Khan" line), but these were probably added (from the east), late in history, since the Mongol expansion is at the end of the great period of Altaic migrations to the west (Huns, Seljuks, Ottomans, Bulgars, etc.)

As a result a proportion of the eastern immigrants into Europe would be undetectable with Y-chromosome markers, namely the substantial fraction with Caucasoid Y chromosomes and Mongoloid mtDNA. The male immigrants of this type would impart their Y chromosomes in the regions they invaded, but not their maternal mtDNA. In the Altai-Kizhi group, for example, 71% Caucasoid Y chromosomes are associated with 76% non-Caucasoid mtDNA.

Consider a population with 3/4 Caucasoid Y chromosomes and 3/4 Mongoloid mtDNA. Consider that the migrant group consists of 3/4 men and 1/4 women. Under such circumstances we would expect approximately the same rate of Mongoloid mtDNA and Y chromosomes in the recipient population. Moreover, the inferred admixture proportion from the frequencies of Mongoloid mtDNA and Y chromosomes would underestimate the true rate of Mongoloid admixture by a factor of 2.

Unfortunately, the presence of Mongoloid Y chromosomes has not been properly studied until now. In the recent Y chromosome study of the Czech Republic for example, the main Mongoloid Y-haplogroups (C, Q, O) likely to have accompanied the women bearing the 3% Mongoloid mtDNA were not examined and could be part of the Y*, P*, and K* paragroups. Similarly, none of these haplogroups were studied in a recent study on Poland and Germany.

In conclusion:
  • The mtDNA evidence suggests a very low-level introgression of Mongoloid components into Eastern Europe.
  • The extent of this admixture is likely to be underestimated by the genetic profile of the source population and the excess of male migrants.
  • The best estimate of the admixture rate will be determined by autosomal studies that sample relevant Uralic-Altaic source populations, but is probably unlikely to amount to more than a few percentage points.
(*) Except in folk migrations such as those of the Kalmyks.

November 06, 2006

ASHG 2006 abstracts

The meeting of the American Society of Human Genetics took place this October and the abstracts of the meeting are online in a big pdf file. A few items of interest:

The genetic variation and population history in the Baltic Sea region
Sharp genetic borders within a geographically restricted region are known to exist among the populations around the northern Baltic Sea on the northern edge of Europe. We studied the population history of this area in greater detail from paternal and maternal perspectives with Y chromosomal and mitochondrial DNA markers. Over 1700 DNA samples from Finland, Karelia, Estonia, Latvia, Lithuania and Sweden were genotyped for 18 Y-chromosomal biallelic polymorphisms and 8 microsatellite loci, together with 18 polymorphisms from the coding area of mtDNA and sequencing of the HVR1. Y chromosomal haplogroups from the biallelic data indicate both various phases of gene flow and existence of genetic barriers within the Baltic region. Haplogroup N3, being abundant on the eastern side of the Baltic, differentiates between eastern and western sides of the Baltic Sea, just like R1b that has a reverse frequency pattern to N3. The typically Scandinavian haplogroup Ia1 has a high frequency of up to 40%, separating not only Sweden but also Western Finland from the other populations. The frequency of haplogroup R1a1, most characteristic to Slavic peoples, varied substantially across the populations. In addition to biallelic markers, Y-chromosomal microsatellite loci were analyzed for a more detailed approach to the history of the paternal lineages in the region. We also analyzed mtDNA markers with special interest for sub-haplogroups of H and U, that among other haplogroups, show substantial variation between the populations (e.g. haplogroups H1, H2, T and J1). In conclusion, our current Y-chromosomal and mtDNA data suggest various incidents of gene flow from different sources, each reaching partly different areas of the Baltic region, which can be thus seen as a meeting point of a not only culturally but also genetically diverse set of populations.
Asian Nomads traces in the mitochondrial gene pool of Slavs.
Mitochondrial DNA (mtDNA) variability was studied in a sample of 179 individuals representing Czech population from west Bohemia. MtDNA analysis revealed that the majority of Czech mtDNAs belongs to the common West Eurasian mitochondrial haplogroups. However, about 3 per cent of Czech mtDNAs encompass East Eurasian lineages (A, N9a, D4, M*). Comparative analysis of published data has shown that different Slavonic populations contain small but marked amount of East Eurasian mtDNAs (e.g. 1.3 per cent in Eastern Slavs, 1.8 per cent in Western Slavs, and 1.2 per cent in Southern Slavs). It is noteworthy that Baltic populations (Latvians, Lithuanians and Estonians) have avoided a marked influence of maternal lineages of East Eurasian origin (0.3-0.6 per cent). The two East Eurasian mtDNA haplogroups, Z1 and D5, are present in gene pools of North European Finnic populations (Saami, Finns, and Karelians). Unlike them, Slavonic populations in general are characterized by heterogeneous mtDNA structure, defined, in addition to Z1 and D5, by haplogroups A, C, D4, G2a, M*, N9a, F and Y. Therefore, different scenarios of female-mediated East Eurasian genetic influence on Northern and Eastern Europeans should be highlighted: (1) the most ancient, probably originated in the early Holocene, influx of Asian tribes, which brought a few selected East Asian mtDNA haplotypes (like Z and D5) to Fennoscandia (Tambets et al. 2004), and (2) gradual gene flows of historic times occurred mostly in the Middle Ages due to migrations of nomadic peoples (such as the Huns, Avars, Bulgars, Mongols) to Eastern and Central European territories inhabited mainly by Slavonic tribes. We suggest that the presence of East Eurasian mtDNA haplotypes is not original feature of gene pool of the proto-Slavs, but mostly is a consequence of admixture with Central Asian nomadic tribes, who migrated into Central and Eastern Europe in the early middle Ages.
Use of Forensic Markers in the Assessment of Population Stratification.
Assignment of individuals to population groups is important to genetic case control association studies, admixture mapping, medical risk assessment, genealogy, and forensic studies. Polymorphic sequences can be used to infer ancestry but their utility for such an application is related to the number of alleles and relative frequency differences of these alleles between the population groups under study. Multiple study designs differing in numbers and types of polymorphic markers with differing levels of informativeness make comparison of studies difficult. The use of commercially-available highly-informative markers that are used internationally in forensic applications could provide a universal first tier analysis for assignment of individuals to population groups prior to inclusion in association and admixture studies. We evaluated the utility of the PowerPlex kit of 16 markers from Promega for this purpose. Multiple population groups including African, Bengalis, Chinese, Japanese, Koreans, Crypto Jews, Sephardic Jews, and Dutch were genotyped using the PowerPlex kit. The data were analyzed with STRUCTURE (Pritchard et al.) using an admixture model, correlated alleles and 3 clusters. Africans, Asians (Bengalis, Koreans, Chinese and Japanese), and Caucasians (Dutch, Sephardic Jews, and Crypto Jews) were clearly delineated. Individuals showing admixture were detectable and their removal resulted in more discrete clustering. An independently collected and genotyped set of Dutch individuals was indistinguishable from the original Dutch group providing reproducibility across data sets. The sensitivity conferred by the number of markers used in the analysis was assessed by removing markers. Delineation of population groups was apparent when 14 markers were used, although clusters were noisier; however it was not possible to delineate population groups when only 8 markers were used. The use of forensic markers is a promising strategy for clustering individuals into population groups and will be an inevitable outcome of their forensic use.
Evaluation of Ancestry and Linkage Disequilibrium Sharing in Admixed Population in Mexico
National Institute of Genomic Medicine, Mexico. More than 80% of the Mexican population is considered Mestizo, resulting from the admixture of ethnic groups with Spaniards. To generate an initial estimate of ancestral contribution (AC) of populations from Europe, Africa and Asia to the Mexican Mestizos, we genotyped 104 samples from the states of Sonora (n=20), Yucatan (n=17), Guerrero (n=21), Zacatecas (n=19), Veracruz (n=18) and Guanajuato (n=8) using the 100K Affymetrix SNP array, and used data from the International HapMap Project as the parental population information. From 3,055 ancestry informative SNPs reported by Smith et al. and Choudhry et al., we identified 105 present in the 100K array and used them to calculate AC from each population to our sample. To infer AC we used Structure software under the admixture model. Based on this analysis, the average AC in our samples is 58.96% European, 10.03% African and 31.05% Asian. Sonora shows the highest European contribution (70.63%) and Guerrero the lowest (51.98%) where we also observe the highest Asian contribution (37.17%). African contribution ranges from 7.8% in Sonora to 11.13% in Veracruz. Based on these data, we grouped our population according to European AC (<50%,>70%). We used the Carlson algorithm to derive European tagSNPs from the 100K marker set. To explore Linkage Disequlibrium Sharing (LDS) between Mestizos and Europeans, we calculated the proportion of tagSNP-marker pairs that maintained an r2≥0.8 in each evaluated population. In general, comparison of LDS between European and Asian population is ~73%, whereas comparison with African population is ~40%. Mestizos from Guerrero show the lowest LDS (74%), whereas those from Sonora show the highest (77%). Similar results are seen in the group of lower (<50%)>70%) European ancestry. Our results suggest that the Mexican Mestizo population shows ancestry-based stratification that will requiere the appropriate corrections to avoid spurius results in association studies. Our results show that admixed populations have unique patterns of LD depending on levels of ancestral contribution.
European mitochondrial haplogroups exhibit differential risk of developing presbycusis.
The genetic basis of human presbycusis (age-related hearing loss) is unknown. This common disorder is characterized by difficulty understanding conversation, particularly in noisy backgrounds. Audiograms of presbycusics show sloping hearing loss, with greatest deficiencies at the highest frequencies, and over time an individual’s hearing loss progresses into the lower frequencies that are more important for understanding speech. We investigated the hypothesis that the mitochondrial (mt) genome plays a role in presbycusis. Subjects of European ancestry, all over age 58, were tested using both classical and advanced audiometric measures and then genotyped to determine mt haplogroups. We found that subjects belonging to haplogroup H (N=93) had better hearing than other Europeans (N=80), with the greatest differences observed in the right ear at 3 kHz (p=0.017) and 10-14 kHz (p=0.016). The difference at 3 kHz correlates with the common noise notch location, and thus may indicate a difference in susceptibility to noise damage. Distortion product otoacoustic emissions also indicated better hair cell health in haplogroup H subjects, at higher frequencies and in the right ear (average DPOAEfor 4-6 kHz, p= 0.010). These results support the hypothesis that a mitochondrial factor influences susceptibility to the development of presbycusis. We are currently investigating the mt genome for causative mutations linked to the haplogroups.

Estimating the split time of Human and Neanderthal populations
Previous genetic studies of Neanderthal ancestry have used mtDNA and thus have been limited in their conclusions on the relationship of humans and Neanderthals. We present here the first use of Neanderthal genomic DNA to assess the joint history of human and Neanderthal populations. Our data consist of 37kb of short fragments of genomic DNA sequenced in Neanderthal. By studying the degree to which modern human diversity is shared with Neanderthal we can assess the time at which the human and Neanderthal populations split. We use a flexible simulation based approach that demonstrates the power of using human variation data in such analyses. We find that the two populations split ~400,000 years, predating the emergence of modern humans. Our best fitting model predicts that the Neanderthal lineage will be outgroup to the human population ~52% of the time.
The Genetic Structure of Human Populations in Africa.
Africa contains the greatest levels of human genetic variation and is the source of the worldwide range expansion of all modern humans. Knowledge of the genetic population boundaries within Africa has important implications for the design and implementation of genetic epidemiologic studies of Africans and African Americans, and for reconstructing modern human origins. A dataset consisting of ~3.7 million genotypes has been generated from the Marshfield panel of 773 microsatellites and 392 in-del polymorphic genetic markers. These markers were genotyped in ~3,200 individuals from >100 diverse ethnic populations across Africa as well as in 118 African Americans and in the CEPH Human Genome Diversity Panel, consisting of 1048 individuals from 51 globally diverse populations. Preliminary analysis of population structure using the program STRUCTURE1 indicates considerably more substructure amongst global populations (estimate for the number of genetic clusters, K, is 12) and amongst African populations (K = 9) than had previously been recognized2. Population clusters are correlated with self-described ethnicity and shared cultural and/or linguistic properties (e.g. Pygmies, Khoisan-speakers, Bantu-speakers, etc). African Americans have predominantly West African Bantu (~80%) and European (~17%) ancestry, although individual admixture levels vary considerably. These results justify the need to include a broad range of geographically and ethnically diverse African populations in studies of human genetic variation. 1Pritchard JK, et al. Genetics 155:945-59 (2000) 2Rosenberg NA, et al. Science 298:2381- 5 (2002).
Patterns of admixture in Latino populations
We examined the diversity of 13 Latino populations from seven countries (Mexico, Guatemala, Costa Rica, Colombia, Chile, Argentina and Brazil) typing 745 autosomal microsatellite markers in 250 individuals. Estimates of genetic ancestry for these populations varied substantially. Native American ancestry varied between 19.6% and 70.3%, European ancestry between 26.9% and 70.6%, and African ancestry between 1.1% and 9.8%. Genetic structure analysis provides evidence of a genetic continuity between pre- and post-Columbian populations for specific geographic regions. For instance, a Chibchan-Paezan ancestry is detectable in Latinos from lower Central America and northwest South America. Individual admixture estimates vary considerably between populations. Some Latinos (e.g. Mexico City) show marked variation in individual admixture, whereas others (e.g. Antioquia and Costa Rica) show little variation. This variation is likely to reflect the history of admixture of each geographic region examined: some Latino populations are still undergoing substantial admixture whereas others underwent admixture mostly in early colonial times. These results have important implications for admixture mapping and association mapping studies in Latino populations.


Genomic diversity and population structure of Native Americans
We examined 745 autosomal microsatellite markers in 432 individuals sampled from 24 indigenous populations in the Americas. These data were analyzed jointly with similar data available in 54 other indigenous populations from across the world (including an additional 5 Native American groups). The populations from the Americas show lower diversity and more differentiation than populations from other continental regions (global Fst=0.08). Signals of long-range linkage disequilibrium are detectable to a greater extent in Native Americans than in other populations, as are signals of recent bottlenecks followed by population growth. A negative correlation is observed between population diversity and geographic distance from the Bering Strait, an observation consistent with the north-to-south dispersal of humans upon initial entry into the continent. A higher diversity is observed in western vs. eastern South American populations, potentially reflecting differences in long-term effective population size or in colonization routes within South America. Phylogenetic trees relating Native American populations show a marked differentiation between Canadian and other Native populations. Canadian natives also show a detectable shared ancestry with contemporary Siberian populations, which is less visible for more southerly Americans. A substantial agreement is observed between phylogenetic relatedness and population affiliation according to the linguistic classification of Greenberg.

The rare nonsynonymous SCN5A-S1103Y variant in Caucasians is due to recent African Admixture as revealed by 100k SNP genotyping.
The SCN5A-S1103Y variant is an established and confirmed risk factor conferring an odds ratio up to 8.5 for cardiac ventricular arrhythmias and sudden cardiac death (Splawski et al, Science, 2002, Burke et al., Circulation, 2005, Plant et al., J. Clin. Invest. 2006). In Africans it is a common nonsynonymous SNP (MAF=8%), but it is rarely observed in Caucasians (Chen et al, J. Med. Genet. 2002). In a Bavarian family appearing of entirely Caucasian descent and affected with long QT Syndrome we have detected this variant in heterozygote state as the only causal nonsynonymous variation upon diagnostic ion channel resequencing. To resolve the question, whether in the family the variant was (a) of ancient African descent, (b) due to recent African admixture or (c) a de novo mutation, we analyzed the genetic segment it resided on. Dense SNP genotyping in admixed individuals allows to infer the ethnicity of chromosomal regions if allele frequencies are known in the original populations. Ethnicity inference for any given locus can be carried out by applying the product rule to a sliding window of neighboring SNPs or via modeling ancestry by hidden Markov Chain Monte Carlo Methods (Tang et al. Am. J. Hum. Genet, 2006). By 100k SNP genotyping of the Bavarian family, we demonstate that the S1103 variant is due to recent African admixture (b) and could rule out possibilities (a) and (c). This application demonstrates that inferring ethnicity of chromosomal regions by high density SNP genotyping is a powerful approach with prospects also to admixture mapping of disease loci and population stratification correction of genomewide association mapping of complex disease loci.

Allele frequency estimates from DNA pools for 317,000 SNPs for multiple European and worldwide populations and discovery of Ancestry Informative Markers for Europe.
The identification of Ancestry Informative Markers (AIMs) and inference of individual genetic history is useful in many applications, including studies of geography and evolution of human populations, forensic sciences, pharmacogenomics, admixture mapping and association studies of complex diseases. While many AIMs have been reported that define strong genetic differences between major continents, it is more difficult to identify markers that reflect subtle, within-continent diversity, such as the heterogeneous ancestry of European Americans contributed by different populations within Europe. We have analyzed DNA pools, each for a different population, on Illumina HumanHap300 BeadArrays to estimate allele frequencies for ~317,000 Single Nucleotide Polymorphisms for 9 European, 6 African, and 2 Amerindian populations in the Human Genome Diversity Project collection. We have also evaluated the performance of this method by analyzing three HapMap pools (YRI, CHB, and JPT), for which the true allele frequencies are already known from the International HapMap Project. We found that the allele frequency estimates differed between replicate chips by less than +/-5% for 95% of the SNPs, and that the estimated frequencies and the true frequencies differed by +/-5-10% for 90% of the SNPs. The data for nine European populations, from western Caucasus, Scotland, Tuscany, Sardinia, France, Iberia, Russia, Northern Italy, and a Basque region, showed a clear excess of SNPs having large allele frequency differences (e.g. >30%) between most pairs of populations, compared to what would be expected given the sample sizes. These results provide a valuable resource of European AIMs for monitoring within-continent stratification in association studies. We are currently validating the most informative SNPs by individually genotyping samples that formed the pools as well as those from additional European populations.


Mitochondrial haplogroups are associated with asthma and total serum IgE levels
Maternal history of asthma and/or atopy is a major risk factor for the subsequent development of asthma and allergy in childhood. Although mitochondrial mutations have been implicated in several maternally inherited monogenic disorders, no studies of mitochondrial polymorphisms and asthma have been reported.Weevaluated whether common mitochondrial haplogroups are associated with asthma and total serum IgE levels. 8 common mitochondrial single nucleotide polymorphisms (mtSNP) were genotyped in two cohorts of European ancestry: 512 adult women with incident asthma and 517 matching controls participating in the Nurses’ Health Study (NHS) and 654 children ages 5-12 years with mild to moderate asthma participating in the Childhood Asthma Management Program (CAMP). Genotyping was performed using TaqMan® probe hybridization assays. 93 random NHS samples were run in duplicate for all assays and demonstrated 100% concordance. In the CAMP Study, genotype data from probands’ mothers was also 100% concordant across all assays. Completion rates in both cohorts were > 95% for all markers. mtSNP 9055 was seen at higher frequency in NHS asthma cases (frequency 11.1%) than controls (8.0%, p = 0.02). Association analysis using haplo.score identified two haplogroups associated with asthma: one haplogroup at a frequency of 3.83% among cases compared to 1.27% among controls (p=0.0002) and another at a frequency of 9.97% among cases and 11.3% among controls (p=0.04). The CAMP Study is a case-only (family-based) cohort, thus precluding evaluation of mitochondrial SNP associations with asthma status. However, quantitative analysis of mitochondrial haplogroups identified two haplogroups of 11.0% and 1.87% frequency that were associated with log-transformed total serum IgE levels, an important intermediate phenotype in asthma and atopy (p=0.006 and 0.01, respectively). These data suggest that common mitochondrial haplogroups influence asthma diathesis.