arXiv:1306.4021 [q-bio.PE]
Reconstructing Native American Migrations from Whole-genome and Whole-exome Data
Simon Gravel et al.
There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and especially Native American ancestry to these populations. Estimated Native American ancestry is 48% in MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR appears most closely related to Equatorial-Tucanoan-speaking populations, supporting a Southern America ancestry of the Taino people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a three-population demographic model. The ancestral populations to the three groups likely split in close succession: the most likely scenario, based on a peopling of the Americas 16 thousand years ago (kya), supports that the MXL Ancestors split 12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The model also features a Mexican population of 62,000, a Colombian population of 8,700, and a Puerto Rican population of 1,900. Modeling Identity-by-descent (IBD) and ancestry tract length, we show that post-contact populations also differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest size and the earlier migration from Europe.
Link
Showing posts with label Puerto Rico. Show all posts
Showing posts with label Puerto Rico. Show all posts
June 19, 2013
July 31, 2012
On the age of Y-chromosome haplogroup R1b-M343
Continuing my exploration of the human Y-chromosome, on the basis of 1000 Genomes data, I turned my attention to Y-haplogroup R1b-M343, one of the most populous lineages in extant Europeans. A total of 109 Y-chromosomes possessed the mutation diagnostic of this haplogroup.
At first, I calculated the histogram of pairwise TMRCA between all these 109 Y-chromosomes:
Most times are around 6-7 thousand years ago, but there is an outlier bump at around 15 thousand years ago. To further investigate this bump, I carried out multidimensional scaling of the collection of Y-chromosomes:
It is clear that the group of high pairwise TMRCAs correspond to the individual on the left of the figure that emerges as a clear outlier vis a vis the rest. The ID of that individual is HG00640 (from PUR population). One possibility is that this individual is M343+ due to sequencing error and belongs to a different lineage altogether. However, HG00640 is also R1-S1+ and R1b1-L278+ but R1b1a-P297-.
It will appear therefore that the HG00640 Puerto Rican belongs to the R1b1-L278 clade, but not to the R1b1a-P297 subclade. He thus represents an earlier split from the tree than the R1b1a2-M269 (frequent in West Eurasia), as well as the R1b1a1-M73 (frequent in Central Asia). It seems that I have chanced upon a real relic Y-chromosome!
The estimate of the age difference between HG00640 and the remaining M343+ chromosomes that cluster on the right is: 15,426 years. We now have direct evidence that haplogroup R1b1 is quite old, and R1b-M343 itself must have emerged sometime between 23,657 years (the TMRCA of R1a vs. R1b) and 15,426 years.
This little exercise reinforces my belief, first expressed in the outliers article, that there are real relic Y chromosomes in the world today, and we neglect them at our own peril.
Most European and European-derived men from the 1000 Genomes Project who belong to the R1b-M343 clade share patrilineal descent within the last 7,000 years or so. But, not all of them do, and outliers like HG00640 can only be caught with very large worldwide sample sizes and full genome sequencing.
At first, I calculated the histogram of pairwise TMRCA between all these 109 Y-chromosomes:
Most times are around 6-7 thousand years ago, but there is an outlier bump at around 15 thousand years ago. To further investigate this bump, I carried out multidimensional scaling of the collection of Y-chromosomes:
It is clear that the group of high pairwise TMRCAs correspond to the individual on the left of the figure that emerges as a clear outlier vis a vis the rest. The ID of that individual is HG00640 (from PUR population). One possibility is that this individual is M343+ due to sequencing error and belongs to a different lineage altogether. However, HG00640 is also R1-S1+ and R1b1-L278+ but R1b1a-P297-.
It will appear therefore that the HG00640 Puerto Rican belongs to the R1b1-L278 clade, but not to the R1b1a-P297 subclade. He thus represents an earlier split from the tree than the R1b1a2-M269 (frequent in West Eurasia), as well as the R1b1a1-M73 (frequent in Central Asia). It seems that I have chanced upon a real relic Y-chromosome!
The estimate of the age difference between HG00640 and the remaining M343+ chromosomes that cluster on the right is: 15,426 years. We now have direct evidence that haplogroup R1b1 is quite old, and R1b-M343 itself must have emerged sometime between 23,657 years (the TMRCA of R1a vs. R1b) and 15,426 years.
This little exercise reinforces my belief, first expressed in the outliers article, that there are real relic Y chromosomes in the world today, and we neglect them at our own peril.
Most European and European-derived men from the 1000 Genomes Project who belong to the R1b-M343 clade share patrilineal descent within the last 7,000 years or so. But, not all of them do, and outliers like HG00640 can only be caught with very large worldwide sample sizes and full genome sequencing.
* * *
Addendum: There appears to be a R1b1(xP297) DNA Project. There appear to be a quite rich collection of men with SNP results similar to HG00640, including R1b1c-V88+ (as suggested by Roy King in the comments), but also of V88- individuals. I see great utility in such projects, because if one can detect very aberrant Y-STR haplotypes (which can be done with a simple histrogram or MDS plot, as in this post), then one can identify candidate Y chromosomes for full sequencing.
* * *
November 22, 2009
Ancestry-related assortative mating in latino populations (Risch et al. 2009)
When different races admix, then in the first few generations there is a spectrum of ancestry proportions, ranging from pure individuals of the constituent races to admixed individuals with varying proportions of ancestry.
If there is random mating, then over several generations all individuals tend to have similar ancestry proportions, determined by the number of founders from the two constituent races. Mix 30,000 Europeans with 70,000 Africans, randomly mate them for 10-20 generations, and pretty soon almost everyone will have 30:70 European/African ancestral proportions with a little variation.
However, if there is assortative mating, then this process takes much longer to complete, as matings of individuals with very different ancestry proportions are rare, and the spectrum of varying individual ancestry is maintained. In the above-mentioned example, if there is perfect assortative mating, then after 10-20 generations you will still have 30% of the population having 100% European genes, and 70% of them having 100% African ones.
Previously, I had argued that the fact that Latin Americans, unlike Central Asian Turkic populations (such as the Uyghurs), have such a wide spectrum of ancestry proportions is due to the more recent admixture in the Americas than in Central Asia (less time for homogenization to take place), and the continued importation of Europeans.
If there is random mating, then over several generations all individuals tend to have similar ancestry proportions, determined by the number of founders from the two constituent races. Mix 30,000 Europeans with 70,000 Africans, randomly mate them for 10-20 generations, and pretty soon almost everyone will have 30:70 European/African ancestral proportions with a little variation.
However, if there is assortative mating, then this process takes much longer to complete, as matings of individuals with very different ancestry proportions are rare, and the spectrum of varying individual ancestry is maintained. In the above-mentioned example, if there is perfect assortative mating, then after 10-20 generations you will still have 30% of the population having 100% European genes, and 70% of them having 100% African ones.
Previously, I had argued that the fact that Latin Americans, unlike Central Asian Turkic populations (such as the Uyghurs), have such a wide spectrum of ancestry proportions is due to the more recent admixture in the Americas than in Central Asia (less time for homogenization to take place), and the continued importation of Europeans.
Assortative mating is a third factor that may be behind this phenomenon. A stronger parallel may be found in South Asia, where the two constituents have been co-existing for a much longer time, but under a rigid, formalized regime of assortative mating (the caste system), homogenization has not taken place.
Genome Biology doi:10.1186/gb-2009-10-11-r132
Ancestry-related assortative mating in latino populations
Neil Risch et al.
Abstract
Background
Genome Biology doi:10.1186/gb-2009-10-11-r132
Ancestry-related assortative mating in latino populations
Neil Risch et al.
Abstract
Background
While spouse correlations have been documented for numerous traits, no prior studies have assessed assortative mating for genetic ancestry in admixed populations.
Results
Results
Using 104 ancestry informative markers, we examined spouse correlations in genetic ancestry for Mexican spouse pairs recruited from Mexico City and the San Francisco Bay Area, and Puerto Rican spouse pairs recruited from Puerto Rico and New York City. In the Mexican pairs, we found strong spouse correlations for European and Native American ancestry, but no correlation in African ancestry. In the Puerto Rican pairs, we found significant spouse correlations for African ancestry and European ancestry but not Native American ancestry. Correlations were not attributable to variation in socioeconomic status or geographic heterogeneity. Past evidence of spouse correlation was also seen in the strong evidence of linkage disequilibrium between unlinked markers, which was accounted for in regression analysis by ancestral allele frequency difference at the pair of markers (European versus Native American for Mexicans, European versus African for Puerto Ricans). We also observed an excess of homozygosity at individual markers within the spouses, but this provided weaker evidence, as expected, of spouse correlation. Ancestry variance is predicted to decline in each generation, but less so under assortative mating. We used the current observed variances of ancestry to infer even stronger patterns of spouse ancestry correlation in previous generations.
Conclusions
Conclusions
Assortative mating related to genetic ancestry persists in Latino populations to the current day, and has impacted on the genomic structure in these populations.
Link
Link
September 09, 2009
Genetic Ancestry, Social Classification, and Racial Inequalities in Blood Pressure in Southeastern Puerto Rico (Gravlee et al. 2009)
I had posted when this appeared in AAPA 2008, and now the full paper has been published.
Figure 1 shows the relationship between "color" and genetic ancestry. As can be seen, the "color" categories overlap in terms of genetic ancestry, even though their averages are in the right order:

PLoS ONE 4(9): e6821. doi:10.1371/journal.pone.0006821
Genetic Ancestry, Social Classification, and Racial Inequalities in Blood Pressure in Southeastern Puerto Rico
Clarence C. Gravlee et al.
Abstract
Background
The role of race in human genetics and biomedical research is among the most contested issues in science. Much debate centers on the relative importance of genetic versus sociocultural factors in explaining racial inequalities in health. However, few studies integrate genetic and sociocultural data to test competing explanations directly.
Methodology/Principal Findings
We draw on ethnographic, epidemiologic, and genetic data collected in southeastern Puerto Rico to isolate two distinct variables for which race is often used as a proxy: genetic ancestry versus social classification. We show that color, an aspect of social classification based on the culturally defined meaning of race in Puerto Rico, better predicts blood pressure than does a genetic-based estimate of continental ancestry. We also find that incorporating sociocultural variables reveals a new and significant association between a candidate gene polymorphism for hypertension (α2C adrenergic receptor deletion) and blood pressure.
Conclusions/Significance
This study addresses the recognized need to measure both genetic and sociocultural factors in research on racial inequalities in health. Our preliminary results provide the most direct evidence to date that previously reported associations between genetic ancestry and health may be attributable to sociocultural factors related to race and racism, rather than to functional genetic differences between racially defined groups. Our results also imply that including sociocultural variables in future research may improve our ability to detect significant allele-phenotype associations. Thus, measuring sociocultural factors related to race may both empower future genetic association studies and help to clarify the biological consequences of social inequalities.
Link
Subscribe to:
Posts (Atom)