This is an open access paper.
AJHG http://dx.doi.org/10.1016/j.ajhg.2015.03.012
The Kalash Genetic Isolate: Ancient Divergence, Drift, and Selection
Qasim Ayub et al.
The Kalash represent an enigmatic isolated population of Indo-European speakers who have been living for centuries in the Hindu Kush mountain ranges of present-day Pakistan. Previous Y chromosome and mitochondrial DNA markers provided no support for their claimed Greek descent following Alexander III of Macedon's invasion of this region, and analysis of autosomal loci provided evidence of a strong genetic bottleneck. To understand their origins and demography further, we genotyped 23 unrelated Kalash samples on the Illumina HumanOmni2.5M-8 BeadChip and sequenced one male individual at high coverage on an Illumina HiSeq 2000. Comparison with published data from ancient hunter-gatherers and European farmers showed that the Kalash share genetic drift with the Paleolithic Siberian hunter-gatherers and might represent an extremely drifted ancient northern Eurasian population that also contributed to European and Near Eastern ancestry. Since the split from other South Asian populations, the Kalash have maintained a low long-term effective population size (2,319–2,603) and experienced no detectable gene flow from their geographic neighbors in Pakistan or from other extant Eurasian populations. The mean time of divergence between the Kalash and other populations currently residing in this region was estimated to be 11,800 (95% confidence interval = 10,600−12,600) years ago, and thus they represent present-day descendants of some of the earliest migrants into the Indian sub-continent from West Asia.
Link
April 30, 2015
April 21, 2015
PCA and natural selection
arXiv:1504.04543 [q-bio.PE]
Detecting genomic signatures of natural selection with principal component analysis: application to the 1000 Genomes data
Nicolas Duforet-Frebourg et al.
(Submitted on 8 Apr 2015)
Large-scale genomic data offers the perspective to decipher the genetic architecture of natural selection. To characterize natural selection, various analytical methods for detecting candidate genomic regions have been developed. We propose to perform genome-wide scans of natural selection using principal component analysis. We show that the common Fst index of genetic differentiation between populations can be viewed as a proportion of variance explained by the principal components. Looking at the correlations between genetic variants and each principal component provides a conceptual framework to detect genetic variants involved in local adaptation without any prior definition of populations. To validate the PCA-based approach, we consider the 1000 Genomes data (phase 1) after removal of recently admixed individuals resulting in 850 individuals coming from Africa, Asia, and Europe. The number of genetic variants is of the order of 36 millions obtained with a low-coverage sequencing depth (3X). The correlations between genetic variation and each principal component provide well-known targets for positive selection (EDAR, SLC24A5, SLC45A2, DARC), and also new candidate genes (APPBPP2, TP1A1, RTTN, KCNMA, MYO5C) and non-coding RNAs. In addition to identifying genes involved in biological adaptation, we identify two biological pathways involved in polygenic adaptation that are related to the innate immune system (beta defensins) and to lipid metabolism (fatty acid omega oxidation). PCA-based statistics retrieve well-known signals of human adaptation, which is encouraging for future whole-genome sequencing project, especially in non-model species for which defining populations can be difficult. Genome scan based on PCA is implemented in the open-source and freely available PCAdapt software.
Link
bioRxiv http://dx.doi.org/10.1101/018143
Fast principal components analysis reveals independent evolution of ADH1B gene in Europe and East Asia
Kevin J Galinsky et al.
Principal components analysis (PCA) is a widely used tool for inferring population structure and correcting confounding in genetic data. We introduce a new algorithm, FastPCA, that leverages recent advances in random matrix theory to accurately approximate top PCs while reducing time and memory cost from quadratic to linear in the number of individuals, a computational improvement of many orders of magnitude. We apply FastPCA to a cohort of 54,734 European Americans, identifying 5 distinct subpopulations spanning the top 4 PCs. Using a new test for natural selection based on population differentiation along these PCs, we replicate previously known selected loci and identify three new signals of selection, including selection in Europeans at the ADH1B gene. The coding variant rs1229984 has previously been associated to alcoholism and shown to be under selection in East Asians; we show that it is a rare example of independent evolution on two continents.
Link
Detecting genomic signatures of natural selection with principal component analysis: application to the 1000 Genomes data
Nicolas Duforet-Frebourg et al.
(Submitted on 8 Apr 2015)
Large-scale genomic data offers the perspective to decipher the genetic architecture of natural selection. To characterize natural selection, various analytical methods for detecting candidate genomic regions have been developed. We propose to perform genome-wide scans of natural selection using principal component analysis. We show that the common Fst index of genetic differentiation between populations can be viewed as a proportion of variance explained by the principal components. Looking at the correlations between genetic variants and each principal component provides a conceptual framework to detect genetic variants involved in local adaptation without any prior definition of populations. To validate the PCA-based approach, we consider the 1000 Genomes data (phase 1) after removal of recently admixed individuals resulting in 850 individuals coming from Africa, Asia, and Europe. The number of genetic variants is of the order of 36 millions obtained with a low-coverage sequencing depth (3X). The correlations between genetic variation and each principal component provide well-known targets for positive selection (EDAR, SLC24A5, SLC45A2, DARC), and also new candidate genes (APPBPP2, TP1A1, RTTN, KCNMA, MYO5C) and non-coding RNAs. In addition to identifying genes involved in biological adaptation, we identify two biological pathways involved in polygenic adaptation that are related to the innate immune system (beta defensins) and to lipid metabolism (fatty acid omega oxidation). PCA-based statistics retrieve well-known signals of human adaptation, which is encouraging for future whole-genome sequencing project, especially in non-model species for which defining populations can be difficult. Genome scan based on PCA is implemented in the open-source and freely available PCAdapt software.
Link
bioRxiv http://dx.doi.org/10.1101/018143
Fast principal components analysis reveals independent evolution of ADH1B gene in Europe and East Asia
Kevin J Galinsky et al.
Principal components analysis (PCA) is a widely used tool for inferring population structure and correcting confounding in genetic data. We introduce a new algorithm, FastPCA, that leverages recent advances in random matrix theory to accurately approximate top PCs while reducing time and memory cost from quadratic to linear in the number of individuals, a computational improvement of many orders of magnitude. We apply FastPCA to a cohort of 54,734 European Americans, identifying 5 distinct subpopulations spanning the top 4 PCs. Using a new test for natural selection based on population differentiation along these PCs, we replicate previously known selected loci and identify three new signals of selection, including selection in Europeans at the ADH1B gene. The coding variant rs1229984 has previously been associated to alcoholism and shown to be under selection in East Asians; we show that it is a rare example of independent evolution on two continents.
Link
April 19, 2015
mtDNA of Alaskan Eskimos
AJPA DOI: 10.1002/ajpa.22750
Mitochondrial diversity of Iñupiat people from the Alaskan North Slope provides evidence for the origins of the Paleo- and Neo-Eskimo peoples
Jennifer A. Raff et al.
ABSTRACT
Objectives:
All modern Iñupiaq speakers share a common origin, the result of a recent (∼800 YBP) and rapid trans-Arctic migration by the Neo-Eskimo Thule, who replaced the previous Paleo-Eskimo inhabitants of the region. Reduced mitochondrial haplogroup diversity in the eastern Arctic supports the archaeological hypothesis that the migration occurred in an eastward direction. We tested the hypothesis that the Alaskan North Slope served as the origin of the Neo- and Paleo-Eskimo populations further east.
Materials and Methods:
We sequenced HVR I and HVR II of the mitochondrial D-loop from 151 individuals in eight Alaska North Slope communities, and compared genetic diversity and phylogenetic relationships between the North Slope Inupiat and other Arctic populations from Siberia, the Aleutian Islands, Canada, and Greenland.
Results:
Mitochondrial lineages from the North Slope villages had a low frequency (2%) of non-Arctic maternal admixture, and all haplogroups (A2, A2a, A2b, D2a, and D4b1a–formerly known as D3) found in previously sequenced Neo- and Paleo-Eskimos and living Inuit and Eskimo peoples from across the North American Arctic. Lineages basal for each haplogroup were present in the North Slope. We also found the first occurrence of two haplogroups in contemporary North American Arctic populations: D2a, previously identified only in Aleuts and Paleo-Eskimos, and the pan-American C4.
Discussion:
Our results yield insight into the maternal population history of the Alaskan North Slope and support the hypothesis that this region served as an ancestral pool for eastward movements to Canada and Greenland, for both the Paleo-Eskimo and Neo-Eskimo populations
Link
Mitochondrial diversity of Iñupiat people from the Alaskan North Slope provides evidence for the origins of the Paleo- and Neo-Eskimo peoples
Jennifer A. Raff et al.
ABSTRACT
Objectives:
All modern Iñupiaq speakers share a common origin, the result of a recent (∼800 YBP) and rapid trans-Arctic migration by the Neo-Eskimo Thule, who replaced the previous Paleo-Eskimo inhabitants of the region. Reduced mitochondrial haplogroup diversity in the eastern Arctic supports the archaeological hypothesis that the migration occurred in an eastward direction. We tested the hypothesis that the Alaskan North Slope served as the origin of the Neo- and Paleo-Eskimo populations further east.
Materials and Methods:
We sequenced HVR I and HVR II of the mitochondrial D-loop from 151 individuals in eight Alaska North Slope communities, and compared genetic diversity and phylogenetic relationships between the North Slope Inupiat and other Arctic populations from Siberia, the Aleutian Islands, Canada, and Greenland.
Results:
Mitochondrial lineages from the North Slope villages had a low frequency (2%) of non-Arctic maternal admixture, and all haplogroups (A2, A2a, A2b, D2a, and D4b1a–formerly known as D3) found in previously sequenced Neo- and Paleo-Eskimos and living Inuit and Eskimo peoples from across the North American Arctic. Lineages basal for each haplogroup were present in the North Slope. We also found the first occurrence of two haplogroups in contemporary North American Arctic populations: D2a, previously identified only in Aleuts and Paleo-Eskimos, and the pan-American C4.
Discussion:
Our results yield insight into the maternal population history of the Alaskan North Slope and support the hypothesis that this region served as an ancestral pool for eastward movements to Canada and Greenland, for both the Paleo-Eskimo and Neo-Eskimo populations
Link
April 13, 2015
Haplogroup G1, Y-chromosome mutation rate and migrations of Iranic speakers
The origin of Iranian speakers is a big puzzle as in ancient times there were two quite different groups of such speakers: nomadic steppe people such as Scythians and settled farmers such as Persians and Medes.
I am guessing that the story of Iranian origins will only be solved in correlation to their Indo-Aryan brethren and their more distant Indo-European relations.
Clearly, G1 cannot be Proto-Indo-European as it has a rather limited distribution in Eurasia, but it could very well have been a marker of a subset of Indo-Europeans. If it was present in ancestral Iranians, then this would geographically constrain the places where ancestral Iranians were formed.
PLoS ONE 10(4): e0122968. doi:10.1371/journal.pone.0122968
Deep Phylogenetic Analysis of Haplogroup G1 Provides Estimates of SNP and STR Mutation Rates on the Human Y-Chromosome and Reveals Migrations of Iranic Speakers
Oleg Balanovsky et al.
Y-chromosomal haplogroup G1 is a minor component of the overall gene pool of South-West and Central Asia but reaches up to 80% frequency in some populations scattered within this area. We have genotyped the G1-defining marker M285 in 27 Eurasian populations (n= 5,346), analyzed 367 M285-positive samples using 17 Y-STRs, and sequenced ~11 Mb of the Y-chromosome in 20 of these samples to an average coverage of 67X. This allowed detailed phylogenetic reconstruction. We identified five branches, all with high geographical specificity: G1-L1323 in Kazakhs, the closely related G1-GG1 in Mongols, G1-GG265 in Armenians and its distant brother clade G1-GG162 in Bashkirs, and G1-GG362 in West Indians. The haplotype diversity, which decreased from West Iran to Central Asia, allows us to hypothesize that this rare haplogroup could have been carried by the expansion of Iranic speakers northwards to the Eurasian steppe and via founder effects became a predominant genetic component of some populations, including the Argyn tribe of the Kazakhs. The remarkable agreement between genetic and genealogical trees of Argyns allowed us to calibrate the molecular clock using a historical date (1405 AD) of the most recent common genealogical ancestor. The mutation rate for Y-chromosomal sequence data obtained was 0.78×10-9 per bp per year, falling within the range of published rates. The mutation rate for Y-chromosomal STRs was 0.0022 per locus per generation, very close to the so-called genealogical rate. The “clan-based” approach to estimating the mutation rate provides a third, middle way between direct farther-to-son comparisons and using archeologically known migrations, whose dates are subject to revision and of uncertain relationship to genetic events.
Link
I am guessing that the story of Iranian origins will only be solved in correlation to their Indo-Aryan brethren and their more distant Indo-European relations.
Clearly, G1 cannot be Proto-Indo-European as it has a rather limited distribution in Eurasia, but it could very well have been a marker of a subset of Indo-Europeans. If it was present in ancestral Iranians, then this would geographically constrain the places where ancestral Iranians were formed.
PLoS ONE 10(4): e0122968. doi:10.1371/journal.pone.0122968
Deep Phylogenetic Analysis of Haplogroup G1 Provides Estimates of SNP and STR Mutation Rates on the Human Y-Chromosome and Reveals Migrations of Iranic Speakers
Oleg Balanovsky et al.
Y-chromosomal haplogroup G1 is a minor component of the overall gene pool of South-West and Central Asia but reaches up to 80% frequency in some populations scattered within this area. We have genotyped the G1-defining marker M285 in 27 Eurasian populations (n= 5,346), analyzed 367 M285-positive samples using 17 Y-STRs, and sequenced ~11 Mb of the Y-chromosome in 20 of these samples to an average coverage of 67X. This allowed detailed phylogenetic reconstruction. We identified five branches, all with high geographical specificity: G1-L1323 in Kazakhs, the closely related G1-GG1 in Mongols, G1-GG265 in Armenians and its distant brother clade G1-GG162 in Bashkirs, and G1-GG362 in West Indians. The haplotype diversity, which decreased from West Iran to Central Asia, allows us to hypothesize that this rare haplogroup could have been carried by the expansion of Iranic speakers northwards to the Eurasian steppe and via founder effects became a predominant genetic component of some populations, including the Argyn tribe of the Kazakhs. The remarkable agreement between genetic and genealogical trees of Argyns allowed us to calibrate the molecular clock using a historical date (1405 AD) of the most recent common genealogical ancestor. The mutation rate for Y-chromosomal sequence data obtained was 0.78×10-9 per bp per year, falling within the range of published rates. The mutation rate for Y-chromosomal STRs was 0.0022 per locus per generation, very close to the so-called genealogical rate. The “clan-based” approach to estimating the mutation rate provides a third, middle way between direct farther-to-son comparisons and using archeologically known migrations, whose dates are subject to revision and of uncertain relationship to genetic events.
Link
Neandertal flutes debunked
Royal Society Open Science DOI: 10.1098/rsos.140022
‘Neanderthal bone flutes’: simply products of Ice Age spotted hyena scavenging activities on cave bear cubs in European cave bear dens
Cajus G. Diedrich
Punctured extinct cave bear femora were misidentified in southeastern Europe (Hungary/Slovenia) as ‘Palaeolithic bone flutes’ and the ‘oldest Neanderthal instruments’. These are not instruments, nor human made, but products of the most important cave bear scavengers of Europe, hyenas. Late Middle to Late Pleistocene (Mousterian to Gravettian) Ice Age spotted hyenas of Europe occupied mainly cave entrances as dens (communal/cub raising den types), but went deeper for scavenging into cave bear dens, or used in a few cases branches/diagonal shafts (i.e. prey storage den type). In most of those dens, about 20% of adult to 80% of bear cub remains have large carnivore damage. Hyenas left bones in repeating similar tooth mark and crush damage stages, demonstrating a butchering/bone cracking strategy. The femora of subadult cave bears are intermediate in damage patterns, compared to the adult ones, which were fully crushed to pieces. Hyenas produced round–oval puncture marks in cub femora only by the bone-crushing premolar teeth of both upper and lower jaw. The punctures/tooth impact marks are often present on both sides of the shaft of cave bear cub femora and are simply a result of non-breakage of the slightly calcified shaft compacta. All stages of femur puncturing to crushing are demonstrated herein, especially on a large cave bear population from a German cave bear den.
Link
‘Neanderthal bone flutes’: simply products of Ice Age spotted hyena scavenging activities on cave bear cubs in European cave bear dens
Cajus G. Diedrich
Punctured extinct cave bear femora were misidentified in southeastern Europe (Hungary/Slovenia) as ‘Palaeolithic bone flutes’ and the ‘oldest Neanderthal instruments’. These are not instruments, nor human made, but products of the most important cave bear scavengers of Europe, hyenas. Late Middle to Late Pleistocene (Mousterian to Gravettian) Ice Age spotted hyenas of Europe occupied mainly cave entrances as dens (communal/cub raising den types), but went deeper for scavenging into cave bear dens, or used in a few cases branches/diagonal shafts (i.e. prey storage den type). In most of those dens, about 20% of adult to 80% of bear cub remains have large carnivore damage. Hyenas left bones in repeating similar tooth mark and crush damage stages, demonstrating a butchering/bone cracking strategy. The femora of subadult cave bears are intermediate in damage patterns, compared to the adult ones, which were fully crushed to pieces. Hyenas produced round–oval puncture marks in cub femora only by the bone-crushing premolar teeth of both upper and lower jaw. The punctures/tooth impact marks are often present on both sides of the shaft of cave bear cub femora and are simply a result of non-breakage of the slightly calcified shaft compacta. All stages of femur puncturing to crushing are demonstrated herein, especially on a large cave bear population from a German cave bear den.
Link
April 12, 2015
April 05, 2015
Biology of Genomes titles
have been announced. A sample of interest is below:
- Population structure in African-Americans
- Contrasting patterns in the high-resolution variation of uniparental markers in European populations highlight very recent male-specific expansions
- Is Sanger sequencing still a gold standard?
- The time and place of European gene flow into Ashkenazi Jews
- 65,222 whole genome haplotypes from the Haplotype Reference Consortium and efficient algorithms to use them
- The expansion of human populations out of Africa might have led to the progressive build-up of a recessive mutation load
- An early modern human with a recent Neandertal ancestor
- Great ape Y chromosome diversity reflects social structure and sex-biased behaviours
- Theoretical analysis indicates human genome is not a blueprint but a storage of genes, and human oocytes have an instruction
- Modeling population size changes leads to accurate inference of sex-biased demographic events
- Exploring population structure through large pedigrees
- Better, faster, stronger—Mixed models and PCA in the year 2015
- Denisovan ancestry in East Eurasian and Native American populations
- Measuring the rate and heritability of aging in Sardinians using pattern recognition
- Dog diversity is shaped by a Central Asian origin followed by geographical isolation and admixture
- Comparative analysis of the Y chromosome genomes of greater apes
- Genomic analysis of ‘Paleoamerican relicts’ reveals close ancestry with Native Americans
- Analysis of genetic history of Siberian and Northeastern European populations
April 04, 2015
In search of the source of Denisovan ancestry
bioRxiv http://dx.doi.org/10.1101/017475
Denisovan Ancestry in East Eurasian and Native American Populations.
Pengfei Qin , Mark Stoneking
Although initial studies suggested that Denisovan ancestry was found only in modern human populations from island Southeast Asia and Oceania, more recent studies have suggested that Denisovan ancestry may be more widespread. However, the geographic extent of Denisovan ancestry has not been determined, and moreover the relationship between the Denisovan ancestry in Oceania and that elsewhere has not been studied. Here we analyze genome-wide SNP data from 2493 individuals from 221 worldwide populations, and show that there is a widespread signal of a very low level of Denisovan ancestry across Eastern Eurasian and Native American (EE/NA) populations. We also verify a higher level of Denisovan ancestry in Oceania than that in EE/NA; the Denisovan ancestry in Oceania is correlated with the amount of New Guinea ancestry, but not the amount of Australian ancestry, indicating that recent gene flow from New Guinea likely accounts for signals of Denisovan ancestry across Oceania. However, Denisovan ancestry in EE/NA populations is equally correlated with their New Guinea or their Australian ancestry, suggesting a common source for the Denisovan ancestry in EE/NA and Oceanian populations. Our results suggest that Denisovan ancestry in EE/NA is derived either from common ancestry with, or gene flow from, the common ancestor of New Guineans and Australians, indicating a more complex history involving East Eurasians and Oceanians than previously suspected.
Link
Denisovan Ancestry in East Eurasian and Native American Populations.
Pengfei Qin , Mark Stoneking
Although initial studies suggested that Denisovan ancestry was found only in modern human populations from island Southeast Asia and Oceania, more recent studies have suggested that Denisovan ancestry may be more widespread. However, the geographic extent of Denisovan ancestry has not been determined, and moreover the relationship between the Denisovan ancestry in Oceania and that elsewhere has not been studied. Here we analyze genome-wide SNP data from 2493 individuals from 221 worldwide populations, and show that there is a widespread signal of a very low level of Denisovan ancestry across Eastern Eurasian and Native American (EE/NA) populations. We also verify a higher level of Denisovan ancestry in Oceania than that in EE/NA; the Denisovan ancestry in Oceania is correlated with the amount of New Guinea ancestry, but not the amount of Australian ancestry, indicating that recent gene flow from New Guinea likely accounts for signals of Denisovan ancestry across Oceania. However, Denisovan ancestry in EE/NA populations is equally correlated with their New Guinea or their Australian ancestry, suggesting a common source for the Denisovan ancestry in EE/NA and Oceanian populations. Our results suggest that Denisovan ancestry in EE/NA is derived either from common ancestry with, or gene flow from, the common ancestor of New Guineans and Australians, indicating a more complex history involving East Eurasians and Oceanians than previously suspected.
Link