September 30, 2008

Disease genes are ancient genes

A very interesting paper, which shows that genes that cause human disease tend to be those which appeared earlier in evolution on Earth, rather than in the mammalian lineage. Perhaps, ancient genes are very important for the proper functioning of an organism (they managed to survive the longest, so they must be doing something important), and hence dysfunctions caused by them would have a major negative effect.

Molecular Biology and Evolution, doi:10.1093/molbev/msn214

An ancient evolutionary origin of genes associated with human genetic diseases

Tomislav Domazet-Loo and Diethard Tautz


Several thousand genes in the human genome have been linked to a heritable genetic disease. The majority of these appear to be non-essential genes (i.e. are not embryonically lethal when inactivated) and one could therefore speculate that they are late additions in the evolutionary lineage towards humans. Contrary to this expectation, we find that they are in fact significantly over-represented among the genes that have emerged during the early evolution of the metazoa. Using a phylostratigraphic approach, we have studied the evolutionary emergence of such genes at 19 phylogenetic levels. The majority of disease genes was already present in the eukaryotic ancestor and the second largest number has arisen around the time of evolution of multicellularity. Conversely, genes specific to the mammalian lineage are highly underrepresented. Hence, genes involved in genetic diseases are not simply a random subset of all genes in the genome, but are biased towards ancient genes.


Female life expectancy vs. Female Birth Rate

Evolution and Human Behavior doi:10.1016/j.evolhumbehav.2008.08.002

Sex difference in life span affected by female birth rate in modern humans

Alexei A. Maklakov


Sex differences in life span are common in different taxa, including primates, but not well understood. Theory and comparative evidence suggest that differential costs of reproduction between the sexes may explain the differences in sex-biased mortality across large taxonomic groups. The level of sex-specific reproductive effort may thus affect the difference in life span across populations. Modern humans (Homo sapiens), generally show the typical mammalian pattern of male-biased mortality. Here, I asked whether the differences in female birth rates between countries affect the sex difference in life span. I used the data on male and female life span and female birth rate in different countries from publicly available databases, while controlling for geographic and economic factors. The analysis suggests that female birth rate explains 17% of the variation in relative sex differences in life span across countries. Low female birth rate results in females living relatively longer than males. These data suggest that a simple biological factor—female birth rate—may explain a significant part of the variation in sex differences in life span across human populations.


September 29, 2008

Y chromosomes of the Ruling Dynasty of the Nso' in Cameroon

From the paper:
The groups are (1) the won nto', descendants of a fon down to the third or fourth generation; (2) the duy, descendants of a fon who ruled more than three or four generations ago together with, according to Chem-Langhëë and Fanso (1997), some members of commoner lineages whose heads are descendants of princesses and members of associated patriclans or clan segments, allegedly founded by immigrant royals, that provide state counselors; (3) the nshiylav, subjects born or recruited2 into palace service (patrilineally inherited); and (4) the mtaar, commoners (patrilineally inherited). Although the majority of the Nso' are self-identifying Christians of the Roman Catholic denomination, the fon has, through the generations, maintained a polygynous household, which in 2005 numbered over 70 women.3


The most common NRY haplogroup in the won nto' was Y*(xBR,A3b2), with a frequency of 55.6% ( ; table 1). This haplogroup was also found at a frequency of 17.6% in the duy. Furthermore, all Y*(xBR,A3b2) chromosomes had the same microsatellite haplotype (14-12-20-11-14-14; ... For convenience only we refer to Y*(xBR,A3b2) and the associated microsatellite haplotype as the won nto' modal haplotype (WMH); it had ten representatives, while the next most frequent haplotype in the won nto' had only two. The modal NRY haplogroup in the non–won nto' social classes was E3a, with a diverse range of NRY types at the microsatellite haplotype level


A principal coordinates analysis plot (fig. 2) based on a pairwise FST distance matrix calculated using NRY haplogroup frequencies (see table F1 for genetic distances and associated P values) clearly distanced the won nto' from both the other Nso' social classes and other ethnic groups, demonstrating that high frequencies of Y*(xBR,A3b2) are not typical of Grassfields and Tikar Plain NRY profiles. Accordingly, because Y*(xBR,A3b2) is typical of a hunter-gatherer population and the WMH is the most likely candidate to be the NRY type of the father of the first Nso' fon, the NRY data favor the oral tradition that the princess married an indigenous Visale, from whom all subsequent fons descend.

Current Anthropology doi: 10.1086/590119

Sex-Specific Genetic Data Support One of Two Alternative Versions of the Foundation of the Ruling Dynasty of the Nso' in Cameroon

Krishna R. Veeramah et al.


Sex-specific genetic data favor a specific variant of the oral history of the kingdom of Nso' (a Grassfields city-state in Cameroon) in which the royal family traces its descent from a founding ancestress who married into an autochthonous hunter-gatherer group. The distributions of Y chromosome and mitochondrial DNA variation in the Nso' in general and in the ruling dynasty in particular are consistent with specific Nso' marriage practices, suggesting strict conservation of the royal social class along agnatic lines. This study demonstrates the efficacy of using genetics to augment other sources of information (e.g., oral histories, archaeology, and linguistics) when seeking to recover the histories of African peoples.


September 28, 2008

Integrated detection of SNPs and Copy number variation

While SNPs are single-letter changes in the genetic code, copy number variation (CNV) involves the multiplication (or deletion) of entire chunks of DNA. While in a SNP, the allele is a single letter (e.g., C or T), in CNVs, the allele is an integer number of how many copies of the particular chunk of DNA an individual has. What this paper shows is that most human CNVs don't appear to be "fresh" changes but rather old "frozen" changes that are linked to specific SNPs or combinations of SNPs. Practically, this means that a CNV allele can be inferred fairly accurately by looking at SNPs in the region of the chromosome where it occurs.

Nature Genetics 40, 1166 - 1174 (2008)

Integrated detection and population-genetic analysis of SNPs and copy number variation

Steven A McCarroll et al.


Dissecting the genetic basis of disease risk requires measuring all forms of genetic variation, including SNPs and copy number variants (CNVs), and is enabled by accurate maps of their locations, frequencies and population-genetic properties. We designed a hybrid genotyping array (Affymetrix SNP 6.0) to simultaneously measure 906,600 SNPs and copy number at 1.8 million genomic locations. By characterizing 270 HapMap samples, we developed a map of human CNV (at 2-kb breakpoint resolution) informed by integer genotypes for 1,320 copy number polymorphisms (CNPs) that segregate at an allele frequency >1%. More than 80% of the sequence in previously reported CNV regions fell outside our estimated CNV boundaries, indicating that large (>100 kb) CNVs affect much less of the genome than initially reported. Approximately 80% of observed copy number differences between pairs of individuals were due to common CNPs with an allele frequency >5%, and more than 99% derived from inheritance rather than new mutation. Most common, diallelic CNPs were in strong linkage disequilibrium with SNPs, and most low-frequency CNVs segregated on specific SNP haplotypes.


September 27, 2008

More ASHG 2008 abstracts

The previous batch is here.

Analysis of East Asia Genetic Substructure: Population Differentiation and PCA Clusters Correlate with Geographic Distribution
Accounting for genetic substructure within European populations has been important in reducing type 1 errors in genetic studies of complex disease. As efforts to understand complex genetic disease are expanded to other continental populations an understanding of genetic substructure within these continents will be useful in design and execution of association tests. In this study, population differentiation(Fst) and Principal Components Analyses(PCA) are examined using >200K genotypes from multiple populations of East Asian ancestry(total 298 subjects). The population groups included those from the Human Genome Diversity Panel[Cambodian(CAMB), Yi, Daur, Mongolian(MGL), Lahu, Dai, Hezhen, Miaozu, Naxi, Oroqen, She, Tu, Tujia, Naxi, and Xibo], HapMap(CHB and JPT), and East Asian or East Asian American subjects of Vietnamese(VIET), Korean(KOR), Filipino(FIL) and Chinese ancestry. Paired Fst(Wei and Cockerham) showed close relationships between CHB and several large East Asian population groups(CHB/KOR, 0.0019; CHB/JPT, 00651; CHB/VIET, 0.0065) with larger separation with FIL(CHB/FIL, 0.014). Low levels of differentiation were also observed between DAI and VIET(0.0045) and between VIET and CAMB(0.0062). Similarly, small Fsts were observed among different presumed Han Chinese populations originating in different regions of mainland of China and Taiwan. For example, the four For PCA, the first two PCs showed a pattern of relationships that closely followed the geographic distribution of the different East Asian populations.corner groups were JPT, FIL, CAMB and MGL with the CHB forming the center group, and KOR was between CHB and JPT. Other small ethnic groups were also in rough geographic correlation with their putative origins. These studies have also enabled the selection of a subset of East Asian substructure ancestry informative markers(EASTASAIMS) that may be useful for future genetic association studies in reducing type 1 errors and in identifying homogeneous groups.

Worldwide Population Structure using SNP Microarray Genotyping
We genotyped 348 individuals sampled from 24 populations world-wide using the Affymetrix 250k NspI microarray chip. For context, we added matching genotypes from 210 HapMap individuals for a total of 250,823 loci genotyped in 543 individuals from 28 populations. We included populations from India and Daghestan to provide detail between the genetic poles of Western Europe, East Asia, and sub-Sahara Africa. With so many markers, principal components analyses reveal genetic differentiation between almost all identified populations in our sample. Northern and southern European populations (FST = 0.004, p <0.01) are statistically distinguishable, as are upper and lower caste groups in India (FST = 0.005, p <0.01). All individuals are accurately classified into continental groups, and even between closely-related populations, genetic- and self-classifications conflict for only a minority of individuals (e.g. ~2% between upper and lower Indian castes; k-means clustering.) As expected, the HapMap CHB+JPT, CEU, and YRI samples are most similar to our east Asian, west European, and African samples, respectively. The HapMap CEU samples and our northern European ancestry samples were both collected from Utah. Although individual samples cannot be reliably classified into their collection of origin, the groups are statistically distinguishable despite their high similarity (FST = 0.0005, n.s.). Our Japanese group is also statistically distinguishable from the HapMap JPT group (FST = 0.006, p <0.01), and in this comparison, most samples can be correctly classified. With such large numbers of genotypes, significant differences can be found even between very similar population samplings. Our results provide guidelines for researchers in selecting suitable control populations for case-control studies.

Frequency distribution and selection in 4 pigmentation genes in Europe
Pigmentation is one of the more obvious forms of variation in humans, particularly in Europeans where one sees more within group variation in hair and eye pigmentation than in the rest of the world. We studied 4 genes (SLC24A5, SLC45A2, OCA2 and MC1R) that are believed to contribute to the pigment phenotypes in Europeans. SLC24A5 has a single functional variant that leads to lighter skin pigmentation. Data on 83 populations worldwide (including 55 from our lab) show the variant (at rs1426654) has almost reached fixation in Europe, Southwest Asia, and North Africa, has moderate to high frequencies (.2-.9) throughout Central Asia, and has frequencies of .1-.3 in East and South Africa. The variant is essentially absent elsewhere. SLC45A2 also has a single functional variant (at rs16891982) associated with light skin pigmentation in Europe. Data on 84 populations worldwide show the light skin allele is nearly fixed in Northern Europe but has lower frequencies in Southern Europe, the Middle East and Northern Africa. In Central Asia the frequency of the SLC45A2 variant declines more quickly than the SLC24A5 variant. It is absent in both East and South Africa. In OCA2 we typed 4 SNPs (rs4778138, rs4778241, rs7495174, rs12913832) with a haplotype associated with blue eyes in Europeans. This haplotype shows a Southeastern to Northwestern pattern in Europe with frequencies of .25 (.05 homozygous) in the Adygei to .85 (.75 homozygous) in the Danes. In MC1R we typed 5 SNPs (rs3212345, rs3212357, rs3212363, C_25958294_10, rs7191944) that cover the entire MC1R gene and found a predominantly European haplotype that ranges in frequency from .35 to .65 in Europe, reaching its highest levels in Southwest Asia and Northwestern Europe. Extended Haplotype Heterozygosity (EHH) and normalized Haplosimilarity (nHS) show evidence of selection at SLC24A5 in not only our European and Southwest Asian populations but also our East African populations. Neither SLC45A2 or OCA2 showed evidence of selection in either test. MC1R did not show evidence of selection for our European specific haplotype but we did see some evidence both upstream and downstream in our nHS test in Europe.

Using principal components analysis to identify candidate genes for natural selection.
Genetic markers that differentiate populations are excellent candidates for natural selection due to local adaptation, and may shed light into physiological pathways that underlie disorders with varying frequencies around the world. Principal Components Analysis (PCA) has emerged as a powerful tool for the characterization and analysis of the structure of genomewide datasets. In prior work, we described an algorithm that can be used to select small subsets of genetic markers (SNPs) that correlate well with population structure, as captured by PCA. Our method can be used to detect SNPs that differentiate individuals from different geographic regions, or even neighboring subpopulations. We set out to explore the nature and properties of the genes where population-differentiating SNPs reside, by analyzing the publicly available Human Genome Diversity Panel dataset (650,000 SNPs for 1,043 individuals, 51 populations). Applying our SNP selection algorithms, we chose small subsets of SNPs that almost perfectly reproduce worldwide population structure as identified by PCA. We determined SNP panels both for population differentiation within seven geographic regions, as well as around the globe. We then explored the hypothesis that the selected SNPs attained their current worldwide allele frequency patterns as a response to the pressure of natural selection. Comparing our lists to recently published reports, we found a significant overlap with other genomewide scans for selection, thus validating our hypothesis. For example, EDAR (involved in the development of hair follicles) harbors the most differentiating SNPs in our world-wide panels. SNPs located in genes that are involved in skin and eye pigmentation (OCA2, MYO5C, HERC1, HERC2) are also among the top population differentiating markers. In East Asia, SNPs residing at the ADH cluster appear among the most important SNPs for population structure, while, in Europe, the same is true for genes that are involved in immune response to pathogens (CR1, DUOX2, TLR, and HLA). Finally, a comprehensive gene ontology analysis is presented.

September 26, 2008

Central Asian patrilineal populations

On the same topic as the preceding post. I would add that a major cause for the higher informativeness of the human Y-chromosome is the fact that most of the presently dominant lineages in the world are fairly recent, and hence there has been less time for random diffusion of patrilineages on the map to wipe out pre-established patterns of Y-chromosomes associated with archaeologically or historically dominant patriarchal groups.

In the standard model, males are more static, and females more mobile, because they may move fairly long distances to settle in their husband's residence. This factor is counterbalanced, I think, by the excess organized migration of surplus males in societies where there is socio-economic/reproductive inequality.

Thus, the constant individualistic short-range migration of women, coupled with their greater reproductive equality, over long periods of time, evens out the distribution of mtDNA lineages, with the resulting distribution further obscured by (climate-related) selective factors acting on human mtDNA. On the contrary, the Y-chromosome landscape is established by patrilocal males staying by and defending their hearths, but is occasionally punctuated by long-range collective migration of patrilineally related males.

PLoS Genetics doi:10.1371/journal.pgen.1000200

Sex-Specific Genetic Structure and Social Organization in Central Asia: Insights from a Multi-Locus Study

Laure Ségurel et al.


In the last two decades, mitochondrial DNA (mtDNA) and the non-recombining portion of the Y chromosome (NRY) have been extensively used in order to measure the maternally and paternally inherited genetic structure of human populations, and to infer sex-specific demography and history. Most studies converge towards the notion that among populations, women are genetically less structured than men. This has been mainly explained by a higher migration rate of women, due to patrilocality, a tendency for men to stay in their birthplace while women move to their husband's house. Yet, since population differentiation depends upon the product of the effective number of individuals within each deme and the migration rate among demes, differences in male and female effective numbers and sex-biased dispersal have confounding effects on the comparison of genetic structure as measured by uniparentally inherited markers. In this study, we develop a new multi-locus approach to analyze jointly autosomal and X-linked markers in order to aid the understanding of sex-specific contributions to population differentiation. We show that in patrilineal herder groups of Central Asia, in contrast to bilineal agriculturalists, the effective number of women is higher than that of men. We interpret this result, which could not be obtained by the analysis of mtDNA and NRY alone, as the consequence of the social organization of patrilineal populations, in which genetically related men (but not women) tend to cluster together. This study suggests that differences in sex-specific migration rates may not be the only cause of contrasting male and female differentiation in humans, and that differences in effective numbers do matter.


Polygyny in human evolution

This paper suggests that polygyny has been a feature of our species for most of its history. They arrive at this conclusion by comparing genetic variation in autosomal DNA and X chromosomes.

Autosomal DNA spends an equal amount of time in male and female bodies, while X chromosomes spend twice as long in female than in male bodies. In a polygynous society, many males don't have offspring while most women do. Hence, genetic variation in X chromosomes has a higher chance to arise (more bodies=>more mutations) and to be maintained (more bodies=>less drift).

This ties in quite nicely with my recent suggestion on reproductive inequality for human Y-chromosomes.

Related story in the New Scientist.
Hammer's team discovered more genetic differences in the X chromosome than would be expected if equal numbers of males and females tended to mate, over human history. The only explanation for this pattern is widespread, long-lasting polygyny, he says.

His team's analysis reflects all of human history, and modern monogamy has not even left a blip in our genomes. "I don't know how long monogamy has been with us," Hammer says. "It seems it hasn't been around long, evolutionarily."

PLoS Genetics doi:10.1371/journal.pgen.1000202

Sex-Biased Evolutionary Forces Shape Genomic Patterns of Human Diversity

Sex-Biased Evolutionary Forces Shape Genomic Patterns of Human Diversity et al.


Comparisons of levels of variability on the autosomes and X chromosome can be used to test hypotheses about factors influencing patterns of genomic variation. While a tremendous amount of nucleotide sequence data from across the genome is now available for multiple human populations, there has been no systematic effort to examine relative levels of neutral polymorphism on the X chromosome versus autosomes. We analyzed ~210 kb of DNA sequencing data representing 40 independent noncoding regions on the autosomes and X chromosome from each of 90 humans from six geographically diverse populations. We correct for differences in mutation rates between males and females by considering the ratio of within-human diversity to human-orangutan divergence. We find that relative levels of genetic variation are higher than expected on the X chromosome in all six human populations. We test a number of alternative hypotheses to explain the excess polymorphism on the X chromosome, including models of background selection, changes in population size, and sex-specific migration in a structured population. While each of these processes may have a small effect on the relative ratio of X-linked to autosomal diversity, our results point to a systematic difference between the sexes in the variance in reproductive success; namely, the widespread effects of polygyny in human populations. We conclude that factors leading to a lower male versus female effective population size must be considered as important demographic variables in efforts to construct models of human demographic history and for understanding the forces shaping patterns of human genomic variability.


September 25, 2008

ASHG 2008 abstracts

Just a sample of abstracts that I found interesting from the upcoming meeting of the American Society of Human Genetics.

Strong linkage disequilibrium for the frequent GJB2 35delG mutation in the Greek population.
Up to forty percent of autosomal recessive, congenital, severe to profound hearing impairment cases result from mutations in the GJB2 gene. The 35delG mutation accounts for the majority of mutations detected in Caucasian populations and represents one of the most frequent disease mutations identified so far. Some previous studies have assumed that the high frequency of the 35delG mutation reflects the presence of a mutational hot spot, whilst other studies support the theory of a common founder. Greece is amongst the countries presenting the highest frequency of the 35delG mutation (3.5%), and a recent study raised the hypothesis of the origin of this mutation in ancient Greece. We genotyped 60 Greek deafness patients homozygous for the 35delG mutation for six single nucleotide polymorphisms (SNPs) and two microsatellite markers, mapping within or flanking the GJB2 gene, as compared to 60 Greek hearing controls. A strong linkage disequilibrium was found between the 35delG mutation and the DNA markers at distances of 34 kb on the centromeric and 90 kb on the telomeric side of the gene, respectively. A comparison of the present findings with those of a previous study from Belgium, UK and USA, demonstrated a common haplotype reflecting the common founder. Our study supports the hypothesis of a founder effect and we further propose that ethnic groups of Greek ancestry could have propagated the 35delG mutation, as evidenced by historical data beginning from the 15th century BC.

Detection of population substructure among Jews and a north/south gradient within Ashkenazi Jews using 32 STR markers.
Understanding and detecting population substructure are critical issues. Using 32 autosomal STR markers and the program STRUCTURE we demonstrated differentiation between Ashkenazi (AJ) (N=135) and Sephardic (SJ) (N=226) Jewish populations in the form of Northern and Southern European genetic components (AJ north 73%, south 22%, SJ north 32%, south 61%) and a significant relationship between latitude of grandparental country of origin (GCO) and percent north/south genetic component in AJ. Notably, we revealed substructure among Jews (and among European Americans (EA)) using a small STR panel, only when additional samples representing major continental populations (African American, EA, Asian) were included in analyses. Further, negative RIS (-0.035) indicates recent admixture in individuals with both SJ and AJ parents (N=38). RIS is a measure of inbreeding adapted from FIS for STR markers. Negative RIS indicates allelic variation within individuals greater than expected under random mating, i.e., excess heterozygosity due to outbreeding. Although geographic patterns are seen in the average north/south percent assignment values between groups as defined by AJ or SJ, grandparental world region of origin, or GCO, within each group there is high variability among individual assignment values. Thus, even based on data from a small marker set, AJ is not a homogeneous population. The north/south gradient in AJ may be a reflection of the pre-existing north/south gradient in European host populations (recently shown in other studies using large numbers of SNPs) with which Jews admixed slowly. We also demonstrate the utility of including purported parental populations when attempting to detect population substructure within closely related populations.
Mutation meltdown of mitochondrial DNA and Neanderthal extinction.
There is emerging evidence that mitochondrial DNA (mtDNA) plays and integral role in the evolution of the human species. Although contentious, recent phylogenetic studies of modern humans implicate genetic variation of mitochondrial DNA (mtDNA) as a major factor underpinning the climatic adaptation of across the globe. Greater sequence diversity in the MTATP6 gene in arctic populations led to the idea that specific mtDNA polymorphisms cause subtle uncoupling of the respiratory chain, with the subsequent generation of additional heat being adaptive in northern climes. Our knowledge of mtDNA and its affect on adaptability may help us to understand how modern humans have survived their early ancestors. Here, we characterise the mtDNA of one of these extinct hominids. Neanderthals are the closest hominid relatives of modern humans, who up until 30,000 years ago coexisted in Europe and western Asia. Recently, over 1Mb of DNA was successfully extracted and characterised from the Vi-80 Neanderthal fossil. We reanalysed 2,705 base pairs of mtDNA in order to examine the hypothesis that mitochondrial dysfunction contributed to the Neanderthals demise. We identified thirty-two nucleotide differences from the modern human mtDNA reference sequence and by treating the Vi-80 as a diagnostic sample leads us to the conclusion that sequence variants that are highly likely to be artifacts, and a large proportion of the remaining mutations could be due to nuclear pseudogene amplification. We did identify a potentially deleterious variation; however more study may be needed to ascertain the effect of mitochondrial dysfunction on Neanderthal survival.

Early Siberian Maternal Lineages in the Tubalar of Northeastern Altai Inferred from High-Resolution Mitochondrial DNA Analysis
At the hight of the last glaciation (~18 kya) Siberians were confined to the southern strongholds, which were areas of continuous occupation, and where immediate ancestors of the Uralic, Kettic and Altaian language groups differentiated. To better understand the evolutionary relationships between the earlier and contemporary Siberians, we focused on the northern Altaic prehistory preserved in the mtDNA diversity of the Tubalar, until recently representing a typical hunting-gathering population. The present study includes 139 Tubalar. All mtDNAs were subjected to high-resolution SNP analysis, followed by complete sequencing of selected mtDNA samples. We showed that the core of the Tubalar genetic makeup proved to be a mixture of west (H8, U4b, U5a1, and X2e) and east Eurasian (A and B1) haplogroups derived from macrohaplogroup N, and Siberian derivatives of the macrohaplogroup M identifiable by subhaplogroup-specific mutations. For example, among the 36 Tubalar mtDNA samples that belong to haplogroup D, 10 (28%) harbored diagnostic markers of the subhaplogroup D3a2a shared with the Chukchi and Eskimos. This finding verified at the complete sequence level we attributed to ancient link between early Siberians, who underwent pronounced differentiation in the Altai-Sayan region, and some of the Eskimo tribes. A comparison of the mtDNA data generated through the course of this study with published complete sequences has contributed essentially to parsimonious phylogenetic structure of mtDNA evolution in west Siberia. Specifically, northeastern Altai appears to be a good candidate for the ancestral homeland of the haplogroup U4b, which is apparently ancient European. For some haplogroups, such as X2e, the relatively recent arrival to the Altai region is more likely.
Sex-specific gene flow between Pygmy and non-Pygmy populations
Cultural traditions and preferences may drive sex-specific gene flow among human populations. We have examined sex-specific gene flow between Mbuti Pygmies, a hunter-gather population, and surrounding agriculturist groups, the Alur, Hema, and Nande, which all reside in Central Africa. We used 18 lineage-defining Y chromosome SNPs and HVS1 mitochondrial DNA sequence information to examine patterns of gene flow among these groups. Mbuti Pygmy males have more diverse Y chromosome lineages (Mbuti Pygmy [n = 28]: = 0.229; Alur [n = 10]: 0.193; Hema [n = 18]: 0.178; Nande [n = 15]: 0.090) and slightly less mtDNA diversity than neighboring groups (0.020, 0.023, 0.025, 0.022 in Mbuti Pygmy, Alur, Hema, and Nande groups, respectively). The majority of Mbuti Pygmy males have a Y haplotype characteristic of Mbuti Pygmies (B2b); however, more than 30% of Pygmy males exhibit Y haplotypes associated with Bantu-speaking agricultural populations (E3a lineage). Conversely, no agriculturist males exhibit Y haplogroups associated with Mbuti Pygmy populations but instead have derived Y haplogroups characteristic of Bantu agriculturalists (E2, E3a). Pairwise FST was calculated among all populations using Y haplogroup frequency and HVS1 mtDNA sequence data. YDNA and mtDNA FST values between Mbuti Pygmy and non-Pygmy groups (Alur, Hema, and Nande) were 0.278, 0.355, and 0.217 (for YDNA) and 0.088, 0.239 and 0.217 (for mtDNA), respectively. A Mantel test between pairwise FST matrices showed no significant correlation ((r = 0.27; p 0.35), which indicates that patterns of genetic differentiation differ between Y chromosome SNPs and mtDNA sequence patterns. These results also suggest no emigration of Mbuti Pygmy Y chromosomes into surrounding groups but immigration of non-Mbuti Pygmy Y chromosomes into the Mbuti Pygmy population.
Population Structure in Mongolia from a Mitochondrial DNA Perspective.
Mongolia has experienced a complex series of demographic movements over the past 10-20 millennia that have shaped the patterns of its modern human genetic variation. However, modern populations in Mongolia have not been extensively studied for DNA diversity, nor has the genetic contribution of Mongolians to the gene pools of contemporary populations in Southeast Asia and Oceania been fully resolved. Archaeological evidence from as early as the late Neolithic suggests the presence of both West and East Eurasian cultures in this region. Later demographic movements involving the emergence of the Mongolian and later Manchu Empires have further convoluted Mongolias population structure. To clarify the complex population history of Mongolia, we analyzed variation in the mtDNAs of 190 individuals from several Mongolian ethnic groups, including the Uriankhai, Zakhchin, Derbet, Khoton and Khalkha. We screened all samples for phylogenetically informative coding region SNPs and sequenced HVSI to assess control region variation in them. Our data suggest that the mtDNA diversity present in our population is consistent with the general pattern of variation observed in East Asia, with the most frequent haplogroups being C, D and G. Haplogroup variation in Mongolian ethnic groups reveals considerable maternal diversity with a predominance of basal M types. Interestingly, the Mongolians also possessed West Eurasian haplogroups, such as H, J and K, which are not commonly observed in East Asia, even at low frequencies. The main ethnic group in Mongolia, the Khalkha, was highly variable with respect to mtDNA haplotypes in comparison with the other ethnic groups, and clearly distinct from the Khoton and Zakhchin, as evidenced by distance measures. Overall, these data provide insights into the origins and affinities of these populations, their relationships with East Asian groups and neighboring Turkic speaking groups, including indigenous Altaians, and their possible role in the peopling of the Americas.

Allocation of YSTR Microvariant Alleles to Y-Chromosome Binary Haplogroups.
Y-chromosome short tandem repeat (YSTR) loci are used extensively in studies of population substructure, temporality of population dynamics, and forensic identification. The occurrence of non-consensus YSTR alleles, such as unusually short alleles or partial insertion/deletion events (microvariants), have been used successfully as indicators of common ancestry among YSTR haplotypes, exposing further levels of phylogenetic substructure with restricted geographic distributions. However, the high variability of STR loci can potentially lead to false associations due to homoplasy (ie, recurrent mutation). Thus, YSTR haplotypes are best interpreted within the context of the binary marker defined Y-chromosome phylogeny. To identify YSTR microvariant alleles potentially useful for elucidating further phylogenetic substructure within binary haplogroups, we have assessed the haplogroup affiliation of microvariant alleles found at informative frequencies in public YSTR databases for the following YSTR loci: DYS385, DYS392, DYS441, DYS446, DYS447, DYS449 and DYS464. We report haplogroup affiliations for each variant allele and geographic origins of representative samples.

L1c2a, the (African) Haplogroup With The Longest Mitochondrial Genome!
Haplotypes derived from the maternally-inherited mitochondrial DNA (mtDNA) control region are often employed as a first step in determining phylogenetic-relevant samples that could be selected for additional coding region testing. Using the currently defined world mtDNA haplogroup tree, researchers can assign these haplotypes to specific branches, paying particular attention to novel mutations that could assist in identifying new subclades. During a recent survey of the nearly 58000 mtDNA control region haplotypes currently present in the publicly accessible Sorenson Molecular Genealogy Foundation database, we observed a small number of mtDNAs (n=16) characterized by the presence of unusually long insertions of up to 200 bases. A small subset of these particularly long mtDNA haplotypes shared an identical insertion of 15 bases. Genealogical analysis combined with haplogroup prediction confirmed that these haplotypes shared a common African origin. Additionally, based on the pedigree data gathered, we determine the donors were not closely related. Moreover, through the analysis of complete mtDNA sequences, we conclude that the newly defined haplogroup is most likely of recent origin. As reported in this study, insertions of more than 10 bps are quite rare in the general population and in the published literature, thus providing an interesting case work in population and possibly future disease studies.

Mitochondrial DNA footprints in modern Mongolia.
Although Mongolia is one of the most sparsely populated countries in the world, it is located at a pivotal crossroad between the four corners of Asia (including the well-known Silk Road) and has been characterized throughout history by events that greatly added to its current cultural and ethnic diversity. Among these, perhaps one of the most significant happening was the ambitious expansion strategy employed by Mongolias most prominent personality, Genghis Khan, whose empire eventually stretched across all of modern-day China, a portion of modern Russia, Southern Asia, Eastern Europe and the Middle East. In 2007, through a well-planned collection effort, researchers at the Sorenson Molecular Genealogy Foundation and the National University of Mongolia were able to gather over 3,000 DNA samples, informed consents, and genealogical data throughout the country of Mongolia, including samples from 21 distinct tribal or ethnic populations. All the samples were sequenced for the three hypervariable segments of the mitochondrial DNA (mtDNA) control region to assess the genetic composition of modern Mongolia. The most common mtDNA haplotypes are typical of haplogroup C, which is frequent throughout Eastern Asia. However, nearly 40% of the observed mtDNA lineages are of Western Eurasian origin, including a significant frequency (~7%) of haplogroup H - the most common in Europe. The high prevalence of Western Eurasian lineages could be a remnant from Genghis Khans conquering efforts, trade and cultural exchanges along the Silk Route. To assess the extent of recent gene flow that could account for the elevated levels of Eurasian haplogroups within Mongolian populations, we have examined genealogical data of samples representative of Western Eurasian haplogroups.

Y chromosome microsatellite haplotypes in the Hutterite founders.
The current population of >12,000 Schmiedeleut Hutterites are descendants of 38 male founders who were born between 1700 and 1830 in Europe. Only 12 of these founders, each with a unique surname, have living male descendants related through male-only lineages. DNA samples were available in our laboratory for 75 male descendants of 11 of the 12 founders, accounting for 673 independent paternal meioses. We genotyped 9 microsatellite loci, which included a mean of 6.8 (range 2-23) males per lineage to evaluate potential relationships between the founders. Fourteen different haplotypes were identified, with an average of 3.5 (range 1-8) pairwise differences between haplotypes. All descendants within each of 9 lineages had identical Y haplotypes. Descendents of two of these lineages, 2 and 10, had the same haplotype despite different surnames, suggesting possible relatedness between the founders of these two lineages. Descendants of two lineages, 6 and 11, each carried three distinct haplotypes. Within each of these lineages the haplotypes differed from the ancestral haplotype by one repeat size at two loci. Additional male descendants in lineages 6 and 11 were then genotyped for the discrepant microsatellites, confirming the presence of three Y haplotypes each in lineages 6 and 11. The one mutation arose at each of four loci: DYS388, DYS389II, DYS390, DYS393. Three mutations were gains of one repeat; it was not possible to determine if the fourth mutation was a gain or loss of one repeat. The ancestral haplotypes in these two lineages are identical at four microsatellite loci; the alleles at the other five loci differ by one repeat size. The average mutation rate at these 9 loci was 0.00066 (95% CI 0.00015-0.0013), similar to other estimates. These data suggest that the founders of lineages 2 and 10 may have been related through paternal lines and that surnames do not strictly correspond to unique Y chromosomes. Moreover, certain ancestral haplotypes (i.e., those in lineages 6 and 11) may be more prone to mutation. Supported by NIH grants HD21244 and HL085197.

Genetic History of human populations of East African inferred from mtDNA and Y chromosome analyses.

Evidence from genetic, paleobiological, and archaeological studies suggest that Africa, especially East Africa, is most likely to be the cradle of the modern human species. Despite this fact, very little is currently known about genetic diversity in African populations in general, and East African populations in particular. Genetic data demonstrate that the patterns of genetic variation in East African populations are complex. All four major language families spoken in Africa (Afro-Asiatic, Nilo-Saharan, Niger-Kordofanian, and Khoisan) are found in the region. As part of a large study of population genetic diversity of East and Northeast Africa, we examined Y chromosome genetic diversity (to ascertain paternal lineages) as well as mitochondrial genetic diversity (to ascertain maternal lineages) in 1200 - 1500 individuals from ~ 40 Tanzanian, Sudanese, and Kenyan populations. For the Y chromosome analysis, we genotyped 60 UEPs (analyzed in a hierarchical manner to construct haplotypes) in a total of ~1500 male individuals. In order to infer ages of lineages and migration patterns, we further genotyped the individuals for 16 Y chromosome microsatellites. For the mtDNA analysis, we sequenced the mitochondrial D-loop in a total of 1200 individuals from the same populations, and for 200 individuals, we did complete mitochondrial genome sequencing. We compare our results with published results of studies from other parts of Africa and the Middle East. Our results indicate that East African populations have some of the most ancestral Y chromosome and mtDNA lineages in Africa, suggesting that they may have been an ancient source of dispersion throughout Africa. Additionally, we find evidence for ancient geneflow between East Africa and the Middle East. We also ascertained the effect of the Bantu-expansion and signature of recent migration of Cushitic-speaking groups originating from Ethiopia on peopling of East Africa.

Analysis of mtDNA and Y-chromosome haplogroups in Mexican Mestizos and Amerindian groups.
The Mexican population is mainly conformed by Mestizos, individuals with a genetic background consisting of Amerindian, European and African contributions. Genetic heterogeneity in Mexicans results from a complex demographic history that started with the peopling of North and Central America about 15,000 yrs ago, including the settlement of at least 60 different indigenous groups in Mexico, regional differences in admixture dynamics after colonization by Spaniards in the XVI century, epidemics and migration. Y chromosome-specific and mitcohondrial (mt) DNA polymorphisms are useful to help understand the genetic structure and history of human populations, due to their uniparental inheritance and lack of recombination. In order to refine the portrait of genetic variability derived from the Mexican Genome Diversity Project, we are characterizing maternal and paternal lineages participating in admixture. For this we included genotypic data from 163 mt SNPs and 123 Y chromosome SNPs present in the Illumina Human1M chip of 450 individuals, 300 mestizos from six states located in different regions: Northern, Central and Southern; and 150 individuals from different Amerindian groups (Tepehuanes, Zapotecos and Mayas). With this information, we are measuring genetic diversity using Fst and AMOVA analysis. Admixture analysis includes average and individual ancestral contribution estimates using autosomal SNPs. Initial results show that in our Mestizo sample, 88% of the mt haplogroups are Amerindian (A, B, C or D), and the rest includes European and African lineages. We have identified differences in proportions of each haplogroup in both Mestizos and Amerindians. Knowledege about the distribution of mt and Y-chromosome haplogroups in Mexican Mestizos and Amerindian groups, will generate valuable information to better understand genetic relationships between Mexicans and other Latin American populations. In addition, it may contribute to strengthen analysis in association studies of common complex diseases.

The origin of Native Americans from a mitochondrial DNA viewpoint.
America, the last continent to be colonized by modern humans, is characterized by an extraordinary linguistic and cultural diversity. Until recently, it was generally believed that starting around 13,500 years ago, the first Paleo-Indians arrived from Beringia, passing through an interior ice-free corridor in western North America, and spread rapidly all the way to Tierra del Fuego. Today, we realize that the peopling of the Americas involved a much more complex process. As for the maternally transmitted mitochondrial DNA (mtDNA), it has been clear since the early nineties that Native Americans could be traced back to four major maternal lineages (haplogroups) of Asian affinity. These were initially named A, B, C and D, and are now termed A2, B2, C1 and D1. More than 95% of living Native Americans belong to these four haplogroups, which can be considered pan-American, because they are shared by North, Central and South American populations. Later, five additional maternal lineages were discovered and named X2a, D2, D3, C4c, and D4h3. These less common or rare haplogroups are restricted only to some Native American populations or geographic areas and bring the overall number of Native American mtDNA lineages to nine. Our comprehensive overview of the four pan-American branches of the mtDNA tree suggests a scenario with a human entry and spread into the Americas from Beringia about 20,000 years ago, and preliminary data raise the possibility that the uncommon five Native American haplogroups might have marked additional migratory events from Asia or Beringia. Overall, through a combined analysis of modern and ancient Native American mtDNA, we are making an effort for reconstructing the complex pre-Columbian history at both macro- and micro-geographic levels.

Identifying genes affecting normal variation in human facial features using admixed populations.
Seven selection-nominated candidate genes (COL11A1, LMNA, FGFR1, FGFR2, TRPS, BRAF, FLNA) known to be involved in Mendelian craniofacial dysmorphologies and to have high allele frequency differences between West African and European populations were tested for admixture linkage to normal facial feature traits. The sample consists of 254 subjects (n=131 African Americans, n=123 Brazilians) of West African and European genetic ancestry. Each individual was genotyped at 176 ancestry informative markers (AIMs), which allowed for proportional estimation of genetic ancestry from four parental populations and adjustments for admixture stratification.
3D images of faces were acquired using the 3dMDface imaging system. 3D coordinate data were collected from 22 landmarks placed on each image using the 3dMDPatient software. The 231 possible pairwise landmark distances were scaled to the geometric mean and then analyzed using Euclidean Distance Matrix Analysis.
We used both ANOVA and ADMIXMAP to control for admixture stratification and to test for associations between the 231 pairwise landmark distances and 183 AIMs, using sex, height and BMI as covariates. We used a four-population model (West African, European, East Asian, and Native American).
There is a strong concordance between the ANOVA and ADMIXMAP results. Many landmark distances, particularly on the mouth and nose, were significantly associated with genetic ancestry. Additionally, three of the candidate genes show no effects on pairwise landmark distances while four show distinct patterns of association. For example, FGFR2 is associated primarily with the length of the face. These results represent the first identification of the first genes affecting normal variation in facial features.

Ethnicity-Confirmed Genetic Structure in New Hampshire.
Genetic population structure is known to result from shared ancestry. Though there have been several studies of genetic structure within and among different geographic regions and ethnic groups, little is known of the genetic structure of highly admixed US populations or whether the structure is concordant with self-reported ancestry. In this study, 1529 single nucleotide polymorphisms (SNPs) from 864 healthy control individuals from New Hampshire were measured as part of a bladder cancer epidemiology study. The SNPs were from approximately 500 cancer susceptibility genes scattered throughout the genome. Of these, 960 Tag SNPs were used to cluster individuals using the Structure algorithm for between 2 and 5 subpopulations. Subtle genetic structure was found, suggesting the appropriate number of subpopulations to be either 4 or 5 (FSTs 4 populations: 0.0377, 0.0399, 0.0363, 0.0340; 5 populations: 0.0452, 0.0536, 0.0585, 0.0534, 0.0521). We coded the individuals self-reported ancestries in a genotype fashion (i.e. 0= not reporting that ancestry, 1= reporting part that ancestry, 2= reporting only that ancestry) and conducted a Spearmans rank correlation between each ancestry and the structure q value, which represents the proportion of an individual that originated from a certain genetic subpopulation. Those of Russian, Polish and Lithuanian ancestry most consistently clustered together. The ancestry results support either 4 or 5 subpopulations. In order to investigate linkage disequilibrium (LD), the complete set of SNPs from the 7 most densely genotyped genes were used to make haploview plots between the different groups. The results vary by gene, though for one gene in particular, GHR, the results are very different for 4 subpopulations. These results suggest that despite New Hampshires admixture and presumed homogeneity, there are 4 or 5 distinct genetic subgroups within the population that can be linked to self-reported ancestry and display differences in patterns of LD.

Inference of human demographic parameters using haplotype patterns from genome-wide SNP data.
Accurate inference of human demographic history from genetic data is essential for identification of single nucleotide polymorphism (SNP) association with disease and for inference of natural selection. Haplotype diversity and haplotype sharing carry additional demographic information to that obtainable from SNP frequency spectra, and so we propose a novel method using haplotype summary statistics to fit demographic models to genome-wide SNP data. We divide the genome into 0.25 cM windows and for each we tabulate the number of distinct haplotypes and the frequency of the most common haplotype. We summarize the data by the genome-wide joint distribution of these two statistics. Coalescent simulations are then used to evaluate whether different demographic models are compatible with the observed data. Application of our method to simulated data shows that our method can reliably infer parameters from complex demographic models (such as bottlenecks) and is relatively robust to the levels of SNP ascertainment bias found in many genome-wide datasets. We have applied our method to data collected by the International HapMap Consortium and find that a bottleneck model best fits the CEU population. We have also analyzed a large dataset consisting of Affymetrix 500k data from ~2,900 individuals with ancestry from Taiwan, Japan, India, Mexico and many European countries. Since this dataset includes ~2,300 European individuals, we are able to study haplotype patterns at a fine scale within Europe. Interestingly, we find that within Europe there is a south-to-north gradient with decreasing levels of haplotype diversity moving north, consistent with south to north migrations. We also find that the southwestern European sample has higher haplotype diversity than the southeastern European sample. Additionally, a higher proportion of haplotypes are shared between the southwestern European sample and the Yoruba sample than between southeastern European sample and the Yoruba sample. These two patterns are consistent with recent admixture across the Mediterranean from Northern Africa.

Genome wide analysis and heritability estimation of intelligence in the International Multi-centre ADHD Genetics (IMAGE) study.
Attention-Deficit/Hyperactivity Disorder (ADHD) is a neurodevelopmental disorder characterised by symptoms of inattention, hyperactivity and impulsivity. There is growing evidence of heterogeneity in its etiology, pathophysiology and clinical expression. One approach to resolving heterogeneity involves the identification of endophenotypes, intervening variables that might mediate pathways between specific genes and clinical phenotype. IQ is a candidate endophenotype for ADHD. Genome-wide linkage analyses of full scale IQ and IQ subscales were performed in the International Multi-centre ADHD Genetics (IMAGE) study including 1094 families with 1094 DSM-IV combined type ADHD probands and their 1441 siblings (unselected for ADHD status). IQ was measured using five subscales of the WISC-IIIR scale. The full scale prorated IQ score and the five subscales were used as quantitative traits for linkage analysis. 5,407 autosomal SNPs were used to run multipoint regression-based linkage analyses using MERLIN. The h2 estimates from the IQ subscales and the full IQ score ranged from 31% to 100%. Three suggestive linkage signals were found (LOD scores 2, p values 0.001) on chromosomes 7, 9 and 14 for three different subscales. Previously, two regions on chromosomes 7 and 14 were reported as being associated or linked to IQ. Our results, though only suggestive, suggest the presence of additional genetic variants contributing to the variance of IQ in ADHD.

Population structure in Japan with 140k SNPs

After the many recent studies on fine-scale genetic ancestry in Europe, a new paper investigates population structure in Japan using 140k SNPs. From the paper:
Our present study has clearly shown, on the basis of analysis of genome-wide SNP genotypes that most Japanese individuals fall into two main clusters: the Hondo cluster and the Ryukyu cluster. Our results also show that local regions in Honshu Island (the largest island of Japan) are still genetically differentiated, even though human migration within Japan has become rather frequent in the past 100 years or so. Our finding that the individuals from Tohoku were less related to Han-Chinese individuals than were the individuals from Kinki and Kyushu suggests that the individuals in Tohoku were less affected by immigrants from the Asian continent than were the individuals in Kinki. The immigrants who came to Japan from the Asian continent through the Korean Peninsula may have entered Japan from northern Kyushu, the Japan Sea side of Kinki or Chugoku.

American Journal of Human Genetics doi: doi:10.1016/j.ajhg.2008.08.019

Japanese Population Structure, Based on SNP Genotypes from 7003 Individuals Compared to Other Ethnic Groups: Effects on Population-Based Association Studies

Yumi Yamaguchi-Kabata et al.


Because population stratification can cause spurious associations in case-control studies, understanding the population structure is important. Here, we examined Japanese population structure by “Eigenanalysis,” using the genotypes for 140,387 SNPs in 7003 Japanese individuals, along with 60 European, 60 African, and 90 East-Asian individuals, in the HapMap project. Most Japanese individuals fell into two main clusters, Hondo and Ryukyu; the Hondo cluster includes most of the individuals from the main islands in Japan, and the Ryukyu cluster includes most of the individuals from Okinawa. The SNPs with the greatest frequency differences between the Hondo and Ryukyu clusters were found in the HLA region in chromosome 6. The nonsynonymous SNPs with the greatest frequency differences between the Hondo and Ryukyu clusters were the Val/Ala polymorphism (rs3827760) in the EDAR gene, associated with hair thickness, and the Gly/Ala polymorphism (rs17822931) in the ABCC11 gene, associated with ear-wax type. Genetic differentiation was observed, even among different regions in Honshu Island, the largest island of Japan. Simulation studies showed that the inclusion of different proportions of individuals from different regions of Japan in case and control groups can lead to an inflated rate of false-positive results when the sample sizes are large.


Varki et al. (2008) on Human Uniqueness in Nature Reviews Genetics

From the paper:
Remarkable similarities of known human and chimpanzee protein sequences initially led to the suggestion that significant differences might be primarily in gene and protein expression, rather than protein structure6. Further analysis of alignable non-coding sequences affirmed this ~1% difference. However, the subsequent identification of non-alignable sequences that were due to small- and large-scale segmental deletions and duplications21–23 showed that the overall difference between the two genomes is actually ~4%.


Why are coding-sequence changes in brain genes under a larger degree of purifying selection than in other tissues? The reason for this is not immediately clear as a wide range of brain function supports life to reproductive age in humans.


But this notion, which is based on single nucleotide changes in protein-coding sequence, has to be reconciled with the CNV data, because CNVs in humans seem to be enriched among genes involved in neurodevelopmental processes.


However, connecting such genes involved in disorders of human cognition to the specific phenotypes undergoing selection poses significant challenges. A salient example involves two genes, abnormal spindle homologue microcephaly associated (ASPM) and microcephalin (MCPH1), the adaptive evolution of these genes in humans was claimed to be related to normal variation in brain size, on the basis of the fact that Mendelian mutations in each results in microcephaly in humans152,153. However, not all investigators have found evidence for the adaptive evolution of ASPM or MCPH1 (ref. 154). Also, neither gene is likely to contribute significantly
to normal variation in human brain size155. This case illustrates the challenges of interpreting genetic data in the face of complex phenotypes, especially those that are poorly understood.

Nature Reviews Genetics doi:10.1038/nrg2428

Human uniqueness: genome interactions with environment, behaviour and culture

Ajit Varki et al.


What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, 'anthropogeny' (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any 'genes versus environment' dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture — perhaps relaxing allowable thresholds for large-scale genomic diversity.


Deshpande et al. (2008) on Out of Africa

This is an important new paper which adds some complexity to the Out of Africa theory. Much existing work has focused on a "tree-like" story of the emergence of modern humans, with an African source population at the root, and other populations being less diverse the further they are (geographically) from the source.

This new model is not limited on colonization, i.e., the movement of a subset of a territory's population into a new uninhabited territory, but also on "lateral" gene exchange between pre-established populations.

From the paper:
Unlike previous models, ours separated colonization events from the continued exchange of people between occupied territories. Our estimates of the exchange rate between neighbouring populations were very low (below 0.01), with carrying capacities ranging from approximately 600 to 1200. Assuming that the census size is three times this effective population size, we derive a census size of approximately 1800–3600 people in each deme. Since each deme has dimensions of 125x125 km, this corresponds to a population density of approximately 0.11–0.23 persons m-2, well within the range for hunter–gatherers referred to by Liu et al. (2006).

Related: Geographic and genetic distance in human populations, A Geographically Explicit Genetic Model of Worldwide Human-Settlement History

Proceedings of the Royal Society B doi: 10.1098/rspb.2008.0750

A serial founder effect model for human settlement out of Africa

Omkar Deshpande, Serafim Batzoglou, Marcus W. Feldman, L. Luca Cavalli-Sforza


The increasing abundance of human genetic data has shown that the geographical patterns of worldwide genetic diversity are best explained by human expansion out of Africa. This expansion is modelled well by prolonged migration from a single origin in Africa with multiple subsequent serial founding events. We discuss a new simulation model for the serial founder effect out of Africa and compare it with results from previous studies. Unlike previous models, we distinguish colonization events from the continued exchange of people between occupied territories as a result of mating. We conduct a search through parameter space to estimate the range of parameter values that best explain key statistics from published data on worldwide variation in microsatellites. The range of parameters we use is chosen to be compatible with an out-of-Africa migration at 50–60Kyr ago and archaeo–ethno–demographic information. In addition to a colonization rate of 0.09–0.18, for an acceptable fit to the published microsatellite data, incorporation into existing models of exchange between neighbouring populations is essential, but at a very low rate. A linear decay of genetic diversity with geographical distance from the origin of expansion could apply to any species, especially if it moved recently into new geographical niches.


September 24, 2008

The Byzantine origin of clinical geriatrics

Wien Med Wochenschr. 2008;158(17-18):471-80.

[Why should Byzantium be considered as a cradle of clinical geriatrics?]

Lapin A.


Generally, roots of today's medical ethics are thought to have sprouted from antiquity and from classical Hebraic consciousness, while the origin of hospital medicine and institutional nursing of the elderly was assumed in Middle Age and in modern times, respectively. But even between these two periods, notably in Byzantium (324-1453) there were many famous physicians working with surprising skills in many disciplines such as surgery and ophthalmology. The most important achievement of that time, however, was in public health care. Following the Christian ideal of philanthropy, numerous hospitals (nosokomeia), hospices (xenodocheia) and asylums for the elderly (gerokomeia) of a remarkable organisation and professionalism were founded in many cities of the Byzantine Empire. Concerning the elderly patients, interesting findings were obtained concerning ageing process (eschatogeria), geriatric symptoms, multimorbidity, marasm and typically occurring diseases. Interesting approaches were realized with regard to the nursing care, diet and recommended life style for the elderly. By the end of the Byzantium Empire in 1453 and due to the different cultural development in the West, which was sometimes marked by conflicts between church and science and by the regulations of medicine, the knowledge about the Byzantine health care was almost lost. It survived, however, only in hospitals of occidental monastic orders, which brought their experience from East-Mediterranean area. Their hospitals were than a base for modern health care and for geriatrics.


mtDNA haplogroups and Parkinson's

J Neural Transm. 2008 Sep 23. [Epub ahead of print]

Mitochondrial DNA haplogroups and subhaplogroups are associated with Parkinson's disease risk in a Polish PD cohort.

Gaweda-Walerych K, Maruszak A, Safranow K, Bialecka M, Klodowska-Duda G, Czyzewski K, Slawek J, Rudzinska M, Styczynska M, Opala G, Drozdzik M, Canter JA, Barcikowska M, Zekanowski C.

mtDNA common variation is inconsistently reported to modify the risk of Parkinson's disease (PD). We evaluated the impact of the mitochondrial haplogroups, subhaplogroups, coding and non-coding single-nucleotide polymorphisms on PD risk in 241 PD patients and 277 control subjects. After stratification by gender, we found that haplogroup J (OR 0.19; 95% CI 0.069-0.53; P = 0.0014) was associated with a lower PD risk in males. Unexpectedly, subhaplogroup analysis based on the control region (CR) polymorphisms demonstrated that subcluster K1a was more prevalent in healthy controls, while K1c was more frequent in PD patients (P = 0.025 and P = 0.011, respectively; two-tailed Fisher's exact test). Additionally, we confirmed the hypothesis that sublineages (U4 + U5a1 + K+J1c + J2), previously proposed to partially uncouple oxidative phosphorylation (OXPHOS), decrease PD risk (P = 0.027, chi(2) with Yates' correction). The putative protective effect of uncoupling mtDNAs against PD might result from decreased production of reactive oxygen species. We propose that stratification into subhaplogroups or by gender could be necessary to reveal the involvement of specific mtDNA sublineages in PD pathogenesis.


Nutrition behind Flynn effect?

The Flynn effect is the improvement of IQ scores over time, with people born more recently tending to score higher than the average of previous generations. This paper suggests that better nutrition of pregnant women and infants is behind this phenomenon.

Intelligence doi: doi:10.1016/j.intell.2008.07.008

What has caused the Flynn effect? Secular increases in the Development Quotients of infants

Richard Lynn


Results of five studies show that during the second half of the twentieth century there were increases in the Development Quotients (DQs) of infants in the first two years of life. These gains were obtained for the Bayley Scales in the United States and Australia, and for the Griffiths Test in Britain. The average of 19 data points is a DQ gain of approximately 3.7 DQ points per decade. Similar gains of approximately 3.9 IQ points per decade have been present among preschool children aged 4–6 years. These gains are about the same as the IQ gains of school age students and adults on the Wechsler and Binet tests. This suggests that the same factor has been responsible for all these secular gains. This rules out improvements in education, greater test sophistication, etc. and most of the other factors that have been proposed to explain the Flynn effect. It is proposed that the most probable factor has been improvements in pre-natal and early post-natal nutrition.


September 23, 2008

Facial masculinity and testosterone levels correlated (after a win)

Proceedings of the Royal Society B doi: 10.1098/rspb.2008.0990

Testosterone responses to competition in men are related to facial masculinity

Nicholas Pound et al.


Relationships between androgens and the size of sexually dimorphic male traits have been demonstrated in several non-human species. It is often assumed that a similar relationship exists for human male faces, but clear evidence of an association between circulating testosterone levels and the size of masculine facial traits in adulthood is absent. Here we demonstrate that, after experimentally determined success in a competitive task, men with more a masculine facial structure show higher levels of circulating testosterone than men with less masculine faces. In participants randomly allocated to a ‘winning’ condition, testosterone was elevated relative to pre-task levels at 5 and 20min post-task. In a control group of participants allocated to a ‘losing’ condition there were no significant differences between pre- and post-task testosterone. An index of facial masculinity based on the measurement of sexually dimorphic facial traits was not associated with pre-task (baseline) testosterone levels, but was associated with testosterone levels 5 and 20min after success in the competitive task. These findings indicate that a man's facial structure may afford important information about the functioning of his endocrine system.


Neanderthals' trips to the sea in search of food

Another data point for Neanderthal behavioral complexity; this paper shows that Neanderthals made forays to the sea to exploit marine food resources. UPDATE: John Hawks comments.

PNAS doi: 10.1073/pnas.0805474105

Neanderthal exploitation of marine mammals in Gibraltar

C. B. Stringer et al.


Two coastal sites in Gibraltar, Vanguard and Gorham's Caves, located at Governor's Beach on the eastern side of the Rock, are especially relevant to the study of Neanderthals. Vanguard Cave provides evidence of marine food supply (mollusks, seal, dolphin, and fish). Further evidence of marine mammal remains was also found in the occupation levels at Gorham's Cave associated with Upper Paleolithic and Mousterian technologies [Finlayson C, et al. (2006) Nature 443:850–853]. The stratigraphic sequence of Gibraltar sites allows us to compare behaviors and subsistence strategies of Neanderthals during the Middle Paleolithic observed at Vanguard and Gorham's Cave sites. This evidence suggests that such use of marine resources was not a rare behavior and represents focused visits to the coast and estuaries.


September 22, 2008

Stonehenge was built in 2,300BC and may have been a healing center

There have been quite a few stories on Stonehenge lately. Now, the BBC reports on the results of new carbon dating:
Archaeologists have pinpointed the construction of Stonehenge to 2300 BC - a key step to discovering how and why the mysterious edifice was built.

The radiocarbon date is said to be the most accurate yet and means the ring's original bluestones were put up 300 years later than previously thought.
and new interpretations about its function:
Professors Darvill and Wainwright believe that Stonehenge was a centre of healing - a "Neolithic Lourdes", to which the sick and injured travelled from far and wide, to be healed by the powers of the bluestones.

They note that "an abnormal number" of the corpses found in tombs nearby Stonehenge display signs of serious physical injury and disease.

And analysis of teeth recovered from graves show that "around half" of the corpses were from people who were "not native to the Stonehenge area".

Such a prominent edifice need not have a single function. Healing cults tend to form around important religious sites irrespective of their original purpose.

There is a BBC Timewatch documentary on this which will air on Sep 27; a couple of video clips are on the BBC site. Apparently, the scientists suggest that the bluestones were put up ~2,300BC, while the trilithons were put up ~2,100BC. The monument started to enter its phase of decline and neglect ~1,900BC.

Interestingly, as pointed out in the clip, the new date for the erection of Stonehenge coincides with the burial date for the Amesbury archer.

Comparison of different methods for estimating admixture

Yann Klimentidis links to this new paper.

American Journal of Epidemiology doi:10.1093/aje/kwn224

Comparison of Statistical Methods for Estimating Genetic Admixture in a Lung Cancer Study of African Americans and Latinos

Melinda C. Aldrich et al.


A variety of methods are available for estimating genetic admixture proportions in populations; however, few investigators have conducted detailed comparisons using empirical data. The authors characterized admixture proportions among self-identified African Americans (n = 535) and Latinos (n = 412) living in the San Francisco Bay Area who participated in a lung cancer case-control study (1998–2003). Individual estimates of genetic ancestry based on 184 informative markers were obtained from a Bayesian approach and 2 maximum likelihood approaches and were compared using descriptive statistics, Pearson correlation coefficients, and Bland-Altman plots. Case-control differences in individual admixture proportions were assessed using 2-sample t tests and logistic regression analysis. Results indicated that Bayesian and frequentist approaches to estimating admixture provide similar estimates and inferences. No difference was observed in admixture proportions between African-American cases and controls, but Latino cases and controls significantly differed according to Amerindian and European genetic ancestry. Differences in admixture proportions between Latino cases and controls were not unexpected, since cases were more likely to have been born in the United States. Genetic admixture proportions provide a quantitative measure of ancestry differences among Latinos that can be used in analyses of genetic risk factors.


September 20, 2008

John Hawks stars in the "Neanderthal Code"!

Ok, "stars" may be a bit too much, but judging from the videos on the National Geographic site, he seems to have quite a big role in the documentary:
I feel like the defense attorney for the Neanderthals sometimes. I am trying to see the ways that they overlapped with us, and trying to add complexity to the story, because any story that involves things happening over a continent over thousands of years, it's got to be complicated.
I don't have a very strong opinion on Neanderthal-sapiens relations, but I must acknowledge that in Prof. Hawks, everyone's favorite Paleolithic mystery men (and women) have found one of their most eloquent defenders.

Ian Tattersall also appears (probably on the Out of Africa corner of the ring).

Most anthropologists today seem to be somewhere between the replacement and assimilation model of human origins, with Wolpoff's multi-regional model still in the running, and Coon's "candelabra" model mostly abandoned.

This is one debate that has raged for decades, and depends on the interpretation of a handful of old skeletons, and of the new DNA evidence about Neanderthals.

I will probably be watching the Neanderthal Code and making further comments in the coming week.

UPDATE Here is the 10-page article from the October 2008 issue of National Geographic.

September 19, 2008

Carl Zimmer article on Intelligence (and some thoughts on nature/nurture and IQ)

Carl Zimmer blogs about his Scientific American article on Intelligence. From the article:
It was with great delight that Plomin got his hands on microarrays that could detect 500,000 genetic markers--hundreds of times more than he had previously used. He and his colleagues got cheek swabs from 7,000 children, isolated their DNA, and ran it through the microarrays. And once more the results were disappointing.

“I’m not willing to say that we have found genes for intelligence,” Plomin declares, “because there have been so many false positives. They’re such small effects that you’re going to have to replicate them in many studies to feel very confident about them.”
I had blogged about this study when it came out. I repeat my comments from 2006 which are still valid today:
It appears that the hunt for genes affecting intelligence is not going well. I can't say that I'm surprised, because I have always maintained that intelligence is an emergent property of a set of co-operating genes during development in a particular environment and I don't anticipate that the geno-centric approach will take us closer to understanding it.

Intelligence, and -I believe- other complex traits are like complex dishes with many ingredients. The ingredients themselves (e.g., salt, lettuce, or chicken) are themselves unremarkable, but it is the way that they are put together and turned on and off by internal and external stimuli (the pot, the temperature, time, etc.) that makes a good dish.
I have expressed the same view in the recent entry on genome-wide association studies:
This Lego-block paradigm is based on the notion that most of our alleles are commodity "building blocks"; if they are brought together harmoneously, they produce positive results. The occasional allele may have a large effect, and some alleles fit better together than others. Yet, most of the success or failure of a construction depends on how the components fit together, and not what they are.
From the Carl Zimmer article:
Researchers have made images of their developing brains once a year, and Shaw has focused much of his attention on what the pictures reveal about the growth of the cortex, the outer rind of the brain where the most sophisticated information processing takes place.


In all children the cortex gets thicker as new neurons grow and produce new branches. Then the cortex thins out as branches are pruned. But in some parts of the cortex, Shaw found, development took a different course in children with different levels of intelligence. “The superclever kids started off very thin,” Shaw says. “They got really relatively thicker, but in adolescence they got thinner again very quickly.”

I had blogged about this study in 2006; check out that blog entry to see the thickness curves of cortex in development.

At the dawn of the genetics era, physical anthropologists' ideas that intelligence was correlated with the brain's observable properties were often ridiculed. And, yet neuronatomical correlates are pretty much the only game in town when it comes to giving a prediction (admittedly a very coarse one) of a person's IQ

That doesn't mean that genes don't play a role in intelligence; they do, and it's a sizeable one. But that role is hidden in a gene-gene and gene-environment interaction web of thousands of factors, where the individual components aren't really important, but the way they are put together are.

This realization also leads one to question genetic fetishists' conclusions about environmental influences on IQ.

It is true that scientists have looked at a lot of possible environmental influences on IQ and have come up short on significant environmental factors that can boost a person's IQ. There is simply very limited evidence that any particular environment can achieve this --sort of really bad influences such as malnutrition or some infectious diseases in childhood. And, yet we know that part of the variation of IQ is due to environmental influences. What gives?

What scientists have looked at are recognizable, "obvious", environmental influences (parenting style, schooling, etc.), which are analogous to the "common variants" in genetics.

Just as a microarray-based genome-wide association study has no clue about the rare family-level gene complexes and disease factors, so studies of environmental influences have no clue about the rare family/school/peer group micro-environments affecting a person's development.

Thus, the failure to find strong environmental influences on IQ doesn't strengthen the nature side of the nature-nurture divide, just as the failure to find strong genetic influences on IQ doesn't strengthen the nurture side.

The truth is, that Intelligence is an emergent property of a complex web of genetic and non-genetic interactions.

A human being is like a black box with zillions of inputs, some of them genetic, others environmental. We know that the box's output, e.g. its IQ score on a test is related to its inputs; but the relationship isn't linear and tidy: you can try different inputs from here to eternity, but you won't be able to figure out what the output is.

As I wrote in my post on height and body mass index, real progress will come about only when we finally look into the box:
Real progress will only come about with more developmental and functional studies, i.e. studies that actually look at what genes do in the body.

Figuring out how humans "work" is easier said than done. But, I believe, there is no shortcut.

Google trumps MitoMap for identifying mtDNA mutations

Hum Mutat. 2008 Sep 17. [Epub ahead of print]

Exaggerated status of "novel" and "pathogenic" mtDNA sequence variants due to inadequate database searches.

Bandelt HJ, Salas A, Taylor RW, Yao YG.

Given its relative ease, screening the entire mitochondrial DNA (mtDNA) for heteroplasmic or novel homoplasmic mutations has become part of the routine diagnostic workup for the molecular geneticist confronted with a disease case exhibiting clinical and biochemical features of mitochondrial dysfunction. "Novelty" of a given mtDNA variant is most often equated with nonregistration in the extensive MITOMAP database ( This practice has led to a number of spurious findings and wrong conclusions concerning the pathogenic status of specific mtDNA mutations, especially in the absence of proper evaluation and pathogenicity scoring. We demonstrate by way of real cases targeting the mt-tRNA(Cys) (MT-TC) gene and a stretch within the MT-ND3 gene, that a straightforward Google search can identify twice as many previously observed mutations than any MITOMAP query could achieve. Further, we reassess the recent rediscovery of m.15287T>C by listing all known occurrences and, where possible, providing the haplogroup context, shedding new light on the potential pathogenicity status of m.15287T>C.


September 18, 2008

Political orientation and physiological response

I have often noticed when I turn on the TV and there is a political discussion going on with speakers I don't recognize, that it's often possible to guess (better than chance) the participants' side. Whether it's appearance, clothing, or mannerisms, there may be subtle clues that our minds have come to associate with particular political attitudes. For that to be possible, however, political orientation should be made manifest in some way. In this paper, it is shown that people who are startled more easily tend to be more right-wing in the American political spectrum. It would be interesting to repeat this experiment in other countries. (I'll post the abstract when I see it -- posted)

UPDATE (Sep 19): John Hawks posts a long and skeptical commentary on the study, which should be read by anyone interested in the subject.

Political attitudes are predicted by physiological traits
HOUSTON -- (Sept. 16, 2008) -- Is America's red-blue divide based on voters' physiology? A new paper in the journal Science, titled "Political Attitudes Are Predicted by Physiological Traits," explores the link.

Rice University's John Alford, associate professor of political science, co-authored the paper in the Sept. 19 issue of Science.

Alford and his colleagues studied a group of 46 adult participants with strong political beliefs. Those individuals with "measurably lower physical sensitivities to sudden noises and threatening visual images were more likely to support foreign aid, liberal immigration policies, pacifism and gun control, whereas individuals displaying measurably higher physiological reactions to those same stimuli were more likely to favor defense spending, capital punishment, patriotism and the Iraq War," the authors wrote.

Science Vol. 321. no. 5896, pp. 1667 - 1670
DOI: 10.1126/science.1157627

Political Attitudes Vary with Physiological Traits

Douglas R. Oxley et al.

Although political views have been thought to arise largely from individuals' experiences, recent research suggests that they may have a biological basis. We present evidence that variations in political attitudes correlate with physiological traits. In a group of 46 adult participants with strong political beliefs, individuals with measurably lower physical sensitivities to sudden noises and threatening visual images were more likely to support foreign aid, liberal immigration policies, pacifism, and gun control, whereas individuals displaying measurably higher physiological reactions to those same stimuli were more likely to favor defense spending, capital punishment, patriotism, and the Iraq War. Thus, the degree to which individuals are physiologically responsive to threat appears to indicate the degree to which they advocate policies that protect the existing social structure from both external (outgroup) and internal (norm-violator) threats.


Y chromosomes from the Pyrenees

Once again, this paper uses the inappropriate 0.00069/locus/generation mutation rate, hence all its age estimates are wrong. I wonder who the first scientist will be to say that the Emperor has no clothes; the practice of uncritically using a mutation rate derived under totally inapplicable demographic assumptions will eventually be noticed.

From the paper:
However comparing the average STR variances of the R1b1b2c (0.243), R1b1b2d (0.207) and I2a2 (0.278) lineages considered in this study and given the replicated estimates pointing to a Mesolithic time frame for the origin, diversification and diffusion of the I2a2 clade (Rootsi et al. 2004), the temporal interpretation here provided for R1b1b2c seems reliable.
Reliable indeed. Even with the wrong mutation rate these lineages can't be pushed to the Paleolithic. Better estimates for them are: R1b1b2c: ~1,350BC; R1b1b2d: ~850BC; I2a2: ~1,800BC.

From the paper:
However, the time to the most-recent common ancestor (TMRCA) of the Pyrenean R1b1b2d lineages was here estimated at 7383 ± 1477 years ago, which is consistent with an early dispersion of R1b1b2d all over the Pyrenees and subsequent dissemination outside the mountain range from the Neolithic era onwards. The much younger age estimated by Hurles et al. (1999) for the SRY2627 mutation can, nevertheless, be explained by the mutation rate used (2.1×10−3, for microsatellites), which does not take into account evolutionary considerations (see Zhivotovsky et al. 2006).
Hurles was right; the authors should follow their own advice and see Zhivotovsky et al. 2006. They will realize that their 0.00069/locus/generation is derived for a demographic scenario in which a lineage originating 7383 years ago has only ~150 living descendants, an underestimation of several orders of magnitude.

From the paper:
The Y lineages representative of what might have been a pre-Neolithic male genetic composition in Iberia, were those bearing the Palaeolithic mutations M269, including its Mesolithic derived branches R1b1b2c-M153 and R1b1b2d-SRY2627, plus those falling in the I clade defined by the Mesolithic M170.
It's as if time has frozen and scientists are doomed to forever repeat what other scientists have said before them.

Annals of Human Genetics doi: 10.1111/j.1469-1809.2008.00478.x

In search of the Pre- and Post-Neolithic Genetic Substrates in Iberia: Evidence from Y-Chromosome in Pyrenean Populations

A. M. López-Parra et al.


The male-mediated genetic legacy of the Pyrenean population was assessed through the analysis of 12 Y-STR and 27 Y-SNP loci in a sample of 169 males from 5 main geographical areas in the Spanish Pyrenees: Cinco Villas (Western Pyrenees), Jacetania and Valle de Arán (Central Pyrenees) and Alto Urgel and Cerdaña (Eastern Pyrenees). In the Iberian context, the Pyrenean samples present some specificities, being characterizeded by a high proportion of chromosomes R1b1b2-M269 (including the usually uncommon R1b1b2d-SRY2627 and R1b1b2c-M153 types) or I2a2-M26 and low proportions of other haplogroups. Our results indicate that an old pre-Neolithic substrate is preponderant in populations of the whole Pyrenean fringe. However, AMOVA revealed a high level of substructure within Pyrenean populations, partially explained by drift effects as well as by the signature of an ancient genetic differentiation between Western and Eastern Pyrenees.