August 31, 2005

Racial analysis calculator (female version)

I have created a female version of the Racial Analysis Calculator which can be found here. This works by first converting raw measurements into "male-equivalents" and then checking these against the previously identified metric types. The accuracy will depend on (i) the goodness of the conversion process which was based on regression on male-female population data that was available to me, and also of course (ii) the accuracy of the raw measurements. As before, this works only for Caucasoids. If you are male, then you can still access the male version here.

I'm always interested in hearing comments from people who've used the various calculator tools which can be found at the Anthropological Research Page, so leave a comment here or post at the Dodona forum. In many cases errors in measurement technique render the results meaningless, but if you persevere and obtain correct measurements, then (unless you have a particularly oddly shaped head) you should get some interesting results.

August 29, 2005

Haplogroup frequency correlations in Southeastern Europe

I have decided to investigate the correlations between haplogroup frequencies in southeastern Europe and some neighboring populations. Currently, I have collected frequency data for the main haplogroups found in the region (E3b, J2, I, R1a, R1b) for 16 populations. Most 3-letter codes should be recognizable, but KAL=Kosovo Albanians, SMA=Slav Macedonians, CAL=Calabrians. I should also note that the frequency of haplogroup I in Bulgarians is interpolated from frequencies in Romanians, Greeks, Slav Macedonians and Serbians, as it was missing in the original article. Conclusions about Bulgarians are especially weak, due to this reason, and also the small original sample (N=24).

I began by calculating the correlation matrix in my sample.

A few features strike the eye:

  • The negative correlation between haplogroup R1 and haplogroups E3b, J2, and R1b
  • The negative correlation between haplogroup I and haplogroups J2 and R1b
  • The positive correlation between haplogroup J2 and haplogroup R1b
  • The absence of a substantial correlation between "Neolithic" haplogroups J2 and E3b
As the next analysis will make clear, variation is explained by the presence of two main groupings: a "continental" group comprising of Slavic speakers and a "coastal" group comprising of all others.

The absence of a correlation between J2 and E3b is significant, because it hints that these haplogroups did not diffuse as a result of a single process. The eastern-most populations of our sample, but also the two Italian populations show a higher J2/E3b ratio compared to the "continental" populations.

The second analysis is a dendrogram using Euclidean distance of the normalized haplogroup frequencies. As is apparent, this way of representing the frequency data results in a separation of the two main clusters.

Finally, a principal components analysis is shown in the following plot. The first two components summarize about 77% of the variance.

We observe the two main "contrasts" in the data between "coastal" J2/R1b and "continental" I1b and between "Neolithic" E3b and "Slavic" R1a (*)

Several conclusions can be drawn.

  • The spread of the Neolithic economy into continental Europe involved E3b bearers in a riverine expansion whose northern expression is associated with the Linearbandkeramik. This does not mean that E3b was the only haplogroup associated with these early European farmers, only that it definitely seems to correlate better with this movement compared to the other Neolithic haplogroup (J2).
  • The early diffusion of E3b occurred over a haplogroup I Paleolithic background. It is likely that as groups moved northward the frequency of haplogroup E3b abated, and this is in fact shown in the frequency distribution. This movement is probably associated with the narrow-faced Danubian Mediterranean racial types.
  • This native European population later received an influx of R1a speakers; the frequency of R1a is correlated with latitude. This led to a decrease of the native component in favor of the foreign R1a component (*)
  • The frequency of haplogroup J2 was established by three movements: (i) the initial arrival of J2 from Asia Minor; this did not significantly penetrate into the Western Balkans; (ii) the initial dispersal of J2 into Italy and further west, and around the Black Sea in pre-Greek times, which may be associated with the arrival of gracile Mediterranean racial types into the Ukraine; (iii) the latter dispersal of additional J2 as a result of Greek colonization.
It is imperative that the fine-level phylogeography of haplogroup J2 be resolved. The high frequency of this haplogroup around the Black Sea compared to the western Balkans is highly suggestive of Greek colonization, as it is well known that Greek colonization of the Black Sea was much more intensive than Greek activity in the Adriatic. However, archaeological evidence also shows the northward diffusion of agriculturalists in Thrace to Romania, culminating in the Tripoljie culture and its steppe offshoots. We must be able to distinguish between this earlier movement and the later maritime arrival of the Greeks.

The critical question would be: what fraction of J2 lineages in the Ukraine can be explained as the result of ancient and recent Greek settlement in the Crimea, and what fraction predates the Greeks?

(*) We should note that these are rough correspondences. If the theory of riverine diffusion of haplogroup E3b into Central and Northern Europe is correct, then it is likely that E3b existed in a small frequency in Proto-Slavs; conversely, R1a diffused after the LGM before its most recent diffusion associated perhaps with Slavic languages.

Update: A reader alerts me to a different study which listed the Hungarian R1a frequency as substantially lower than the one used here (Semino et al. 2000). Unfortunately, that study did not list frequencies of all haplogroups needed for comparison, so it could not be used directly. If the frequency of R1a=20.4% is used, then a slightly different clustering is obtained.

Free Image Hosting at

August 27, 2005

Affinities of Early Upper Paleolithic Europeans

Some interesting information about the affinities of Early Upper Paleolithic (EUP) European crania. EUP Europeans were generalized Caucasoids, and are remarkably similar to modern Europeans despite the tens of thousands of years that have intervened and the substantial replacement of the original EUP population by later migrations evidenced in European mtDNA.
Some of the discordance Van Vark et al. see between genetic and morphometric results may be attributable to their methodological choices. It is clear that the affiliation expressed by a given skull is not independent of the number of measurements taken from it. From their Table 3, it is evident that those skulls expressing Norse affinity are the most complete and have the highest number of measurements ( = 50.8), while those expressing affinity to African populations (Bushman or Zulu) are the most incomplete, averaging just 16.8 measurements per skull. Use of highly incomplete or reconstructed crania may not yield a good estimate of their morphometric affinities. When one considers only those crania with 40 or more measurements, a majority express European affinity.

To examine this idea further, we use the eight Upper Paleolithic crania available from the test series of Howells ([1995]), all of which are complete. Our analysis of these eight, based on 55 measurements, is presented in Table 1. Using raw measurements, 6 of 8 express an affinity to Norse, and with the shape variables of Darroch and Mosimann ([1985]), 5 of 8 express a similarity to Norse. Using shape variables reduces the Mahalanobis distance, substantially in some cases. Typicality probabilities (Wilson, [1981]), particularly for the shape variables, show the crania to be fairly typical of recent populations. The results presented in Table 1 are consistent with the idea that Upper Paleolithic crania are, for the most part, larger and more generalized versions of recent Europeans. Howells ([1995]) reached a similar conclusion with respect to European Mesolithic crania.

Image Hosted by


Next, let us examine the issue of whether the EUP situation can be regarded as parallel to the Native American one. There are some obvious differences, principal among them the time frame. The European crania used by Van Vark et al. span 26,000 years, as against our North American sample that spans about 2,400 years. Their EUP series dates from 37,000 BP to about 9,000 BP, as against a maximum time frame for our North American sample of 9,400-7,000 BP (Jantz and Owsley, [2001]). The Upper Paleolithic time span is significantly older and more than 10 times longer than the American one, yet the EUP crania are not correspondingly further removed from the contemporary population. Given that European fossil crania are separated from their supposed descendants by greater temporal distance than is the case in America, one could easily accept that European fossil crania might be more loosely connected to the modern population. Yet, we observe just the opposite. The data in Van Vark et al. demonstrate a higher degree of affiliation with the supposed descendent modern population (16/35 = 46%) than we found in the American situation (1/11 = 9%).
American Journal of Physical Anthropology
Volume 121, Issue 2, Pages 185-188

Reply to Van Vark et al.: Is European Upper Paleolithic cranial morphology a useful analogy for early Americans?

Richard L. Jantz, Douglas W. Owsley

No abstract


August 26, 2005

The Indian Genome Variation database

Indian scientists have announced in Human Genetics the launch of a new program to understand genetic variation within the diverse and fragmented Indian population. This exciting new project should help us understand the origin of about 1/6 of modern humanity, in addition to important medical applications.

The following two maps of the distribution of morphological types and linguistic groups in India are a broad reflection of the cultural and biological diversity of the country.

Free Image Hosting at Free Image Hosting at

The Indian Genome Variation database (IGVdb): a project overview

The Indian Genome Variation Consortium


Indian population, comprising of more than a billion people, consists of 4693 communities with several thousands of endogamous groups, 325 functioning languages and 25 scripts. To address the questions related to ethnic diversity, migrations, founder populations, predisposition to complex disorders or pharmacogenomics, one needs to understand the diversity and relatedness at the genetic level in such a diverse population. In this backdrop, six constituent laboratories of the Council of Scientific and Industrial Research (CSIR), with funding from the Government of India, initiated a network program on predictive medicine using repeats and single nucleotide polymorphisms. The Indian Genome Variation (IGV) consortium aims to provide data on validated SNPs and repeats, both novel and reported, along with gene duplications, in over a thousand genes, in 15,000 individuals drawn from Indian subpopulations. These genes have been selected on the basis of their relevance as functional and positional candidates in many common diseases including genes relevant to pharmacogenomics. This is the first large-scale comprehensive study of the structure of the Indian population with wide-reaching implications. A comprehensive platform for Indian Genome Variation (IGV) data management, analysis and creation of IGVdb portal has also been developed. The samples are being collected following ethical guidelines of Indian Council of Medical Research (ICMR) and Department of Biotechnology (DBT), India. This paper reveals the structure of the IGV project highlighting its various aspects like genesis, objectives, strategies for selection of genes, identification of the Indian subpopulations, collection of samples and discovery and validation of genetic markers, data analysis and monitoring as well as the project’s data release policy.


August 25, 2005

Sex differences in progressive matrices

A new article by Paul Irwing and Richard Lynn suggests that men score approximately 4.6 IQ points higher than women in the Progressive Matrices test. From the article:
It has frequently been asserted that there is no sex difference in average general intelligence but that the variance is greater in males. In this paper we examine these two propositions by a meta-analysis of studies of sex differences on the Progressive Matrices among university students. We find that both are incorrect.


There are five points of interest in the results. First, the present meta-analysis of sex differences on the Progressive Matrices among university students showing that men obtain significantly higher means than females confirms the results of our meta-analysis of sex differences on this test among general population samples (Lynn & Irwing, 2004). The magnitude of the male advantage found in the present study lies between 3.3 and 5 IQ points, depending on various assumptions. Arguably the best estimate of the advantage of men to be derived from the present study is .31d, based on all the studies and shown in the first row of Table 2. This is the equivalent of 4.6 IQ points and is closely similar to the 5 IQ points found in the meta-analyses of general population samples previously reported.


Second, the Progressive Matrices is widely regarded as one of the best tests of Spearman’s g, the general factor underlying all cognitive abilities... Now that we have established that men obtain higher means than women on the Progressive Matrices, it follows that men have higher general intelligence or g.


Third, the finding that males have a higher mean reasoning ability than females raises the question of how this can be explained... Hence, the larger average brain size of men may theoretically give men an advantage in intelligence arising from a larger average brain size of 0.78 multiplied by 0.40, giving a theoretical male advantage of .31d = 4.7 IQ points. This is a close fit to the sex difference obtained empirically in our previous meta-analysis of the sex difference of 5 IQ points on the Progressive Matrices in general population samples, and of 4.6 IQ points on the Progressive Matrices, in the present meta-analysis of the sex difference in college student samples.


Fourth, a number of those who have asserted that there is no sex difference in intelligence have qualified their position by writing that there is no sex difference ‘worth speaking of’ Mackintosh (1996, p. 567), ‘only a very small advantage of boys and men’ (Geary, 1998, p. 310), ‘ no practical differences in the scores obtained by males and females’ (Halpern, 2000, p. 90), ‘no meaningful sex differences’ (Lippa, 2002), and ‘negligible differences’ (Jorm et al., 2004, p. 7)... These different proportions of men and women with high IQs are clearly ‘worth speaking of’ and may go some way to explaining the greater numbers of men achieving distinctions of various kinds for which a high IQ is required, such as chess grandmasters, Fields medallists for mathematics, Nobel prize winners and the like.


Fifth, the finding in this meta-analysis that there is no sex difference in variance on the Advanced Progressive Matrices and that females show greater variance on the Standard Progressive Matrices is also contrary to the frequently made contention, documented in the introduction, that the variance of intelligence is greater among males. This result should be generalizable to the general population of normal intelligence. The greater male variance theory may, however, be correct for general population samples that include the mentally retarded... The issue of whether there is greater male variance for intelligence in general population samples needs to be addressed by meta-analysis.
British Journal of Psychology (preprint)

Sex differences in means and variability on the
progressive matrices in university students: A meta-analysis

Paul Irwing and Richard Lynn

(no abstract)


More on Ancient Alaskan mtDNA

I had previously posted about ancient DNA extracted from a prehistoric Alaskan sample (On Yourk Knees Cava Man, or OYKCM). In this news article, Brian Kemp, the scientist responsible for this research elaborates about the work which has been submitted to Nature for publication. Interestingly, by comparing the ancient DNA with that of modern humans with the same haplotype, and taking into account the 10,300 years separating the two, he was able to measure the mutation rate, finding it to be faster than what was previously thought:
In the late 1990s, scientists used DNA studies to propose that people first advanced upon the continent from Asia as much as 40,000 years ago. But data from numerous archaeological sites across the Americas have placed the migration at closer to 10,000 to 12,000 years ago.

Kemp has used OYKCM as a measuring stick to come up with dates much closer to the archaeological record. "Because we know that this guy represents the oldest known example of this lineage, that places a minimum date on the emergence of the lineage," he explains.

In other words, OYKCM represents one end of the measuring stick. At the other end are the 47 people who belong to his haplotype. According to the rules of the molecular clock, this makes it possible to measure the genetic changes between OYKCM and the modern samples and calculate the time it would have required for those changes to occur.

"My calibration shows that the changes were occurring two to four times faster than previously thought," Kemp says. "It means some people have overestimated the time. It wasn't so long ago."

Of course ancient DNA is susceptible to so-called phantom mutations occurring after the subject's death. So, even if OYKCM's mtDNA was exactly similar to that of living humans, damage in the intervening 10,300 years may have caused it to appear different. Thus, the mutation rate may be overestimated due to this problem. It will be interesting to see whether this problem is addressed in the published paper, since there are ways to distinguish between genuine and post-mortem mutations in ancient DNA.

August 24, 2005

The Neolithic and the evolution of fear

The following article argues that certain unexplained fear symptoms evolved during the Neolithic. Here is how the authors describe Neolithic warfare:
Paleo-anthropological research has documented a specific pattern of prehistoric inter-group warfare in the Neolithic. In contrast to warfare in historical times, Neolithic inter-group warfare almost exclusively involved attacks against non-combatants in unsuspecting settlements by raiding parties of mateless young, post-pubertal males in search of material and especially reproductive resources. Neolithic combat occurred exclusively between young males, with females and children serving as objects of competition. This has been clearly documented by research on prehistoric human remains. It has been estimated that the victors killed 15%–50% of post-pubertal males and most infants and toddlers, and took females and most weaned pre-pubertal individuals captive (Lambert, 1997, Larsen, 1999, LeBlanc and Register, 2003 and Maschner and Reedy-Maschner, 1998).

The authors argue that the higher prevalence of these unexplained fear symptoms in women than in men, and in younger persons is due to the special nature of Neolithic warfare. Before the Neolithic, humans were most in danger from non-human predators; a fear response played no role against such predators. However, during the Neolithic, other humans replaced non-humans as the greatest danger. Pseudo-neurological problems evolved as a way to signal to the attacking males that one was incapacitated and hence did not pose a danger. This strategy worked especially for young females who are desirable for their mating potential; pre-Neolithic non-human predators made no such distinctions, as humans of both genders and regardless of age were viewed as a food source.

Journal of Affective Disorders (Article in Press)

Evolution of the human fear-circuitry and acute sociogenic pseudoneurological symptoms: The Neolithic balanced-polymorphism hypothesis

H. Stefan Bracha et al.


In light of the increasing threat of large-scale massacres such as terrorism against non-combatants (civilians), more attention is warranted not only to posttraumatic stress disorder (PTSD) but also to acute sociogenic pseudoneurological (“conversion”) symptoms, especially epidemic sociogenic symptoms. We posit that conversion disorders are etiologically related to specific evolutionary pressures (inescapable threats to life) in the late stage of the human environment of evolutionary adaptedness (EEA). Bracha et al. have recently argued that from the neuroevolutionary perspective, medically unexplained efferent vasovagal syncope and medically unexplained craniofacial musculoskeletal pain in young otherwise healthy individuals, may be taxonomized as stress and fear-circuitry disorders. In the present article, we extend neuroevolutionary perspectives to acute pseudoneurological sociogenic (“conversive”) symptoms: psychogenic non-epileptic attacks (“pseudoseizures”), epidemic sociogenic disorders (DSM-IV-TR Epidemic “Hysteria”), conversive motor deficits (pseudo-paralysis and pseudo-cerebellar symptoms), and psychogenic blindness. We hypothesize that these perplexing pseudoneurological stress-triggered symptoms, which constitute psychopathology in extant humans, are traceable to allele-variant polymorphisms which spread during the Neolithic EEA. During Neolithic warfare, conversive symptoms may have increased the survival odds for some non-combatants by visually (i.e., “non-verbally”) signaling to predatory conspecifics that one does not present a danger. This is consistent with the age and sex pattern of conversive disorders. Testable and falsifiable predictions are presented; e.g., at the genome–transcriptome interface, one of the major oligogenic loci involved in conversive spectrum disorders may carry a developmentally sensitive allele in a stable polymorphism (balanced polymorphism) in which the gene expression mechanism is gradually suppressed by pleiotropic androgens especially dehydroxyepiandrosterone sulfate (DHEA-S). Taxonomic implications for the much-needed rapprochement between the forthcoming Diagnostic and Statistical Manual for Mental Disorders, Fifth Edition (DSM-V) and the International Classification of Diseases (ICD) are discussed.


August 21, 2005

Asymmetrical men beware

New research suggests that women who are ovulating are attracted to their regular partners less, and to other men more if their regular partners are asymmetrical.

Proceedings: Biological Sciences (FirstCite)

Women's sexual interests across the ovulatory cycle depend on primary partner developmental instability

Steven W. Gangestad et al.


Normally ovulating women have been found to report greater sexual attraction to men other than their own partners when near ovulation relative to the luteal phase. One interpretation is that women possess adaptations to be attracted to men possessing (ancestral) markers of genetic fitness when near ovulation, which implies that women's interests should depend on qualities of her partner. In a sample of 54 couples, we found that women whose partners had high developmental instability (high fluctuating asymmetry) had greater attraction to men other than their partners, and less attraction to their own partners, when fertile.


August 20, 2005

ESHG abstracts

I had previously posted some titles from this year's European Society of Human Genetics conference. There is now a pdf volume on the ESHG which contains all the abstracts of the conference. Some of them have already been published, and doubtlessly more of them will be published next year. I will discuss below some of the more intriguing entries:

F. Cruciani et al., Molecular dissection of the Y chromosome haplogroups A, E and R1b
The male-specific region of the human Y chromosome (MSY) is characterized by a low amount of sequence diversity compared to the mtDNA, the autosomes and the X chromosome. Recently, the use of DHPLC and direct sequencing of DNA has permitted to identify more than 300 new single nucleotide polymorphisms (SNPs) on the MSY. The analysis of the geographic distribution of the haplogroups identified by these markers has provided new insights in the history of human populations, at the same time, it came out that undetected Y chromosome SNPs still contain useful information. In this study we have analyzed the sequence variation of 60 kb of the TBL1Y gene. While previous studies have analyzed the sequence variation of the Y chromosome in a random sample of individuals, we here focus on 22 chromosomes belonging to three specific haplogroups (A, R1b and E), whose geographic distribution is relevant for the human evolutionary history of Africa and/or western Eurasia. We discovered 32 new SNPs, and placed them in the known Y chromosome phylogenetic tree: about half of the new mutations identify new branches of the tree. The geographic distribution of five new E-M78 sub-haplogroups, analyzed in more than 6,000 subjects from Eurasia and Africa, has led to the identification of interesting evolutionary patterns.
The discovery of new subclades, especially for E-M78 and R1b will be especially welcome for those interested in finer distinctions in these widely prevalent haplogroups. R1b for example occurs throughout the Caucasoid world, and so far very few meaningful sub-haplogroups of it were known. E-M78 is the main sublineage of haplogroup E3b and until now there was evidence fo haplotype clusters that differentiated E-M78 chromosomes; the discovery of new sub-haplogroups will probably reflect to some degree these previously known haplotype clusters.

People interested in their own personal anthropology may be advised to wait until the publication of the R1b and E-M78 sub-haplogroups and their incorporation into commercial "fine-resolution" SNP tests, if they are considering undertaking such a test.

I. Kutuev et al., Phylogeographic analysis of mtDNA and Y chromosome lineages in Caucasus populations
The Greater Caucasus marks a traditional boundary between Europe and Asia. Linguistically, it is one of the most diverse areas of the continental Eurasia, while genetics of the people living there is poorly understood. Mitochondrial DNA and NRY variability was studied in 23 Caucasus populations speaking Caucasus, Turkic, andIndo-European languages. Total sample comprised more than 1700 individuals on Y chromosome and more than 2100 individuals on mtDNA. Genetic outliers among the studied populations are relatively recently arrived Turkic speaking Nogays. The indigenous Caucasus populations possess generally less than 5% of eastern Eurasian mtDNA and Y-chromosomal haplotypes - in a profound contrast to the Turkic-speaking people at the other side of the Caspian, but not so dissimilar compared to the Volga-Turkic Tatars and Chuvashis or to the Anatolian Turks. Haplogroup frequency variation within the Caucasus populations, in some instances significant, appears to be caused primarily by specific aspects of the demographic history of populations. Phylogeographically, a particularly intriguing finding is the presence, though at low frequencies, of a predominantly northeastern African haplogroup M1 in many North Caucasus populations, though they lack sub-Saharan L lineages, relatively frequent in the Arab-speaking Levant. Results obtained help to place the Caucasus populations into the scenario of the peopling of Eurasia with anatomically modern humans. Possible migration routs, peopling of steppe and mountain parts of the Caucasus and causes of high linguistic diversity presence in this region is analyzed in this study.
The finding of M1 lineages in the Caucasus not associated with Sub-Saharan L lineages is important, because it can be explained in only one of two ways:
  1. M1 originated in Asia, so its presence in east Africa can be explained by back-migration from Asia. We know that macrohaplogroup M originated in Asia, but it is not clear whether M1 itself originated in Asia or Africa; the "trail" of M lineages between South Asia and Eastern Africa is still flimsy, so we cannot draw any conclusions on this matter yet.
  2. M1 originated in eastern Africa, but during a time when there was a much small level of penetration of sub-Saharan L lineages into the region.
V. Stepanov et al., Genetic diversity and differentiation of Y-chromosomal lineages in North Eurasia
Composition and frequency of Y-chromosomal haplogroups, defined by the genotyping of 36 biallelic loci in non-recombining part of Ychromosome, was revealed for native population of Siberia, Central Asia and Eastern Europe. Slavonic ethnic groups, which geographically represent Eastern Europe, are characterized by the high frequency of R1a1, I*, I1b, and N3a clades and by the presence of R1b3, J2, E, and G. Most frequent haplorgoup is R1a1, which comprises 44-51% of Y-chromosomes. The distinguishing peculiarity of Central Asian Caucasoids is the high frequency of Caucasoid clades R1a1, J*, J2, and the presence of R1b3 and G. Twenty-five haplogroups were found in gene pool of native Siberian populations. Only 7 of them have the frequency higher than 3%. In sum these 7 clades comprise 86% of Siberian samples. In populations of Southern Siberia the most frequent haplogroup is R1a1. The high frequency of N3a is characteristic for Eastern Siberians, and in Yakuts its frequency is almost 90%. Koryaks, Buryats and Nivkhs have the highest frequency of C3* lineage among investigated populations. Haplogroup O* revealed with variable frequency in most of Siberian. Highest frequency of Q* was found in Ketsand Northern Altayans (85% and 32%, respectively).The high level of genetic differentiation of North Eurasian population on Y-chromosomal lineages was revealed. The proportion of inter-population differences in the total genetic variability of region’s population according to the analysis of molecular variance is 19.04%. Genetic differences between territorial groups took 6.9% of total genetic variability, whereas 12.8% is the inter-population differences within groups.
This study seems to confirm what we already knew about the distribution of haplogroups in northern Eurasia, but it seems like a comprehensive survey of the area, which will be very useful when it appears in print.

S. Sengupta et al., Genescape of India, as Reconstructed from Polymorphic DNA Variation in the Y chromosome
The contemporary male gene pool of ethnic India largely comprises haplogroups that originated indigenously, in southeast Asia, and in west and central Asia. The indigenous haplogroup is predominant among the tribal group . The southeast Asian influence is largely on the male gene pools of Tibeto-Burman speaking tribals and Austro-Asiatic and Dravidian. The west and central Asian influence is primarily on caste groups - both Indo-European and Dravidian. The haplogroup diversity within the various tribal groups is lower than that within the caste groups. Analyses of molecular variance showed higher genetic variability among populations within linguistic clusters of tribals compared to castes. Moreover, the between group variability in the Indo-European caste cluster is higher than that in the Dravidian
caste cluster. This may be a reflection of diverse ancestries, antiquities and isolation of the tribals, coupled with subsequent cultural (linguistic) homogenization. Lesser between group genetic variability in caste groups may be a reflection of their recent founding history. The complete congruence of the patterns of Y-chromosomal and mitochondrial DNA differentiation may be indicative of inflow of both male and female genes from similar source populations. The rank order of FST values showed that tribes and castes are most differentiated, followed by upper and middle caste, upper and lower caste and middle and lower caste.
Again, this study seems to confirm the indigenous component in Indians, and the higher prevalence of western and central Asian Caucasoid haplogroups in castes compared to tribals. Also of interest is the finding that the main difference in the Indian population is between castes and tribals: within the castes, differentiation decreases towards the lower castes, the most differentiated ones being the upper castes.

E. Bogácsi-Szabó et al., Maternal and paternal lineages in ancient and modern Hungarians

Hungarian language represents the westernmost group of the Finno-Ugric language phylum, surrounded entirely by Indo-European speaking populations. Their linguistic isolation in the Carpathian basin suggests the possibility that they might also show a significant genetic isolation. According to historical data at the end of the 9th century Hungarian conquerors from the west side of the Ural Mountains settled down into the Carpathian Basin and took the hegemony. To determine the genetic background of Hungarians we examined mitochondrial and Y chromosomal DNA from ancient `conquerors` from Hungary, originated from the 10th century and from modern Hungarian-speaking adults from today's Hungary and Transylvanian Seklers (Romania). DNA was extracted from 35 excavated ancient bones and hair samples of 125 and 80 modern Hungarians and Seklers, respectively. Mitochondrial haplogroups were determined with HVS I sequencing and RFLP typing. The mtDNA HVS I sequences were compared with 2615 samples from 34 Eurasian populations retrieved from published data. ARLEQUIN 2.001 Software was used to estimate genetic distances between populations. The resulting matrix was summarized in two dimensions by use of Multidimensional Scaling. The M46 biallelic Y chromosomal marker (TAT, often called Uralic migration marker) was also investigated from 2 ancient, 34 modern Hungarian and 60 Sekler samples. Our results suggest that the modern Hungarian gene pool is very similar to other central European ones concerning the mitochondrial and Y chromosomal markers, while the ancient population contains more Asian type elements.
This is a very exciting study comparing ancient Magyar mtDNA and Y chromosomes (at least the Tat-C marker) with those of modern Hungarian speakers. Physical anthropologists have long identified a Mongoloid and mixed Mongoloid component in the Magyars, and this is now confirmed with the finding of Tat-C and Mongoloid mtDNA in the ancient Magyars at a higher frequency than in the modern population. Today, Hungarians are predominantly Caucasoid, and this is supported by the molecular data and reflects the assimilation of the indigenous Caucasoid population by the more "Asian" original Magyar population.

F. di Giacomo et al., Y chromosomal variation in the Czech Republic
In order to analyse the contribution of the Czech Republic to the genetic landscape of Europe, we typed 257 male subjects from 5 locations for 17 Unique Event Polymorphisms of the Y chromosome. Sixteen haplogroups or sub-haplogroups were identified, with only 5 chromosomes uncharacterized. Overall, the degree of population structuring was low. The three commonest haplogroups were R1a
(0.344), P*(xR1a) (0.281) and I (0.184). M157, M56 and M87 showed no variation within haplogroup R1a. Haplogroup I was mostly represented as I1b* and I1b2 was also detected in this population. Thus, the majority of the Czech male gene pool is accounted for by the three main haplogroups found in western and central Europe, the Balkans and the Carpathians. Haplogroup J was found at low frequency, in agreement with a low gene flow with the Mediterranean. In order to draw inferences on the dynamics of the Czech population, we typed 141 carriers of the 3 most common haplogroups for 10 microsatellites, and applied coalescent analyses. While the age of the I clade agreed with that reported in the vast study of Rootsi et al (2004), the ages of its sub-haplogroups differed considerably, showing that the I chromosomes sampled in the Czech Republic are a subset of those found throughout Europe. Haplogroup R1a turned out to be the youngest with an estimated age well after the Last Glacial Maximum. For all three major haplogroups the results indicate a fast population growth, beginning at approximately 60-80 generations ago.
The young age of R1a1 in Czechs, combined with its high frequency make it a likely candidate as reflecting historical or recent prehistorical events, and less likely to reflect the post-LGM recolonization of Europe.

August 19, 2005

Y-haplogroup testing options (c. Aug 19, 2005)

I thought it would be useful to list all the companies that you can use to test your Y-chromosome haplogroup as of now. In choosing a testing company, one has to consider:
  • The type of genetic information that they can obtain
  • The price
  • The quality of the company: accompanying information, turnaround times, years in business, number of customers, customer satisfaction, etc.
I will not go into the third factor, which is quite subjective, and I will limit myself to price and genetic information obtained. I will also list pricing for individual customers, and not pricing for those interested in joing a "surname" project, or other promotional offers that various companies want to offer.

For information purposes only. You are responsible if you decide to use any of these testing companies. Companies are listed alphabetically.

DNA Heritage

DNA Heritage offer a $99 fine-resolution global SNP test which determines not only the "main" haplogroup (e.g., R1b), but also sub-haplogroups (e.g., R1b3). For a list of markers, click here.


Ethnoancestry offers a world-wide SNP test for determining your haplogroup (e.g., R1b) for $195. Finer-resolution tests are also offered or planned.

Family Tree DNA

Family Tree DNA offers a $159 12-marker Y-STR test. In some cases, this may be used to infer the haplogroup (e.g., R1b) with high probability. A confirmatory SNP test which tests perhaps multiple SNPs until the haplogroup is confirmed costs $65.

Genographic Project

The Genographic Project is offering a $99.95 test (plus $7.55 shipping & handling for US customers, or $26.50 for international customers). The same STR markers as Family Tree DNA are typed, and a SNP test is performed (at no extra cost) if haplogroup attribution is ambiguous.


GeoGene is offering a global SNP test for determining your haplogroup, e.g., R1b, plus 6 Y-STR markers for $185 with a 100% money-back guarantee.

Peripheral neuropathy and mtDNA haplogroup T

AIDS. 2005 Sep 2;19(13):1341-1349.

Mitochondrial haplogroups and peripheral neuropathy during antiretroviral therapy: an adult AIDS clinical trials group study.

Hulgan T et al.

OBJECTIVE:: HIV nucleoside reverse transcriptase inhibitors (NRTI) can cause peripheral neuropathy that is a result of mitochondrial injury. Polymorphisms in the mitochondrial genome define haplogroups that may have functional implications. The objective of this study was to determine if NRTI-associated peripheral neuropathy is associated with European mitochondrial haplogroups. DESIGN:: Case-control study of Adult AIDS Clinical Trials Group (ACTG) study 384 and ACTG Human DNA Repository participants. METHODS:: ACTG study 384 was a treatment strategy trial of antiretroviral therapy with didanosine (ddI) plus stavudine (d4T) or zidovudine plus lamivudine given with efavirenz, nelfinavir, or both. Subjects were followed for up to 3 years. Peripheral neuropathy was ascertained based on signs and symptoms. For this analysis, polymorphisms that define European mitochondrial haplogroups were characterized in a majority of ACTG 384 participants, and associations with peripheral neuropathy were assessed using logistic regression. RESULTS:: A total of 509 subjects were included in this analysis of whom 250 (49%) were self-identified as white, non-Hispanic. Mitochondrial haplogroup T was more frequent in subjects who developed peripheral neuropathy. Among 137 white subjects randomized to receive ddI plus d4T, 20.8% of those who developed peripheral neuropathy belonged to mitochondrial haplogroup T compared to 4.5% of control subjects (odds ratio, 5.4; 95% confidence interval, 1.4-25.1; P = 0.009). Independent predictors of peripheral neuropathy were randomization to receive ddI plus d4T, older age, and mitochondrial haplogroup T. CONCLUSIONS:: A common European mitochondrial haplogroup may predict NRTI-associated peripheral neuropathy. Future studies should validate this relationship, and evaluate non-European mitochondrial haplogroups and other NRTI toxicities.

August 18, 2005

Earliest shoe wearers

A new article by Erik Trinkaus sheds some light into the antiquity of shoe wearing. Shoes are usually made from perishable materials, so there is usually little direct evidence for them, except for quite recent times. In older times, there are sometimes footprints which show evidence of being produced with covered feet.

Dr. Trinkaus studied the morphology of toes from prehistoric humans. Roughly speaking, toes in shoes do not need to "grab" onto the ground, because this function is performed by the shoe. This is especially true for the lesser toes, i.e., toes other than the big toe. So, the lesser toes should shrink when humans began to use shoes regularly.

Indeed, there was evidence that lesser toes did shrink in Paleolithic Western Eurasians. Early modern humans ("Cro-Magnons") but also late Neanderthals may have started using shoes as protection against the cold and for protection from the ground. Moreover, both southern European and more northern individuals seem to have been affected, and this may indicate that this was a general cultural practice that was adopted, rather than a specific measure to combat the cold in the northernmost parts of Eurasia.

Journal of Archaeological Science Volume 32 Issue 10, 1515-1526

Anatomical evidence for the antiquity of human footwear use

Erik Trinkaus


Archeological evidence suggests that footwear was in use by at least the middle Upper Paleolithic (Gravettian) in portions of Europe, but the frequency of use and the mechanical protection provided are unclear from these data. A comparative biomechanical analysis of the proximal pedal phalanges of western Eurasian Middle Paleolithic and middle Upper Paleolithic humans, in the context of those of variably shod recent humans, indicates that supportive footwear was rare in the Middle Paleolithic, but that it became frequent by the middle Upper Paleolithic. This interpretation is based principally on the marked reduction in the robusticity of the lesser toes in the context of little or no reduction in overall lower limb locomotor robusticity by the time of the middle Upper Paleolithic.


August 17, 2005

Individualism and collectivism in 20 countries

Here is the breakdown of the four types in the twenty countries.

Image Hosted by
Journal of Cross-Cultural Psychology, Vol. 36, No. 3, 321-339 (2005)

Variation of Individualism and Collectivism within and between 20 Countries
A Typological Analysis

Eva G. T. Green et al.

With data from a 20-nation study (N = 2,533), the authors investigated how individual patterns of endorsement of individualist and collectivist attitudes are distributed within and across national contexts. A cluster analysis performed on individual scores of self-reliance (individualist dimension), group-oriented interdependence (collectivist dimension), and competitiveness (individualist or collectivist dimension) yielded a typology of four constrained combinations of these dimensions. Despite the prevalence of a typology group within a given country, variability was observed in all countries. Self-reliant non-competitors and interdependent non-competitors were prevalent among participants from Western nations, whereas self-reliant competitors and interdependent competitors were more common in non-Western countries. These findings emphasize the benefits for cross-cultural research of a typological approach based on combinations of individualist and collectivist dimensions.


Earliest examples of the four major racial types

I thought it would be a good idea to list the earliest known representatives of the four major racial types, i.e., Australoids, Caucasoids, Mongoloids, and Negroids. These dates represent the latest possible time after which the existence of these types is guaranteed; of course, by definition, they must have existed for some time before.

The first known Caucasoid is Mladec 1 from the Czech Republic, dated to 31,000BP.

Image Hosted by

The first known Australoid is Moh Khiew Cave from Thailand, dated to 25,800BP.

Image Hosted by

The first known Mongoloids appear in early Neolithic sites in China, such as Baoji and Huaxian around 7,100BP.

[Picture pending]

The earliest know Negroids date from the ~14,500-12,500BP site of Jebel Sahaba in Lower Nubia. Pictured below is Jebel Sahaba 117-10.

Image Hosted by

August 16, 2005

Croatian Y chromosomes and mtDNA

Croat Med J. 2005 Aug;46(4):502-13.

Review of croatian genetic heritage as revealed by mitochondrial DNA and y chromosomal lineages.

Pericic M et al.

The aim of this review is to summarize the existing data collected in high-resolution phylogenetic studies of mitochondrial DNA and Y chromosome variation in mainland and insular Croatian populations. Mitochondrial DNA polymorphisms were explored in 721 individuals by sequencing mtDNA HVS-1 region and screening a selection of 24 restriction fragment length polymorphisms (RFLPs), diagnostic for main Eurasian mtDNA haplogroups. Whereas Y chromosome variation was analyzed in 451 men by using 19 single nucleotide polymorphism (SNP)/indel and 8 short tandem repeat (STR) loci. The phylogeography of mtDNA and Y chromosome variants of Croatians can be adequately explained within typical European maternal and paternal genetic landscape, with the exception of mtDNA haplogroup F and Y-chromosomal haplogroup P* which indicate a connection to Asian populations. Similar to other European and Near Eastern populations, the most frequent mtDNA haplogroups in Croatians were H (41.1%), U5 (10.3%), and J (9.7%). The most frequent Y chromosomal haplogroups in Croatians, I-P37 (41.7%) and R1a-SRY1532 (25%), as well as the observed structuring of Y chromosomal variance reveal a clearly evident Slavic component in the paternal gene pool of contemporary Croatian men. Even though each population and groups of populations are well characterized by maternal and paternal haplogroup distribution, it is important to keep in mind that linking phylogeography of various haplogroups with known historic and prehistoric scenarios should be cautiously performed.

August 14, 2005

Population genetics of Indus Valley populations

A new article confirms that the genetic composition of the population of the Indo-Gangetic plain (Pakistan and NW India) consists of West Eurasian (Caucasoid) and indigenous South Asian elements. The contribution of other elements such as Sub-Saharan African in the Makrani "Negroids", the significant contribution of indigenous female South Asian elements to the Parsis (who are of Iranian origin, but live in India), and the contribution of Mongoloid elements in groups such as the Hunza, and the Hazara is also confirmed. The mtDNA distribution is shown below:

Free Image Hosting at

In terms of their Y-chromosome, the population of the region is of western Eurasian (Caucasoid) origin, also including a variant which has developed in the region and is found at lesser frequency elsewhere:
In striking contrast to the mtDNA data, there is no strong evidence in Pakistani populations of Y-chromosome signatures of the early inhabitants of the region following the African exodus (Qamar et al. 2002, Zerjal et al. 2002), with their Y-chromosomes largely replaced by subsequent migrations or gene flow. The Y-chromosome gene pool of Pakistani populations is mainly attributable to western Eurasian lineages, particularly from the Middle East (Qamar et al. 2002). Conversely, few traces of East Asian haplogroups are observed in the Indus Valley populations. One Y-chromosome haplogroup (L-M20) has a high mean frequency of 14% in Pakistan and so differs from all other haplogroups in its frequency distribution. L-M20 is also observed, although at lower frequencies, in neighbouring countries, such as India, Tajikistan, Uzbekistan and Russia. Both the frequency distribution and estimated expansion time (~7,000 YBP) of this lineage suggest that its spread in the Indus Valley may be associated with the expansion of local farming groups during the Neolithic period (Qamar et al. 2002).


For easier access, here is a break-down of Indian Y-chromosomal distribution taken from a recent comprehensive study (pdf).

And, a similar study on Y-chromosomal distribution from Pakistan (pdf):

Free Image Hosting at

Annals of Human Biology Mar-Apr 32(2):154-62.

A population genetics perspective of the Indus Valley through uniparentally-inherited markers.

McElreavey K and Quintana-Murci L.

Analysis of mtDNA and Y-chromosome variation in the Indo-Gangetic plains shows that it was a region where genetic components of different geographical origins (from west, east and south) met. The genetic architecture of the populations now living in the area comprise genetic components dating back to different time-periods during the Palaeolithic and the Neolithic. mtDNA data analysis has demonstrated a number of deep-rooting lineages of Pleistocene origin that may be witness to the arrival of the first settlers of South and Southwest Asia after humans left Africa around 60,000 YBP. In addition, comparisons of Y-chromosome and mtDNA data have indicated a number of recent and sexually asymmetrical demographic events, such as the migrations of the Parsis from Iran to India, and the maternal traces of the East African slave trade.


August 13, 2005

Ancestry of African Americans

A comprehensive new study of African American mtDNA in comparison to African mtDNA indicates that African Americans are primarily descended from West, West-Central, and South-Western Africa, with negligible contributions from Northern, East, Southern, and Southeastern Africa. As common sense would suggest, African Americans are descended from the primary Negroid area of Africa:
Africa is the most genetically diverse continent. A fine subdivision of African mtDNA lineages provides a powerful source of phylogeographic information: major regions of the continent display markedly different frequencies of the continent-specific mtDNA clades, or haplogroups (fig. 1a). However, the first point to make from this enhanced data set is the obvious similarity of the haplogroup frequency profiles of West Africa, west-central Africa, and southwestern Africa in comparison with the other major regions of the continent.

Am. J. Hum. Genet (Early View)

Charting the Ancestry of African Americans

Antonio Salas et al.

The Atlantic slave trade promoted by West European empires (15th–19th centuries) forcibly moved at least 11 million people from Africa, including about one-third from west-central Africa, to European and American destinations. The mitochondrial DNA (mtDNA) genome has retained an imprint of this process, but previous analyses lacked west-central African data. Here, we make use of an African database of 4,860 mtDNAs, which include 948 mtDNA sequences from west-central Africa and a further 154 from the southwest, and compare these for the first time with a publicly available database of 1,148 African Americans from the United States that contains 1,053 mtDNAs of sub-Saharan ancestry. We show that >55% of the U.S. lineages have a West African ancestry, with <41%>coming from west-central or southwestern Africa. These results are remarkably similar to the most up-to-date analyses of the historical record.


August 12, 2005

Psychological masculinization of tall women

Personality and Individual Differences (Article in Press)

Height in women predicts maternal tendencies and career orientation

Denis K. Deady and Miriam J. Law Smithb


Previous research has shown that variation in sex-specific personality traits in women can be predicted by measures of physical masculinisation (second to fourth digit ratio and circulating testosterone). This study aimed to test the hypothesis that certain sex-specific traits in women (maternal tendencies and career orientation) could be predicted by one index of masculinisation, height. Data was collected via online questionnaires. In pre-reproductive women (aged 20–29, n = 679), increasing height related to decreasing maternal personality (lower importance of having children, lower maternal/broodiness) and decreasing reproductive ambition (fewer ideal number of children, older ideal own age to have first child). Increasing height also related to increasing career orientation (higher importance of having a career, and higher career competitiveness). In post-reproductive women (aged over 45, n = 541), increasing height related to decreased reproductive events (fewer children, had first child at older age) and increased career orientation. Results provide further support for previous studies that show physical masculinisation is associated with psychological masculinisation.


August 11, 2005

Basque Y chromosomes

Very interesting new paper on Y chromosomal diversity disproving the idea that the Basques are an especially ancient population or somehow representative of the ancestral European gene pool. The following sentence should probably be framed by scientists trying to correlate the findings of modern population genetics with the historical-archaeological record:
An ad hoc mining of the historical record can lead to a spurious association of any finding in human population genetics with any historical episode that could potentially explain it.
European Journal of Human Genetics (advance online publication)

The place of the Basques in the European Y-chromosome diversity landscape

Santos Alonso et al.


There is a trend to consider the gene pool of the Basques as a 'living fossil' of the earliest modern humans that colonized Europe. To investigate this assumption, we have typed 45 binary markers and five short tandem repeat loci of the Y chromosome in a set of 168 male Basques. Results on these combined haplotypes were analyzed in the context of matching data belonging to approximately 3000 individuals from over 20 European, Near East and North African populations, which were compiled from the literature. Our results place the low Y-chromosome diversity of Basques within the European diversity landscape. This low diversity seems to be the result of a lower effective population size maintained through generations. At least some lineages of Y chromosome in modern Basques originated and have been evolving since pre-Neolithic times. However, the strong genetic drift experienced by the Basques does not allow us to consider Basques either the only or the best representatives of the ancestral European gene pool. Contrary to previous suggestions, we do not observe any particular link between Basques and Celtic populations beyond that provided by the Paleolithic ancestry common to European populations, nor we find evidence supporting Basques as the focus of major population expansions.


August 10, 2005

The Upper Paleolithic population of Europe

Journal of Archaeological Science (Article in Press)

Estimates of Upper Palaeolithic meta-population size in Europe from archaeological data

Jean-Pierre Bocquet-Appel et al.


Three databases (2961 georeferenced archeological sites, simulated climatic variables simulating a typical “warm” phase of the isotopic stage 3 (IOS3 project), and ethnographic of hunter–gatherers (HG)) were used to estimate the size, growth rate and kinetics of the metapopulation of HG during four periods of the Upper Paleolithic in Europe. The size of the metapopulation was obtained by multiplying a demographic density (per 100 km2) by the size of the population territory of HG. Demographic density for each period was calculated by successively backprojecting a reference density obtained for the Late Glacial with inter-period growth rate of the archeological sites. From the Aurignacian to the Glacial Maximum, the metapopulation remained in a positive quasi-stationary state, with about 4400–5900 inhabitants (95% confidence interval (CI95%): 1700–37,700 inhabitants). During the Glacial Maximum, the metapopulation responded to the cold: (i) by moving the northern limits of its maximum expansion zone towards the low latitudes by 150–500 km from west to east, (ii) by concentrating in few refuge zones (mainly Périgord, Cantabria and the Ibérian coasts), (iii) by becoming perhaps distributed in smaller groups than during the pre and post Glacial Maximum. The metapopulation reached 28,800 inhabitants (CI95%: 11,300–72,600) during the mid-Late Glacial recolonisation.


August 08, 2005

Biogeographical Ancestry testing

John Hawks replies to some criticism of his earlier post by a blogger at Majority Rights. The subject is the meaning and usefulness of the DNA Print admixture test (AncestryByDNA).

First, we should distinguish between admixture testing in general, and the DNA Print family of tests in particular. Problems with the latter, e.g., "Native American" affiliation in Greeks, "South Asian" affiliation in Iberians, or "Middle Eastern" affiliation in the Irish do not invalidate the utility of admixture testing in general.

First, let's see how AncestryByDNA type tests work:

Frequency data of alleles are obtained for several reference populations, i.e., European, East Asian, Sub-Saharan African, and Native American. For example, if a locus X has three alleles, A, B, C then it is recorded that Europeans may have 50% of A, 30% of B, and 20% of C, while Sub-Saharan Africans may have 20% of A, 40% of B, and 40% of C. This type of frequency information is recorded for all alleles in all reference populations.

Next, individual genotypes are recorded for each customer. For example, for locus X, the customer may have allele C. This is the "hard" mechanistic part of the test, which is almost completely accurate.

Finally, the maximum likelihood estimate of admixture proportions is calculated. Suppose, for example that our hypothetical individual is 100% European and 0% Sub-Saharan African; the probability of observing a C is then simply 0.2, i.e., the frequency of C in the European population. If he is 100% Sub-Saharan African and 0% European, then this probability is 0.4. If he is 50% E/SSA then the probability is 0.5x0.2+0.5x0.4=0.3. All such admixture proportions are tested in a systematic, algorithmic way.

So, we see that the most likely ancestral composition of this individual is 100% Sub-Saharan African and 0% European, based on a single locus. The same kind of calculation can be used for multiple loci. As more loci are tested, the confidence in the admixture proportions increases; for example, the hypothetical individual presented above could quite easily be a European: after all, 60% of Africans do not have the C allele, whereas 20% of Europeans do. However, if we systematically observed that this individual had such common African alleles in multiple loci, then the probability that he had European admixture would be diminished.

So, admixture proportions depend on the following factors:
  1. Individual genotype
  2. Parental reference populations
What we should note here is that the test does not measure exact admixture proportions from races which were once thought to be pure. Rather, if the individual can be modelled as deriving his ancestry from the reference populations, then his most likely ancestral proportions are reported.

In some cases the "if" is justified. For example, the inhabitants of the Brazil can be reasonably seen as the product of admixture between Europeans, Native Americans and Sub-Saharan Africans, because these groups are known to have settled that country.

In other cases, this assumption is not justified. For example, South Asians are not the result of admixture between the four groups listed in the Ancestry By DNA test; this is established by the phylogeny of haploid markers such as mtDNA and the Y chromosome, which establish that South Asians have a high proportion of markers that are specific to themselves, e.g., South Asian-specific subclades of mtDNA macrohaplogroup M. So, South Asians are not reasonably modelled as the product of admixture between the four groups, because these four groups do not include a significant component in the ancestry.

It is important to see what is the problem here: any genotype will be assigned admixture proportions by the maximum likelihood estimation algorithms. Even a chimp's genotype would be assigned some proportions that add up to 100%. These proportions make sense only if the individual can be reasonably expected to be derived from the differentiated populations used as references.

This brings us to the second problem: which reference populations? For example, it is true that Europeans settled both Brazil and the United States, but not the same kind of Europeans. So, the frequency data from a pan-European or English sample do not represent the European component in Brazilians.

In conclusion, admixture testing works best when the parental populations are well-defined, highly differentiated and known to have historically admixed in a given territory. It does not work well when these conditions are not met.


John suggests an alternative way of presenting the results of the test:
Compare to this hypothetical result, based on alleles only without any reference to Linnaean taxonomy. The person is told he has 89 alleles that are common worldwide, 35 that are common in Europe but rare elsewhere, 4 that are very common in East Africa and moderately common in the Near East, 10 that are very common in China and Thailand, moderately common in India and Pakistan, and present but less common in the Near East, and 2 alleles that are very high frequency in Native Americans, but also present in Siberia, Caucasus, the Near East, and Greece.

Certainly, it would be nice to have this type of information accompanying the haplotype results. However, this type of presentation can be deceptive. For example, many alleles have slight frequency differences in different human populations. For example, an allele may have a frequency of 50% in Africans and 40% in Europeans. We can certainly not be sure whether it is derived from a European or African ancestor: it is one of the alleles that are "common worldwide" in the quoted paragraph. However, the co-occurrence of many alleles of this kind carries information, and if an individual e.g., has 10 such alleles that are slightly more frequent in Africans than in Europeans and 3 that have the opposite pattern, then we can still conclude that African ancestry is highly probable.


There is a discussion in Majority Rights blog which raises some objections to my comments. Let's address them one at a time:
2. Dienekes’ labelling of certain DNAP results as “problems” is, in my opinion, not justified without further evidence. Given that “Middle Eastern” in the Irish does NOT imply any sort of direct admixture of Middle Easterners (at least, as they exist today) into the Irish genepool, how it is known a priori that this is a problem?
It is certainly a problem to claim that the Irish have more MIDEAS affiliation than the Turks. It goes against geography, history, physical anthropology and common sense. Until a satisfactory explanation for this unexpected result is offered, we are justified to view it as a bug of the test.
I have discovered that Dienekes is incorrect about the chimp comment. From we see:-

Number of failed Loci

Chimpanzee - 157 out of 176 Gorilla - 151 out of 176 Orangutan - 137 out of 176”

This is completely irrelevant to my point. My point was that the genotype of a chimpanzee would have to be assigned to four numbers adding to 100%. That has nothing to do with the procedure used to obtain the genotype in the first place. Of course, we expect to have failed loci when we try to read a SNP in a different species, because primers developed for humans will not generally work in a different species; however, if we could read the letters in the 176 loci (or as many loci as are shared in the human and chimp sequence), and plugged the resulting genotype into the estimation algorithm, we would still get four numbers that add up to 100%.

In any case, one can repeat the example using an Australian instead of a chimp. An Australian's genotype would be assigned to the four groups with numbers adding to 100%, even though Australians cannot be viewed as being a mix of the four groups in question.
4. Dienekes’ comments about South Asians disregard the Euro 1.0 test.
The EURO-DNA test measures the "South Asian" component of the "European" component. However, I was not referring to the "South Asian" component of the "European" component, but to the aboriginal South Asian component which is _not_ related to the Western Eurasian (Caucasoid) component, and is evidenced primarily by the predominance of extremely ancient South Asian specific clades of mtDNA macrohaplogroup M. In short, with the exception of some Mongoloid tribes, South Asians are not descended from East Asians (Mongoloids). They are descended from very old indigenous South Asian populations as well as more recent Central Asian (Caucasoid) populations. Whatever similarity they have with East Asians is due to common ancestry _before_ the emergence of Mongoloids.

In other words, a population X may be genetically affiliated to East Asians either due to the joint possession of shared ancestral alleles, or due to the introgression of East Asian alleles into X. Someone labelled as e.g., 50% European + 50% East Asian according to DNAP may be for example (i) a first generation Japanese-Briton, or (ii) a Central Asian Turk of ancient Caucasoid-Mongoloid ancestry, or (iii) a South Asian. In cases (i,ii) he is the progeny of the admixture between Caucasoid and Mongoloid ancestors, whereas in case (iii) he is the progeny of Caucasoid and Proto-South Asian ancestors without any significant Mongoloid ancestry.

August 03, 2005

Forensic informativeness of mtDNA haplogroup H

mtDNA haplogroup H occurs very frequently in Caucasoids. This means that even if additional HVRI/HVRII information is known, there is still a high probability that two H samples with the same mutations in the hypervariable regions may not in fact be closely related. Testing for the sub-haplogroups of H is a way in which additional information about haplogroup H samples may be obtained. The authors find that this is quite useful and that the distribution of H sub-haplogroups is significantly correlated with geography. From the article:
Most of the European populations sampled display an overall haplogroup H frequency of ~40–50%, with frequencies decreasing towards the south-east, reaching ~20% in the Near East and Caucasus, and ~65% of haplogroup H lineages in Iberia, ~46% in the north-west, ~27% in central and eastern Europeans, and ~5–15% in the Near East/Caucasus, and are absent from the Gulf. The frequency of H5a appears to be highest on the central European plain; it occurs at low levels across Europe but is absent from the Caucasus and the Near East. H2 and H6 are both common in Eastern Europe and the Caucasus, and are not found in the Near Eastern sample. The less common sub-clades H4, H7 and H13 occur in both Europe and the Near East; H13 also present in the Caucasus. The paraphyletic cluster H* predominates in the Near East, its range at least partly the mirror-image of that for H1 and H3, but it is most common in East–Central Europe and the Balkans, and is frequent in Atlantic Europe as well.

Forensic Science International (Article in Press)

Evaluating the forensic informativeness of mtDNA haplogroup H sub-typing on a Eurasian scale

Luísa Pereira et al.


The impact of phylogeographic information on mtDNA forensics has been limited to the quality control of published sequences and databases. In this work we use the information already available on Eurasian mtDNA phylogeography to guide the choice of coding-region SNPs for previous termhaplogroupnext term H. This sub-typing is particularly important in forensics since, even when sequencing both HVRI and HVRII, the discriminating power is low in some Eurasian populations. We show that a small set (eight) of coding-region SNPs resolves a substantial proportion of the identical haplotypes, as defined by control-region variation alone. Moreover, this SNP set, while substantially increasing the discriminating efficiency in most Eurasian populations by roughly equal amounts, discloses population-specific profiles.


August 01, 2005

Y-haplogroup Q3 in prehistoric Alaskan

Via the PalAnth forum, an interesting abstract presented in the AAAS meeting recently:
Analysis of Ancient DNA from an Individual from Prince of Wales Island: Implications for the Peopling of the New World

BRIAN M. KEMP et al.

Ancient mitochondrial and Y-chromosome DNA were successfully extracted from the teeth of an individual dated to 9,880 ± 50 (CAMS- 32038) and 9,730 ± 60 (CAMS-29873) excavated from On Your Knees Cave (site 49-PET-408) on Prince of Wales Island in Southeast Alaska. d13C values of 12.1 and 12.5% suggest a diet of marine foods, so the date should be adjusted to c. 9,200 14C ybp. The mitochondrial DNA (mtDNA) of this individual belongs to haplogroup D and the Y-chromosome to haplogroup Q-M3*, which also confirms that the sex of the individual was male. This individual's mitochondrial haplotype (based on his hypervariable region I+II sequences) does not represent the basal haplogroup D lineage, demonstrating that multiple founder lineages ofthis haplogroup reached the New World. This haplotype matches or closely matches published sequences of Native American mtDNA found in populations of both North and South America, being found in approximately 1% of living Native Americans. The known date associated with this sample allows for the calibration of the molecular clock and can be used to assess the accuracy of earlier estimates of the timing of the peopling of the New World based on molecular diversity. This sample also establishes a minimum date for the emergence of the M3 Y-chromosome mutation, which is believed to have occurred early during the settlement of the New World.