April 15, 2008

Haplogroup correlations in Russia

I have looked at the data of the recent article on Russian Y chromosomes with the purpose of detecting some patterns in the occurrence of haplogroups among the 14 geographically distributed sub-populations studied there.

First, I looked at the correlation between the frequencies of occurrence of different haplogroups, limiting myself to haplogroups E3b1, I1a, I1b, J2, N2, N3, R1a, R1b3. For each haplogroup, I report the correlations which are at least 0.5 in absolute value (either positive or negative).

I1a: -0.56 with R1a
I1b: -0.76 with N3, -0.66 with N2
N2: -0.66 with I1b
N3: -0.76 with I1b, -0.66 with R1a
R1a: -0.66 with N3, -0.54 with I1a

The main points of interest are:
  • I1a vs. R1a
  • I1b vs. N3/N2
  • R1a vs. N3
Next, I calculated correlations between haplogroup frequencies and latitude/longitude. I list correlations greater or equal to 0.5 in absolute value.

Longitude (west-to-east): R1a (-0.51), N2 (+0.56)
Latitude (south-to-north): I1b (-0.87), R1a (-0.57), N2 (+0.61), N3 (+0.79)

These observations suggest that the most striking features of the Russian Y-DNA landscape are the contrast between southern haplogroup I1b, and southwestern haplogroup R1a on the one hand, and northern haplogroup N3, and northeastern haplogroup N2 on the other. Thus, as a first-order approximation, the title of the paper, "Two Sources of the Russian Patrilineal Heritage in Their Eurasian Context" is justified, the two sources being R1a/I1b dominated SW/S group, and the other being N2/N3 dominated NE/N group.

Within the SW/S group, there is some concordance between the frequencies of haplogroups R1a and I1b, with a correlation of +0.40. Similarly, for the NE/N group, the correlation between N3 and N2 is +0.47. This suggests the pre-existence of these two combinations, while leaving open the possibility that a two-source solution is not the whole story.


  1. I think by now we've all learned how unreiable the frequencies of Y-DNA haplogroups can be as indicators of ancestral relationships.

    Genetic drift and bottlenecks can do extraordinary things to what poplations look like in terms of these markers.

    I think the only senseuble thing to do is to correlate any coclusions based on Y-DNA (and mtDNA) with a at least a few hundred reliable genome wide SNPs.

  2. "Genetic drift and bottlenecks can do extraordinary things to what poplations look like in terms of these markers."

    Genetic drift does not result in a geographical gradient.

    Also, Y-DNA is the only system that maintains the signal of a population's patrilineages, which is of prime importance when discussing the origins of ethnic groups, most of which were patriarchal.

  3. And it seems "genetic drift" and "bottlenecks" are being used to explain so many anomalies they've reached the stage of being as informative as saying, "God did it".


