Dienekes’ Anthropology Blog: Haplogroup sizes and observation selection effects (continued)

July 29, 2008

Haplogroup sizes and observation selection effects (continued)

This is a continuation of my comments on How Y-STR variance accumulates.

The story so far

In my previous post I showed how the "evolutionary rate" of Zhivotovsky, Underhill, and Feldman (2006) is inappropriate for TMRCA calculations, because:

It is not calculated from the time depth of the MRCA, but of an earlier "Patriarch"; more importantly:
It is an average over many simulated haplogroups of small size, and not the kinds of haplogroups one is usually interested in dating in population studies

How big are the haplogroups in Z.U.F.-type simulations?

Z.U.F. consider several different demographic models, differing in their choice of m, the population growth constant. The population size increases (stochastically) on average by 100(1-m)% every generation.

I produce N=10,000 simulations for each reported number. These are the average, and maximum number of descendants over these N simulations.

Constant population size (m=1)

Under this assumption, haplogroup size grows purely due to randomness of the fathering process; there is no overall population growth. This is an important case, because the 3.6x slower evolutionary rate has been derived from it.

		Number of Descendants
g	Average		Maximum
10	5.9		56
20	11		106
40	21.1		176
80	40.9		366
160	81.2		801
320	159.8		1310

It is clear, that this type of simulation produces very small haplogroup sizes. Even for 320 generations (early Neolithic for Greece) the very largest haplogroup produced had 1,310 descendants, while the average one had the theoretically predicted ~160.

Small haplogroups => more drift => loss of variance => lower "effective" mutation rate.

So, as I mentioned in my previous post, to calculate the 3.6x slower rate, not only do we average over haplogroups of all sizes, small and large alike, but we are actually missing the relevant observations. But more on this, in the next section.

Expanding population (m=1.01)

		Number of Descendants
g	Average		Maximum
10	6.2		54
20	12.2		112
40	25.5		232
80	62.2		559
160	194.1		1722
320	1170.5		12214

Predictably, haplogroups end up bigger in an expanding population, but still far short of the sizes of commonly dated real-world haplogroups. The case of m=1.01 is important, because it is the one which yields the maximum effective mutation rate considered by Z.U.F. assuming haplogroups start with one individual.

Thus, even the highest mutation rate considered by Z.U.F (about 0.55μ over 400 generations) is derived by averaging over haplogroups that are unrealistic (too small). Real Y-STR variance accumulates at a higher rate in the real world.

Why are Z.U.F.-style simulated haplogroups so small?

It is surprising that these simulated haplogroups end up so small, looking nothing like commonly studied haplogroups even for an expanding population.

The apparent mystery is resolved, once we realize that m is nothing more than the average number of sons a man has. The reason why we see haplogroups so much bigger than the simulated ones is because for individual men, m may be much more, or much less than its population average. In other words, there is reproductive inequality, which could be due both to social advantage, or to natural selection.

So, rather than having a uniform m for all men, we can allow m to vary in individual lineages. A man A may have m_A<m if he is impoverished or has a faulty Y-chromosome gene, and he may have m_A>m if he is a ruler or has an advantageous gene in his Y-chromosome.

The advantage could be slight but long-standing (a small fitness improvement) or small and intense (a conquest or foundation of a dynasty). Its effect on the lucky lineage is an increase in the number of descendants. Its effect on Y-STR variance is a rate of increase approaching the germline rate.

It is clear, by now, that realistic haplogroup sizes can occur only when there is reproductive inequality. They are not the result of genetic drift, but of natural or social selection. And, effective mutation rates should be calculated over successful haplogroups under conditions of reproductive inequality, and not over all haplogroups under conditions of reproductive equality.(*)

A note on sampling

Consider a lineage of 1,000 men (i.e. ~ the maximum produced with reproductive equality) in a population of 1,000,000 men. Its frequency is thus 0.1%

We take a sample of 1,000 men from this population; this is much larger sample than is typically used in population studies, and for a smaller population. We expect on average to find just 1 man from the lineage in question in our sample. You can't do a variance-based age estimate with one man!

Thus, it becomes clear why haplogroups produced by Z.U.F.-style simulations are uninteresting. You just never encounter enough representatives from them in a real population study. You are typically interested in the much larger haplogroups, which could only have proliferated under conditions of reproductive inequality, and which are the only ones that can yield enough representatives in a sample to allow for a variance calculation.

Summary

In the previous post I showed that Z.U.F. calculate their effective rate over all simulated observations, but the rate is applied in the literature over a very specific set of observations, i.e. large haplogroups.

In this post, I showed that Z.U.F.-style simulation just don't produce realistic haplogroup sizes. Drift alone can't explain why millions of men share patrilineal ancestry. Large haplogroup sizes require an assumption of reproductive inequality, and Y-STR variance within them accumulates near the germline rate.

(*) Of course, if one studies numerically small populations, it is possible that a slower effective rate may be desired. My concern is with the large human populations (e.g. Greeks or Indians) where real haplogroup sizes exceed greatly those produced by simulations with reproductive equality.

UPDATE (August 8): Continued in On the effective mutation rate for Y-STR variance

28 comments:

Maju said...: A man A may have mA < m if he is impoverished or has a faulty Y-chromosome gene, and he may have mA > m if he is a ruler or has an advantageous gene in his Y-chromosome.

Not necesarily in his Y chromosome, just if he has advantegous genes overall. Fitness is not measured only on the Y chromosome, obviously.

Also, don't forget "Casanovas". You do not need to be rich to be succesful reproductively. Even nowadays it's estimated that one in every ten children is illegitimate (most of whom will never know), a "sexy" man can manage to have a very high reproductive rate even if poor.

Furthermore, poor people often reproduce at very high rates. So I'm not really so convinced of the benefits of being powerful and rich in regard to reproductive success. There may be something to it but it's not very clear anyhow. In fact, many historical leaders' lines went extinct (at least for the male side) soon after them (Caesar and Alexander are two that I can recall right now).

It is clear, by now, that realistic haplogroup sizes can occur only when there is reproductive inequality. They are not the result of genetic drift, but of natural or social selection. And, effective mutation rates should be calculated over successful haplogroups under conditions of reproductive inequality, and not over all haplogroups under conditions of reproductive equality.(*)

This seems to be your bottomline, right?

You are attributing the effects of natural selection on genes (individual genes or arrays of them) to only male lineages. But obviously the small Y chromosome itself plays a very minor role in overall fitness, and the "good" genes can (and are in fact) transmitted (with a good deal randomness because of recombination) via both father and mother to both sons and daughters. The likehood of fitness being transmitted mainly via male-male lineages is small really.

I don't say it does not exist, just that it's limited, more than you seem to think. While genetic recombination plays an excellent role in securing diversity (clones are normally less fit) it also reduces very notably the likehood of of the whole genetic package of inidvidual fitness from being transmitted intact.

Additionally there have been recent studies that suggest that attractive females are more likely to have homosexual sons, and succesful men are more likely to marry or partake concubinage with sexy women... so the effect of male success on patrilineages is not so clear after all.

Other many "buts" could be added like succesful men prefering less dominant sons to be their favorites (they can't help it: they hate competence, they love to be on top), reducing even more the likehood of perpetuation of succesful male lineages.; Tuesday, July 29, 2008 1:39:00 pm
Dienekes said...: This seems to be your bottomline, right?

You are attributing the effects of natural selection on genes (individual genes or arrays of them) to only male lineages. But obviously the small Y chromosome itself plays a very minor role in overall fitness, and the "good" genes can (and are in fact) transmitted (with a good deal randomness because of recombination) via both father and mother to both sons and daughters. The likehood of fitness being transmitted mainly via male-male lineages is small really.

The question is why Y-chromosome lineages proliferate in a population much more than can be expected by drift alone.

Reproductive inequality is the only way to explain this: if individual lineages may grow at a faster rate than the population average, then they can grow to large, realistic sizes.

Such a faster rate can be either the result of selection on Y-chromosome genes, or the result of social factors such as high rank.

Your notion that non-Y chromosome genes affect a man's fitness even more is irrelevant. We are adopting a gene's point of view. A man's fitness is a consequence of many factors, but the Y-chromosome's fitness is a consequence of the genes that are on it.

Furthermore, poor people often reproduce at very high rates.

There are many more poor people than rich ones. Hence, the population average rate is determined by poor people, not rich ones.

The masses can't reproduce at a higher rate for a long time, because this leads to a Malthusian limit (or at least did before industrialization).

If a 5% elite has m=1.1 and the poor have m=1, then the population grows at a rate of 1.0025. If m's are reversed, then the population grows at a rate of 1.0475 which is too high for most of human history.

Also, don't forget "Casanovas". You do not need to be rich to be succesful reproductively. Even nowadays it's estimated that one in every ten children is illegitimate (most of whom will never know), a "sexy" man can manage to have a very high reproductive rate even if poor.

"Even nowadays" and suggests that spousal infidelity was higher in the past, a suspect proposition. Also, sexy people may father a few illegitimate children, but they can't lead to consistent growth of their Y-chromosome lineages. Only if their sons are also sexy, and their sons' sons, i.e. if there is a Y-linked reproductive advantage will their lineages grow to large sizes.

Caesar and Alexander are two that I can recall right now

But, of course, you have no idea how many offspring Caesar and Alexander had on the side. Official historical accounts only count legitimate or known illegitimate children.

And, in any case, the top dog's fertility is subject to stochastic fluctuation as any other man's, but Macedonian Y chromosomes could spread to many more bodies after Alexander's campaign, and they probably did than if he had stayed at home and his state was overrun by the Persians.; Tuesday, July 29, 2008 2:06:00 pm
pconroy said...: Dienekes,

As a long time reader of your blog - 5 plus years - I just want to congratulate you on this and your previous posting on the accumulation of Y-STR variance and how it affects TMRCA calculations.

It's wonderful to see important original research like this being published in a blog!

Cheers,
Paul; Tuesday, July 29, 2008 5:55:00 pm
John Hawks said...: I think you'll find that the same is true of mtDNA: the big haplogroups are too big to be drift.

Nice work.; Tuesday, July 29, 2008 6:29:00 pm
Maju said...: The question is why Y-chromosome lineages proliferate in a population much more than can be expected by drift alone.

Because that drift happened in an older time when the population was much smaller, IMO.

Your notion that non-Y chromosome genes affect a man's fitness even more is irrelevant. We are adopting a gene's point of view. A man's fitness is a consequence of many factors, but the Y-chromosome's fitness is a consequence of the genes that are on it.

I don't agree at all. The Y-DNA is dependent on the carrier and the carrier's fitness is only slightly (if at all) determined by the Y-DNA.

It's like the national olympic team shirt: if it's worn by an able athlete, it will make to the podium, else it will not. It's not the shirt that gives the fitness but the athlete.

"Even nowadays" and suggests that spousal infidelity was higher in the past, a suspect proposition.

I have no reason to think it was lower but I did not mean it was necesarily higher either. They had less availability of contraceptives in any case.

Also, sexy people may father a few illegitimate children, but they can't lead to consistent growth of their Y-chromosome lineages. Only if their sons are also sexy, and their sons' sons, i.e. if there is a Y-linked reproductive advantage will their lineages grow to large sizes.

That can also be said of wealthy and poweful people. But history is full of fallen lineages and of climbing commoners and even slaves. It's not like a powerful person can secure their offspring's success either. It will largely depend on heir own fitness, as much as with the sexy Casanova or the poor rabbit-like father of twenty. Each one has his own reproductive strategy and all have many implicit uncertainties.

But, of course, you have no idea how many offspring Caesar and Alexander had on the side. Official historical accounts only count legitimate or known illegitimate children.

Sure. But guess that, if they had no recognition whatsoever, they did not enjoy either any privileges associated to their powerful ancestor. So their reproductive success was as uncertain as any other man's.

...but Macedonian Y chromosomes could spread to many more bodies after Alexander's campaign, and they probably did than if he had stayed at home and his state was overrun by the Persians.

It's possible (even if the massive interethnic marriages promoted by Alexander were nulified soon after his death). The case is that we have no evidence anyhow of any meaningful Macedonian (or generally Greek) legacy in the Alexandrian empire. I am not aware of it apearing obvious in any genetic study of any population anywhere certainly. Rather the opposite: no clear indication of any Macedonian/Greek influence anywhere.

And I am aware of some really close matches between South Asians and Eastern Europeans with the R1a clade that by the usual timelines should be Scythian. But nothing appearing to be Greek.

I wonder what are you thinking about.

...

I think you'll find that the same is true of mtDNA: the big haplogroups are too big to be drift.

John: we seem to know now that the biggest European mtDNA clade (H) was around in the Gravettian. Cannot the small population sizes of that period (and the LGM after it) explain that concentration via drift?

I think they can perfectly.

Also I see that there are many H sublineages that have "pathological mutations" in their mtDNA, what doesn't seem to have been enough to stop them. These are also common withing U5 the second most important European lineage.; Tuesday, July 29, 2008 11:14:00 pm
Dienekes said...: Because that drift happened in an older time when the population was much smaller, IMO.

If drift in a large population, where haplogroup size isn't constrained can't account for its large size, drift in a small one where it is population size-bound certainly can't.

I don't agree at all. The Y-DNA is dependent on the carrier and the carrier's fitness is only slightly (if at all) determined by the Y-DNA.

Slight fitness advantages multiply over evolutionary time. It's called Darwinism.

But guess that, if they had no recognition whatsoever, they did not enjoy either any privileges associated to their powerful ancestor. So their reproductive success was as uncertain as any other man's.

It doesn't matter, because the "Caesar haplogroup" would still jump in size in one generation, and thus accumulate variance higher than the Z.U.F. rate.

Rather the opposite: no clear indication of any Macedonian/Greek influence anywhere.

If one uses absurdly slow mutation rates, then of course all phenomena will be seen as Paleolithic/Neolithic and no historical influences will be detected.; Tuesday, July 29, 2008 11:34:00 pm
Maju said...: Dienekes: I think I can't agree with anything you say in the previous post. Except the last paragraph, which I cannot understand for lack of a clear example.

In any case:

If drift in a large population, where haplogroup size isn't constrained can't account for its large size, drift in a small one where it is population size-bound certainly can't.

How come? AFAIK this is the basic understanding of drift and fixation: the smaller the population the strongest the drift and the greater the likehood of fixation.

If you have a small patrilocal band with, say, two Y-DNA clades A and B, one (either A or B) will with all likehood become fixated in few generations. Sooner or later one of the lines will most probably end up without male heirs and the other lineage will become fixated. There is a small chance that both clades survive - but it's tinier the smaller the population involved.

I mean: this is basic, isn't it? I'm really flippant you even question it.; Wednesday, July 30, 2008 12:15:00 am
Dienekes said...: The question isn't whether or not it will be fixed but whether or not it will grow to the right size. So, in your small population it may be fixed, but that doesn't help us one bit in explaining why it got to the observable size.; Wednesday, July 30, 2008 12:28:00 am
Maju said...: The question isn't whether or not it will be fixed but whether or not it will grow to the right size. So, in your small population it may be fixed, but that doesn't help us one bit in explaining why it got to the observable size.

Well, the normal explanation is that because the population carrying it, expanded later on.

In some cases these clades show a marked starlike structure, meaning that a single fast expansion was involved. That is the case of the two major European haplogroups: mtDNA H and Y-DNA R1b1b2, what implies a single expansion event, in other cases they show other shapes what imply more irregular expansion processes. Small clades, like F4 and the like, either never really expanded much or suffered contraction - or both. That contraction could have been caused by mere drift anyhow.

Fast expansions like those indicated by startlike shapes are not compatible IMO with your selection explanation, that should act more gradually, if anything. It rather implies large colonizations of new lands or ecological niches that allowed a large and quite fast homogeneous increase of the population where that clade (or clades) had become fixed.

In the case of Y-DNA it can well mean an expansion almost exclusively driven by men. A good example is that of R1b in many parts of Latin America, where native mtDNA is still very important but Y-DNA has been virtually eclipsed by European clades (specially R1b, mostly belonging to Iberian lineages). Another sex-biased male-vectorized scenario is commonly attributed to Indo-European expansions, in this case associated primarily to haplogroup R1a1.

But you can find also expansions that are not sex-biased, or at least not markedly so. Generally the ones of Paleolithic and even Neolithic seem to be more of that kind - always following the most widespread interpretations. But modernly the Anglosaxon overseas colonizations have followed that pattern too.; Wednesday, July 30, 2008 3:15:00 am
Polak said...: Hey Dienekes you crazy Greek, have you heard anything about this paper?

Liberton DK, McEvoy B, Bauchet M, et al. Variation in facial features among European populations measured from 3D photographs. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY : 156-156 Suppl. 44 2007; Wednesday, July 30, 2008 7:40:00 am
n/a said...: Polak, it's a poster from the 2007 AAPA conference, not a paper. I've posted the abstract here:

http://racehist.blogspot.com/2008/07/distant-european-populations-distinct.html; Wednesday, July 30, 2008 2:06:00 pm
Polak said...: Ok, thanks for that.; Wednesday, July 30, 2008 4:06:00 pm
pconroy said...: Dienekes,

Is it possible for you to look at the R1b TMRCA of Basques, Iberians, Irish, Welsh, Scottish, English, Frisians and others on the Atlantic fringe, and compare them to R1b in the Middle East in places like Anatolia, Lebanon, Syria, Armenia (especially Karabakh and Syunik), Azerbaijan, Ossetia, Iran, Jordan (especially Jerico), and also more distant R1b populations, like Northern Cameroon and Central Afghanistan (Hazara).

This way we could determine if R1b expansion started in or around Southern Armenia - as I suspect - and spread from there??; Wednesday, July 30, 2008 6:06:00 pm
pconroy said...: I wonder if R1b in Western Europe is associated with the Cardium Pottery peoples from the Middle East?; Wednesday, July 30, 2008 9:00:00 pm
Maju said...: I wonder if R1b in Western Europe is associated with the Cardium Pottery peoples from the Middle East?

Archaeologically speaking it looks very unlikely, because most Cardium Pottery sites (in Italy, as in France or Spain) appear to be aculturized natives (Epipaleolithic tool continuity with farming and Cardium pottery), with true colonization being limited to few sites or areas. Additionally, the origins of Cardium Pottery seem to be in the Western Balcans (with a precursor known in Greece and a later connection in Lebanon)and R1b is relatively scarce in Italy or the Western Balcans or overall in most of the European Mediterranean excepting Iberia.

The distribution of Cardium Pottery in Iberia fits very well with that of Y-DNA J (Eastern Iberia: Paisos Catalans mostly), while the distribution of Y-DNA fits well with the other and older Iberian Neolithic of Andalusia, surely original of North Africa, that probably acted as barrier to further expansion of Cardium Pottery westward.

It is possible that the origin of R1b as a whole was in Anatolia but it does not seem to relate to any Neolithic difusion pattern.

If any Neolithic wave was to mean large scale colonization in Europe, that should be (again from the archaeological viewpoint) the one sweeping the Balcans (scarcely populated before it) and maybe its Danubian offshot (though the transition from one to the other is maybe too abrupt to mean pure continuity anyhow).; Wednesday, July 30, 2008 9:50:00 pm
pconroy said...: Well if not Cardium Pottery people, whose origin was in Lebanon/Syria, then how about the Megalithic people.

It would seem that they might have originated in Southern Armenia, in the Syunik area, as there is a huge ancient megalithic structure there, called Karahunj (aka Zorats Karer, Angelakot) - http://www.nationmaster.com/encyclopedia/Karahunj - which may date to 8,000 yo, it seems to have also served as an observatory - much like Newgrange in Ireland or Stonehenge in England.

It would also seem that the population of this Southern Armenia area is 40-45% R1b:
http://www.ucl.ac.uk/tcga/tcgapdf/Weale-HG-01-Armenia.pdf; Wednesday, July 30, 2008 10:16:00 pm
Maju said...: Well if not Cardium Pottery people, whose origin was in Lebanon/Syria...

Are you sure about that? I know that the Amuq-Biblos culture, specifically the Biblos facies, had Cardium pottery in some stage but I think it's of later date than that of the Balcans. AFAIK the oldest Cardium-like pottery is from Thessaly, found along, but in different contexts as that of pre-Sesklo. Then it's found in the Adriatic coasts and Bosnia, then in southern Italy, etc.

... then how about the Megalithic people.

If Western European R1b is Neolithic then it must be related to Megalithic expansion. But Megalithism in the sense of Dolmenism, as other Megalithic architecture as stone rings, tholoi or artificial caves are less clearly a homogenous phenomenon. Stone rings, unlike dolmens, seem to be mainly associated to cremation, while dolmens are always "collective" (clanic) burials.

Anyhow, it's most likely that dolmenism (and megalithism in general) was a religious phenomenon spread through difussion rather than representing any major migration.

It would seem that they might have originated in Southern Armenia, in the Syunik area, as there is a huge ancient megalithic structure there, called Karahunj (aka Zorats Karer, Angelakot) - http://www.nationmaster.com/encyclopedia/Karahunj - which may date to 8,000 yo, it seems to have also served as an observatory - much like Newgrange in Ireland or Stonehenge in England.

Well, that's an archaeoastronomical estimate. C14 dates for the related cementery are from the Iron Age. In fact the oldest stone ring (using C14 dates) seems to be Nabta Playa in southern Egypt.

But the oldest dolmens, older than any known stone ring and of similar age as the Danubian rondels (similar to henges), seem to be in southern Portugal, pre-dating by some 1000 years those of Brittany (second oldest ones). In fact the traject of the dolmenic phenomenon is clearly from West to East, with offshots appearing in varied parts of West Asia (Caucasus, Yemen) as they vanish in Western Europe, and later also appearing farther east, in India and Korea. I don't think this "religious" wave has anything to do with any population movement in any case, though some localized migrations cannot be fully excluded for Western Europe and North Africa.

It would also seem that the population of this Southern Armenia area is 40-45% R1b:
http://www.ucl.ac.uk/tcga/tcgapdf/Weale-HG-01-Armenia.pdf

Excuse me, but the once called Hg2 is actually haplogroup I, Hg1 is what best approaches R1b (Hg21 is too) and that is between 22 and 43%, but closer to the lowest figure in most regions.

I'm not sure if Armenian origins have been fully clarified by now but for what I know the formation of the Armenian nation was made via colonization (how much demic input and how much aculturation, I cannot say). They anyhow look pretty different from most other Caucasian ethnicities, among which haplogroup J (specially J1) is clearly dominant. More akin to Anatolians than to their other neighbours maybe?

Maybe even more akin to Balcanic peoples (high I) than to any group in West Asia, I'd say.; Thursday, July 31, 2008 6:54:00 am
Ebizur said...: maju said,
"Excuse me, but the once called Hg2 is actually haplogroup I, Hg1 is what best approaches R1b (Hg21 is too) and that is between 22 and 43%, but closer to the lowest figure in most regions."

Former "Hg2" does not refer specifically to haplogroup I-M170. It usually refers to haplogroup F(xJ, K), which means that these Armenian Hg2 Y-chromosomes might actually be a heterogeneous mixture of haplogroups G-M201, H-M69, and I-M170.; Thursday, July 31, 2008 8:38:00 am
Ebizur said...: Actually, I just took a look at the PDF, and it appears that Hg2 in this study is defined by the SRY10831.1 mutation and the researchers did not test for the 12f2.1 mutation, which would make this study's Hg2 equivalent to Y*(xA, DE, K). That means that the Hg2 figures for Armenians could be any sort of haplogroup F(xK).; Thursday, July 31, 2008 8:58:00 am
Maju said...: Former "Hg2" does not refer specifically to haplogroup I-M170. It usually refers to haplogroup F(xJ, K), which means that these Armenian Hg2 Y-chromosomes might actually be a heterogeneous mixture of haplogroups G-M201, H-M69, and I-M170.

Actually, I just took a look at the PDF, and it appears that Hg2 in this study is defined by the SRY10831.1 mutation and the researchers did not test for the 12f2.1 mutation, which would make this study's Hg2 equivalent to Y*(xA, DE, K). That means that the Hg2 figures for Armenians could be any sort of haplogroup F(xK).

Good anotation, I was just comparing with smilar maps for Europe, where "Hg2" seems to be mostly I. But You are absolutely right.

Guess it'll be mostly J then, like the other Caucasians.; Thursday, July 31, 2008 3:31:00 pm
Ebizur said...: maju said,

"Good anotation, I was just comparing with smilar maps for Europe, where "Hg2" seems to be mostly I. But You are absolutely right.

Guess it'll be mostly J then, like the other Caucasians."

Thanks, maju. You are correct that most of the hg2 Y-DNA among the Armenians should belong to haplogroup J; however, it seems that approximately 10% of Armenians should belong to haplogroup G-M201 and approximately 25% of Armenians should belong to haplogroup J2-M172, so the fraction of Armenians that belongs to haplogroup J1-M267, which is overwhelmingly frequent among the Northeast Caucasian peoples, should be no more than 15%, as Weale et al. have hg2 in approximately 50% of their Armenian samples.

Here is the full data on Armenians from Weale et al. 2001:

hg1 (PxR1a):
10/44 = 0.2273 Ararat
42/189 = 0.2222 North
56/140 = 0.4000 Syunik
92/215 = 0.4279 Karabakh
18/56 = 0.3214 Iranian
20/90 = 0.2222 West
238/734 = 0.3243 Armenian total
(Note that nearly one third of Armenians may belong to haplogroup R1b.)

hg2 (YxA, DE, K):
26/44 = 0.5909 Ararat
111/189 = 0.5873 North
60/140 = 0.4286 Syunik
91/215 = 0.4233 Karabakh
29/56 = 0.5179 Iranian
52/90 = 0.5778 West
369/734 = 0.5027 Armenian total

hg3 (R1a1-M17):
0/44 = 0.0000 Ararat
8/189 = 0.0423 North
13/140 = 0.0929 Syunik
12/215 = 0.0558 Karabakh
1/56 = 0.0179 Iranian
3/90 = 0.0333 West
37/734 = 0.0504 Armenian total

hg21 (E-SRY4064):
4/44 = 0.0909 Ararat
13/189 = 0.0688 North
4/140 = 0.0286 Syunik
6/215 = 0.0279 Karabakh
8/56 = 0.1429 Iranian
5/90 = 0.0556 West
40/734 = 0.0545 Armenian total
(Note the high frequency of haplogroup E among Armenians in Iran. Perhaps this indicates some non-Armenian ancestry for at least part of the ethnically Armenian population in Iran.)

hg26 (K-M9xL-M20, N1c-Tat, O2b-SRY465, P-92R7):
2/44 = 0.0455 Ararat
10/189 = 0.0529 North
7/140 = 0.0500 Syunik
11/215 = 0.0512 Karabakh
0/56 = 0.0000 Iranian
6/90 = 0.0667 West
36/734 = 0.0490 Armenian total
(The approximately 5% of Armenians in hg26 probably belong to haplogroup T-M70 and/or haplogroup NxN1c.)

hg28 (L-M20):
2/44 = 0.0455 Ararat
4/189 = 0.0212 North
0/140 = 0.0000 Syunik
3/215 = 0.0140 Karabakh
0/56 = 0.0000 Iranian
3/90 = 0.0333 West
12/734 = 0.0163 Armenian total

hg29 (R1axR1a1):
0/44 = 0.0000 Ararat
1/189 = 0.0053 North
0/140 = 0.0000 Syunik
0/215 = 0.0000 Karabakh
0/56 = 0.0000 Iranian
1/90 = 0.0111 West
2/734 = 0.0027 Armenian total; Friday, August 01, 2008 12:41:00 am
Maju said...: Thanks for your extensive precissions, Ebizur.

Just to be even more precise, you say that:

Note that nearly one third of Armenians may belong to haplogroup R1b

However, Hg1 is more like "a primitive P*", it included all sorts of fractions of P (P*, Q, R1b and R2) not identified by other means. It excluded all R1a (Hgs 3 and 29), a subclade of R1b (Hg 22)and Q3 (Hg 18). All other P is in it.

In modern nomenclature it was precisely P(xQ3,R1a,R1b1b2a2d). Parts of it could well be Q or R2, rare but not totally unheard of in that region (see Cinnioglu 2003). Q (mostly Q*) is specially noticeable in Turkish Armenia (6% in region 4) and Central Anatolia (3.3% in region 7). R2 is only occasional but it's reported too.

Btw, in Cinnioglu's paper you get (total for Turkey) 14.5% for R1b1b2 but as much as 19.1% if you'd consider "Hg1" (not considering R1b1b2a2d, aka "Hg22", that anyhow is not likely to be found there). It is still lower than the Armenian figure but much closer and indicates the possible importance of other elements (Q, R1b(xR1b1b2), R2) in that figure.

The areas with highest R1b1b2 frequency in Turkey are region 2 (Northern Anatolia?, Sinope and nearby areas, with almost 27.6%) and region 6 (Mediterranean coasts with 21.2%). It's quite posible that these regions are roughly even with Armenia in this haplogroup's apportion (haven't calculated the Hg1 cocktail for them).; Friday, August 01, 2008 2:48:00 am
Ebizur said...: maju said,

"However, Hg1 is more like "a primitive P*", it included all sorts of fractions of P (P*, Q, R1b and R2) not identified by other means. It excluded all R1a (Hgs 3 and 29), a subclade of R1b (Hg 22)and Q3 (Hg 18). All other P is in it.

In modern nomenclature it was precisely P(xQ3,R1a,R1b1b2a2d)."

Incorrect. When I said that hg1 in this study referred to P(xR1a), I meant P(xR1a). Please refer to Figure 2 in Weale et al. (2001), which is labeled as "Y chromosome haplogroup network defined by the 11 UEP markers used in this study."

I'll respond to your other points later when I have some more time. :); Friday, August 01, 2008 3:20:00 am
Maju said...: When I said that hg1 in this study referred to P(xR1a), I meant P(xR1a). Please refer to Figure 2 in Weale et al. (2001), which is labeled as "Y chromosome haplogroup network defined by the 11 UEP markers used in this study."

Ok. You seem to be right for this particular study. I was using the 2001 YCC tree, that is more comprehensive. But it doesn't really matter as the other two haplogroups identified then within the Tyler-Smith-Göbling system inside would-be P (Q3 and R1b1b2a2d) don't seem to exist in that area anyhow.; Friday, August 01, 2008 3:37:00 am
pconroy said...: ebizur,

Thanks for the data, especially:
56/140 = 0.4000 Syunik
92/215 = 0.4279 Karabakh

So it seems that Southern Armenia was heavily R1b and had some of the earliest megalithic structures.

Maju,

I disagree that the megalithic structures were religious in nature, they were instead like great stone locks, which were used by agricultural people to measure the passing of the seasons, and determine when planting and harvesting should take place.

Can you point out any "stone circles" that are not associated with some plain, where agriculture took place?; Friday, August 01, 2008 11:36:00 pm
Maju said...: I disagree that the megalithic structures were religious in nature, they were instead like great stone locks, which were used by agricultural people to measure the passing of the seasons, and determine when planting and harvesting should take place.

And wasn't that religious? I use of course "religion" in a wide sense but I have no reason to think that the ancients detached astronomy from religion. Call it "belief system" if that makes you feel better.

You are anyhow defining Megalithism based only in stone rings, when the defining fossil is, if any, the dolmen (trilithon), whose purpose is not astronomical but of clannic burial, funerary. Ancestor veneration was surely, IMO, a religious element that was surely central to Megalithism (understood primarily as dolmenism).

Can you point out any "stone circles" that are not associated with some plain, where agriculture took place?

Sure. Basque cromlechs (stone rings, generally small and dated to the Iron Age, after dolmenism was gone in Europe) are always on mountain heights, not on top of the mountain but close, under the summit. They are actually related, like many other megalithic structures everywhere, with pastoralism, not agriculture. But they were not any utilitary constructions, historically they were still used in some valleys as reference for popular assemblies, which gathered right by them (never inside).

In any case, it doesn't matter, I don't think the peoples building megaliths could detach them, whichever their meaning from their quotidiain context, it would make no sense. Like hermits later on, they were intermingled with the normal rural context but often marking places considered specialy sacred - alternatively it was the monuments which conferred sacredness to those spots (just like other temples and shrines do now).; Saturday, August 02, 2008 12:11:00 am
Ebizur said...: maju said,

"Btw, in Cinnioglu's paper you get (total for Turkey) 14.5% for R1b1b2 but as much as 19.1% if you'd consider "Hg1" (not considering R1b1b2a2d, aka "Hg22", that anyhow is not likely to be found there). It is still lower than the Armenian figure but much closer and indicates the possible importance of other elements (Q, R1b(xR1b1b2), R2) in that figure.

The areas with highest R1b1b2 frequency in Turkey are region 2 (Northern Anatolia?, Sinope and nearby areas, with almost 27.6%) and region 6 (Mediterranean coasts with 21.2%). It's quite posible that these regions are roughly even with Armenia in this haplogroup's apportion (haven't calculated the Hg1 cocktail for them)."

Cinnioglu's data are on Turks in Turkey. They are not necessarily relevant to the potential breakdown of the P(xR1a) component of the Armenian samples of Weale et al. (2001).

For comparison, here are some more data on Armenians in Armenia, courtesy of Wells et al. (2001):

2/47 = 0.0426 E-M96
6/47 = 0.1277 F-M89(xI-M170, J2-M172, H1-M52, K-M9)
2/47 = 0.0426 I-M170
10/47 = 0.2128 J2-M172
3/47 = 0.0638 K-M9(xO-M175, L-M20, N1c-M46, P-M45)
2/47 = 0.0426 L-M20
1/47 = 0.0213 N1c-M46
17/47 = 0.3617 R1-M173(xR1a1-M17)
4/47 = 0.0851 R1a1-M17(xR1a1c-M87)

And here are some data on Armenians from Zerjal et al. (2002):

9/21 = 0.4286 P(xR1a)
2/21 = 0.0952 Y(xA, C, DE, H2, J, K)
1/21 = 0.0476 R1a1
5/21 = 0.2381 J
2/21 = 0.0952 E(xE1b1a-SY81)
1/21 = 0.0476 K-M9(xM20, LLY22g, MSY2.2, LY1, 92R7)
1/21 = 0.0476 L-M20

The Armenian sample of Wells et al. (2001) has a much lower frequency of F(xK) than the samples of Weale et al. (2001), with only 18/47 = 0.3830 haplogroup F(xK) in the data of Wells et al. compared to 369/734 = 0.5027 F(xK) in the data of Weale et al. The low frequency of haplogroup F(xK) in the Armenian sample of Wells et al. makes me suspect that the sample might have been taken from Syunik, the most southerly province of the independent state of Armenia, or from Nagorno-Karabakh, an Armenian-dominated enclave surrounded by the Republic of Azerbaijan. The samples of Weale et al. taken from Syunik and Karabakh had the highest frequencies of P(xR1a) and the lowest frequencies of Y(xA, DE, K) among all Armenian subpopulations. Actually, the Armenians of Syunik and Karabakh speak the same Karabakh dialect of the Armenian language, named after the historical region of Karabakh, which includes the province of Syunik as well as modern Nagorno-Karabakh. Perhaps the Karabakh dialect speakers in general have a high frequency of haplogroup R1(xR1a1) and a relatively low frequency of haplogroup F(xK).

Anyway, in response to your concern about the possibility of inflation of the P(xR1a) percentages among Armenians due to the presence of haplogroup Q or haplogroup R2, it is clear from the data of Wells et al. (2001) that these haplogroups cannot form a significant part of the Armenian genepool, because Wells et al. found zero instances of R2-M124 or P-M45(xM120, M124, M3, M173) among a sample of 47 Armenians. If haplogroup R2 and haplogroup Q were present among Armenians, and I believe that at least haplogroup R2 is present at a low frequency in Armenia, they should each be found in no more than approximately 2% of Armenians, for a total of approximately 4% potential R2+Q among Armenians. Even with this rather liberal estimate of the frequency of haplogroups R2 and Q among Armenians, the frequency of haplogroup R1b among Armenians should be approximately 30%.

However you may crunch the numbers, you will not be able to escape the fact that haplogroup R1b is the most frequently occurring haplogroup among Armenians.; Tuesday, August 05, 2008 1:00:00 am
Maju said...: However you may crunch the numbers, you will not be able to escape the fact that haplogroup R1b is the most frequently occurring haplogroup among Armenians.

Ok. You seem to have reasonably demonstrated it.

However, following Alonso et al, 2005, Armenian R1b is clearly related to the Anatolian one and it seems slightly less diverse than the Turkish one.

If you pay due attention to the haplotype map in that paper, it shows that Anatolia and Armenia are mostly a separate branch within R1b (with some offshots in the Balcans). There is only one shared haplotype between Anatolia/Armenia and most of Europe, which is the most extended and probably the ancestral one.; Tuesday, August 05, 2008 3:42:00 am

July 29, 2008

Haplogroup sizes and observation selection effects (continued)

28 comments:

Old Blog Archive

Articles

Calculators

My Other Blogs

Reference

Blogroll