Dienekes' Anthropology Blog /

July 31, 2008

Expansion of E-V13 explained

E-V13 is the main European clade of haplogroup E. It has been variously interpreted as a signature of early Balkan Bronze Age, or Mesolithic, the Greek colonization of Southern Italy, Greek ancestry in some Pakistanis, or Roman soldiers of Balkan origin in Britain. A proper understanding of its age would help resolve the problem of its origins.

Age, of course, depends on a proper choice of mutation rate, and as I have argued (part I and part II), the proper effective mutation rate is near the germline rate and not 3.6x slower as argued by Zhivotovsky, Underhill, and Feldman (2006). This is especially true for a relatively young haplogroup (very low STR variance compared to other lineages), which is also quite frequent in its area of origin, while much reduced away from it, giving a definite impression of a sudden and relatively recent expansion.

In my previous post, I estimated a Late Bronze Age for E-V13 in Greece and areas affected by historical Greek colonization. I now used Ken Nordtvedt's Generations2 program to obtain estimates of the age of E-V13 in three different datasets: the King set, 12-marker data from the E-M35 Phylogeny Project (Haplozone), as well as E-M78 data -most of which should be E-V13- from Bosch et al. (2006). In the latter set, I used two marker sets: all 12 markers common between Generations2 and Bosch, as well as 8 markers common between them, but excluding markers after DYS392 (in the Generations2/FTDNA order).

N


Age (25y/gen) Age (30y/gen)
Nea Nikomedeia 8
149
1725 BC 2470 BC
Sesklo/Dimini 20
71
225 AD 130 BC
Lerna Franchthi 20
120
1000 BC 1600 BC
Crete 13
68
300 AD 40 BC
Haplozone 103
134
1350 BC 2020 BC
Aromuns (12) 32
71
225 AD 130 BC
Aromuns (8) 32
73
175 AD 190 BC
Slavomacedonians (12) 13
51
725 AD 470 AD
Slavomacedonians (8) 13
59
525 AD 230 AD
Albanians (12) 9
70
250 AD 100 BC
Albanians (8) 9
59
525 AD 230 AD

Both the King et al. E-V13 data, as well as the diverse, mostly European Haplozone E-V13 agree in placing the expansion of this haplogroup squarely in the Aegean Bronze Age.

Aromuns (Vlachs) coalesce to the Roman era, consistent with the idea that they are Balkan natives who became Latinized linguistically at around that era.

Albanians also coalesce to Roman/Late Antique times, consistent with the idea that their high frequency of haplogroup E-V13 (which reaches very high numbers in e.g. Kosovars) is not associated with high diversity. Founder effects in that time frame are the reason for the high frequency of E-V13 in them.

Finally, Slavomacedonians from the former Yugoslav Republic of Macedonia coalesce well into AD times, at around the time of the first Slavic arrivals in the Balkans. This suggests that E-V13 in them is the result of local founders at around that time who adopted the Slavic language. However, Pericic et al. (2005) (see below) report high (but unspecified) diversity of E-M78α in "Macedonia", so it is possible that a larger number of earlier inhabitants were absorbed.

Pericic et al. (2005) give a 7.3kya estimate for the expansion of E-M78α (almost perfectly equivalent to E-V13) for Southeastern European populations north of Greece. Due to their use of the 3.6x slower mutation rate, this figure needs to be converted to equivalent years. The Nea Nikomedeia time depth was estimated as 9.2kya by King et al. Therefore, the equivalent age for the Pericic et al. (2005) expansion is (7.3/9.2) * 149 generations or 118 generations (1,540-950BC). They note that STR variance is higher in Greece, Macedonia, and Apulia, all areas with well-known historical Greek connections.

Cruciani et al. (2007) propose that E-V13 arrived in Europe from West Asia and underwent an expansion in Europe at 4-4.7 kya. This age is calculated using effective mutation rates that are 2.4 or 2.8 slower than the germline rate, which seems to suggest a Late Bronze Age or even later expansion with a rate closer to the germline one.

In the Balkans, it is fairly clear that E-V13 is mostly concentrated south of the Jirecek Line which separated native Greek from Latin speakers. In Italy, the highest frequencies are found in the south, the areas of historical Greek colonization. High frequencies are also attained in Cyprus. Cyprus also high STR diversity, consistent with an early arrival, suggestive of both early Mycenaean and later colonizations from the Aegean.

Conclusion

The age and distribution of E-V13 chromosomes suggest that expansions of the Greek world in the Bronze and later ages were the major causes of its diffusion.

Who was the E-V13 patriarch in Greece? He was perhaps one of the legendary figures of Greek mythology some of whom are said to have come from abroad. For whatever reason, his progeny grew, and were around to participate in the expansion of the Mycenaean world and the subsequent Greek colonization.

UPDATE (Aug. 1):

An additional piece of evidence is Y-chromosome distribution in Calabria, a Southern Italian region with well-known Greek connections. According to Semino et al. (2004) [Am. J. Hum. Genet. 74:1023–1034, 2004], the Calabrian sample has an E-M78 frequency of 16.3%, whereas "Calabria 2" representing the "Albanian community of the Cosenza province" has only 5.9%. This is consistent with the idea that E-V13 in modern Albanians is to a great degree due to Greek founders (Epirotes or ancient colonists).

Antikythera mechanism and the timing of the Olympiads


Complex clock combines calendars:
The Antikythera Mechanism, a clockwork device made in Greece around 150–100 BC, astounded the world two years ago when scientists deduced how this machine was used to make complex astronomical time-reckonings. Now they say that the instrument, discovered in 1901 in a Mediterranean shipwreck, did much more than that.

...

Researchers have been trying to decode the mechanism's inscriptions and functions for several years. Their latest findings reveal that it links the technical calendars used by astronomers to the everyday calendars that regulated ancient Greek society — most strikingly, the calendar that set the timing of the Olympic Games.

“The mechanism is full of surprises,” says Alexander Jones of the Institute for the Study of the Ancient World in New York, who is one of the decoding team. “The latest revelations establish its cultural origin for the first time.”

...

In 2006, Freeth was part of a team that used this and other techniques to figure out much of the mechanism's function, showing it to be an instrument of unparalleled sophistication in antiquity, more or less unrivalled until the clockwork mechanisms of the later Middle Ages3.

Now they say that the device was even more sophisticated than that — it unites abstruse astronomical determinations of time with the calendar of civic society. Another ancient Greek calendar cycle, called the Metonic cycle, was established to cope with the incommensurability of the lunar cycle and the solar year — the period of Earth's rotation around the Sun, as determined, say, by the time between successive summer solstices. One Metonic period is equal to 235 lunar months, which is almost exactly 19 solar years. The Metonic cycle, thought previously to be used only by astronomers, is represented on a dial on the Antikythera Mechanism. But this dial now turns out to be inscribed with the names of months in a regional calendar used in Corinthian colonies in northwest Greece — providing evidence that the device was used for mundane reckonings, and giving a surprising clue to its origin.

...

But Freeth and his team now think that the instrument may have come from Syracuse in Sicily, the Corinthian colony where Archimedes devised a planetarium in the third century BC. “Archimedes died at the siege of Syracuse in 212 BC, so we are confident that he did not make the mechanism,” says Freeth. “But it is possible that it came from a heritage of instrument-making that originated with him in Syracuse. It is an attractive idea, but purely speculative at present.”
Nature 454, 614-617 (31 July 2008) | doi:10.1038/nature07130

Calendars with Olympiad display and eclipse prediction on the Antikythera Mechanism

Tony Freeth1,2, Alexander Jones3, John M. Steele4 & Yanis Bitsakis1,5

Previous research on the Antikythera Mechanism established a highly complex ancient Greek geared mechanism with front and back output dials1, 2, 3, 4, 5, 6, 7. The upper back dial is a 19-year calendar, based on the Metonic cycle, arranged as a five-turn spiral1, 6, 8. The lower back dial is a Saros eclipse-prediction dial, arranged as a four-turn spiral of 223 lunar months, with glyphs indicating eclipse predictions6. Here we add surprising findings concerning these back dials. Though no month names on the Metonic calendar were previously known, we have now identified all 12 months, which are unexpectedly of Corinthian origin. The Corinthian colonies of northwestern Greece or Syracuse in Sicily are leading contenders—the latter suggesting a heritage going back to Archimedes. Calendars with excluded days to regulate month lengths, described in a first century bc source9, have hitherto been dismissed as implausible10, 11. We demonstrate their existence in the Antikythera calendar, and in the process establish why the Metonic dial has five turns. The upper subsidiary dial is not a 76-year Callippic dial as previously thought8, but follows the four-year cycle of the Olympiad and its associated Panhellenic Games. Newly identified index letters in each glyph on the Saros dial show that a previous reconstruction needs modification6. We explore models for generating the unusual glyph distribution, and show how the eclipse times appear to be contradictory. We explain the four turns of the Saros dial in terms of the full moon cycle and the Exeligmos dial as indicating a necessary correction to the predicted eclipse times. The new results on the Metonic calendar, Olympiad dial and eclipse prediction link the cycles of human institutions with the celestial cycles embedded in the Mechanism's gearwork.

Link

July 29, 2008

Haplogroup sizes and observation selection effects (continued)

This is a continuation of my comments on How Y-STR variance accumulates.

The story so far

In my previous post I showed how the "evolutionary rate" of Zhivotovsky, Underhill, and Feldman (2006) is inappropriate for TMRCA calculations, because:
  • It is not calculated from the time depth of the MRCA, but of an earlier "Patriarch"; more importantly:
  • It is an average over many simulated haplogroups of small size, and not the kinds of haplogroups one is usually interested in dating in population studies

How big are the haplogroups in Z.U.F.-type simulations?

Z.U.F. consider several different demographic models, differing in their choice of m, the population growth constant. The population size increases (stochastically) on average by 100(1-m)% every generation.

I produce N=10,000 simulations for each reported number. These are the average, and maximum number of descendants over these N simulations.

Constant population size (m=1)

Under this assumption, haplogroup size grows purely due to randomness of the fathering process; there is no overall population growth. This is an important case, because the 3.6x slower evolutionary rate has been derived from it.




Number of Descendants
g
Average
Maximum
10
5.9
56
20
11
106
40
21.1
176
80
40.9
366
160
81.2
801
320
159.8
1310

It is clear, that this type of simulation produces very small haplogroup sizes. Even for 320 generations (early Neolithic for Greece) the very largest haplogroup produced had 1,310 descendants, while the average one had the theoretically predicted ~160.

Small haplogroups => more drift => loss of variance => lower "effective" mutation rate.

So, as I mentioned in my previous post, to calculate the 3.6x slower rate, not only do we average over haplogroups of all sizes, small and large alike, but we are actually missing the relevant observations. But more on this, in the next section.

Expanding population (m=1.01)




Number of Descendants
g
Average
Maximum
10
6.2
54
20
12.2
112
40
25.5
232
80
62.2
559
160
194.1
1722
320
1170.5
12214

Predictably, haplogroups end up bigger in an expanding population, but still far short of the sizes of commonly dated real-world haplogroups. The case of m=1.01 is important, because it is the one which yields the maximum effective mutation rate considered by Z.U.F. assuming haplogroups start with one individual.

Thus, even the highest mutation rate considered by Z.U.F (about 0.55μ over 400 generations) is derived by averaging over haplogroups that are unrealistic (too small). Real Y-STR variance accumulates at a higher rate in the real world.

Why are Z.U.F.-style simulated haplogroups so small?

It is surprising that these simulated haplogroups end up so small, looking nothing like commonly studied haplogroups even for an expanding population.

The apparent mystery is resolved, once we realize that m is nothing more than the average number of sons a man has. The reason why we see haplogroups so much bigger than the simulated ones is because for individual men, m may be much more, or much less than its population average. In other words, there is reproductive inequality, which could be due both to social advantage, or to natural selection.

So, rather than having a uniform m for all men, we can allow m to vary in individual lineages. A man A may have mA<m if he is impoverished or has a faulty Y-chromosome gene, and he may have mA>m if he is a ruler or has an advantageous gene in his Y-chromosome.

The advantage could be slight but long-standing (a small fitness improvement) or small and intense (a conquest or foundation of a dynasty). Its effect on the lucky lineage is an increase in the number of descendants. Its effect on Y-STR variance is a rate of increase approaching the germline rate.

It is clear, by now, that realistic haplogroup sizes can occur only when there is reproductive inequality. They are not the result of genetic drift, but of natural or social selection. And, effective mutation rates should be calculated over successful haplogroups under conditions of reproductive inequality, and not over all haplogroups under conditions of reproductive equality.(*)


A note on sampling

Consider a lineage of 1,000 men (i.e. ~ the maximum produced with reproductive equality) in a population of 1,000,000 men. Its frequency is thus 0.1%

We take a sample of 1,000 men from this population; this is much larger sample than is typically used in population studies, and for a smaller population. We expect on average to find just 1 man from the lineage in question in our sample. You can't do a variance-based age estimate with one man!

Thus, it becomes clear why haplogroups produced by Z.U.F.-style simulations are uninteresting. You just never encounter enough representatives from them in a real population study. You are typically interested in the much larger haplogroups, which could only have proliferated under conditions of reproductive inequality, and which are the only ones that can yield enough representatives in a sample to allow for a variance calculation.

Summary

In the previous post I showed that Z.U.F. calculate their effective rate over all simulated observations, but the rate is applied in the literature over a very specific set of observations, i.e. large haplogroups.

In this post, I showed that Z.U.F.-style simulation just don't produce realistic haplogroup sizes. Drift alone can't explain why millions of men share patrilineal ancestry. Large haplogroup sizes require an assumption of reproductive inequality, and Y-STR variance within them accumulates near the germline rate.

(*) Of course, if one studies numerically small populations, it is possible that a slower effective rate may be desired. My concern is with the large human populations (e.g. Greeks or Indians) where real haplogroup sizes exceed greatly those produced by simulations with reproductive equality.

UPDATE (August 8): Continued in On the effective mutation rate for Y-STR variance

July 28, 2008

SLC24A5 in Greeks

Since two subjects were heterozygous for the Thr(111) allele, its overall frequency in Greeks is 99.4%, within the range of 98.7% and 100% reported for European Americans.

Exp Dermatol. 2008 Jul 7.

A study of a single variant allele (rs1426654) of the pigmentation-related gene SLC24A5 in Greek subjects.

Dimisianos G, Stefanaki I, Nicolaou V, Sypsa V, Antoniou C, Poulou M, Papadopoulos O, Gogas H, Kanavakis E, Nicolaidou E, Katsambas AD, Stratigos AJ.

Department of Medical Genetics, University of Athens, Agia Sophia Children's Hospital, Athens, Greece.

The SLC24A5 gene, the human orthologue of the zebrafish golden gene, has been shown to play a key role in human pigmentation. In this study, we investigate the prevalence of the variant allele rs1426654 in a selected sample of Greek subjects. Allele-specific polymerase chain reaction was performed in peripheral blood samples from 158 attendants of a dermatology outpatient service. The results were correlated with pigmentary traits and MC1R genotype. The vast majority of subjects (99%) were homozygous for the Thr(111) allele. Only two subjects from the control group (1.26%) were heterozygous for the alanine and threonine allele. Both of these Thr(111)/Ala(111) heterozygotes carried a single polymorphism of MC1R (one with the V92M variant and another with the V60L variant). Following reports of the rs1426654 polymorphism reaching fixation in the European population, our study of Greek subjects showed a prevalence of the Thr(111) allele, even among subjects with darker skin pigmentation or phototype.

Link

Ancient mtDNA from Inner Mongolia

Three individuals with mixed Caucasoid-Mongoloid affinities were an adult female (haplogroup C), 25yo male (haplogroup M), and 25-30yo male (haplogroup A). From the paper:
All haplogroups were Asian-specific, the haplotypes of 10 individuals are shared by modern Han Chinese, and the one-step neighbors to another 7 individuals also mainly distribute in modern Han Chinese (Yao et al., 2002). The phylogenetic analysis of the ancient population and extant Eurasian populations showed that the ancient population most closely related to the Han Chinese, especially the northern Han.
American Journal of Physical Anthropology doi: 10.1002/ajpa.20894

Ancient DNA analysis of human remains from the upper capital city of Kublai Khan

Yuqin Fu et al.

Abstract

Analysis of DNA from human archaeological remains is a powerful tool for reconstructing ancient events in human history. To help understand the origin of the inhabitants of Kublai Khan's Upper Capital in Inner Mongolia, we analyzed mitochondrial DNA (mtDNA) polymorphisms in 21 ancient individuals buried in the Zhenzishan cemetery of the Upper Capital. MtDNA coding and noncoding region polymorphisms identified in the ancient individuals were characteristic of the Asian mtDNA haplogroups A, B, N9a, C, D, Z, M7b, and M. Phylogenetic analysis of the ancient mtDNA sequences, and comparison with extant reference populations, revealed that the maternal lineages of the population buried in the Zhenzishan cemetery are of Asian origin and typical of present-day Han Chinese, despite the presence of typical European morphological features in several of the skeletons.

Link

July 25, 2008

German origin of Transylvanian Saxons

Using Athey's haplogroup predictor, with equal priors and a threshold of 50 and probability of 90%, the following haplogroups were predicted in the 59 males:

5 E1b1b
1 G1
2 G2a
2 H
4 I1
3 I2a(xI2a2)
1 I2a2
1 I2b1
1 J2b
1 N
2 R1a
22 R1b

Rom J Leg Med 12 (4) 247 – 255 (2004)

A study on Y-STR haplotypes in the Saxon population from Transylvania (Siebenbürger Sachsen): is there an evidence for a German origin?

Ligia Barbarii et al.

ABSTRACT: A study on Y-STR haplotypes in the Saxon population from Transylvania
(Siebenbürger Sachsen): is there an evidence for a German origin? Y chromosome markers are increasingly used to investigate human population histories, being considered to be sensitive systems for detecting the population movements. In this study we present Y-STR data for a male population of Transylvanian Saxons in
comparison with Y-haplotypes from Romanians and other European populations. The Transylvanian Saxons, called like that since medieval times, are representing a western population with unknown origin, settled in the Arch of Romanian Carpathian Mountains in the earliest of the 12th century. Historical and dialectal studies strongly suggest that they do not originate from Saxony, but more probably from the Mosel riversides (Rhine affluent) and also from the Eifel Mountains Valley (present territory of Luxembourg). Living protected by fortified cities in compact communities, they still represent a quite distinct population in Transylvania. For this study, 59 male samples were collected from the Siebenburgen area, subjects being selected by their Saxon surnames and paternal grandfather birthplace. A set of nine STR polymorphic systems mapping on the male-specific region of the human Y chromosome (DYS19, DYS385, DYS389 I/II, DYS390, DYS391, DYS392, DYS393) were typed by means of
one or two two multiplex PCR reactions and capillary electrophoresis. The typing results reflect high Saxon population haplotype diversity. Furthermore, we present data on the haplotype sharing of the Saxon population with other European populations, especially with Germans as well as with the Romanians and the Transylvanian Szekely.

Link (pdf)

Data on 17 Y-STRs

Int J Legal Med. 2008 Jul 24

Population and segregation data on 17 Y-STRs: results of a GEP-ISFG collaborative study.

Sánchez-Diz P, Alves C, Carvalho E, Carvalho M, Espinheira R, García O, Pinheiro MF, Pontes L, Porto MJ, Santapa O, Silva C, Sumita D, Valente S, Whittle M, Yurrebaso I, Carracedo A, Amorim A, Gusmão L; GEP-ISFG (The Spanish and Portuguese Working Group of the International Society for Forensic Genetics).

A collaborative work was carried out by the Spanish and Portuguese International Society for Forensic Genetics Working Group in order to extend the existing data on Y-short tandem repeat (STR) mutations at the 17 Y chromosome STR loci included in the AmpFlSTR YFiler kit (Applied Biosystems): DYS19, DYS385, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, and GATA H4.1. In a sample of 701 father/son pairs, 26 mutations were observed among 11,917 allele transfers across the 17 loci. After summing previously reported mutation data with our sample, mutation rates varied between 4.25 x 10(-4) (95% CI 0.05 x 10(-3)-1.53 x 10(-3)) at DYS438 and 6.36 x 10(-3) (95% CI 2.75 x 10(-3)-12.49 x 10(-3)) at DYS458. All mutations were single step, and mutations in the same father/son pair were found twice.

Link

Ancient Thracian mtDNA

The presentation of the results isn't very clear. From a cursory comparison of the results listed in the text with the Genographic project list of motifs, at least the following seem represented in the ancient Thracian individuals:
  • 1 individual seems to be 16129A 16223T
  • 1 individual seems to be 16145A
  • 1 individual seems to be 16186T 16190C (however, this looks like 16189C in Fig. 4, 186T and 189C are found in haplogroup T1)
  • 1 individual seems to be 16193T 16283C (16193T is found in J2, which also carries 16069T (beyond the region sequenced) 16126C (in the region sequenced but not found).
  • 1 individual seems to be 16311C
  • 2 individuals seems to be 16362C which in West Eurasia seems to be found in R0a and R6
Anyway, feel free to comment if you can make better sense of these results.

Rom J Leg Med 12 (4) 239 – 246 (2004)

Paleo-mtDNA analysis and population genetic aspects of old Thracian populations from South-East of Romania

Cardos G. et al.

ABSTRACT: Paleo-mtDNA analysis and population genetic aspects of old Thracian populations from South-East of Romania. We have performed a study of mtDNA polymorphisms (HVR I and HVR II sequences) on the skeletal remains of some old Thracian populations from SE of Romania, dating from the Bronze and Iron Age in order to show their contribution to the foundation of the modern Romanian genetic pool and the degree of their genetic kinships with other old and modern human European populations. For this purpose we have applied and adapted three DNA extraction methods: the phenol/chloroform, the guanidine isotiocianat and silica particles and thirdly the Invisorb Forensic Kit (Invitek)-based DNA extraction method. We amplified by PCR short fragments of HVR I and HVR II and sequenced them by the Sanger method. So far, we have obtained mtDNA from 13 Thracian individuals, which we have compared with several modern mtDNA sequences from 5 European present-day populations. Our results reflect an evident genetic similarity between the old Thracian individuals and the modern populations from SE of Europe.

Link (pdf)

July 24, 2008

Cuban mtDNA and Y chromosomes

A message from this study is that Y chromosome diversity within an already settled territory can indeed be wiped out. Introduction of new pathogens or a technological differential between colonists and natives, are just two possible ways to achieve this.

Many technological innovations (e.g. farming, Bronze, Iron) originated in a very small part of the Old World and spread far and wide. I would not be very surprised if this coincided with a massive replacement of Y chromosomes. The legacy of the earlier inhabitants may, of course, endure, via mtDNA, or autosomal DNA.

BMC Evol Biol. 2008 Jul 21;8(1):213. [Epub ahead of print]

Genetic origin, admixture, and asymmetry in maternal and paternal human lineages in Cuba.

Mendizabal I, Sandoval K, Berniell-Lee G, Calafell F, Salas A, Martinez-Fuentes A, Comas D.

ABSTRACT: BACKGROUND: Before the arrival of Europeans to Cuba, the island was inhabited by two Native American groups, the Tainos and the Ciboneys. Most of the present archaeological, linguistic and ancient DNA evidence indicates a South American origin for these populations. In colonial times, Cuban Native American people were replaced by European settlers and slaves from Africa. It is still unknown however, to what extent their genetic pool intermingled with and was 'diluted' by the arrival of newcomers. In order to investigate the demographic processes that gave rise to the current Cuban population, we analyzed the hypervariable region I (HVS-I) and five single nucleotide polymorphisms (SNPs) in the mitochondrial DNA (mtDNA) coding region in 245 individuals, and 40 Y-chromosome SNPs in 132 male individuals. RESULTS: The Native American contribution to present-day Cubans accounted for 33% of the maternal lineages, whereas Africa and Eurasia contributed 45% and 22% of the lineages, respectively. This Native American substrate in Cuba cannot be traced back to a single origin within the American continent, as previously suggested by ancient DNA analyses. Strikingly, no Native American lineages were found for the Y-chromosome, for which the Eurasian and African contributions were around 80% and 20%, respectively. CONCLUSIONS: While the ancestral Native American substrate is still appreciable in the maternal lineages, the extensive process of population admixture in Cuba has left no trace of the paternal Native American lineages, mirroring the strong sexual bias in the admixture processes taking place during colonial times.

Link

Hercules movie in development

Berg to direct 'Hercules':
"Hancock" director Peter Berg is spearheading a fresh take on Hercules for Universal.

Berg will produce and will develop to direct "Hercules: The Thracian Wars," a co-production of Spyglass Entertainment, Berg’s Film 44 and Radical Pictures. Spyglass and Universal will co-finance the film.

Ryan Condal will write the script, based on a five-issue comicbook series by Steve Moore that debuted in May through Radical Publishing.

July 21, 2008

How Y-STR variance accumulates: a comment on Zhivotovsky, Underhill and Feldman (2006)

An important erratum for this post.

Additions to this entry at the bottom (last update July 29)


In recent years, in most population genetics papers, an evolutionary mutation rate for Y chromosome microsatellites (STRs) of 0.00069/locus/generation has been used. This rate was proposed by Zhivotovsky et al. (2004) (pdf), and defended in Zhivotovsky et al. (2005), and especially Zhivotovsky, Underhill and Feldman (2006) (henceforth Z.U.F.)

This mutation rate is smaller than the observed germline mutation rate by a factor of 3-4. The germline mutation rate is observed by counting mutations directly, e.g., in father-son pairs, or in known pedigrees. Zhivotovsky et al. have provided two pieces of evidence in favor of their evolutionary rate:
  • Study of accumalation of STR variation in populations with known founding events, namely Bulgarian Roma and Maori, in their 2004 paper.
  • Simulations indicating a 3.6x discrepancy between the two rates in their 2006 paper, which is due to multiple bottlenecks in a haplogroup's history.
I was always apprehensive about what the "right" mutation rate should be:
We need to obtain good estimates of the mutation rate in order to pinpoint in time the common ancestor of a set of Y chromosomes. A factor of 3, especially for relatively recent events may correspond to a difference between early historical and late Paleolithic events.
Thus, I decided to look into the matter myself to be convinced -one way or another- of what the evolutionary mutation rate must be.

Methodology

The following assumptions, following Z.U.F. are made:
  • A man has 0, 1, 2, ... sons according to a Poisson process with mean m=1.
  • A step mutation (increase or decrease by 1 repeat) occurs with a mutation rate of µ=0.00251
  • STR variance of the man's descendants is measured after g generations.
Results are averaged over N men who have descendants after g generations. I will call such men, "Patriarchs". Thus, I generate random family trees for men until I have harvested N=10,000 of them who have living descendants today.2

Patriarch vs. MRCA

A consequence of the time-forward methodology of simulation, is that a Patriarch may not be the Most Recent Common Ancestor (MRCA) of his descendants g generations into the future. Trivially, if a Patriarch has only one son, then, that son -not the Patriarch- is the MRCA of his descendants. But, even if the Patriarch has many sons, and his group of descendants grows, it is possible (due to randomness of the fathering process) that at some generation only 1 descendant will survive.

Suppose that the Patriarch has lived in generation 0, and the MRCA lived in generation i. Thus, STR variance in the descendants at generation g (today) has accumulated over a time span of g-i generations, since, of course, at the generation i (of the MRCA), STR variance is zero.

Now, if we use a time-forward methodology from known foundation events (e.g. the arrival of the Roma in Bulgaria, or the Maori in New Zealand), it is perfectly right to see how STR variance accumulates from the known foundational event. We would then divide the accumulated STR variance by the known time span to determine an effective evolutionary mutation rate, similar to Zhivotovsky et al. (2004).

But, when the foundational event is unknown, when we are trying to estimate its age, then we can only go as far back as the MRCA, since at his time variance is zero. Therefore, by dividing accumulated variance with the evolutionary mutation rate of Z.U.F., we are over-estimating the time to the MRCA.

For example, with g=100, the average STR variance for the descendants of N=10,000 Patriarchs is 0.0755. But, if we average only those Patriarchs who are also the MRCA of their descendants, we obtain a value of 0.0824, or about 9% higher.

In general, the over-estimate (as a percentage) decreases as g increases: as g increases, the average number of descendants of a Patriarch increases, making them much less susceptible to a variance-reset type of bottleneck described here.

Thus, while the age difference between the MRCA and the Patriarch is real, its effect in the age estimate is not very pronounced. There is, however, a second, and much more serious problem, with the Z.U.F. rates when applied to evolutionary studies.

Prolific vs. Non-Prolific Patriarchs: an Observation Selection effect

Patriarchs starting at generation 0 will have a very variable number of descendants at generation g. By averaging over all of them, we are estimating the average STR variance in the descendants of men who lived g generations ago.

Now, consider how this average changes if we average only over the k most "prolific" men (with the most descendants) out of all the N=10,000 Patriarchs:


k
Average Variance
100
0.1721
1000
0.1407
2500
0.1219
5000
0.1033
10000
0.0755


It is clear that the STR variance in the descendants of the most "prolific" Patriarchs is much higher than in the descendants of the least "prolific" ones. In fact, for the most prolific Patriarchs, variance accumulates near the germline mutation rate, and not at the lower evolutionary effective rate.

Below is the cumulative percentage of the descendants of the k most prolific Patriarchs, with k from 1 to N.

It can be seen that e.g., from the most prolific half of the Patriarchs stems 84% of the descendants. And this, assuming no social inequality in the number of progeny, i.e. each man having the exact same average probability (m=1) of fathering a son. Thus, in reality, the more prolific Patriarchs may have an even larger fraction of the descendants.

Why is this important? Because, in population studies, scientists are likely observe (in the finite samples they collect) multiple descendants only of the most prolific of the Patriarchs. Thus, for the vast majority of the Patriarchs with few descendants, we are likely to sample no, or few of their descendants.

This means that there is an inherent observation selection effect in the types of Patriarchs we are likely to study: they are much more likely to be among the prolific ones. Coupling this observation with the knowledge that STR variance in the descendants of prolific Patriarchs accumulates near the germline mutation rate (0.69µ for the 100 most prolific ones in my experiment), we, once again, conclude that the STR variance in haplogroups likely to be made the object of scientific study accumulates near the germline mutation rate, and at the very least, faster than the evolutionary rate of Z.U.F.

Closing Remarks

Z.U.F. have also proposed two additional demographic scenaria under which a higher effective mutation rate would be observed:
  • A sudden jump in the size of the haplogroup after it appears
  • An expanding population (m>1)
Both factors seem reasonable for post-Holocene human populations. It is well known that -whatever temporary setbacks there were- mankind has overall experienced a substantial population growth in recent millennia. Thus, an expanding population seems like a fair assumption.

Moreover, it is reasonable to assume that in stratified human societies, a few males, (leaders, or conquerors), or groups of closely related males may have generated a disproportionate number of descendants in the short-term.

In summary:
  • The age difference between the Patriarch and the MRCA indicates that Variance/0.00069 overestimates the age of the MRCA somewhat (but not very much).
  • A prolific Patriarch's descendants are more likely to be sampled by scientists, and tend to have a higher STR variance. Hence, Variance/0.00069 overestimates the age of the MRCA, perhaps substantially.
  • Demographic factors, such as population growth, or short-term success by related males indicates that Variance/0.00069 overestimates the age of the MRCA.
In view of the above, and keeping in mind both the stochastic factors that cause STR variance to fluctuate around its expected value, as well as uncertainties in demographic history, I do believe that ages calculated with the evolutionary mutation rate of 0.00069/locus/generation are significantly overestimated.

1 Z.U.F. used a germline mutation rate of µ=0.001. For the purposes of simulation, this is not an important difference, as they themselves note. I choose the rate of 0.0025 because it is closer to the actual human germline mutation rate for STRs.
2 Z.U.F. generated 50,000 men and then averaged over the men who had descendants. I, on the other hand, generate as many men as it takes to harvest at least N men with descendants, to ensure that I average a substantially large number of such men.

Editorial change (Jul 22): erroneously written "exceeds",in paragraph 2, changed to "is smaller than".

Update (July 23):

To further elucidate how the observation selection effect may make lineages seem older than they really are, I carried out another small experiment (g=110, N=10,000, m=1).

The age of each group is inferred by dividing the accumulated variance by the evolutionary rate of 0.0006944 (=μ/3.6).

The average variance over all N in this experiment is 0.0867, thus, the average inferred age is 125 generations, close to the truth (110 generations), allowing for the correction in age between the Patriarch and the TMRCA.

However, if we calculated the average variance over ten groups of 1,000 lineages (out of all N=10,000) according to the number of descendants, we see, as described above, that more "prolific" lineages have accumulated more variance, whereas less "prolific" ones have accumulated less variance than the overall average of 0.0867.

Thus, over the 10% most populous lineages (right of the figure), the average inferred age is 209 generations, or a 90% overestimate of the true age!

But, as I mentioned, it is precisely these populous lineages (which don't just have "some" descendants today, but thousands and millions of them) that are likely to be studied, because they are the only ones that have enough representatives in a sample of 100-1,000 men, typically seen in a population study, to allow for an age estimate via a variance calculation.

Update (July 24): Haplogroup sizes

The number of a Patriarch's descendants after g generations is a random variable which depends on the parameters m (the population growth constant), and g, the number of generations.

Scientists typically look at haplogroups with thousands or millions of existing members. Are such haplogroups produced in the types of simulations performed by Z.U.F.?

I estimate the average size of the haplogroups of the haplogroups produced by Z.U.F. for different g=10,20,...,700 and m=1.

It is evident that this number increases linearly with g at a rate estimated to be 0.5/generation [This was also noted by Z.U.F. who state: "the average size of the surviving haplogroups increased each generation by a value rapidly approaching 0.5"] However, this means, that the average haplogroup at 700 generations has a size of ~350 men.

Thus, not only is the average variance estimated by Z.U.F. inappropriate because of an observation selection effect (averaging over small and large haplogroups alike), but it seems to miss the relevant observations altogether, i.e. the really large haplogroups numbering in the hundreds of thousands or millions. Yet it is precise for such large haplogroups that it has often be used in the literature.

How can we produce "realistic" haplogroup sizes, close to those likely to become an object of scientific study in contemporary human populations? We can either:
  • increase the number of initial representatives, i.e. start with many related men with identical Y chromosomes rather than just 1, or we can
  • increase the population growth constant m to something higher than 1, i.e. a growing population.
Yet, both these changes have the same effect, namely the accumulation of variance at a higher rate than the Z.U.F. rate.

Indeed, Z.U.F. produce some such large haplogroups in some of their simulations (Fig. 1 asterisks, Fig. 2 squares/diamonds), all of which show -predictably- a higher effective rate than their 3.6x slower rate.

They caution against such large haplogroup sizes ["population size exceeds 1 million by generation 1000, which is not realistic for many local tribes."]. Granted, -- if one looks at local tribes never growing to large numbers.

And yet, some or all of the co-authors of Z.U.F. did not limit their use of the 3.6x slower rate to local tribes: Cinnioglu et al. 2004 (pdf), Sengupta et al. (2006), King et al. (2008) all apply the 0.00069 rate for populations (and haplogroups) that have grown to much more than 1 million in less time, thus overestimating severely their age.

Update (July 24): Variance of a large haplogroup

Following the previous observations, naturally, I wanted to see for myself what the STR variance of an ancient lineage with a large number of modern descendants actually looks like. My target size is 1,000,000, which is about 20% of modern Greek males.

I consider two cases:
  • Expansion commencing in the Late Bronze Age (g=120 or 1,600BC with a generation length of 30)
  • Expansion commencing in the early Neolithic (g=300 or 7,000BC)

I harvest N=1,000 haplogroups for each of these cases. I set the growth constant at m=1.100694 for the Bronze Age, and m=1.039122 for the Neolithic. This ensures that enough "large" haplogroups will be generated during simulation. Naturally, the overall population grows at a smaller rate, but the successful lineages will grow much faster than the population average.

Note that I harvest only haplogroups whose MRCA lived in the specified time span. Also, I harvest haplogroups whose final size is between 750,000 and 1,250,000 to match my target size of 1,000,000. Indeed, the average size of the harvested haplogroups is 964,327 for the Bronze Age, and 979693 for the Neolithic.

Here are the results:
  • ~1 million descendants of a Bronze Age (120 generations ago) ancestor have an STR variance of 0.269 +/ 0.087
  • ~1 million descendants of a Neolithic (300 generations ago) ancestor have an STR variance of 0.629 +/- 0.156
If we used the germline mutation rate (μ=0.0025) we would estimate the ages of these haplogroups as:
  • Bronze Age: 107.6 generations, or a 10% underestimate
  • Neolithic: 251.6 generations, or a 16% underestimate
On the other hand, if we used the evolutionary rate of 0.00069 of Z.U.F., our estimates would be:
  • Bronze Age: 389.9 generations, or a 225% overestimate
  • Neolithic 911.6 generations, or a 203% overestimate
It is clear that the Z.U.F. rate of 0.00069 substantially overestimates the ages of large recent haplogroups, whereas the germline rate underestimates them by a little.

Let's look at some concrete examples of age estimates in the literature, where I compare my own (first) estimates with the published ones. Here is how my estimates are derived:

For a Bronze Age ancestor (g=120) it is: 0.269 =(approx) 0.9 μg

For a Neolithic ancestor (g=300) it is: 0.629 =(approx) 0.84 μg

Thus, the correction multiplier, if the variance is between 0.269 and 0.629 is between 0.84 and 0.9; I will use the midpoint 0.87. If the variance is less than 0.269, then I use 0.9. If the variance is more than 0.629 then I use 0.84. Of course, the correction factor could be expressed more accurately as a function of the variance.

Note that the generation length preferred by these authors is 25, by me it is 30. All ages are ky BC.

Cinnioglu et al. (2004)

In this paper, an evolutionary rate of 0.0007 is used.



Variance
Cinnioglu
Dienekes
E-M78
0.18
4.4
0.4
G-P15
0.35
10.5
2.9
I-P37
0.23
6.2
1.1
J-M12
0.24
6.6
1.2
J-M67
0.33
9.8
2.6
R-M269
0.33
9.8
2.6

E-M78 is dated to 400BC, only a couple of centuries after the historical Greek colonization. E-M78 reaches its maximum in the Peloponese, a major source of Greek colonists.

I-P37 and J-M12 are dated to 1,100BC and 1,200BC, at around the time that e.g. the Phrygians from the Balkans are believed to have migrated to Asia Minor. I-P37 and J-M12 reach their maxima in areas north of Greece where the Phrygians are said to have originated.

Sengupta et al. (2006)



Variance
Sengupta
Dienekes
J2-M410
0.38
11.7
3.3
R-M17
0.39
12
3.4
R-M17 (upper caste)
0.26
7.3
1.5
G-P15
0.29
8.5
2
J-M241
0.38
11.8
3.3

Thus, all the exogenous West Asian lineages in India have post-Neolithic ages, with R-M17 having a suggestive age of 1,500BC coinciding with the suggested date for the Indo-Aryans.

King et al. (2008)



Variance
King
Dienekes
J-M12 (Nea Nikomedeia)
0.18
4.7
0.4
E-V13 (Sesklo/Dimini)
0.24
6.6
1.2
E-V13 (Lerna Franchthi)
0.25
7.2
1.3
J-M92 (Crete)
0.14
3.1
0.1 AD
J-M319 (Crete)
0.14
3.1
0.1 AD
E-V13 (Crete)
0.09
1.1
0.8 AD

These are very localized samples, so they should not be interpreted as reflecting expansion times in Greece itself, however, they do suggest a Bronze Age expansion of E-V13 and a much later arrival of E-V13 in Crete.

Note that for Crete, the 1,000,000-haplogroup size assumption is a substantial overestimate, so my age estimates are also substantial underestimates.

Update (July 25): R-M17 in South Siberia

Derenko et al. (2006) "Contrasting patterns of Y-chromosome variation in South Siberian populations from Baikal and Altai-Sayan regions" calculate the variance of R-M17 chromosomes in South Siberia, using the Z.U.F. rate, arriving at an age of 11.3kya corresponding to a value of 0.31. This corresponds to 2,300BC according to my estimate (see previous update).

Recently Bouakaze et al. (Int J Legal Med (2007) 121:493–499) reported the presence of R-M17 chromosomes in ancient inhabitants of South Siberia and the Andronovo culture (2,500BC-1,500BC).

The Andronovo culture is widely believed to be of Eastern European ultimate origin, reflecting the eastward movement of the Kurgan culture, and is associated by some with the ancestors of the Indo-Iranians.

In the Balkans, again in Z.U.F. years, the age of R-M17 is 15.8kya corresponding to variation of 0.44, corresponding to ~4,000BC according to my estimate.

Update (July 25): Baltic Y chromosomes

Lappalainen et al. (2008) use the Z.U.F. rate to estimate the antiquity of lineages in the Baltic region. Dates are ky BC.



Lappalainen Dienekes
I1a
5.7
1
N3
6.8
1.5
R1a1
8.7
1.9

1,000BC for I1a in the Baltic region is within the time frame of the emergence of the Germanic people who did experience a strong demographic growth.
1,500BC for N3 shows a rather late time for Finno-Ugrians. However, it must be noted that smaller demographic sizes would impose more drift, and hence a slower accumulation of variance. Therefore, this time is probably underestimated.
1,900BC for R1a1 is consistent with the northern edge of the expansion of R1a1. Once again, reduced variance may also be influenced by smaller population numbers, making this a possible underestimate.

Update (July 25): Southeastern Europe (the Balkans)

Pericic et al. (2005) use the Z.U.F. rate to estimate ages of Y-chromosome lineages in the Balkans. Dates are ky BC.



Pericic
Dienekes
I1b* (xM26)
8.1
2
E3b1α
5.3
0.9
R-M17
13.8
3.8
R-M269
9.6
2.3
J-M241 (without Kosovars) 1
0.8AD

Thus, Balkan haplogroup I seems related to a Bronze Age origin, with R-M17 being substantially older, and deriving perhaps from northern Balkan Neolithic or alternatively intrusive Kurgan populations. J-M241 seems to be quite young, similar to J-M12 in Nea Nikomedeia (see discussion of King et al. (2008) above).

The young ages of J-M12 and J-M241 also explain the striking inverse correlation between it and J-M410, which makes sense if it expanded later. A fairly late expansion also explains its under-representation in Southern Italy and Anatolia: it appears to be a rather young and "Epirotic" clade that was too late in coming to significantly participate in the historical Greek colonization.

Update (July 26): E3b in Cyprus and Southern Italy

Capelli et al. (2005) [Population Structure in the Mediterranean Basin: A Y Chromosome Perspective] study Y-chromosome variation in many Mediterranean populations including Cyprus. I use a mutation rate of 0.0018 for the six markers used in this study (Quintana-Murci et al. AJHG 68(2) pp. 537 - 542 ). Ages are in ky BC.

I come up with an age of 1.4ky BC for E3b in Cyprus, which is consistent with Mycenaean and later Greek settlements on the island.

I also looked at Southern Italian Y chromosomes. I removed those with values other than (13,12) in DYS19,DYS388), since these are universal in Greek E-V13, in order to remove possible contamination from non E-V13 chromosomes. The resulting age is 900BC, once again very close to the historical Greek colonization of Magna Graecia.

July (26): A more elaborate population growth model

Z.U.F. also propose (Fig. 2 triangles) a more elaborate population growth with:
  • m=1.002 before 400 generations
  • m=1.012 from 400 to to 14 generations ago
  • m=1.12 from 14 to 8 generations ago
  • m=1.25 from 8 generations ago to current time

I ran a simulation (g=1000, N=10,000) with this population growth model. The average size of the descent groups of the MRCAs is 692,982 men. Averaged all of them, variance is 1.37.
  • With the germline mutation rate, an estimate of 549 generations (45% underestimate)
  • With the Z.U.F. evolutionary rate, an estimate of 1,988 generations (99% overestimate)
If we limit ourselves only to the 10, 1000, 5000 most prolific MRCAs (out of the N=10,000), we obtain ages (respectively):
  • With the germline mutation rate: 776, 747, 668 generations
  • With the Z.U.F. evolutionary rate: 2,810, 2,707, 2,419 generations

Thus, one can estimate that STR variance since the time of the MRCA accumulates at a rate of ~0.75μ / generation.

And, yet, the 0.00069 rate has been used to date Paleolithic events, e.g., by Semino et al. (2004) [Am. J. Hum. Genet. 74:1023–1034, 2004], leading to general age overestimates.

Update (July 29)

My discussion is continued in Haplogroup sizes and observation selection effects (continued)

Y chromosomes and mtDNA of Daghestan groups

This is a free paper which establishes the difference between highland Northeast Caucasian speakers and lowland Altaic speakers in Daghestan. The lowland groups show evidence of Mongoloid haplogroups in both Y chromosomes and mtDNA, while the highland groups are dominated by haplogroup J:
The highland Avar, Dargin, and Kubachi exhibit high frequencies of haplogroup J (0.56, 1.00, and 0.67, respectively)
According to Table 2, the Avars possess 0.33 of J2, so, consistent with previous observations, the Northeast Caucasian groups are J1 (or at least J*(xJ2)) exclusive.

Interestingly, haplogroup G occurs in the Avars (0.06) but not in the other highland groups. Haplogroup G is common in the Southern Caucasus. The mountain groups also have little R1*(xR1a1) (0.06 in Avars, 0.08 in Kubachi) and no I, R1a1 or E.

It certainly seems to be the case that the highland Northeast Caucasian speakers are descended from a J1-dominated ancient Near Eastern population which was preserved due to patrilocal endogamy. The relationship -that I wrote about earlier- of these Caucasian J1's to the Arabian J1's, the second major region of J1 dominance remains to be seen.

BMC Genetics 2008, 9:47 doi:10.1186/1471-2156-9-47

Culture creates genetic structure in the Caucasus: Autosomal, mitochondrial, and Y-chromosomal variation in Daghestan

Elizabeth E Marchani 1, W Scott Watkins 2, Kazima Bulayeva 3, Henry C
Harpending 1, Lynn B Jorde 2§

Abstract

Background

Near the junction of three major continents, the Caucasus region has been an important thoroughfare for human migration. While the Caucasus Mountains have diverted human traffic to the few lowland regions that provide a gateway from north to south between the Caspian and Black Seas, highland populations have been isolated by their remote geographic location and their practice of patrilocal endogamy. We investigate how these cultural and historical differences between highland and lowland populations have affected patterns of genetic diversity. We test 1) whether the highland practice of patrilocal endogamy has generated sex-specific population relationships, and 2) whether the history of migration and military conquest associated with the lowland populations has left Central Asian genes in the Caucasus, by comparing genetic diversity and pairwise population relationships between Daghestani populations and reference populations throughout Europe and Asia for autosomal, mitochondrial, and Y-chromosomal markers.

Results

We found that the highland Daghestani populations had contrasting histories for the mitochondrial DNA and Y-chromosome data sets. Y-chromosomal haplogroup diversity was reduced among highland Daghestani populations when compared to other populations and to highland Daghestani mitochondrial DNA haplogroup diversity. Lowland Daghestani populations showed Turkish and Central Asian affinities for both mitochondrial and Y-chromosomal data sets. Autosomal population histories are strongly correlated to the pattern observed for the mitochondrial DNA data set, while the correlation between the mitochondrial DNA and Y-chromosome distance matrices was weak and not significant.

Conclusions

The reduced Y-chromosomal diversity exhibited by highland Daghestani populations is consistent with genetic drift caused by patrilocal endogamy. Mitochondrial and Ychromosomal phylogeographic comparisons indicate a common Near Eastern origin of highland populations. Lowland Daghestani populations show varying influence from Near Eastern and Central Asian populations.

Link (pdf)

July 18, 2008

'Ten Commandments' of race and genetics

Via the New Scientist:
Even with the human genome in hand, geneticists are split about how to deal with issues of race, genetics and medicine.

Some favor using genetic markers to sort humans into groups based on ancestral origin – groups that may show meaningful health differences. Others argue that genetic variations across the human species are too gradual to support such divisions and that any categorisation based on genetic differences is arbitrary.

These issues have been discussed in depth by a multidisciplinary group – ranging from geneticists and psychologists to historians and philosophers – led by Sandra Soo-Jin Lee of Stanford University, California.

Now the group has released a set of 10 guiding principles for the scientific community, published as an open letter in this week's Genome Biology.

Here is my commentary on each of the "commandments":
1. All races are created equal

No genetic data has ever shown that one group of people is inherently superior to another. Equality is a moral value central to the idea of human rights; discrimination against any group should never be tolerated.
This is a vague statement that is false for two reasons: (i) for any particular single trait, there is a wealth of evidence that one race may be genetically better than another, e.g., Caucasoids are inherently more likely to get skin cancer than Negroids. (ii) there is no way set-in-stone to rand two groups based on a number of many different traits. But, this is an obvious statement: if someone is beautiful and dumb and another one is ugly and intelligent, then you can't say that one is better than another: it depends on what importance you assign to different traits.
2. An Argentinian and an Australian are more likely to have differences in their DNA than two Argentinians

Groups of human beings have moved around throughout history. Those that share the same culture, language or location tend to have different genetic variations than other groups. This is becoming less true, though, as populations mix.
Correct, although populations are hardly mixing at a very high rate even in our interconnected world, but definitely more so than in pre-Columbian times.
3. A person's history isn't written only in his or her genes

Everyone's genetic material carries a useful, though incomplete, map of his or her ancestors' travels. Studies looking for health disparities between individuals shouldn't rely solely on this identity. They should also consider a person's cultural background.
Essentially correct, since groups and individuals differ both because of genes and because of culture.
4: Members of the same race may have different underlying genetics

Social definitions of what it means to be "Hispanic" or "black" have changed over time. People who claim the same race may actually have very different genetic histories.
Correct in the sense that there is variation within races. Also, in the sense that socially-defined races such as "black" and "Hispanic" do not correspond perfectly to biological races. "Blacks", at least in the United States are usually thought of as partial Negroids, and "Hispanics" are usually thought as Spanish speakers who tend to have a variable amount of Caucasoid and American Mongoloid ancestry.
5. Both nature and nurture play important parts in our behaviors and abilities

Trying to use genetic differences between groups to show differences in intelligence, violent behaviors or the ability to throw a ball is an oversimplification of much more complicated interactions between genetics and environment.
Essentially correct. However, this statement is often used to "ease the blow" of the fact that races may indeed have genetic differences that affect outcomes irrespective of environments, or at least in the range of environments that people tend to find themselves in in the 21st century.
6. Researchers should be careful about using racial groups when designing experiments

When scientists decide to divide their subjects into groups based on ethnicity, they need to be clear about why and how these divisions are made to avoid contributing to stereotypes.
No disagreement here.
7. Medicine should focus on the individual, not the race

Although some diseases are connected to genetic markers, these markers tend to be found in many different racial groups. Overemphasising genetics may promote racist views or focus attention on a group when it should be on the individual.

Focusing on the individual is a noble goal for the future. Doctors don't have infinite time and resources to study the individual in all its particulars, so they work by placing him and his condition in a few relevant categories, e.g., "old white male". The category "white" may be of little relevance depending if one has a broken limb but of greater relevance if one has a skin pathology.

Individuals are real, but we don't really perceive individuals: we perceive a cloud of categories and attributes about individuals, as time, knowledge, and interest allows, and one of these categories -and not an insignificant one- is their race.
8. The study of genetics requires cooperation between experts in many different fields

Human disease is the product of a mishmash of factors: genetic, cultural, economic and behavioral. Interdisciplinary efforts that involve the social sciences are more likely to be successful.

Certainly.

9. Oversimplified science feeds popular misconceptions

Policy makers should be careful about simplifying and politicising scientific data. When presenting science to the public, the media should address the limitations of race-related research.

Scientists should try to make scientific results accessible to the public without fueling misconceptions. A big part of this is being honest about race-related research, something which many scientists holding a politically correct "races don't exist/races are social constructs" seem unwilling to do.
10. Genetics 101 should include a history of racism

Any high school or college student learning about genetics should also learn about misguided attempts in the past to use science to justify racism. New textbooks should be developed for this purpose.
Genetics 101 should focus on the science of genetics, nothing more and nothing less. It should impart on the student correct notions about the science, and about differences between groups.

UPDATE (July 19):

The Genome biology open letter on which the New Scientist article is based.

Better mental health of African Americans is not explained by social relationships

Personal Relationships doi: 10.1111/j.1475-6811.2008.00195.x

Race, social relationships, and mental health

K. JILL KIECOLT et al.

ABSTRACT

Researchers often assume that the extent, quality, and effectiveness of personal relationships explain why African Americans have relatively good mental health despite experiencing high levels of stress. This study tests this assumption using data from the 1990–1992 National Comorbidity Survey. Few racial differences emerge in patterns of social relationships, and the nature and quality of social relationships do not explain African Americans' resiliency on mental health. Several aspects of social relationships benefit African Americans' mental health more than Whites', but these moderating effects are insubstantial. Hence, the data do not support the assumption. If social relationships help explain the lack of racial differences in mental health, their nature and effects must be more adequately conceptualized.

Link

Nasal passage differences between Caucasoids and Negroids

American Journal of Physical Anthropology

Ecogeographic variation in human nasal passages

Todd R. Yokley

Abstract

Theoretically, individuals whose ancestors evolved in cold and/or dry climates should have greater nasal mucosal surface area relative to air volume of the nasal passages than individuals whose ancestors evolved in warm, humid climates. A high surface-area-to-volume (SA/V) ratio allows relatively more air to come in contact with the mucosa and facilitates more efficient heat and moisture exchange during inspiration and expiration, which would be adaptive in a cold, dry environment. Conversely, a low SA/V ratio is not as efficient at recapturing heat and moisture during expiration and allows for better heat dissipation, which would be adaptive in a warm, humid environment. To test this hypothesis, cross-sectional measurements of the nasal passages that reflect surface area and volume were collected from a sample of CT scans of patients of European and African ancestry. Results indicate that individuals of European descent do have higher SA/V ratios than individuals of African descent, but only when decongested. Otherwise, the two groups show little difference. This pattern of variation may be due to selection for different SA/V configurations during times of physical exertion, which has been shown to elicit decongestion. Relationships between linear measurements of the skeletal nasal aperture and cavity and cross-sectional dimensions were also examined. Contrary to predictions, the nasal index, the ratio of nasal breadth to nasal height, is not strongly correlated with internal dimensions. However, differences between the nasal indices of the two groups are highly significant. These results may be indicative of different adaptive solutions to the same problem.

Link

July 17, 2008

Beauty map of London

This is the kind of quantitative study that I really like. There is so much anecdotal talk and debate about whether people from this region/country/continent/class/religion etc. are more beautiful/attractive/intelligent/etc. but with the exception of IQ and personality traits, I have seen very little quantitative evidence for these assertions.

Like g where an individual's correlated performance in multiple test items allows us to extract a common underlying intelligence factor, correlated measures of attractiveness across many observers could in principle allow us to extract an individuals BQ (beauty quotient) in a controlled social science experiment.

Personality and Individual Differences doi:10.1016/j.paid.2008.05.005

A beauty-map of London: Ratings of the physical attractiveness of women and men in London’s boroughs

Viren Swami and Eliana G. Hernandez

Abstract

In 1908, Francis Galton discussed anecdotal data he had collected for the compilation of a ‘beauty-map of the British Isles’. Based on his discussion, the present study attempted to compile a more empirical beauty-map of London. A community sample of 461 Londoners completed a questionnaire in which they rated the physical attractiveness of women and men in London’s 33 boroughs, as well as their familiarity with those boroughs. Results showed a significant interaction between borough and rated sex, with women being rated as more attractive across boroughs, and three boroughs in particular (the City of London, the City of Westminster, and Kensington and Chelsea) being rated high in physical attractiveness. Overall, ratings of attractiveness were significantly positively correlated with familiarity of boroughs, as well as objective measures of borough affluence (specifically, annual gross pay and average house prices) but not of borough health (life expectancy). These results are discussed in relation to the association between wealth and attractiveness, as well as Galton’s original beauty-map.

Link

Y chromosomes of Sudanese

American Journal of Physical Anthropology

Y-chromosome variation among Sudanese: Restricted gene flow, concordance with language, geography, and history

Hisham Y. Hassan et al.

Abstract

We study the major levels of Y-chromosome haplogroup variation in 15 Sudanese populations by typing major Y-haplogroups in 445 unrelated males representing the three linguistic families in Sudan. Our analysis shows Sudanese populations fall into haplogroups A, B, E, F, I, J, K, and R in frequencies of 16.9, 7.9, 34.4, 3.1, 1.3, 22.5, 0.9, and 13% respectively. Haplogroups A, B, and E occur mainly in Nilo-Saharan speaking groups including Nilotics, Fur, Borgu, and Masalit; whereas haplogroups F, I, J, K, and R are more frequent among Afro-Asiatic speaking groups including Arabs, Beja, Copts, and Hausa, and Niger-Congo speakers from the Fulani ethnic group. Mantel tests reveal a strong correlation between genetic and linguistic structures (r = 0.31, P = 0.007), and a similar correlation between genetic and geographic distances (r = 0.29, P = 0.025) that appears after removing nomadic pastoralists of no known geographic locality from the analysis. The bulk of genetic diversity appears to be a consequence of recent migrations and demographic events mainly from Asia and Europe, evident in a higher migration rate for speakers of Afro-Asiatic as compared with the Nilo-Saharan family of languages, and a generally higher effective population size for the former. The data provide insights not only into the history of the Nile Valley, but also in part to the history of Africa and the area of the Sahel.

Link

July 16, 2008

Y chromosomes and Athapaskans

From the EurekAlert release:
The new findings reinforce the hypothesis that the Athapaskan migration involved a relatively small group that nonetheless was very successful at assimilating and intermixing with native groups already living in the southwest. The newcomers were so influential that the Athapaskan language family now dominates many parts of the Southwest. Now called Apacheans, the Navajo and Apache descendants of the early migrants are dispersed throughout the central Southwest and speak languages closely related to the Chipewyan, an Athapaskan language found in the subarctic.

...

Other patterns emerged from the Y chromosome analysis. One genetic signature associated with European males was detected in native males throughout North America, but was found at the highest frequency in groups living nearest to Hudson Bay, where trade between Europeans and the region's indigenous peoples was established in the early 17th century.

From the paper:
Gene map interpolations (Fig. 2A–C) indicate that the frequency of haplogroup Q is highest in Southwestern North America/Mesoamerica. The frequency of haplogroup C is highest in Northwestern North America and the frequency of haplogroup R, the presence of which is attributed to European admixture, reaches its maximum in Northeastern North America. In total, 73% percent of the populations analyzed exhibited haplogroup R, which ranges in frequency from 4 to 88% (Table 1).

...

Y chromosome haplogroup C is observed at a moderate frequency in the Subarctic Athapaskan groups and at a low frequency in the Navajo and Apache, but is otherwise absent from the Southwest. Nearly all Navajo and Apache Y chromosomes within haplogroup C belong to a specific, well-defined subclade (Zegura et al., 2004). Hence, it is likely that ancestral Subarctic Athapaskan speakers provided the source for Y chromosome haplogroup C as well as the mtDNA A2a subclade in Apachean groups.
However, Apachean groups cluster with other Southwest and Mesoamerican groups in the principal coordinates analysis, rather than with Athapaskans from the Subarctic. This suggests that the majority of non-C Y chromosomes in the Navajo and Apache were contributed by non-Athapaskan populations in the Southwest, which mirrors the presence in the Apachean of mtDNA lineages belonging to haplogroups B and C.

Wikipedia on Athapaskan languages.

American Journal of Physical Anthropology

Distribution of Y chromosomes among native North Americans: A study of Athapaskan population history

Ripan Singh Malhi et al.

Abstract

In this study, 231 Y chromosomes from 12 populations were typed for four diagnostic single nucleotide polymorphisms (SNPs) to determine haplogroup membership and 43 Y chromosomes from three of these populations were typed for eight short tandem repeats (STRs) to determine haplotypes. These data were combined with previously published data, amounting to 724 Y chromosomes from 26 populations in North America, and analyzed to investigate the geographic distribution of Y chromosomes among native North Americans and to test the Southern Athapaskan migration hypothesis. The results suggest that European admixture has significantly altered the distribution of Y chromosomes in North America and because of this caution should be taken when inferring prehistoric population events in North America using Y chromosome data alone. However, consistent with studies of other genetic systems, we are still able to identify close relationships among Y chromosomes in Athapaskans from the Subarctic and the Southwest, suggesting that a small number of proto-Apachean migrants from the Subarctic founded the Southwest Athapaskan populations.

Link

Individualists and egalitarians are more optimistic

Personality and Individual Differences doi:10.1016/j.paid.2008.05.008

Is optimism universal? A meta-analytical investigation of optimism levels across 22 nations

Ronald Fischer et al.

Abstract

A meta-analysis of dispositional optimism levels as measured by the life orientation test (LOT, Scheier & Carver, 1985) across 22 countries (k = 213; N = 89,138) is reported. Using mixed effect modeling, overall culture differences were small. Greater individualism (Hofstede, 1980) was associated with greater optimism. Greater egalitarianism (versus hierarchy, Schwartz, 1994) was consistently associated with higher optimism. Claims of fundamental cultural differences were not supported. Implications for cross-cultural research and applications are discussed.

Link

28,000 year-old Cro-Magnon mtDNA from Italy

The researchers could verify that the sequence of the Cro-Magnon (which was the Cambridge Reference Sequence, common among modern Caucasoids) was genuine, since it was different from that of all possible contaminating individuals who handled the find since 2003.

This raises an interesting methodological problem. Should researchers with common mtDNA sequences be handling ancient remains? It seems like blind good luck that no one out of individuals had the quite common CRS. In the recent mtDNA paper on the Mycenaeans for example, they were able to identify contaminant sequences by the fact that the author and experimenter had a particular mutation; if she was plain CRS, they wouldn't have been able to disprove possible contamination for the CRS individual from Mycenae.

Ascertaining authenticity is challenging. In an ideal situation the sample is handled by only a single individual, and one who is unlikely to possess the same mtDNA type as the sample. An obvious solution to this would be to recruit a person of remote geographic origin to do the lab work, e.g. a Japanese person to work on European samples and vice versa.

PLoS ONE 3(7): e2700. doi:10.1371/journal.pone.0002700

A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

David Caramelli et al.

Abstract

Background

DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans.

Methodology/Principal Findings

We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences.

Conclusions/Significance:

The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

Link

July 15, 2008

Narcissistic people aren't really more beautiful

Personality and Individual Differences doi:10.1016/j.paid.2008.05.018

Narcissistic men and women think they are so hot – But they are not

April Bleske-Rechek et al.

Abstract

Narcissists think they are more knowledgeable, better leaders, and more attractive than others are. Higher narcissism scores in celebrities than in non-celebrities (Young & Pinsky, 2006) raise the question of whether narcissistic individuals actually are, to some degree, more knowledgeable or attractive than other individuals are. Because little research has investigated the degree to which narcissists’ ratings of their attractiveness are inflated relative to others’ ratings of their attractiveness, we asked men and women to evaluate their own attractiveness, and then we asked two separate panels of judges to view and rate facial shots of these men and women. More narcissistic men and women rated themselves as more attractive than less narcissistic individuals did, but outside judges did not rate more and less narcissistic individuals as any different in attractiveness.

Link

Craniometry of the Ainu

American Journal of Physical Anthropology

Craniometric variation of the Ainu: An assessment of differential gene flow from Northeast Asia into Northern Japan, Hokkaido

Tsunehiko Hanihara et al.

Abstract

In and after the latest Neolithic period in Japan (B.P. 2,300 years), there were two distinct waves of migration from eastern Asia. One is well known as successive episodes in which indigenous inhabitants of main-island Japan were intruded on by new arrivals with advanced technology, and of a different genetic stock. Another migration of people and culture, identified as the Okhotsk culture, reached the northeastern part of Hokkaido. As opposed to main-island Japan, the morphological continuity from the Neolithic to recent inhabitants in Hokkaido (Ainu) is notable, so that the evidence of admixture easily could have escaped notice. In this study, the effects of gene flow from an outside source on the pattern of among-group variation of Hokkaido Ainu are examined by means of two models. One is the R-matrix model comparing observed and expected craniometric variation for estimating differential external gene flow into a region. The other is a simple simulation model that estimates admixture in a population with two parental populations. The two approaches give similar results. The results suggest the possibility of admixture between the migrants from Northeast Asia, the Okhotsk culture people, and the indigenous inhabitants in Hokkaido during the 5th to 12th centuries A.D., at least in northeastern Hokkaido. Such gene flow may have a certain degree of effect on the genetic structure of recent Ainu. The findings further suggest morphological heterogeneity in Northeast Asia during the Holocene that has relevance for understanding the morphological heterogeneity seen through time in the New World.

Link

July 14, 2008

Ancient mtDNA from Southeast Asia

American Journal of Physical Anthropology

Genetic history of Southeast Asian populations as revealed by ancient and modern human mitochondrial DNA analysis

Patcharee Lertrit et al.

Abstract

The 360 base-pair fragment in HVS-1 of the mitochondrial genome were determined from ancient human remains excavated at Noen U-loke and Ban Lum-Khao, two Bronze and Iron Age archaeological sites in Northeastern Thailand, radio-carbon dated to circa 3,500-1,500 years BP and 3,200-2,400 years BP, respectively. These two neighboring populations were parts of early agricultural communities prevailing in northeastern Thailand from the fourth millennium BP onwards. The nucleotide sequences of these ancient samples were compared with the sequences of modern samples from various ethnic populations of East and Southeast Asia, encompassing four major linguistic affiliations (Altaic, Sino-Tibetan, Tai-Kadai, and Austroasiatic), to investigate the genetic relationships and history among them. The two ancient samples were most closely related to each other, and next most closely related to the Chao-Bon, an Austroasiatic-speaking group living near the archaeological sites, suggesting that the genetic continuum may have persisted since prehistoric times in situ among the native, perhaps Austroasiatic-speaking population. Tai-Kadai groups formed close affinities among themselves, with a tendency to be more closely related to other Southeast Asian populations than to populations from further north. The Tai-Kadai groups were relatively distant from all groups that have presumably been in Southeast Asia for longer-that is, the two ancient groups and the Austroasiatic-speaking groups, with the exception of the Khmer group. This finding is compatible with the known history of the Thais: their late arrival in Southeast Asia from southern China after the 10th-11th century AD, followed by a period of subjugation under the Khmers.

Link

July 11, 2008

mtDNA haplogroup H1 and ischemic stroke protection

BMC Med Genet. 2008 Jul 1;9(1):57. [Epub ahead of print]

Mitochondrial haplogroup H1 is protective for ischemic stroke in Portuguese patients.

Rosa A, Fonseca BV, Krug T, Manso H, Gouveia L, Albergaria I, Gaspar G, Correia M, Viana-Baptista M, Moiron Simoes R, Nogueira Pinto A, Taipa R, Ferreira C, Ramalho Fontes J, Rui Silva M, Gabriel JP, Matos I, Lopes G, Ferro JM, Vicente AM, Oliveira SA.

ABSTRACT: BACKGROUND: The genetic contribution to stroke is well established but it has proven difficult to identify the genes and the disease-associated alleles mediating this effect, possibly because only nuclear genes have been intensely investigated so far. Mitochondrial DNA (mtDNA) has been implicated in several disorders having stroke as one of its clinical manifestations. The aim of this case-control study was to assess the contribution of mtDNA polymorphisms and haplogroups to ischemic stroke risk. METHODS: We genotyped 19 mtDNA single nucleotide polymorphisms (SNPs) defining the major European haplogroups in 534 ischemic stroke patients and 499 controls collected in Portugal, and tested their allelic and haplogroup association with ischemic stroke risk. RESULTS: Haplogroup H1 was found to be significantly less frequent in stroke patients than in controls (OR=0.61, 95% CI=0.45-0.83, p=0.001), when comparing each clade against all other haplogroups pooled together. Conversely, the pre-HV/HV and U mtDNA lineages emerge as potential genetic factors conferring risk for stroke (OR=3.14, 95% CI=1.41-7.01, p=0.003, and OR=2.87, 95% CI=1.13-7.28, p=0.021, respectively). SNPs m.3010G>A, m.7028C>T and m.11719G>A strongly influence ischemic stroke risk, their allelic state in haplogroup H1 corroborating its protective effect. CONCLUSION: Our data suggests that mitochondrial haplogroup H1 has an impact on ischemic stroke risk in a Portuguese sample.

Link

July 10, 2008

Campbell & Tishkoff review paper on African genetic diversity

From the paper:
Several studies of nucleotide and haplotype variation have indicated that ancestral African populations were geographically structured prior to the migration of modern humans out of Africa (70, 71, 79, 157, 197, 237). Additionally, a recent study of 800 short tandem repeat polymorphisms (STRPs) and 400 INDELs genotyped in more than 3000 geographically and ethnically diverse Africans indicates the presence of at least 13 genetically distinct ancestral populations in Africa and high levels of population admixture in many regions (F.A. Reed & S.A Tishkoff, unpublished data). Population clusters are correlated with selfdescribed ethnicity and shared cultural and/or linguistic properties (e.g., Pygmies, Khoisanspeaking hunter-gatherers, Bantu speakers, Cushitic speakers). This study reveals extensive admixture between inferred ancestral populations in most African populations. One exception is amongWest African Niger-Kordofanian (i.e., Bantu) speakers who are more genetically homogeneous compared with other African populations, likely reflecting the recent and rapid spread of Bantu speakers from a common origin in Cameroon/Nigeria (although fine-scale genetic structure can be detected amongst these populations). Thus, the pattern of genetic diversity in Africa indicates that African populations have maintained a large and subdivided population structure throughout much of their evolutionary history (Figure 2).
As I have argued before, the great genetic diversity of Sub-Saharan Africans is due to the fact that they are composed of several long-differentiated populations admixed with each other. As Figure 2, mentioned above, indicates, NE Africans are related to Eurasians more closely than other Africans, although there has been subsequent gene flow into NE Africans from other Sub-Saharan Africans. Annual Review of Genomics and Human Genetics Vol. 9 (Volume publication date September 2008) (doi:10.1146/annurev.genom.9.081307.164258) African Genetic Diversity: Implications for Human Demographic History, Modern Human Origins, and Complex Disease Mapping Michael C. Campbell­, Sarah A. Tishkoff­ Comparative studies of ethnically diverse human populations, particularly in Africa, are important for reconstructing human evolutionary history and for understanding the genetic basis of phenotypic adaptation and complex disease. African populations are characterized by greater levels of genetic diversity, extensive population substructure, and less linkage disequilibrium (LD) among loci compared to non-African populations. Africans also possess a number of genetic adaptations that have evolved in response to diverse climates and diets, as well as exposure to infectious disease. This review summarizes patterns and the evolutionary origins of genetic diversity present in African populations, as well as their implications for the mapping of complex traits, including disease susceptibility. Link

July 09, 2008

mtDNA macro-haplogroup R0

R0 is ancestral to the very widespread HV, V, and H which are frequent in Europe, as well as R0a which is frequent in the Middle East.

BMC Evol Biol. 2008 Jul 4;8(1):191. [Epub ahead of print]

Timing and deciphering mitochondrial DNA macro-haplogroup R0 variability in Central Europe and Middle East.

Brandstaetter A, Zimmermann B, Wagner J, Goebel T, Roeck AW, Salas A, Carracedo A, Parson W.

ABSTRACT: BACKGROUND: Nearly half of the West Eurasian assemblage of human mitochondrial DNA (mtDNA) is fractioned into numerous sub-lineages of the predominant haplogroup (hg) R0. Several hypotheses have been proposed on the origin and the expansion times of some R0 sub-lineages, which were partially inconsistent with each other. Here we describe the phylogenetic structure and genetic variety of hg R0 in five European populations and one population from the Middle East. RESULTS: Our analysis of 1,350 mtDNA haplotypes belonging to R0, including entire control region sequences and 45 single nucleotide polymorphisms from the coding region, revealed significant differences in the distribution of different sub-hgs even between geographically closely located regions. Estimates of coalescence times that were derived using diverse algorithmic approaches consistently affirmed that the major expansions of the different R0 hgs occurred in the terminal Pleistocene and early Holocene. CONCLUSIONS: Given an estimated coalescence time of the distinct lineages of 10 - 18 kya, the differences in the distributions could hint to either limited maternal gene flow after the last glacial maximum due to the alpine nature of the regions involved or to a stochastic loss of diversity due to environmental events and/or disease episodes occurred at different times and in distinctive regions. Our comparison of two different ways of obtaining the timing of the most recent common ancestor confirms that the time of a sudden expansion can be adequately recovered from control region data with valid confidence intervals. For reliable estimates, both procedures should be applied in order to cross-check the results for validity and soundness.

Link

July 08, 2008

PCA-informative markers for European American substructure

The importance of this work is that while it takes many thousands of markers to identify population structure in closely related groups, a much smaller subset of these markers captures almost all the information in the larger marker set.

Thus, from an economic standpoint, discovery of substructure in an "unexamined" group requires a considerable initial investment of genotyping a large representative sample for a large number of markers. But, subsequent ancestry analysis can profit from the identified smaller subset to economically test new individuals.

Once I look at the details of this paper, I will try to update EURO-DNA-CALC to use this new marker panel.

See the earlier paper by this group on PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations.

PLoS Genet 4(7): e1000114. doi:10.1371/journal.pgen.1000114

Tracing Sub-Structure in the European American Population with PCA-Informative Markers

Peristera Paschou et al.

Abstract

Genetic structure in the European American population reflects waves of migration and recent gene flow among different populations. This complex structure can introduce bias in genetic association studies. Using Principal Components Analysis (PCA), we analyze the structure of two independent European American datasets (1,521 individuals–307,315 autosomal SNPs). Individual variation lies across a continuum with some individuals showing high degrees of admixture with non-European populations, as demonstrated through joint analysis with HapMap data. The CEPH Europeans only represent a small fraction of the variation encountered in the larger European American datasets we studied. We interpret the first eigenvector of this data as correlated with ancestry, and we apply an algorithm that we have previously described to select PCA-informative markers (PCAIMs) that can reproduce this structure. Importantly, we develop a novel method that can remove redundancy from the selected SNP panels and show that we can effectively remove correlated markers, thus increasing genotyping savings. Only 150–200 PCAIMs suffice to accurately predict fine structure in European American datasets, as identified by PCA. Simulating association studies, we couple our method with a PCA-based stratification correction tool and demonstrate that a small number of PCAIMs can efficiently remove false correlations with almost no loss in power. The structure informative SNPs that we propose are an important resource for genetic association studies of European Americans. Furthermore, our redundancy removal algorithm can be applied on sets of ancestry informative markers selected with any method in order to select the most uncorrelated SNPs, and significantly decreases genotyping costs.

Link

July 07, 2008

Diet in Bronze Age Lerna

Journal of Archaeological Science
Article in Press doi:10.1016/j.jas.2008.06.018

Isotopic Dietary Reconstruction of humans from middle Bronze Age lerna, argolid, greece

S. Triantaphyllou, M.P. Richards, C. Zerner and S. Voutsaki

Abstract

This study presents the results of a carbon and nitrogen stable isotope analysis
of thirty-nine human bone and eight animal samples from Middle Bronze Age (or
Middle Helladic, MH, ca. 2100-1700BC) Lerna, Greece. The isotopic data indicate
that the humans had a C3 terrestrial diet while certain individuals appear to have
significant amounts of animal protein in their diet. With regard to weaning age, the
isotopic values and the estimated age of early enamel disruptions suggest that solid
foods were starting to be used as a substitute for breast milk at or before the ages of
2.5 and 3 years old.

Link

July 04, 2008

Clicks not a feature of early human language

Annual Review of Anthropology
Vol. 37 (Volume publication date October 2008)
(doi:10.1146/annurev.anthro.37.081407.085109)

A Historical Appraisal of Clicks: A Linguistic and Genetic Population Perspective

Tom Güldemann­, Mark Stoneking

Clicks are often considered an exotic feature of languages, and the fact that certain African "Khoisan" groups share the use of clicks as consonants and exhibit deep genetic divergences has been argued to indicate that clicks trace back to an early common ancestral language (Knight et al. 2003). Here, we review the linguistic evidence concerning the use of click sounds in languages and the genetic evidence concerning the relationships of African click-speaking groups. The linguistic evidence suggests that genealogical inheritance and contact-induced transmission are equally relevant for the distribution of clicks in African languages. The genetic evidence indicates that there has been substantial genetic drift in some groups, obscuring their genetic relationships. Overall, the presence of clicks in human languages may in fact not trace back to the dawn of human language, but instead reflect a much later episode in the diversification of human speech.

Link

July 03, 2008

Linguistic diversity in the Caucasus

Annual Review of Anthropology
Vol. 37 (Volume publication date October 2008)
(doi:10.1146/annurev.anthro.35.081705.123248)

Linguistic Diversity in the Caucasus

Bernard Comrie­

The Caucasus is characterized by a relatively high level of linguistic diversity, whether measured in terms of number of languages, number of language families, or structural properties. This is in stark contrast to low levels of linguistic diversity in neighboring areas (Europe, the Middle East), although the Caucasus does not reach such high levels of linguistic diversity as are found in New Guinea. There is even a variation between greater diversity in the North Caucasus and less diversity in the South Caucasus. Illustrative structural properties show not only idiosyncratic properties of individual languages and families but also features that have spread across the boundaries separating languages and families, sometimes with variation across languages with regard to finer points of detail, although few features characterize the Caucasus as a single linguistic area. Social factors have probably played at least as important a role as has geography in the development of linguistic diversity in the Caucasus.

Link

July 02, 2008

Waist-to-hip ratio of Miss Koreas

Aesthetic Plast Surg. 2008 Jun 28. [Epub ahead of print]

Anthropometric Analysis of Waist-to-Hip Ratio in Asian Women.

Hong YJ, Park HS, Lee ES, Suh YJ.

BACKGROUND: The universally accepted attractive female figure has a waist-to-hip ratio (WHR) of 0.7 or 0.68 (WHR of the Venus de Milo). Using WHR and other parameters, the authors attempted to investigate chronologic changes in perceptions of the attractive female figure in Korean society, differences between Asian and Western societies in this respect, and changes in attractiveness with respect to body mass index (BMI) and age in the general female Korean population. METHODS: The authors analyzed the anthropometric measurements of 227 Miss Korea winners between 1971 and 2007, 60 candidates of the 2007 Miss Korea contest, 36 candidates of the 2007 Miss France contest, and 1785 normal women in the general population. RESULTS: In the Miss Korea winners' group, the WHR tended toward 0.7. The WHR of the 2007 Miss Korea candidates was statistically smaller than the WHR of the 2007 Miss France candidates. The WHR of normal women was statistically larger than WHR of the 2000s Miss Korea winners. In all age groups of normal women, subjects with a low BMI were not significantly different from the 2000s Miss Koreas in terms of waist circumference, but they had a relatively larger hip circumference. Moreover, subjects with a normal BMI had waist circumferences that were similar to those of the 2000s Miss Koreas but relatively larger hip circumferences, and subjects with high BMI had larger waist and hip circumferences than the 2000s Miss Koreas. CONCLUSION: The perceived attractive female figure in Asia has moved toward the universally accepted ideal WHR. However, there were still some differences between Asian and Western societies in the concept of ideal body figure. Also, a significant difference in body contour was observed between normal women and the ideal figure. This is because hip volume decreases and waist volume increases with age, although waist and hip volumes increase with BMI.

Link
Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history. Feel free to send e-mail to Dienekes Pontikos, or to visit my other three sites: Anthropological Research Page, Γενετική των Ελλήνων, and d-politiki. You can also follow dienekesp on Twitter.

Creative Commons License This work is licensed under a Creative Commons License. You may cite, quote, or reproduce articles on this site for non-commercial purposes, provided that you attribute them to Dienekes Pontikos and provide a link either to the main page of this blog or to the individual blog entry you are referring to.