European Journal of Human Genetics advance online publication 1 July 2015; doi: 10.1038/ejhg.2015.138
Shared language, diverging genetic histories: high-resolution analysis of Y-chromosome variability in Calabrian and Sicilian Arbereshe
Stefania Sarno et al.
The relationship between genetic and linguistic diversification in human populations has been often explored to interpret some specific issues in human history. The Albanian-speaking minorities of Sicily and Southern Italy (Arbereshe) constitute an important portion of the ethnolinguistic variability of Italy. Their linguistic isolation from neighboring Italian populations and their documented migration history, make such minorities particularly effective for investigating the interplay between cultural, geographic and historical factors. Nevertheless, the extent of Arbereshe genetic relationships with the Balkan homeland and the Italian recipient populations has been only partially investigated. In the present study we address the genetic history of Arbereshe people by combining highly resolved analyses of Y-chromosome lineages and extensive computer simulations. A large set of slow- and fast-evolving molecular markers was typed in different Arbereshe communities from Sicily and Southern Italy (Calabria), as well as in both the putative Balkan source and Italian sink populations. Our results revealed that the considered Arbereshe groups, despite speaking closely related languages and sharing common cultural features, actually experienced diverging genetic histories. The estimated proportions of genetic admixture confirm the tight relationship of Calabrian Arbereshe with modern Albanian populations, in accordance with linguistic hypotheses. On the other hand, population stratification and/or an increased permeability of linguistic and geographic barriers may be hypothesized for Sicilian groups, to account for their partial similarity with Greek populations and their higher levels of local admixture. These processes ultimately resulted in the differential acquisition or preservation of specific paternal lineages by the present-day Arbereshe communities.
Link
Showing posts with label Calabria. Show all posts
Showing posts with label Calabria. Show all posts
July 12, 2015
July 03, 2010
Y chromosomes of Arbereshe from Calabria
From the paper:

The presence of F*(xG,I,J,K) in Albanians is interesting as this occurs in Romania and Bosnia Herzegovina (all groups), and in South Apulia, It could potentially be haplogroup H and may reflect a Gypsy element that was not present when the Arbereshe moved to Italy from the Balkans.
Ann Hum Biol. 2010 Jun 22. [Epub ahead of print]
Background: The Arbereshe are an Albanian-speaking ethno-linguistic minority who settled in Calabria (southern Italy) about five centuries ago. Aim: This study aims to clarify the genetic relationships between Italy and the Balkans through analysis of Y-chromosome variability in a peculiar case study, the Arbereshe. Subject and methods: Founder surnames were used as a means to identify a sample of individuals that might trace back to the Albanians at the time of their establishment in Italy. These results were compared with data of more than 1000 individuals from Italy and the Balkans. Results: The distributions of haplogroups (defined using 31 UEPs) and haplotypes (12 STRs) show that the Italian and Balkan populations are clearly divergent from each other. Within this genetic landscape, the Arbereshe are characterized by two peculiarities: (a) they are a clear outlier in the Italian genetic background, showing a strong genetic affinity with southern Balkans populations; and (b) they retain a high degree of genetic diversity. Conclusion: These results support the hypothesis that the surname-chosen Arbereshe are representative of the Y-chromosome genetic variability of the Albanian founder population. Accordingly, the Arbereshe genetic structure can contribute to the interpretation of the recent biological history of the southern Balkans. Intra-haplogroup analyses suggest that this area may have experienced important changes in the last five centuries, resulting in a marked increase in the frequency of haplogroups I2a and J2.
Link
The Arbereshe are one of the largest linguistic minorities in Italy. They are the result of complicated movements of Albanians around the end of the 15th and beginning of the 16th century, often linked to the invasion of the Balkans by the Ottoman Empire. Despite that, it is generally agreed that most of the immigrants started moving from the south of Albania (Toskeria), with, very often, intermediate steps in Greece, particularly in the Peloponnese (Zangari 1941). Further evidence is provided by linguistic research, according to which Arberisht, the language spoken by Arbereshe, is part of the Tosk dialect group of Albanian, a language originally spoken in Toskeria (Babiniotis 1998).
On the sample:
The Arbereshe Y-chromosome variation was investigated by sampling individuals from different villages of the Pollino area (Calabria) who bear one of the founding surnames of the population. The genotyping was performed using 12 microsatellites (STRs) and 31 unique event polymorphisms (UEPs), defining, respectively, haplotypes and haplogroups. The Italian and Balkan genetic backgrounds were explored using the large amount of data provided by recent Y-chromosome studies in the two peninsulas and by literature data on STRs from forensic research.
Comparison of Y-haplogroup frequency and diversity between Albanians from Tirana and Arbereshe from Calabria (from Table III):

The presence of F*(xG,I,J,K) in Albanians is interesting as this occurs in Romania and Bosnia Herzegovina (all groups), and in South Apulia, It could potentially be haplogroup H and may reflect a Gypsy element that was not present when the Arbereshe moved to Italy from the Balkans.
Haplogroup I shows similar frequencies, but:
I-M170 is the most common Balkan haplogroup (Pericic et al. 2005a,b) and the second most frequent Arbereshe clade. Nevertheless, analysis of its network reveals unexpected results: most of the Arbereshe I-M170 haplotypes are not included in the Balkan cluster (Figure 3), but are located in the long branches containing mainly Italian chromosomes. Comparisons with literature data (Semino et al. 2000; Barac et al. 2003, Rootsi et al. 2004) show that the core haplotype of the Balkan cluster (16-14-15-13-31-24-11-11-13; locus order as above) is consistent with the almost Balkan exclusive I2a (formerly I1b) clade. The proposed interpretation of the Arbereshe as a proxy of the founder Albanian population leads us to hypothesize that the I2a clade was less common in the southern Balkans 500 years ago than nowadays. The very tight shape of the I2a cluster in the network suggests a very recent expansion of this haplogroup in the southern Balkans. Furthermore, I2a is still rare inmountain populations such as the Albanians of Kosovo (Pericic et al. 2005a,b) and in a randomly selected Arbereshe sample from Rootsi et al. (2004).
This is an interesting finding in the light of recent evidence for selection in Y-haplogroup I.
The situation with J2 is also quite interesting as this is rarer in Arbereshe (3%) than Albanians (17%):
The scarcity of J2 chromosomes in the Arbereshe sample (1/40) is very difficult to explain, given that they are very common in both the Italian peninsula and the southern Balkans. Literature data on J2 indicate that most of the haplotypes included in the Balkan (B) cluster of the network (Figure 3) have an STR configuration consistent with the J2-M12 sub-clade (Di Giacomo et al. 2004; Semino et al. 2004; Cruciani et al. 2007). In contrast, most of the haplotypes in the other clusters agree with the STR configuration given for the J2-M67 clade, with its sub-clade J2-M92 (Di Giacomo et al. 2004). It is unconvincing to attribute the rarity of J2 in the Arbereshe to random sampling or to the effect of genetic drift. Furthermore, the Arbereshe sample analysed by Semino et al. (2004) also completely lacks the typically Balkan J2-M12 chromosomes. If we interpret our Arbereshe sample as representative of the founding Albanian population, we may hypothesize that the J2 haplogroup was considerably less diffuse in the southern Balkans five centuries ago than today.
What we can conclude from this study is that the founding Albanian population was J2- and I2a- lite compared to modern Albanians. The source for the I2a seems to be either the Albanization of people from the West Balkans and/or selection, although it would be difficult to see a massive increase in frequency in only five centuries. The I2a-deficiency of the Arbereshe also gives support to the theory that the Albanians are relatively recent arrivals from the northeast; this theory has been upheld in the past on the basis of the (i) their historical obscurity until the last millennium, and (ii) the paucity of native sea terms and Greek loanwords in Albanian, which is difficult to explain if Albanians always occupied their current location on the Adriatic.
The source of J2 is less clear, and could be either the Albanization of Greeks (the only Balkan population with a sizeable J2 frequency) or remnants of Muslim Anatolians from Ottoman times. However, modern Albanians belong mainly to clade J2b, while Anatolians belong to J2a. Thus, I tend to dismiss the Anatolian connection.
The low frequency of R1*x(R1a1) in the Arbereshe, together with the high E1b1b1a frequency are quite convincing of the Balkan origins of this population.
Linking Italy and the Balkans. A Y-chromosome perspective from the Arbereshe of Calabria.
Boattini A, Luiselli D, Sazzini M, Useli A, Tagarelli G, Pettener D.
Abstract
Abstract
Background: The Arbereshe are an Albanian-speaking ethno-linguistic minority who settled in Calabria (southern Italy) about five centuries ago. Aim: This study aims to clarify the genetic relationships between Italy and the Balkans through analysis of Y-chromosome variability in a peculiar case study, the Arbereshe. Subject and methods: Founder surnames were used as a means to identify a sample of individuals that might trace back to the Albanians at the time of their establishment in Italy. These results were compared with data of more than 1000 individuals from Italy and the Balkans. Results: The distributions of haplogroups (defined using 31 UEPs) and haplotypes (12 STRs) show that the Italian and Balkan populations are clearly divergent from each other. Within this genetic landscape, the Arbereshe are characterized by two peculiarities: (a) they are a clear outlier in the Italian genetic background, showing a strong genetic affinity with southern Balkans populations; and (b) they retain a high degree of genetic diversity. Conclusion: These results support the hypothesis that the surname-chosen Arbereshe are representative of the Y-chromosome genetic variability of the Albanian founder population. Accordingly, the Arbereshe genetic structure can contribute to the interpretation of the recent biological history of the southern Balkans. Intra-haplogroup analyses suggest that this area may have experienced important changes in the last five centuries, resulting in a marked increase in the frequency of haplogroups I2a and J2.
Link
January 05, 2009
Viaggio nella Calabria Greca - Ταξίδι στην Ελληνική Καλαβρία
An interesting documentary by alexspil on the Greeks of Calabria. Description:
Part 1/8:
The complete YouTube playlist.
"Viaggio nella Calabria Greca...insieme a un ministro!"
In questo film viene documentata la visita del vice ministro esteri Greco nella zona ellenofona della Calabria, ma anche vari aspetti culturali interessanti di questa minoranza, come per esempio la lingua, la storia, l'arte,delle testiomonianze ecc. La durata totale del documentario è di 70 minuti, ed è diviso in 8 parti. --------------- ------------------- ------------------ ----------------- ΕΛΛΗΝΙΚΑ: "Περιήγηση στην Ελληνόφωνη Καλαβρία...παρέα με έναν υπουργό!"
Το ντοκιμαντέρ αυτό καταγράφει την επίσκεψη του Υφ.Εξωτερικών της Ελλάδας στα ελληνόφωνα χωριά της Καλαβρίας, καθώς και διάφορα ενδιαφέροντα πολιτιστικά στοιχεία αυτής της μειονότητας, όπως π.χ. γλώσσα, ιστορία, τέχνες , διάφορες μαρτυρίες κ.α. Η συνολική του διάρκεια είναι 70 λεπτά, και έχει χωριστεί σε 8 μέρη.
Part 1/8:
The complete YouTube playlist.
December 04, 2008
Age of Italian R1b Y-chromosomes (amended Jan. 5, 2009)
AMENDED Dec 5
Capelli et al. Mol Phylogenet Evol. 2007 Jul;44(1):228-39 have provided ASD values for R1*(xR1a1) in several Italian locations. In the Italian context, this most likely represents haplogroup R1b and indeed R-M269.
The authors write:
(0.00057+0.00075+0.00061+0.00168+0.00227+0.00351+0.00188+0.00226)/8 = 0.00169 using the Chandler (pdf) rates, or
(0.00057+0.00079+0.00045+0.00245+0.00237+0.00283+0.00237+0.00343)/8 = 0.00191 using the YHRD rates, and the Chandler rate for DYS 388.
In the following table, I list the estimated age using the above germline mutation rate. Note that both variance and ASD accumulate at near the germline mutation rate, and are associated with substantial confidence intervals.
Thus, I list age estimates both with a "standard" model (25 years/generation, germline rate), and my own preferred model (31.5 years/generation, 0.87*germline rate *). All ages are in thousands of years.
[begin amend] These authors use an average squared distance between all pairs of alleles as implemented in Microsat, rather than between alleles and a putative ancestral allele. Therefore the ages given below have been divided by 2, compared to the initial version of this post.
(Sample codes: AMA,
Apennine Marche; CMA, Central Marche; CTU, Central Tuscany; ELB, Elba Island (Tuscany); NEL, North–East Latium; NWA, North–West Apulia;
SAP, South Apulia; SLA, South Latium; TLB, Tuscany–Latium border; VLB, Val Badia (Alto Adige); WCL, West Calabria; WCP, West Campania.)
[end amend]
Furthermore, I calculated ASD from an ancestral haplotype (either modal or median: they coincide) for Anatolian data from Cinnioglu et al. Hum Genet (2004) 114 : 127–148. The ASD value over the same set of markers is 0.408 corresponding to an age estimate of 5.3-9.1kya using any set of assumptions.
Notice further that the value of 0.408 is inflated by the inclusion of R1*(xR1a1) chromosomes that may not be native to Anatolia, such as several R-M73 examples that are often found in Central Asia (Underhill et al. Nature Genetics 26 (2000)). Removal of only examples #442-443 which are conspicuous for having a rare DYS390=19 allele, 5 repeat units away from the modal, reduces the ASD even further to 0.327, and a corresponding age estimate to 4.3-7.3ky.
UPDATE (Dec 5)
[In the following, ASD is calculated from an ancestral haplotype]
The Cinnioglu et al. data for R1b1b2-M269 chromosomes give an ASD of 0.31, corresponding to an age of 4.1-6.6ky.
The data published in Bosch et al. (Annals of Human Genetics (2005) 69,1–30) for haplogroup R1b-P25 from the Balkans, give an ASD of 0.35 or an age of 4.6-7.7ky.
UPDATE II (Dec 5)
The data from Di Gaetano et al. European Journal of Human Genetics (2008) for Sicily have a variance of 0.36 using a set of markers that does not include DYS 388, but includes DYS439, which has a Chandler rate of 0.00530 and a YHRD rate of 0.00635, leading to an average rate of 0.00228 (according to Chandler) or 0.00264 (according to YHRD) or an age range of 3.4-4.9ky
Update (Jan 5, 2009)
[In the following, ASD is calculated from an ancestral haplotype]
Zalloua et al. have a sample of R1b Cypriot males with an ASD of 0.4 over the same markers as Capelli et al. This is virtually identical to the R1*(xR1a1) data for Anatolia. The age corresponds to 5.2-8.6ky.
Discussion
These results suggest that R-M269 diversity in Italy, the Balkans, Anatolia, and Cyprus is similar, making it difficult to trace the origin of this haplogroup on this basis; clearly more data is needed.
Tentatively, there are several reasons why a European rather than West Asian origin seems reasonable:
(*) As explained here
Capelli et al. Mol Phylogenet Evol. 2007 Jul;44(1):228-39 have provided ASD values for R1*(xR1a1) in several Italian locations. In the Italian context, this most likely represents haplogroup R1b and indeed R-M269.
The authors write:
Microsatellite variation was investigated by the analysis of the following 10 microsatellites: DYS 388, 393, 392, 19, 390, 391, 389 I and II and 385—which is a double allele locus.The average mutation rate over these loci is:
...
Y-STRs were used to estimate intra-haplogroup diversity. Locus DYS385 has a duplicated allele pattern that can not be resolved assigning each allele to the corresponding locus. We thus decided to exclude DYS385 from STR variance estimation. Similarly, to avoid double estimation of locus variation, repeat number at locus DYS389 II was calculated by subtracting the number of repeats at DSY389 I.
(0.00057+0.00075+0.00061+0.00168+0.00227+0.00351+0.00188+0.00226)/8 = 0.00169 using the Chandler (pdf) rates, or
(0.00057+0.00079+0.00045+0.00245+0.00237+0.00283+0.00237+0.00343)/8 = 0.00191 using the YHRD rates, and the Chandler rate for DYS 388.
In the following table, I list the estimated age using the above germline mutation rate. Note that both variance and ASD accumulate at near the germline mutation rate, and are associated with substantial confidence intervals.
Thus, I list age estimates both with a "standard" model (25 years/generation, germline rate), and my own preferred model (31.5 years/generation, 0.87*germline rate *). All ages are in thousands of years.
[begin amend] These authors use an average squared distance between all pairs of alleles as implemented in Microsat, rather than between alleles and a putative ancestral allele. Therefore the ages given below have been divided by 2, compared to the initial version of this post.
| Chandler rates | YHRD rates | ||||||
| Standard | Dienekes | Standard | Dienekes | ||||
| VLB | 0.467 | 3.5 | 5 | 3.1 | 4.4 | ||
| CTU | 0.486 | 3.6 | 5.2 | 3.2 | 4.6 | ||
| CMA | 0.298 | 2.2 | 3.2 | 2 | 2.8 | ||
| ELB | 0.544 | 4 | 5.8 | 3.6 | 5.2 | ||
| AMA | 0.623 | 4.6 | 6.7 | 4.1 | 5.9 | ||
| TLB | 0.701 | 5.2 | 7.5 | 4.6 | 6.6 | ||
| NEL | 0.529 | 3.9 | 5.7 | 3.5 | 5 | ||
| SLA | 0.414 | 3.1 | 4.4 | 2.7 | 3.9 | ||
| NWA | 0.297 | 2.2 | 3.2 | 1.9 | 2.8 | ||
| WCP | 0.527 | 3.9 | 5.6 | 3.4 | 5 | ||
| SAP | 0.533 | 3.9 | 5.7 | 3.5 | 5.1 | ||
| WCL | 0.5 | 3.7 | 5.4 | 3.3 | 4.7 | ||
(Sample codes: AMA,
Apennine Marche; CMA, Central Marche; CTU, Central Tuscany; ELB, Elba Island (Tuscany); NEL, North–East Latium; NWA, North–West Apulia;
SAP, South Apulia; SLA, South Latium; TLB, Tuscany–Latium border; VLB, Val Badia (Alto Adige); WCL, West Calabria; WCP, West Campania.)
[end amend]
Furthermore, I calculated ASD from an ancestral haplotype (either modal or median: they coincide) for Anatolian data from Cinnioglu et al. Hum Genet (2004) 114 : 127–148. The ASD value over the same set of markers is 0.408 corresponding to an age estimate of 5.3-9.1kya using any set of assumptions.
Notice further that the value of 0.408 is inflated by the inclusion of R1*(xR1a1) chromosomes that may not be native to Anatolia, such as several R-M73 examples that are often found in Central Asia (Underhill et al. Nature Genetics 26 (2000)). Removal of only examples #442-443 which are conspicuous for having a rare DYS390=19 allele, 5 repeat units away from the modal, reduces the ASD even further to 0.327, and a corresponding age estimate to 4.3-7.3ky.
UPDATE (Dec 5)
[In the following, ASD is calculated from an ancestral haplotype]
The Cinnioglu et al. data for R1b1b2-M269 chromosomes give an ASD of 0.31, corresponding to an age of 4.1-6.6ky.
The data published in Bosch et al. (Annals of Human Genetics (2005) 69,1–30) for haplogroup R1b-P25 from the Balkans, give an ASD of 0.35 or an age of 4.6-7.7ky.
UPDATE II (Dec 5)
The data from Di Gaetano et al. European Journal of Human Genetics (2008) for Sicily have a variance of 0.36 using a set of markers that does not include DYS 388, but includes DYS439, which has a Chandler rate of 0.00530 and a YHRD rate of 0.00635, leading to an average rate of 0.00228 (according to Chandler) or 0.00264 (according to YHRD) or an age range of 3.4-4.9ky
Update (Jan 5, 2009)
[In the following, ASD is calculated from an ancestral haplotype]
Zalloua et al. have a sample of R1b Cypriot males with an ASD of 0.4 over the same markers as Capelli et al. This is virtually identical to the R1*(xR1a1) data for Anatolia. The age corresponds to 5.2-8.6ky.
Discussion
These results suggest that R-M269 diversity in Italy, the Balkans, Anatolia, and Cyprus is similar, making it difficult to trace the origin of this haplogroup on this basis; clearly more data is needed.
Tentatively, there are several reasons why a European rather than West Asian origin seems reasonable:
- R-M269 is more frequent in Europe than in Asia
- Both forms of R-M269, haplotypes Ht15 and Ht35 are present in Europe; the little Ht15 found in West Asia can be easily explained historically.
- The sister clade R1a is found at high frequency in Europe, and may have spread from here to the Eurasiatic steppes.
- Small-scale introduction of R1b in West Asia is more parsimonious than large-scale replacement of European Y-chromosomes by R1b chromosoomes, unaccompanied by other typically West Asian haplogroups such as J2, and presenting a cline with its maximum in the Atlantic
(*) As explained here
July 31, 2008
Expansion of E-V13 explained
E-V13 is the main European clade of haplogroup E. It has been variously interpreted as a signature of early Balkan Bronze Age, or Mesolithic, the Greek colonization of Southern Italy, Greek ancestry in some Pakistanis, or Roman soldiers of Balkan origin in Britain. A proper understanding of its age would help resolve the problem of its origins.
Age, of course, depends on a proper choice of mutation rate, and as I have argued (part I and part II), the proper effective mutation rate is near the germline rate and not 3.6x slower as argued by Zhivotovsky, Underhill, and Feldman (2006). This is especially true for a relatively young haplogroup (very low STR variance compared to other lineages), which is also quite frequent in its area of origin, while much reduced away from it, giving a definite impression of a sudden and relatively recent expansion.
In my previous post, I estimated a Late Bronze Age for E-V13 in Greece and areas affected by historical Greek colonization. I now used Ken Nordtvedt's Generations2 program to obtain estimates of the age of E-V13 in three different datasets: the King set, 12-marker data from the E-M35 Phylogeny Project (Haplozone), as well as E-M78 data -most of which should be E-V13- from Bosch et al. (2006). In the latter set, I used two marker sets: all 12 markers common between Generations2 and Bosch, as well as 8 markers common between them, but excluding markers after DYS392 (in the Generations2/FTDNA order).
Both the King et al. E-V13 data, as well as the diverse, mostly European Haplozone E-V13 agree in placing the expansion of this haplogroup squarely in the Aegean Bronze Age.
Aromuns (Vlachs) coalesce to the Roman era, consistent with the idea that they are Balkan natives who became Latinized linguistically at around that era.
Albanians also coalesce to Roman/Late Antique times, consistent with the idea that their high frequency of haplogroup E-V13 (which reaches very high numbers in e.g. Kosovars) is not associated with high diversity. Founder effects in that time frame are the reason for the high frequency of E-V13 in them.
Finally, Slavomacedonians from the former Yugoslav Republic of Macedonia coalesce well into AD times, at around the time of the first Slavic arrivals in the Balkans. This suggests that E-V13 in them is the result of local founders at around that time who adopted the Slavic language. However, Pericic et al. (2005) (see below) report high (but unspecified) diversity of E-M78α in "Macedonia", so it is possible that a larger number of earlier inhabitants were absorbed.
Pericic et al. (2005) give a 7.3kya estimate for the expansion of E-M78α (almost perfectly equivalent to E-V13) for Southeastern European populations north of Greece. Due to their use of the 3.6x slower mutation rate, this figure needs to be converted to equivalent years. The Nea Nikomedeia time depth was estimated as 9.2kya by King et al. Therefore, the equivalent age for the Pericic et al. (2005) expansion is (7.3/9.2) * 149 generations or 118 generations (1,540-950BC). They note that STR variance is higher in Greece, Macedonia, and Apulia, all areas with well-known historical Greek connections.
Cruciani et al. (2007) propose that E-V13 arrived in Europe from West Asia and underwent an expansion in Europe at 4-4.7 kya. This age is calculated using effective mutation rates that are 2.4 or 2.8 slower than the germline rate, which seems to suggest a Late Bronze Age or even later expansion with a rate closer to the germline one.
In the Balkans, it is fairly clear that E-V13 is mostly concentrated south of the Jirecek Line which separated native Greek from Latin speakers. In Italy, the highest frequencies are found in the south, the areas of historical Greek colonization. High frequencies are also attained in Cyprus. Cyprus also high STR diversity, consistent with an early arrival, suggestive of both early Mycenaean and later colonizations from the Aegean.
Conclusion
The age and distribution of E-V13 chromosomes suggest that expansions of the Greek world in the Bronze and later ages were the major causes of its diffusion.
Who was the E-V13 patriarch in Greece? He was perhaps one of the legendary figures of Greek mythology some of whom are said to have come from abroad. For whatever reason, his progeny grew, and were around to participate in the expansion of the Mycenaean world and the subsequent Greek colonization.
UPDATE (Aug. 1):
An additional piece of evidence is Y-chromosome distribution in Calabria, a Southern Italian region with well-known Greek connections. According to Semino et al. (2004) [Am. J. Hum. Genet. 74:1023–1034, 2004], the Calabrian sample has an E-M78 frequency of 16.3%, whereas "Calabria 2" representing the "Albanian community of the Cosenza province" has only 5.9%. This is consistent with the idea that E-V13 in modern Albanians is to a great degree due to Greek founders (Epirotes or ancient colonists).
Age, of course, depends on a proper choice of mutation rate, and as I have argued (part I and part II), the proper effective mutation rate is near the germline rate and not 3.6x slower as argued by Zhivotovsky, Underhill, and Feldman (2006). This is especially true for a relatively young haplogroup (very low STR variance compared to other lineages), which is also quite frequent in its area of origin, while much reduced away from it, giving a definite impression of a sudden and relatively recent expansion.
In my previous post, I estimated a Late Bronze Age for E-V13 in Greece and areas affected by historical Greek colonization. I now used Ken Nordtvedt's Generations2 program to obtain estimates of the age of E-V13 in three different datasets: the King set, 12-marker data from the E-M35 Phylogeny Project (Haplozone), as well as E-M78 data -most of which should be E-V13- from Bosch et al. (2006). In the latter set, I used two marker sets: all 12 markers common between Generations2 and Bosch, as well as 8 markers common between them, but excluding markers after DYS392 (in the Generations2/FTDNA order).
| N | Age (25y/gen) | Age (30y/gen) | ||||||
| Nea Nikomedeia | 8 | 149 | 1725 | BC | 2470 | BC | ||
| Sesklo/Dimini | 20 | 71 | 225 | AD | 130 | BC | ||
| Lerna Franchthi | 20 | 120 | 1000 | BC | 1600 | BC | ||
| Crete | 13 | 68 | 300 | AD | 40 | BC | ||
| Haplozone | 103 | 134 | 1350 | BC | 2020 | BC | ||
| Aromuns (12) | 32 | 71 | 225 | AD | 130 | BC | ||
| Aromuns (8) | 32 | 73 | 175 | AD | 190 | BC | ||
| Slavomacedonians (12) | 13 | 51 | 725 | AD | 470 | AD | ||
| Slavomacedonians (8) | 13 | 59 | 525 | AD | 230 | AD | ||
| Albanians (12) | 9 | 70 | 250 | AD | 100 | BC | ||
| Albanians (8) | 9 | 59 | 525 | AD | 230 | AD | ||
Both the King et al. E-V13 data, as well as the diverse, mostly European Haplozone E-V13 agree in placing the expansion of this haplogroup squarely in the Aegean Bronze Age.
Aromuns (Vlachs) coalesce to the Roman era, consistent with the idea that they are Balkan natives who became Latinized linguistically at around that era.
Albanians also coalesce to Roman/Late Antique times, consistent with the idea that their high frequency of haplogroup E-V13 (which reaches very high numbers in e.g. Kosovars) is not associated with high diversity. Founder effects in that time frame are the reason for the high frequency of E-V13 in them.
Finally, Slavomacedonians from the former Yugoslav Republic of Macedonia coalesce well into AD times, at around the time of the first Slavic arrivals in the Balkans. This suggests that E-V13 in them is the result of local founders at around that time who adopted the Slavic language. However, Pericic et al. (2005) (see below) report high (but unspecified) diversity of E-M78α in "Macedonia", so it is possible that a larger number of earlier inhabitants were absorbed.
Pericic et al. (2005) give a 7.3kya estimate for the expansion of E-M78α (almost perfectly equivalent to E-V13) for Southeastern European populations north of Greece. Due to their use of the 3.6x slower mutation rate, this figure needs to be converted to equivalent years. The Nea Nikomedeia time depth was estimated as 9.2kya by King et al. Therefore, the equivalent age for the Pericic et al. (2005) expansion is (7.3/9.2) * 149 generations or 118 generations (1,540-950BC). They note that STR variance is higher in Greece, Macedonia, and Apulia, all areas with well-known historical Greek connections.
Cruciani et al. (2007) propose that E-V13 arrived in Europe from West Asia and underwent an expansion in Europe at 4-4.7 kya. This age is calculated using effective mutation rates that are 2.4 or 2.8 slower than the germline rate, which seems to suggest a Late Bronze Age or even later expansion with a rate closer to the germline one.
In the Balkans, it is fairly clear that E-V13 is mostly concentrated south of the Jirecek Line which separated native Greek from Latin speakers. In Italy, the highest frequencies are found in the south, the areas of historical Greek colonization. High frequencies are also attained in Cyprus. Cyprus also high STR diversity, consistent with an early arrival, suggestive of both early Mycenaean and later colonizations from the Aegean.
Conclusion
The age and distribution of E-V13 chromosomes suggest that expansions of the Greek world in the Bronze and later ages were the major causes of its diffusion.
Who was the E-V13 patriarch in Greece? He was perhaps one of the legendary figures of Greek mythology some of whom are said to have come from abroad. For whatever reason, his progeny grew, and were around to participate in the expansion of the Mycenaean world and the subsequent Greek colonization.
UPDATE (Aug. 1):
An additional piece of evidence is Y-chromosome distribution in Calabria, a Southern Italian region with well-known Greek connections. According to Semino et al. (2004) [Am. J. Hum. Genet. 74:1023–1034, 2004], the Calabrian sample has an E-M78 frequency of 16.3%, whereas "Calabria 2" representing the "Albanian community of the Cosenza province" has only 5.9%. This is consistent with the idea that E-V13 in modern Albanians is to a great degree due to Greek founders (Epirotes or ancient colonists).
May 30, 2007
Prehistoric European human sacrifice
A fascinating new paper from the June issue of Current Anthropology explores ancient multiple graves and raises the possibility that hunter gatherers in what is now Europe may have practiced ritual human sacrifice. This practice – well-known in large, stratified societies – supports data emerging from different lines of research that the level of social complexity reached in the distant past by groups of hunter gatherers was well beyond that of many more recent small bands of modern foragers.
Due to their number, state of preservation, richness, and variety of associated grave goods, burials from the Upper Paleolithic (26,000-8,000 BC) represent an important source of information on ideological beliefs that may have influenced funerary behavior. In an analysis of the European record, Vincenzo Formicola (University of Pisa, Italy) points to a high frequency of multiple burials, commonly attributed to simultaneous death due to natural disaster or disease.
However, a look at grave composition reveals that some of the multiple burials may have been selective. Not only do the skeletons in these graves vary by sex and age, but the most spectacular sites also include a severely deformed individual with a pathological condition that would have been apparent since birth, for example, dwarfism or congenital bowing of the bones.
These multiple graves are also richly ornamented and in choice locales. For example, the remains of an adolescent dwarf in Romito Cave (Calabria, Italy) lie next to a female skeleton under an elaborate engraving of a bull. In the Sunghir double burial (Russia), the skeletons of a pre-teen boy and girl are surrounded by ivory objects including about 5,000 beads, each of which may have taken an hour to make.
"These findings point to the possibility that human sacrifices were part of the ritual activity of these populations and provide clues on the complexity and symbolism pervading Upper Paleolithic societies as well as on the perception of "diversity" and its links to magical-religious beliefs," Formicola writes. "These individuals may have been feared, hated, or revered . . . we do not know whether this adolescent received special burial treatment in spite of being a dwarf or precisely because he was a dwarf."
Eurekalert
Due to their number, state of preservation, richness, and variety of associated grave goods, burials from the Upper Paleolithic (26,000-8,000 BC) represent an important source of information on ideological beliefs that may have influenced funerary behavior. In an analysis of the European record, Vincenzo Formicola (University of Pisa, Italy) points to a high frequency of multiple burials, commonly attributed to simultaneous death due to natural disaster or disease.
However, a look at grave composition reveals that some of the multiple burials may have been selective. Not only do the skeletons in these graves vary by sex and age, but the most spectacular sites also include a severely deformed individual with a pathological condition that would have been apparent since birth, for example, dwarfism or congenital bowing of the bones.
These multiple graves are also richly ornamented and in choice locales. For example, the remains of an adolescent dwarf in Romito Cave (Calabria, Italy) lie next to a female skeleton under an elaborate engraving of a bull. In the Sunghir double burial (Russia), the skeletons of a pre-teen boy and girl are surrounded by ivory objects including about 5,000 beads, each of which may have taken an hour to make.
"These findings point to the possibility that human sacrifices were part of the ritual activity of these populations and provide clues on the complexity and symbolism pervading Upper Paleolithic societies as well as on the perception of "diversity" and its links to magical-religious beliefs," Formicola writes. "These individuals may have been feared, hated, or revered . . . we do not know whether this adolescent received special burial treatment in spite of being a dwarf or precisely because he was a dwarf."
Eurekalert
February 17, 2007
Sub-Saharan African mtDNA admixture in several West Eurasian (Caucasoid) populations
The recent article on Etruscan mtDNA contains a useful overview table of mtDNA haplogroups in several West Eurasian (Caucasoid) populations, collected from both this study as well as the literature. Extracted from this table is the following table of mtDNA L (Sub-Saharan African) sequences in the listed populations. Scroll down because blogger added a lot of vertical space above the table.
| POPULATION | Sample Size | % L sequences |
| Palestinians | 117 | 13.68 |
| Jordan | 494 | 12.55 |
| Portugal-South | 203 | 10.84 |
| Iraq | 116 | 9.48 |
| Syria | 328 | 9.15 |
| Portugal-Center | 203 | 6.4 |
| Spain-North-West | 216 | 3.7 |
| Portugal-North | 188 | 3.19 |
| Latium | 138 | 2.9 |
| Non-Europeans | 4739 | 2.85 |
| Lebanon | 176 | 2.84 |
| Volterra | 114 | 2.63 |
| Kurds | 82 | 2.44 |
| Sicily | 105 | 1.9 |
| Turks | 340 | 1.76 |
| Andalusia | 114 | 1.75 |
| Spain-North-East | 179 | 1.68 |
| Casentino | 122 | 1.64 |
| Murlo | 86 | 1.16 |
| Crete | 202 | 0.99 |
| Marche | 813 | 0.98 |
| Tuscans | 322 | 0.93 |
| Finland | 121 | 0.83 |
| Europe (w/o Tuscans) | 10589 | 0.79 |
| Bulgaria | 141 | 0.71 |
| Bosnia | 144 | 0.69 |
| Spain-Center | 148 | 0.68 |
| Basque | 156 | 0.64 |
| England | 335 | 0.6 |
| Sardinia | 370 | 0.54 |
| Switzerland | 228 | 0.44 |
| Campania | 313 | 0.32 |
| France | 332 | 0.3 |
| Germany | 335 | 0.3 |
| Iran | 436 | 0.23 |
| Poland | 542 | 0.18 |
| Caucasus-North-West | 1179 | 0.17 |
| Apulia-Calabria | 226 | 0 |
| Armenia | 191 | 0 |
| Austria | 99 | 0 |
| Azerbaijan | 48 | 0 |
| Bavaria | 249 | 0 |
| Caucasus-North-East | 820 | 0 |
| Czech-Republic | 83 | 0 |
| Estonia | 558 | 0 |
| Georgia | 412 | 0 |
| Greece | 155 | 0 |
| Ireland | 300 | 0 |
| Latvia | 299 | 0 |
| Lemnos | 60 | 0 |
| Lombardy | 177 | 0 |
| Norway | 556 | 0 |
| Piedmont | 169 | 0 |
| Rhodes | 42 | 0 |
| Romania | 94 | 0 |
| Russia | 397 | 0 |
| Scotland | 1199 | 0 |
| Slovakia | 129 | 0 |
| Slovenia | 104 | 0 |
| Sweden-Denmark | 75 | 0 |
| Wales | 92 | 0 |
January 03, 2006
On Genetic Palimpsests
Most of the genetic markers used in human phylogeographic studies have been dated to the prehistoric period, and the majority of them are of Upper Paleolithic origin.
Lately, subclades identified within some human lineages on the Y-chromosome have crossed the Neolithic barrier, and in even rarer cases, "signatures" of historical events, such as the dominance of the Mongols, the Manchu, or the Ui Neill.
As a result, most markers are suitable for examining events of human prehistory, and not of historical ethnic groups.
Of course, scientists have tried to apply genetic information to historical processes, e.g., in the case of Jewish origins, but it turns out that the "Jewish gene" or Cohen Modal Haplotype actually turns out to to be much older and not particularly Jewish after all.
Even with old markers, it is still possible to reason about historical events. For example, the theories of white nationalist Arthur Kemp about the widespread prevalence of black African slavery in the classical world have been squarely defeated by the near-complete absence of Sub-Saharan African markers in the Italian and Balkan peninsulas. Similar theories propagated by Gustav Kossina and the Aryan-Nordic camp about the Northern European origin of the Indo-Europeans of India have similarly been defeated, since Indians completely lack haplogroup I chromosomes that are frequent in European Nordic populations.
So, even though the markers in question are very old (I is of Upper Paleolithic age), we can still reason historically with them.
Often, this historical reasoning can be shaky. For example, Spencer Wells has made tall claims about the Phoenicians, the Sea Peoples, and the Carthaginians in a National Geographic article which were based on the analysis of haplogroup J and E distribution in the Levant and North Africa.
For example, he found that there was little impact of Phoenicians on Carthage, but his conclusions are based on the paucity of haplogroup J in modern North African populations, who are a much broader-group than the socially and geographically constrained group of the ancient Carthaginians. Similar claims were made regarding the non-impact of the Sea Peoples in the Levant, but again, this is based on the similarity between coastal and non-coastal populations.
But, for all we know subdivisions of haplogroup J and other Near Eastern markers may differ between coastal and non-coastal populations, or perhaps, the Sea Peoples did initially affect the coastal peoples, but later their genes diffused into non-coastal populations, removing the distinctiveness of the two.
Let us take a further example of Sicily. The island of Sicily was colonized initially by farmers, and later by Greeks and Phoenicians. All three groups are believed to have contained some "Neolithic" markers, such as haplogroups J, E3b, and G, so any inferences about the relative contributions of the three groups are on very shaky ground.
For example, Semino et al. proposed that only 7% of Calabrian Y-chromosomes are of Greek with the assumption that J2a and E3b represent Anatolian and Greek lineages respectively. But, the frequency of E3b in modern Peloponnesians is not necessarily representative of its frequency in the very specific ancient city states and medieval Greek populations that colonized Southern Italy, and J2a may have arrived in Calabria either from Anatolia, e.g., during the Neolithic, or from Greece, during the age of colonization.
Things become even more complex when we turn to the Balkans or to Anatolia. For example, I playfully recounted some random facts about Phrygo-Armenians, but these hardly scratch the surface of the problem. Hittites, themselves either native or intrusive, were unseated by Phrygians, who were conquered by Persians, who were conquered by Macedonian Greeks, who were conquered by Romans, who were conquered by Turks. Not to mention the Galatians of Ancyra, or the ubuiquitous Armenians of the Byzantine Empire, or even the Jews of both the ancient and more recent origin, and of course the Turks themselves as well as imported Muslims from former provinces or vassals of the Ottoman Empire. And, of course, we should not forget that present-day Anatolians are only a subset of very recent Anatolians, several million of who were liquidated or deported following World War I.
These remarks underscore the near hopelessness of untangling historical patterns on the basis of phylogeography. Is there a way out?
Part of the solution will consist of performing huge studies with large sample sizes and very recently derived genetic markers, augmented by separate genome-wide autosomal clustering methods that may unmask latent genetic components that may be correlated with historical groups. Such studies will be very costly, even though the price of DNA testing is likely to go down, because ultimately the hard work of sample collection has to be done and paid for.
The ultimate solution, would be some significant progress in ancient DNA extraction. At present, mtDNA is the only game in town, and inferences from mtDNA are always up for grabs, due to the potential for contamination, uncertainties about selection, and of course the simple fact that ancient civilizations were largely patriarchal.
An even more exciting development would be the discovery -in modern human populations- of the genetics underlying common human variation in metric and morphological traits. Then, by examing ancient skeletal remains, we will be able to estimate the genetic identity of populations even if DNA cannot be directly observed.
The technical challenges are enormous, but -in my opinion- are not the main challenges at all. As hinted in Genetic vs. Mythical Origins, the study of the past forces us to question our ideas of descent and ethnicity. In the end, will it lead to an erosion of ethnic identity, or to its reinforcement along genetic and hence "objective" lines?
Lately, subclades identified within some human lineages on the Y-chromosome have crossed the Neolithic barrier, and in even rarer cases, "signatures" of historical events, such as the dominance of the Mongols, the Manchu, or the Ui Neill.
As a result, most markers are suitable for examining events of human prehistory, and not of historical ethnic groups.
Of course, scientists have tried to apply genetic information to historical processes, e.g., in the case of Jewish origins, but it turns out that the "Jewish gene" or Cohen Modal Haplotype actually turns out to to be much older and not particularly Jewish after all.
Even with old markers, it is still possible to reason about historical events. For example, the theories of white nationalist Arthur Kemp about the widespread prevalence of black African slavery in the classical world have been squarely defeated by the near-complete absence of Sub-Saharan African markers in the Italian and Balkan peninsulas. Similar theories propagated by Gustav Kossina and the Aryan-Nordic camp about the Northern European origin of the Indo-Europeans of India have similarly been defeated, since Indians completely lack haplogroup I chromosomes that are frequent in European Nordic populations.
So, even though the markers in question are very old (I is of Upper Paleolithic age), we can still reason historically with them.
Often, this historical reasoning can be shaky. For example, Spencer Wells has made tall claims about the Phoenicians, the Sea Peoples, and the Carthaginians in a National Geographic article which were based on the analysis of haplogroup J and E distribution in the Levant and North Africa.
For example, he found that there was little impact of Phoenicians on Carthage, but his conclusions are based on the paucity of haplogroup J in modern North African populations, who are a much broader-group than the socially and geographically constrained group of the ancient Carthaginians. Similar claims were made regarding the non-impact of the Sea Peoples in the Levant, but again, this is based on the similarity between coastal and non-coastal populations.
But, for all we know subdivisions of haplogroup J and other Near Eastern markers may differ between coastal and non-coastal populations, or perhaps, the Sea Peoples did initially affect the coastal peoples, but later their genes diffused into non-coastal populations, removing the distinctiveness of the two.
Let us take a further example of Sicily. The island of Sicily was colonized initially by farmers, and later by Greeks and Phoenicians. All three groups are believed to have contained some "Neolithic" markers, such as haplogroups J, E3b, and G, so any inferences about the relative contributions of the three groups are on very shaky ground.
For example, Semino et al. proposed that only 7% of Calabrian Y-chromosomes are of Greek with the assumption that J2a and E3b represent Anatolian and Greek lineages respectively. But, the frequency of E3b in modern Peloponnesians is not necessarily representative of its frequency in the very specific ancient city states and medieval Greek populations that colonized Southern Italy, and J2a may have arrived in Calabria either from Anatolia, e.g., during the Neolithic, or from Greece, during the age of colonization.
Things become even more complex when we turn to the Balkans or to Anatolia. For example, I playfully recounted some random facts about Phrygo-Armenians, but these hardly scratch the surface of the problem. Hittites, themselves either native or intrusive, were unseated by Phrygians, who were conquered by Persians, who were conquered by Macedonian Greeks, who were conquered by Romans, who were conquered by Turks. Not to mention the Galatians of Ancyra, or the ubuiquitous Armenians of the Byzantine Empire, or even the Jews of both the ancient and more recent origin, and of course the Turks themselves as well as imported Muslims from former provinces or vassals of the Ottoman Empire. And, of course, we should not forget that present-day Anatolians are only a subset of very recent Anatolians, several million of who were liquidated or deported following World War I.
These remarks underscore the near hopelessness of untangling historical patterns on the basis of phylogeography. Is there a way out?
Part of the solution will consist of performing huge studies with large sample sizes and very recently derived genetic markers, augmented by separate genome-wide autosomal clustering methods that may unmask latent genetic components that may be correlated with historical groups. Such studies will be very costly, even though the price of DNA testing is likely to go down, because ultimately the hard work of sample collection has to be done and paid for.
The ultimate solution, would be some significant progress in ancient DNA extraction. At present, mtDNA is the only game in town, and inferences from mtDNA are always up for grabs, due to the potential for contamination, uncertainties about selection, and of course the simple fact that ancient civilizations were largely patriarchal.
An even more exciting development would be the discovery -in modern human populations- of the genetics underlying common human variation in metric and morphological traits. Then, by examing ancient skeletal remains, we will be able to estimate the genetic identity of populations even if DNA cannot be directly observed.
The technical challenges are enormous, but -in my opinion- are not the main challenges at all. As hinted in Genetic vs. Mythical Origins, the study of the past forces us to question our ideas of descent and ethnicity. In the end, will it lead to an erosion of ethnic identity, or to its reinforcement along genetic and hence "objective" lines?
January 02, 2006
Some aspects of J2 distribution
Haplogroup J2 consists exclusively of two separate subclades: J2a-M410 and J2b-M12.
Crete, occupying the southmost of the Greek world has an M12/M172 ratio of 2.2% [1]. This ratio is 20% [1] or 42.2% [2], a weighted average of 26%. In Northern Greece (Macedonia) it is 43.2% [2].
In Albania, the same ratio is 100% in the small sample of [1] and 54.6% as reported by [2], a weighted average of 55%.
In Bulgaria, the ratio is 28.6% [1] and in Romania, the ratio is 0% in the good sample of [1]. In the Ukraine it is 32.9% [2]
According to [3], the ratio is high in Serbs (66.3%). The few Croatians and Herzegovinians belonging in haplogroup J2 belong to the M12 clade, giving a ratio of 100% [2,3]. Similarly in Poland (100%) [2], and Czech Republic/Slovakia (50%) [1].
The distinction between the Western and Eastern Balkans that I have spoken of before is clear in this regard. M12 clade comprises the majority of J2 in the West and the minority in the East. Moreover, Slavic speakers of continental Europe belong more to the M12 clade, whereas those bordering Black Sea are more inclined to have a low frequency of M12, including the non-Slavic Romanians who lack M12 altogether. In historical times, the Balkans were inhabited by several Indo-European peoples which could be classified in the macro-groups of Illyrians (west) and Thracians (east). Greek trade and settlement occurred in both the Adriatic and the Black Sea, but the Greek presence was probably heavier and more long-lasting (until recent times) in the latter region.
Italy resembles the Greek-Black Sea area. Southern Italy has a ratio of 12.4%, while Northern Italy has a ratio of 25% [1]. North-Central Italy (35.7%), and two Calabrian samples (1%), and Sicily (0%). The latter two locations were Greek speaking for the major part of their recorded history.
Turkey resembles the Greek-Black Sea-South Italian area with an overall ratio of 7.1% [4]. Turkey was primarily Greek, Armenian and Kurdish speaking before the arrival of the Altaic-speaking Turks. Before that, it was also home to a variety of languages, including several extinct languages of the Indo-European family such as Hittite, Luvian, Palaic, Lydian, Lycian, Phrygian, and Celtic.
[1] Di Giacomo (2004)
[2] Semino (2004)
[3] Pericic (2006)
[4] Cinnioglu (2004)
Crete, occupying the southmost of the Greek world has an M12/M172 ratio of 2.2% [1]. This ratio is 20% [1] or 42.2% [2], a weighted average of 26%. In Northern Greece (Macedonia) it is 43.2% [2].
In Albania, the same ratio is 100% in the small sample of [1] and 54.6% as reported by [2], a weighted average of 55%.
In Bulgaria, the ratio is 28.6% [1] and in Romania, the ratio is 0% in the good sample of [1]. In the Ukraine it is 32.9% [2]
According to [3], the ratio is high in Serbs (66.3%). The few Croatians and Herzegovinians belonging in haplogroup J2 belong to the M12 clade, giving a ratio of 100% [2,3]. Similarly in Poland (100%) [2], and Czech Republic/Slovakia (50%) [1].
The distinction between the Western and Eastern Balkans that I have spoken of before is clear in this regard. M12 clade comprises the majority of J2 in the West and the minority in the East. Moreover, Slavic speakers of continental Europe belong more to the M12 clade, whereas those bordering Black Sea are more inclined to have a low frequency of M12, including the non-Slavic Romanians who lack M12 altogether. In historical times, the Balkans were inhabited by several Indo-European peoples which could be classified in the macro-groups of Illyrians (west) and Thracians (east). Greek trade and settlement occurred in both the Adriatic and the Black Sea, but the Greek presence was probably heavier and more long-lasting (until recent times) in the latter region.
Italy resembles the Greek-Black Sea area. Southern Italy has a ratio of 12.4%, while Northern Italy has a ratio of 25% [1]. North-Central Italy (35.7%), and two Calabrian samples (1%), and Sicily (0%). The latter two locations were Greek speaking for the major part of their recorded history.
Turkey resembles the Greek-Black Sea-South Italian area with an overall ratio of 7.1% [4]. Turkey was primarily Greek, Armenian and Kurdish speaking before the arrival of the Altaic-speaking Turks. Before that, it was also home to a variety of languages, including several extinct languages of the Indo-European family such as Hittite, Luvian, Palaic, Lydian, Lycian, Phrygian, and Celtic.
[1] Di Giacomo (2004)
[2] Semino (2004)
[3] Pericic (2006)
[4] Cinnioglu (2004)
September 07, 2005
Calabrians as Greek descenants
Most of the Greeks of Calabrians are now Italianized, but it is very likely that due to the mostly rural conditions of the region, the absence of significant foreign settlements and the late survival of Greek, that they may be largely descended from the medieval Greeks of the region, and even before that, the Greeks of mainland Greece. Moreover, since Greek settlement in Calabria largely pre-dates the descents of Slavs and Albanians in Greece, we may be able to (roughly) determine the extent of the impact of these elements in the modern Greek population.
Two papers in the literature [1, 2] report on the frequency of Y-chromosome haplogroups in the population of Calabira. [1] reports data labeled as "Calabrians", and [2] reports data on the population of Reggio and Paola. The cumulative sample has a size of N=87. Frequency data are shown below, with Greek frequency data also shown for comparison from [3]
Of course, frequencies may be modified by random genetic drift, and Calabrians are not descended from all Greek regions, but we can still make some general observations about their commonalities and differences.
In both Calabrians and Greeks, haplogroup J2 appears to be very frequent, and haplogroup E3b is also very frequent. It appears very likely that these two haplogroups were represented in ancient populations.
Calabrians have a higher frequency of haplogroup R1b. This haplogroup originated in Asia, but its most recent expansions mark the movements of people from Iberia and Anatolia after the Last Glacial Maximum. Italians have a generally higher frequency of this haplogroup, and hence it appears likely that R1b in Calabrians may partially represent the contribution of native Italians to their gene pool.
Calabrians also have a higher frequency of haplogroup J1. This haplogroup originated in the southern part of the Fertile Crescent, and is often (but not exclusively) found in modern Semitic speakers such as Jews and Arabs. This may represent remnants of Near Eastern people during post-Roman times, even though its earlier arrival cannot be entirely excluded.
Finally, a striking feature of the frequency table is the paucity of R1a and I lineages in Calabrians. R1a originated in the Ukraine and spread after the Last Glacial Maximum, but more recently with Slavic speakers. I1b originated in the Balkans and spread during late Paleolithic and early Neolithic and subsequent times.
It is fairly interesting that in a study which included a Cypriot sample [4], only 2% of Cypriots carried haplogroup R1a chromosomes. Cypriots are also a population which separated from mainland Greeks before the medieval period. The frequency of haplogroup I chromosomes is not available for Cypriots.
Also, of interest is the fact that in regions of Anatolia [5] inhabited by Greek speakers until recently, and in which the native population may be assumed to be descended partially from Islamized Greeks, the frequencies of haplogroups R1a and I are also low. In the Aegean region (8) they are 3.3% and 6.7%, and in the eastern Black Sea region (3) where Muslim Greek speakers still exist, they are 4.8% and 2.4%. Moreover in Anatolia R1a1 frequency is correlated with longitude, declining towards Greece. R1a frequency also decreases from north to south in the Balkans [6].
In conclusion, this small survey provides some evidence against the notion that Y-haplogroups I and especially R1a were substantially represented in ancient Greeks. The relative absence of these haplogroups in populations thought to be partially descended from Greeks, in addition to the decrease in frequency of R1a both north-to-south in the Balkans and east-to-west in Anatolia are the main reasons for this observation.
Naturally, I doubt that we can statistically exclude the presence of either haplogroup -at some low frequency- in ancient Greeks using these relatively small samples, but at least we have some indication that they probably did not form a substantial part of their patrilineal descent.
Update
In a larger sample of Calabrians, the haplogroup I frequency is 5.4%, and that in Sicilians is 8.8% [7]. Haplogroup I lineages in the Balkans and Italy are divided mainly into I1a, I1b, and I*(xI1a, I1b).
Update 2
I also came across this interesting paper (Coll Antropol. 2001 Jun;25(1):189-93.) which further substantiates the idea of the genetic isolation of Reggio Calabria, listed as REG above:
Surnames of grandparents were collected from children in the primary schools of the Albanian-Italian, Croat-Italian, and Greek-Italian villages. The coefficients of relationships by isonymy show almost no relationship with ethnicity. Ethnolinguistic minorities of Southern Italy and Sicily are geographically subdivided in two main clusters: the first cluster comprises the Albanian, Croat, and Greek communities of the Adriatic area; and the second cluster comprises the Albanian communities of the Ionian, Thirrenian and Sicilian area. The Greeks of Reggio Calabria Province are completely separated from the other communities.It would be extremely interesting to see a study that focused only on Greek speakers of Reggio Calabria.
References
[1] O. Semino et al., "The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective", Science, 290(5494): 1155-1159.
[2] F. Di Giacomo et al., "Clinal patterns of human Y chromosomal diversity in continental Italy and Greece are dominated by drift and founder effects." Molecular Phylogenetics and Evolution, 28(3): 387-395.
[3] C. Flores et al., "Isolates in a corridor of migrations: a high-resolution analysis of Y-chromosome variation in Jordan", Journal of Human Genetics (in press).
[4] Z. Rosser et al., "Y-Chromosomal Diversity in Europe Is Clinal and Influenced Primarily by Geography, Rather than by Language", American Journal of Human Genetics, 67(6): 1526-1543.
[5] C. Cinnioglu et al., "Excavating Y-chromosome haplotype strata in Anatolia", Human Genetics 114(2): 127–148.
[6] M. Pericic et al., "High-Resolution Phylogenetic Analysis of Southeastern Europe (SEE) Traces Major Episodes of Paternal Gene Flow Among Slavic Populations", Molecular Biology and Evolution (in press).
[7] S. Rootsi et al., "Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in europe", American Journal of Human Genetics 75(1): 128-37.
August 29, 2005
Haplogroup frequency correlations in Southeastern Europe
I have decided to investigate the correlations between haplogroup frequencies in southeastern Europe and some neighboring populations. Currently, I have collected frequency data for the main haplogroups found in the region (E3b, J2, I, R1a, R1b) for 16 populations. Most 3-letter codes should be recognizable, but KAL=Kosovo Albanians, SMA=Slav Macedonians, CAL=Calabrians. I should also note that the frequency of haplogroup I in Bulgarians is interpolated from frequencies in Romanians, Greeks, Slav Macedonians and Serbians, as it was missing in the original article. Conclusions about Bulgarians are especially weak, due to this reason, and also the small original sample (N=24).
I began by calculating the correlation matrix in my sample.

A few features strike the eye:
The absence of a correlation between J2 and E3b is significant, because it hints that these haplogroups did not diffuse as a result of a single process. The eastern-most populations of our sample, but also the two Italian populations show a higher J2/E3b ratio compared to the "continental" populations.
The second analysis is a dendrogram using Euclidean distance of the normalized haplogroup frequencies. As is apparent, this way of representing the frequency data results in a separation of the two main clusters.

Finally, a principal components analysis is shown in the following plot. The first two components summarize about 77% of the variance.

We observe the two main "contrasts" in the data between "coastal" J2/R1b and "continental" I1b and between "Neolithic" E3b and "Slavic" R1a (*)
Several conclusions can be drawn.
The critical question would be: what fraction of J2 lineages in the Ukraine can be explained as the result of ancient and recent Greek settlement in the Crimea, and what fraction predates the Greeks?
(*) We should note that these are rough correspondences. If the theory of riverine diffusion of haplogroup E3b into Central and Northern Europe is correct, then it is likely that E3b existed in a small frequency in Proto-Slavs; conversely, R1a diffused after the LGM before its most recent diffusion associated perhaps with Slavic languages.
Update: A reader alerts me to a different study which listed the Hungarian R1a frequency as substantially lower than the one used here (Semino et al. 2000). Unfortunately, that study did not list frequencies of all haplogroups needed for comparison, so it could not be used directly. If the frequency of R1a=20.4% is used, then a slightly different clustering is obtained.
I began by calculating the correlation matrix in my sample.
A few features strike the eye:
- The negative correlation between haplogroup R1 and haplogroups E3b, J2, and R1b
- The negative correlation between haplogroup I and haplogroups J2 and R1b
- The positive correlation between haplogroup J2 and haplogroup R1b
- The absence of a substantial correlation between "Neolithic" haplogroups J2 and E3b
The absence of a correlation between J2 and E3b is significant, because it hints that these haplogroups did not diffuse as a result of a single process. The eastern-most populations of our sample, but also the two Italian populations show a higher J2/E3b ratio compared to the "continental" populations.
The second analysis is a dendrogram using Euclidean distance of the normalized haplogroup frequencies. As is apparent, this way of representing the frequency data results in a separation of the two main clusters.
Finally, a principal components analysis is shown in the following plot. The first two components summarize about 77% of the variance.
We observe the two main "contrasts" in the data between "coastal" J2/R1b and "continental" I1b and between "Neolithic" E3b and "Slavic" R1a (*)
Several conclusions can be drawn.
- The spread of the Neolithic economy into continental Europe involved E3b bearers in a riverine expansion whose northern expression is associated with the Linearbandkeramik. This does not mean that E3b was the only haplogroup associated with these early European farmers, only that it definitely seems to correlate better with this movement compared to the other Neolithic haplogroup (J2).
- The early diffusion of E3b occurred over a haplogroup I Paleolithic background. It is likely that as groups moved northward the frequency of haplogroup E3b abated, and this is in fact shown in the frequency distribution. This movement is probably associated with the narrow-faced Danubian Mediterranean racial types.
- This native European population later received an influx of R1a speakers; the frequency of R1a is correlated with latitude. This led to a decrease of the native component in favor of the foreign R1a component (*)
- The frequency of haplogroup J2 was established by three movements: (i) the initial arrival of J2 from Asia Minor; this did not significantly penetrate into the Western Balkans; (ii) the initial dispersal of J2 into Italy and further west, and around the Black Sea in pre-Greek times, which may be associated with the arrival of gracile Mediterranean racial types into the Ukraine; (iii) the latter dispersal of additional J2 as a result of Greek colonization.
The critical question would be: what fraction of J2 lineages in the Ukraine can be explained as the result of ancient and recent Greek settlement in the Crimea, and what fraction predates the Greeks?
(*) We should note that these are rough correspondences. If the theory of riverine diffusion of haplogroup E3b into Central and Northern Europe is correct, then it is likely that E3b existed in a small frequency in Proto-Slavs; conversely, R1a diffused after the LGM before its most recent diffusion associated perhaps with Slavic languages.
Update: A reader alerts me to a different study which listed the Hungarian R1a frequency as substantially lower than the one used here (Semino et al. 2000). Unfortunately, that study did not list frequencies of all haplogroups needed for comparison, so it could not be used directly. If the frequency of R1a=20.4% is used, then a slightly different clustering is obtained.
Subscribe to:
Posts (Atom)