Showing posts with label Surnames. Show all posts
Showing posts with label Surnames. Show all posts

July 28, 2012

Complex Y chromosome structure in East Tyrol (and more IE thoughts)

Cultural diversity can disappear in a few generations, but genetic diversity -barring major genocides or disasters- usually persists.

The East Tyrol region in Austria has been Germanic-speaking since the Middle Ages, but historical evidence documents the presence of Romance, Germanic, and Slavic groups in its territory. How can we untangle the origin of the different groups when they are all jumbled up together now, and all Germanic-speaking? Previous research has shown that patrilineal groups can be isolated on the basis of surnames, but in the case of East Tyrol, the wide adoption of surnames happened after the region had become linguistically Germanic.

The authors of the new paper exploited the structure of local toponyms, in particular pasture names. The figure on the left shows the concentrations of Slavic (panel A), Romance (panel B), and Germanic (panel C) pasture names. While Germanic pasture names are evenly distributed, there is a contrast between those of Slavic and Romance origin. From the paper:
From the 853 analyzed pasture names in East Tyrol 71% were derived from Germanic (Bavarian) etymons, 17% from Slavic etymons, and 12% from Romance etymons. While pasture names with Germanic etymons were evenly distributed in high density within the whole study area the names with Slavic etymons were spatially focused in the east and north of East Tyrol (Fig. 2). Geographically, these are the lower Drau, Isel, Kals, Virgen and the Defereggen valleys (Fig. 1). No names with Slavic etymons were found in the southwestern Puster valley (Fig. 2). The pasture names with Romance etymons focus mainly in the southern part of East Tyrol (Gail, Puster, and Villgraten valley, Fig. 2). The slight northeastward trend observed in the distribution of Romance etymons is solely caused by the Kals valley, a medieval Romance linguistic enclave, which was separated from the Romance main territory in the 10th century [36]. On the basis of these results, East Tyrol was divided into two regions of former Romance (Puster, Gail, and Villgraten valley; region A) and Slavic (Isel, lower Drau, Defereggen, Virgen, and Kals valley; region B) main settlement (Fig. 2).
The authors dissected the occurrence of different haplogroups in the two contrasting regions (A: Romance, and B: Slavic) in some great detail:

Splitting the East Tyrolean population sample into regions A and B resulted in a partitioning of haplogroups E-M78, R-M17, R-M412/S167*, and R-S116*. E-M78, R-M17 and R-S116* Y chromosomes were exclusively found in region B whereas samples assigned to R-M412/S167*, R-U106/S21, and R-U152/S28 reached higher frequencies in region A (Fig. 3, Table S7). When attributing the samples to the fathers' and grandfathers' places of birth/residence, as reported by the participants, practically identical patterns were obtained for most of the haplogroups (Fig. 3). 
Y chromosomes belonging to haplogroups G-P15, I-M253, and J-M304 showed much lower regionalization in their frequencies (Fig. 3) at all three generation levels.

The non-localization of the G-P15, I-M253, and J-M304 seems reasonable as these may represent what is common in these populations (and one could indeed speculate -on the basis of current ancient DNA knowledge- that they correspond to Neolithic, Paleolithic, and Bronze Age processes respectively)

Two of the most interesting findings are:
Haplogroup R-M412/S167* was found at low frequencies in the combined East Tyrolean sample. However, the R-M412/S167* chromosomes were sorted by the subdivision of the study area and reached in region A levels of ~14% whereas their frequency in region B was well below the 5% threshold. At the probands and fathers level of analysis region A featured approximately fourfold higher frequencies of these chromosomes than region B. This ratio changed to about nine when placing the samples at the grandfathers' places of birth/residence. These contrasts remained statistically significant after correcting for multiple comparisons [22] at the fathers and grandfathers analysis level.
and:
The western border of the geographic expansion of haplogroup R-M17 Y chromosomes is to be found in Central Europe and largely follows the political border separating present-day Poland (57%) and Germany (East: ~30%, South: ~14%, West: ~10%) [42]. Frequencies of about 15% and 10% were also found for Austria [18] and North-East Italy [48], respectively. In South Italy and in West Europe R-M17 chromosomes are not present at informative frequencies. 
In this study, the proportion of Y chromosomes carrying the derived M17 allele was 14.1%, a value that nearly perfectly matched those reported for West Austria (North Tyrol, 15.4%) and South Germany (Munich; 14.3%) [18], [42]. However, haplogroup R-M17 was completely absent in the East Tyrolean sub-sample from region A, but made up to 16% in region B. This result remained practically unchanged when assigning the probands to their respective fathers' or grandfathers' places of birth/residence (Fig. 3).
The new study reinforces my belief that R-M17 was not originally present in the Italo-Celtic branch of Indo-European. Together with the paucity of the same lineage in Albanians (~5%), Armenians (less than 5%), and its quite uneven distribution in Greeks, it is becoming increasingly clear that R-M17 may represent a late entrant that affected minimally southern and western Europe.

The fountain of its spread was probably a trans-Caspian (?) Central Asian staging point that followed a counter-clockwise route into Europe that spawned the northern (Germanic and Balto-Slavic) groups of Europe and the Indo-Iranians, who remained longer in their BMAC homeland, finally breaking down during the 2nd millennium BC. This would also harmonize with the increasing evidence for complementary R-M17 distributions in Europe and Asia, associated with the Z93 marker. 


It might appear that Z93+ chromosomes may track the later expansion of the Indo-Iranian world. I have observed before that R-M17 seems distributed in a long arc north and east of the Caspian, and it is perhaps in different points along this arc that the dominant European (NW) and Asian (SE) types emerged out of the early Neolithic population.

Combining this insight with an analysis of Y chromosome variation within the Graeco-Armeno-Aryan group, it appears that Graeco-Armenian is characterized predominantly by J2+R1b related lineages, while Indo-Iranian by J2+R1a related lineages. The evidence for Tocharian would involve J2+R1b related lineages.  Overall, it would appear that the earliest J2 core of PIE affected two different groups of populations living on complementary sides of the Caspian:

  • The trans-Caspian R-M17 population followed an early (3rd, or late 4th millennium BC?) north-west trajectory into Europe (associated with northern European groups such as Balto-Slavic and Germanic) as well as a later expansion (2nd millennium BC? associated with climatic deterioration in BMAC) that brought Iranian speakers to the steppe, as well as to Iran, and Indo-Aryans to South Asia.
  • The cis-Caspian, trans-Caucasian R-M269 population followed an early (late 4th millennium, early 3rd millennium?) expansion into Europe, probably together with J2 in the Balkans (Graeco-Phrygian, perhaps Thracian), and arriving in the form of Bell Beakers in Western Europe (Italo-Celtic), as well as a later (2nd-1st millennium BC?) expansion to the east (Tocharians)
This long excursus was necessary as a preamble to an explanation on what happened in Europe itself, which brings us back to the topic of the current paper:

  • The lack of structure between regions A and B with respect to haplogroup J, together with the great difference in levels of this haplogroup between Italy and the Celtic world,  suggests that Italian J-related lineages  may have been inflated in proto-historical and historical times. There are candidates a-plenty: Greeks, Etruscans, Trojans to name but three. Excess of J in Italy, relative to the Celtic world, clearly relates to the abundant traditions of eastern origins for the historical groups of Italy.
  • It would appear that during proto-history, most of Europe was dominated by three sets of IE people (R-M269 in the west, who had transmitted Proto-Celto-Italic; R-M17 in the northeast of Proto-Balto-Slavic speech, and Proto-Germanic in-between, participating in both worlds, and --appropriately-- often linked with either Italo-Celtic or Balto-Slavic linguistically)
  • There were other (now-extinct) groups as well: the Illyrians vs. Thracians in the Balkans with complementary Y chromosome distributions, the former including an extra chunk of aboriginal legacy (haplogroup I), no doubt due to the much more difficult terrain of the western Balkans. These are contrasted with our final group, the Greeks who straddled three worlds (the Paleo-Mediterranean world of the first farmers, the Thraco-Phrygian world linked to the Indo-Iranians at a deeper level, and the Anatolian world)
The boundaries between these various groups were a little blurred in the course of history. But, apparently, they were still a little clearer during the Middle Ages, and probably much clearer before the Völkerwanderung of the Germans, and the expansion of the Slavs.


Geneticists are executing a remarkable pincer movement, zeroing in on the period of European ethnogenesis from both the remote past and the present: through a study of ancient DNA from the dawn of history, they are beginning to understand how Europe was peopled, layer after layer of settlement; and through the study of surnames and toponyms they are drilling ever deeper into the pre-genealogical past. Together with much anticipated technological progress related to full genome sequencing and ancient DNA extraction, it will not be long before the history of Europe will be laid bare in remarkable detail.

PLoS ONE 7(7): e41885. doi:10.1371/journal.pone.0041885

Pasture Names with Romance and Slavic Roots Facilitate Dissection of Y Chromosome Variation in an Exclusively German-Speaking Alpine Region

Harald Niederstatter et al.

The small alpine district of East Tyrol (Austria) has an exceptional demographic history. It was contemporaneously inhabited by members of the Romance, the Slavic and the Germanic language groups for centuries. Since the Late Middle Ages, however, the population of the principally agrarian-oriented area is solely Germanic speaking. Historic facts about East Tyrol's colonization are rare, but spatial density-distribution analysis based on the etymology of place-names has facilitated accurate spatial mapping of the various language groups' former settlement regions. To test for present-day Y chromosome population substructure, molecular genetic data were compared to the information attained by the linguistic analysis of pasture names. The linguistic data were used for subdividing East Tyrol into two regions of former Romance (A) and Slavic (B) settlement. Samples from 270 East Tyrolean men were genotyped for 17 Y-chromosomal microsatellites (Y-STRs) and 27 single nucleotide polymorphisms (Y-SNPs). Analysis of the probands' surnames revealed no evidence for spatial genetic structuring. Also, spatial autocorrelation analysis did not indicate significant correlation between genetic (Y-STR haplotypes) and geographic distance. Haplogroup R-M17 chromosomes, however, were absent in region A, but constituted one of the most frequent haplogroups in region B. The R-M343 (R1b) clade showed a marked and complementary frequency distribution pattern in these two regions. To further test East Tyrol's modern Y-chromosomal landscape for geographic patterning attributable to the early history of settlement in this alpine area, principal coordinates analysis was performed. The Y-STR haplotypes from region A clearly clustered with those of Romance reference populations and the samples from region B matched best with Germanic speaking reference populations. The combined use of onomastic and molecular genetic data revealed and mapped the marked structuring of the distribution of Y chromosomes in an alpine region that has been culturally homogeneous for centuries.

Link

August 10, 2011

First "People of the British Isles" paper

The data can be found here. Description of region codes:
Cornwall, Devon and Pembrokeshire were pooled to represent the South/West (SW) and the area that could be considered the closest surrogate to the Ancient British. Kent, Norfolk and Lincolnshire were pooled to represent the East (E) and the area most directly influenced by the Anglo-Saxon invasions. Cumbria, Yorkshire and the North East were pooled broadly to represent the North of England (N); Oxfordshire and the Forest of Dean were combined to represent the Central region of England (CN); and Orkney was kept separate from the others, largely because of the known substantial Norse Viking influence in Orkney.
Of interest are the NRY haplogroup data:


The SW seems to have the lowest R1a1 frequency (1.3%). This is in agreement with the Dodecad v3 results for Cornwall (1.8% East European). The E seems to have an intermediate frequency (3.5%), again in agreement with Dodecad v3 for Kent (3.7%). Orkney has the highest R1a1 frequency (34.2%) and also the highest East European component in Dodecad v3 (11%). So, it does seem that R1a1 frequency tracks an eastern population element in the British isles.

R1xR1a1 has its lowest values in E and OR; this is probably consistent with the idea that Germanic invaders from the east possessed less of this haplogroup than the pre-Germanic population.

The minor alleles in MC1R are associated with red hair, and it seems that this is maximized in Orkney.


The admixture estimates are also quite interesting, showing a dependence on the use of local (L) vs. non-local (N) surnames; the results do suggest non-trivial shifts in genetic composition since the adoption of surnames.


European Journal of Human Genetics advance online publication 10 August 2011; doi: 10.1038/ejhg.2011.127

People of the British Isles: preliminary analysis of genotypes and surnames in a UK-control population

Bruce Winney et al.

There is a great deal of interest in a fine-scale population structure in the UK, both as a signature of historical immigration events and because of the effect population structure may have on disease association studies. Although population structure appears to have a minor impact on the current generation of genome-wide association studies, it is likely to have a significant part in the next generation of studies designed to search for rare variants. A powerful way of detecting such structure is to control and document carefully the provenance of the samples involved. In this study, we describe the collection of a cohort of rural UK samples (The People of the British Isles), aimed at providing a well-characterised UK-control population that can be used as a resource by the research community, as well as providing a fine-scale genetic information on the British population. So far, some 4000 samples have been collected, the majority of which fit the criteria of coming from a rural area and having all four grandparents from approximately the same area. Analysis of the first 3865 samples that have been geocoded indicates that 75% have a mean distance between grandparental places of birth of 37.3 km, and that about 70% of grandparental places of birth can be classed as rural. Preliminary genotyping of 1057 samples demonstrates the value of these samples for investigating a fine-scale population structure within the UK, and shows how this can be enhanced by the use of surnames.

Link