April 01, 2011

Genetic structure in North-Central Europe with the Galore approach (revisited)

This is an update of a previous post, but with a much larger number of 416 sampled individuals from 26 populations.

The sources of the data are:
  • FIN and GBR from the 1000 Genomes Project
  • Populations with _D ending from the Dodecad Project
  • Populations with _H ending from the HGDP
  • Populations with _B ending from Behar et al. (2010)
20 clusters were inferred with 14 MDS dimensions retained. Below is the number of individuals assigned from each population to each cluster:

Some details on the cluster:
  • #1-3 are dominated by all 100 Finns plus 2 Swedes
  • #4 is clearly Balto-Slavic
  • #5 is clearly Russian
  • #6 is Norwegian-Swedish
  • #8 is British Isles
  • #9 is also British Isles but also encompasses all 3 Danes and a Dutch
  • #10 is French dominated
  • #11 is Central European (German-Hungarian)
  • #13-14 are British-Orcadian
Some points of interest:
  1. A single Estonian groups with Balto-Slavs
  2. A single Austrian groups with a Hungarian, French, and British
By definition, a cluster can be inferred if there are 2 or more individuals belonging to it! Hence, singleton project participants are likely to be grouped with some other broader group irrespective of their closeness to it. Hence, we should not conclude that e.g., the Estonian is indistinguishable from Balto-Slavs, but rather that a possible genetic distinctiveness of the Estonian population must await more Estonian population samples.

It is also interesting that the hitherto distinctive Finnish and British Isles populations have split into several clusters. This is the power of numbers, and I anticipate this to occur for other population groups with large sample sizes.

On the flip side, the inclusion of a wide array of Balto-Slavic populations has tended to make them all fall into a single cluster. Belonging to a single cluster does not mean that there is no population differentiation, but rather that this does not take the form of separate "blobs" of individuals that an algorithm working on unlabeled individuals can uncover.

This also brings an explanation of the mega-British Isles/American White cluster discovered in the most recent analysis for Project participants: the inclusion of multiple admixed individuals has probably served to fill-in-the-gaps within the general population of that origin, whereas the current analysis which included only individuals of a single origin as well as what is presumably a good geographical sampling of the GBR population has allowed population structure to be better visible.

UPDATE (Apr 4): It has come to my attention that the single "Hungarian" and the single "Austrian" joined the project under false pretenses and are the relatives of other Project members. In retrospect it is not surprising that they failed to join the German and Hungarian clusters respectively.

22 comments:

  1. The Spilt of Irish and British population into Clusters 8 and 9 is interesting. 76% of the Irish are in Cluster 8, whereas with British_D the spilt is: 35% in Cluster 8 and 50% in Cluster 9. Given that Cluster 9 includes Dutch and Danes we could be seeing spilt between "Insular Celtic" population and "Germanic" incomers?

    Is GBR a british group as well?

    ReplyDelete
  2. >> Is GBR a british group as well?

    Yes, it is comprised of English and Scots.

    ReplyDelete
  3. We need to get some specific welsh and scottish people to join Dodecad it be interesting to try and break out the variation across the island of Britian. I know in past studies that Scots have often overlap with both Irish and English.

    On the GBR group which obviously excludes the Welsh the levels of Cluster 8 drop from the 35% in British_D to about 21%

    -Paul
    (pduffy81)

    ReplyDelete
  4. We would expect at least two subgroups in Finns - true Finns and Swede-Finns. With three groups, one would suspect a strongly Saami admixed Finn, a not strong Saami admixed Finn and Swede-Finn mix, or alternately, a Finn, a Slavic admixed, and a Swede-Finn category.

    ReplyDelete
  5. Great Britain also includes Wales.

    GBR does not.

    ReplyDelete
  6. As I have pointed out a number of times, in pretty much all autosomal studies Germans and Hungarians rank very close.

    Germans are still represented a bit sparsely, here - so there are missing matches with Swedes, Danes, or Dutch that would probably occur in a larger population sample. It is still interesting to see that eight of the ten Germans group with (18 out of 21) Hungarians. What a strong "Danubian" connection!

    ReplyDelete
  7. Regarding the GBR set is there any data regarding the breakdown of how many of the 90 were in Scotland or in England?

    About 1/10th of total population of Britain have a least one Irish grandparent (6million), if you cover all immigration since the famine the number goes up to about 1/4th of the population with some Irish ancestry/admixture.

    ReplyDelete
  8. This may be a result of the small sample sizes but I find it interesting that there is so little overlap between the British and German samples.

    ReplyDelete
  9. I understood that the British data in the 1000 Genomes Project was taken from the People of the British Isles Project:

    http://www.peopleofthebritishisles.org/press/nl4.pdf

    This project includes participants from Wales and Northern Ireland. Do you have a breakdown somewhere of the origins of the geographical origins of the samples they supplied?

    ReplyDelete
  10. I understood that the British data in the 1000 Genomes Project was taken from the People of the British Isles Project:

    "British from England and Scotland (GBR)"

    http://www.1000genomes.org/about

    ReplyDelete
  11. I know where you can get more dutch data! ;)

    ReplyDelete
  12. The cluster overlap with the Danes will be connected with the group of Vikings that came that way, and Danish rule of Northern England.

    And yes, the lack of significant German overlap is striking. But this agrees with haplogroup data that indicates that the impact with groups like the Saxons was minimal.

    IMO similarities between the Germans and British Isles populations is related to a much older underlying structure combining a Southern origin component (related to the Sardinians) and a northern Structure (related to Lithuanians).

    I dont like this cluster only data as I think it can be misinterpreted. Can we have to graphic also? It really adds to the reality of the situation.

    ReplyDelete
  13. Thinking about it further the Danish/Dutch cluster looks to just illustrate the similarity between these people and the British. If it were Danish rule/Viking input it would not be so dominant a cluster. Neither made much inroad further south in England.

    Fits more in with the Doggerland idea than anything.

    ReplyDelete
  14. This may be a result of the small sample sizes but I find it interesting that there is so little overlap between the British and German samples.

    Average Joe - actually, 4 of the 10 German samples do overlap: 2 (including Norway in the group), and another 2 (including France and Hungary in the group).

    But, yes, as I said before, I suspect undersampling of the North - otherwise, there would be more matches with Netherlands, Denmark, and Sweden, and with that, also British.

    ReplyDelete
  15. Looking at the sample map from the pdf that Debbie provided it looks like they have quite a poor sampling rate for most of Scotland. Most scots samples are in the East and south as well as Orcadian. Not alot of samples from the west in the traditional Gàidhlig speaking areas.

    In comparison they got quite a good sample rate from Wales, Cornwall and Northern Ireland.

    ReplyDelete
  16. http://www.1000genomes.org/about

    Thanks for the link. Perhaps the intention is to add the samples from Wales and Northern Ireland at a later date.

    ReplyDelete
  17. I notice that one of the French samples has members in cluster 8 which seems largely exclusive to the Irish and British populations. Does this French sample include Bretons?

    ReplyDelete
  18. Given that Cluster 9 includes Dutch and Danes we could be seeing spilt between "Insular Celtic" population and "Germanic" incomers?

    In addition to Irish and British samples, cluster 9 includes Danish, Dutch and even French samples but no German or Scandinavian ones. I think that this makes it unlikely for cluster 9 to be Germanic. I think that cluster 9 is a Celtic one that was germanized in Denmark and the Netherlands and latinized in France. From what I understand Y-chromosome R1b - which is more common in Celtic populations than in Germanic ones - is more common in Denmark and the Netherlands than it is in Germany and Scandinavia which may be evidence of a significant Celtic survival in the Dutch and Danish populations.

    ReplyDelete
  19. How come the Swiss aren't included? Switzerland is a central European Country, nearly all the population is north of the Alps, and their southern part is north of much of France.

    I know there was at least one German Swiss and I'm sure they'd cluster with northern/central.

    ReplyDelete
  20. @princenuadha
    There is actually a Swiss in the clusters galore and he is in the North-Italian cluster.

    ReplyDelete
  21. @Rafael
    Thanks For the info but was the Guy Italian-swiss (a small minority)? The Italian Swiss have already been shown to be quite different from the German and French Swiss in that one 2008 study. In fact they did cluster with northern Italian.

    The one German Swiss I was thinking of, swissgirl, clustered with northern Europeans and I am almost sure most Swiss would also.

    ReplyDelete

Stay on topic. Be polite. Use facts and arguments. Be Brief. Do not post back to back comments in the same thread, unless you absolutely have to. Don't quote excessively. Google before you ask.