June 27, 2011

Basques (?) in 1000 Genomes IBS (Iberian Spanish) sample

I am looking at the population portraits of the Dodecad v3 results (all of which will be provided as a zip once I finish calculating averages), and I discovered an interesting presence of presumably Basque individuals in the 1000 Genomes IBS sample.

First, here are the Dodecad Spanish:
And, the Behar et al. (2010) Spaniards:
And, the HGDP French Basque:
Notice that they are composed almost entirely of "West European" and "Mediterranean" components.

Here is IBS:
Notice a few individuals that resemble Basques. I haven't found a description of the origin of the IBS individuals, but I would wager that a few Basque individuals are included, that resemble their French co-ethnics.

32 comments:

Onur Dincer said...

6 of the 46 IBS samples genetically resemble the HGDP French Basques. Spain has a population of 46 million, of which about 2.3 million are Basques. If IBS samples were collected based on demography, probably only 2 of the 6 Basque-resembling individuals of IBS are actually Basques and the remaining 4 individuals are from the Romance-speaking ethnic groups of Spain.

Maju said...

You'd have to test in another way: one in which non-Basque Spaniards and Basques are clearly different, in the examples show neither is clearly defined: they just look neither this nor that: 50-50 something else than themselves (undefined).

Just because they have some or none third party admixture (? affinity better)does not say they are Basques or of Basque ancestry, they might well be other group of mostly unmixed people from somewhere else in Iberia.

We know from Bauchet 2007 (and maybe others) that Basques and Iberians cluster in two different components of their own, so what are you waiting to check that 1 vs 1 (K=2 should be enough) before making odd claims? Do you fear to find something you do not like?

The third party affinities most obvious among Spaniards, are NW African (Berber), SW Asian (Arab) and West Asian, these only indicate transmediterranean admixture (?) not automatic Basqueness. They may be related but they are not any safe bet. You should make direct 1 vs 1 comparisons before jumping to any conclusions.

Onur Dincer said...

6 of the 46 IBS samples genetically resemble the HGDP French Basques. Spain has a population of 46 million, of which about 2.3 million are Basques. If IBS samples were collected based on demography, probably only 2 of the 6 Basque-resembling individuals of IBS are actually Basques and the remaining 4 individuals are from the Romance-speaking ethnic groups of Spain.

Addendum: If we remove the 5-6 million foreign immigrant population in Spain from 46 million, then probably 3 of the 6 Basque-resembling IBS individuals are Basques.

᧞eandertalerin said...

Dienekes, could you reveal where do the other individuals in the sample come from? Are individuals 10 to 16 in the IBS sample Basques too? Does the 'IBS' include iberians from all regions?

eurologist said...

@Maju:
"they might well be other group of mostly unmixed people from somewhere else in Iberia"

I agree - I believe it is very likely that before Roman times, much of Iberia was "unmixed" as the Basque look today.

dalouh said...

what is exactly Neo_African ?

Onur Dincer said...

Are individuals 10 to 16 in the IBS sample Basques too?

The 10th individual doesn't genetically resemble Basques, well, at least the genetically known Basques.

Isidro said...

Well, technically the Basques living south of the Pyrenees belong to the Iberian Peninsula classification like the rest of the IBS group showing.
Indeed 10 to 16 resemble the French Basque no mystery there, they are the same ethnic group.If anything it shows isolation emanating from the Pyrenees to the rest of Iberia with up to 10% of a conglomerate of the so called outsiders to the area showing in non Basques.
In my case, the Dodecad K12 shows exactly that.Born in Aragon, adjacent east of the Basque area and my 3 main components are 32% Basque (with no Basque ancestry)30% northern European and 19% Sardinian.
Ideally if we created a map with the results of IBS from Basque with the rest of France and Spain, it would show clearly that there is no mystery there.

Kepler said...

Unmixed? What about Greeks along the Mediterranean?
Alicante? Ampurias? Rosas? Málaga?
Phoenicians with Cartagena, Cádiz?
My J2 haplogroup (about >10% along the coast, if I remember well) could come from that, if it was not from Romans or Jews later on.
That is hardly seen in the Basque country

Maju said...

Kepler: just for the record, there were no archaeologically confirmed Greek colonies south of Emporion in Iberia. There was Greek trade from Marseilles (and dependencies) but no Greek colonies.

Phoenician colonies were limited to a few spots in the Andalusian coast, notably Cádiz but also Málaga (that you misattribbute to Greeks) and other several locations. Only in the late period (Barcid empire, betwen the two Punic wars) there was some phoenician penetration inland.

Overall these proto-historical anecdotes are pretty much irrelevant. Surely Phoenicians had an impact in the specific sites they settled but that was surely it.

"My J2 haplogroup (about >10% along the coast, if I remember well) could come from that, if it was not from Romans or Jews later on".

J2 is widely spread in about all Iberia and other parts of Europe. Lebanese have equivalent apportions of J2 and J1 (much like Jews, who may descend from ancient Phoenicians).

Similarly, would it be Roman admixture, we'd expect similar apportions to those found in Italy (or specific parts of it) and we do not. So what you say makes little sense.

It is much more likely that J2 in West Europe corresponds to a Neolithic founder effect or something like that, older than what you say.

jes-r said...

Interesting find, you could try a PCA analysis and see if these putative Basque pull towards the HGDP Basque?

Kepler said...

Maju,

Málaga: my fault;

Now: Are you implying J2 distribution in Spain is uniform?
If not: do you think this is still reflection of Neolithic times?
J2 is definitely more represented along the Mediterranean, even if it is to be found in many other places, which is not strange given the important migrations that have taken places throughout Spain's history.

"Similarly, would it be Roman admixture, we'd expect similar apportions to those found in Italy (or specific parts of it) and we do not. So what you say makes little sense. "
No, we would not, simply because Roman populations did not completely replaced the original population and we hardly know what the genetic composition of that original one is. J2 is more common in the South of Italy, from where most settlers came.
You won't expect an exact copy of the patterns in those regions, only so much.

Andrew Oh-Willeke said...

Is HGDP data amenable to extraction of any known phenotypes? E.g., lactose persistence, malarial resistance, FOX-2, BRAC, blood type, immune complexes, any of the novelty seeking associated genotypes, eye/hair coloration, etc.

Normally, in autosomal studies the point is to used the largest available data set to do cluster and admixture analysis with the assumption that the bulk of the data points are selectively neutral, in order to analyze ancestry. But, a few phenotypes associated with particular clusters might make it possible to make more informed non-statistical inferences about which of multiple possible historical moments at which admixture could have occurred make the most sense.

For example, I remain captivated by the recent ancient DNA study that showed an absence of lactose tolerance in some remains in a mid-to-late megalithic frontier village (in line with low levels of lactose tolerance in modern Gascons of Southern France), as contrasted with exceedingly high levels of lactose tolerance in the nearby Basque who, while having the autosomal distinctiveness noted in this post and a few unusual aspects of the Y-DNA and mtDNA mix are overall typically Iberian genetically.

I'm also curious if there has been much demographic modeling on the impact of RH- factors in Basque genetic distinctiveness as this could limit random admixture.

What potential source populations for Basque are the best matches for the lactose tolerance and RH- factors?

The high levels of lactose tolerance in Basque which have high levels of Y-DNA R1b1b2, that is absent in close neighbors to the Basque with high levels of R1b1b2, is particularly vexing because it makes scenarios suggested by the absence of a West Asian autosomal component such as Basque being representative of an early Neolithic background and other Europeans having an additional trace of a later wave of IE or proto-IE migration in the autosomal mix harder to grok.

In an Old European wave (including any admixture in of Paleolithic layers) plus non-Basque IE wave model, you'd expect the non-IE people to be lactose intolerant and the IE people to be lactose tolerant. Instead, we see the reverse, which requires a more complex scenario. We need waves of Old European lactose intolerants, Basque lactose tolerants, and then an IE wave, in that order. Further, the mtDNA, at least, seems to suggest that there is more Basque admixture in of pre-Neolithic populations (e.g. represented in U5b and V mtDNA hgs) than some of their IE neighbors, which is particularly hard to fathom given the sequencing. One needs to have pre-Neolithic retreating to refugium for one reason in the face of Old European Neolithics, followed by Basque in migrations moving in for some other reason (other ecological niches are full, or they are kicked out of somewhere else and muscle in). Dating this from archaeology isn't easy either because you have continuious human presence in the region from the Paleolithic and it is hard to make calls on cultural continuity v. discontinuity and to distinguish meme transfer from demic transfer.

Alternately, we could have had a scenario where selective effects would have put the Basque at far more selective pressure/founder effect for lactose tolerance than their neighbors, but it is hard to come up with selective pressures that would be so much more intense for the Basque than for Gacons or other Iberians.

Maju said...

@Kepler: not strictly uniform but quite so - see this and my version, including a J2-only regional map.

"If not: do you think this is still reflection of Neolithic times?"

Difficult question. It's almost homogeneous: there are some regional differences but J2 is one of the haplogroups for which such regional differences are lowest in Iberia.

"J2 is definitely more represented along the Mediterranean"...

In Iberia you mean? Not especially so. The densest areas are in fact 3 atlantic regions (Asturias, South Portugal and West Andalusia), 2 inland regions (Extemadura, Aragon) and only one Mediterranean (East Andalusia). It'd seem to have a center in SW Iberia if anywhere.

"No, we would not, simply because Roman populations did not completely replaced the original population"...

If J2 would all be Roman then we would expect it to come with the other haplogroups of Italy/Latium in similar frequencies.

What you say only makes sense if some J2 existed in Iberia before any Roman colonization. But then all J2 is not Roman, Q.E.D.

"J2 is more common in the South of Italy, from where most settlers came".

But the South of Italy is also high in J1 and Spain almost lacks that haplogroup. Big problem for your Roman hypothesis.

"You won't expect an exact copy of the patterns in those regions, only so much".

You must expect an almost exact copy. If the origin is 50% A and 50% B, then the colonist will have roughly that apportion and therefore:

1) A destination that has 25% A, 20% B and 55% other (native lineages) suggests a colonization of 45% intensity from the origin.

but...

2) A putative destination with 30% A, 5% B and 65% other implies that either (a) there was no colonization at all (false starting hypothesis) or that (b) there was a very strong founder effect (implying very small populations involved by all sides) or that (c) the origin is another place where A is 80% and B is just 20% (or even less).

You MUST consider this basic logic, otherwise it's just sloppy pseudoscience. Method and internal consistence is all.

Maju said...

@Andrew:

"What potential source populations for Basque are the best matches for the lactose tolerance and RH- factors?"

I'd say that Basques themselves are the best candidate source (many reasons), while awaiting for a proper sampling and research in France. Other West Europeans from South Iberia to Scandinavia are consistent anyhow in both factors (and other stuff) and are therefore close relatives.

"you'd expect the non-IE people to be lactose intolerant and the IE people to be lactose tolerant. Instead, we see the reverse, which requires a more complex scenario. We need waves of Old European lactose intolerants, Basque lactose tolerants, and then an IE wave, in that order".

I think that (some?) Paleolithic West Europeans were lactose tolerant just because (random allele). That simplifies things a lot.

Rafael said...

Actually there is not a clear cut between basques and non-basques, it's something gradual. The surrounding areas (Cantabria, Northern Aragon, Rioja, etc) certainly would look closer or intermediate between basques and the rest of spaniards. In the clusters as well. Shame we don't have samples from these areas.

Average Joe said...

Maju:

Could it be possible that Basques inherited both R1b and lactose tolerance from surrounding Iberian populations due to gene flow?

Maju said...

@Joe: I think no, at least not from modern populations, which have both markers at lower frequencies than Basques.

Gene flow is not any magic wand, it must follow a logic and if at the origin pool you have salt water (water but also salt), you don't get freshwater in the destination pool (water and no salt). The opposite is possible instead.

In order to get freshwater from saltwater you need to postulate some special mechanism, for example an ancient founder effect. But that is not mere regular "gene flow".

It is quite obvious, right?

Isidro said...

Raphael said:
Anyways, I don't like these Iberian samples. It's stupid. Too ethnically diverse. The Portuguese and Spaniards are mixed together. The Portuguese have much more North-African than spaniards. At the V3 they have double than spaniards.

I am not sure what you are talking about, the Spanish and Portuguese cluster together neatly, the difference in the North African is not that much waaaay below 10% on both populations, the rest is a mirror image.

Rafael said...

Isidro, at DOdecad the spaniards have only 3% of North-African while the Portuguese show an average of 7%. The lowest North-African by a portuguese is the highest in Spaniards.

Anonymous said...

Basques have by far the lowest lactose intolerance (~0.5%) in the known world and highest R1b1B2 in Europe (and possibly the world)the world. Occams Razor says this is significant.

Ricardo said...

Rafael, you should test Galicians and other western spaniards and see how much north african they have. So if you want a split, you should do it on western Spain / eastern Spain. I bet that way you would feel much better about your north african score.

princenuadha said...

@magu

"If J2 would all be Roman then we would expect it to come with the other haplogroups of Italy/Latium in similar frequencies."


That is some simplification. We don't know the haplogroup frequencies of ancient Italy, that A. Italy was not homogeneous, and effects of drift and founder effect.

The math your talking about is bases on bad assumptions.

As an aside I hope you realize that Spain could get j2 and g2a from Italy but also get g2a from.another population.

"But then all J2 is not Roman, Q.E.D."

Who said that?

I'm with kepler in that a significant amount of the j2 in Spain came from the combined migrants of phoenicians, geeks, and Romans. The fact that j2 is more represented in southern Spain correlates with phoenician colonization.

Maju said...

"We don't know the haplogroup frequencies of ancient Italy"....

By default, we must assume that they were roughly the same as today. Otherwise you must postulate a major demographic change based on the introduction of some specific haplogroups (J1 for instance) and a mechanism for such important and monochromatic demographic change happening.

Are you proposing any such important change?

(1) Yes? Please detail, so we can discuss it.

(2) No? We go back to the step before you happily questioned that today's frequencies are a good proxy for Ancient Italy.

"... based on bad assumptions".

It is you who is making bad claims without any clarity nor evidence.

"I hope you realize that Spain could get j2 and g2a from Italy"...

But (unless important founder effect(s) happened, what requires very specific conditions) they could not without other Italian haplogroups, like said J1 but also the adequate apportions of R1b subclades.

"... but also get g2a from.another population".

Focus: which population, in which circumstances... just using vagueness as some sort of smoke bomb is not a recipe for scientific knowledge. Postulate a hypothesis, demonstrate it consistent with reality... otherwise it's like the kind of discourse of biblical creationists: pseudoscience.

"Who said that?"

I did.

"a significant amount of the j2 in Spain came from the combined migrants of phoenicians, geeks, and Romans".-

All of which should have also significant amounts of J1, to be found almost nowhere in Iberia.

This issue requires a very marked founder effect (either in Spain or Italy/alternate source population... or both), founder effect that is impossible to imagine after the Neolithic demographic changes.

So we must accept that J2, G2a, etc. are all Neolithic clades, because of the lack of J1 and other such "oriental" clades like R1b-L23(xR1b-S116)...

Anything else would require such a complicated model (that I have seen described nowhere anyhow) that it's extremely not parsimonious.

Occam's razor slices here.

Bolinaga said...

Greetings,

With regards to the Neolithic farmers in Iberia, I am G2a3a and my family is from the Iberian peninsula.

To be exact, my family is from Eskoriatza in the Valley of Leniz in Gipuzkoa in Euskadi in the north of Spain .

But I do not think we came in the Neolithic migrations.

I have a TMRCA (time to most recent common ancestor) with Armenians and a Turk of around 2500 year BP based on 67 markers.

My surname is documented in Gipuzkoa since the late 1100s. We fought at las Navas de Tolosa (1212) and at the taking of Baeza (1227) and were Gamboinos in the War of the Bands (c 1350-c 1470) by way of example.

There are men from the Iberian peninsula (Galicia and Northern and Central Portugal) and Colombia who apparently share a unique marker with me and who have a TMRCA of less 700 year BP.

I doubt we were part of the Neolithic Famer Migration based on the TMRCA with the Armenians and the Turk. There are many others from Anatolia/Armenia/the Levant that I have a TMRCA closer than the Neolithic.

There are several other Basque families from Gipuzkoa who have similar histories and are G2a or some flavor of J.

I am 7HEPW at Ysearch.

My Turkish cousin is URENY at Ysearch.

The closest Armenian cousin is EZFUZ.

You may run the TMRCA calculations yourself.

In the DNA groups for the Basque country, Spain and Portugal at FTDNA there are various Anatolian/Levantine Y Haplogroups from Iberia. Some them are very likely Neolithic and some very likely not.

My gateway ancestor to Spain was very likely a Byzantine probably a soldier who accompanied Eudokia Komnene, niece of Manuel I Komnenos when she came to a marry the King of Aragon in the late 1170’s. The marriage did not happen but Eudokia married Raymond the Count of Toulouse(their daughter did marry a later King of Aragon) and many of her retinue stayed in the west. Andronikos I Komnenos would not have been nice to them had they gone home to Constantinople.

But then again, I could be the descendant of a Roman soldier guarding the salt mines at Leintz Gatzaga or something entirely unthought-of but not a Neolithic Farmer. As God wills and science progresses, I may someday know.

Regards,

Andres Bolinaga or,

as in Medieval times,

Andres Martinez de Borinaga

princenuadha said...

"By default, we must assume that they were roughly the same as today."

No. Someone came up with a theory that you said was wrong based on an ASSUMPTION that isn't necessarily true. I pointed that out.

Maju said...

You are not proposing a mechanism by which Italians would have changed so much in their genetic makeup in the last 2000 years. There doesn't seem to be any such mechanism or reason.

So when you just accuse me of what seems to read "assuming too much", you're acting like the typical creationist who demands a fossil for each generation or almost and then proposes a "counter theory" based only on faith: literal interpretation of the Genesis (in this case your blind faith on certain versions of the "molecular clock" or otherwise wishful thinking, right?)

By demanding impossibles from me and providing no evidence whatsoever for your conjectures you only make yourself look as a bible seller but not as any scientifically-minded person.

I cannot debate with that, mostly just ignore but, wait, he may mislead some naive accidental reader, so I have to expose the salesman's tricks you are using and demand that you provide a coherent hypothesis that has at least some empirical support.

princenuadha said...

>You are not proposing a mechanism by which Italians would have changed so much in their genetic makeup in the last 2000 years.

I don't need to. Pay attention to the argument.

"So when you just accuse me of what seems to read "assuming too much", you're acting like the typical creationist who demands a fossil for each generation or almost and then proposes a "counter theory" based only on faith:"

Did you seriously just compare leaving the door open for hap group frequency changes to arguing for creation theory... (Here's one "minor" difference. I've seen proof of haplogroup frequencies changing over time in a given location but I haven't seen proof of god creating a species.)

Wow magu, you really don't like being based into a corner.

No, we do not default to saying there are no changes in a populations hap frequencies over a period of 2000 years. It is an assumption which would be wrong sometimes... obviously! Not only can you be incorrect in that assumption but it is not scientific or honest to suggest someone is definitively wrong when you base it on an assumption that hasn't been shown correct.

Also, don't loose track of the issue at hand. To do your math you not only assume that the Italian frequencies haven't changed but also that the Roman colonists in Spain hast the same frequencies of Italians in general, and that there wasn't drift in Spain.

You can add this to your list of assumptions that may be wrong.

Maju said...

You are not leaving the door open, you are positing an alternative hypothesis, an extraordinary claim that demands extraordinary evidence. Just like creationism.

You don't even bother positing how that radical demographic change could have happened at all: you just attempt to cast doubt on the baseline scenario.

And why exactly, mind you, would Italians have lost all that R1b-S116 (from 50% to residual) and gained all that J1 (from anecdotal to very important in some regions), etc.? Was it the Lombard invasion from what is now Hungary? Or maybe the orgies of Pope Alexander VI?

Instead of just speculating that maybe somewhere, somehow in some undefined 2000 years massive demographic change may have happened... why don't you address the specific case of Italy since the 2nd Punic War, a very well known historical process.

You do not dare to face reality and that's a shame. Shame on you, Duadha, for not being able to assume the responsibility of your own claims or having the dignity of accepting that you are wrong.

Because that is what happens: that you are as wrong as a biblical creationist and, like those, you hide behind a smoke curtain of vagueness and generic doubt casting.

Maybe? No, we know well the history of Italy and Iberia for the last 2000 years, we do not need to defer to uncertainty: we have a degree of certainty that we'd wish for most other circumstances: it's all quite densely written history.

Maju said...

PS- And drift becomes nearly irrelevant after Neolithic: the greater the population the tinier the drift (and vice versa). The populations of the regions involved in you hypothesis were in the millions, the room for drift was very low, not zero maybe by close enough.

Let's recapitulate, Duadha: you claim that a number of Romans or Italians caused a major demic impact, maybe as much as 10%, ONLY in the Y-DNA haplogroup J2, all the rest being unaffected: doesn't matter how was the genetic pool of Italians then because there is a shadow of doubt that can be ritually casted on anything at convenience, right?

Not convincing at all, really. Funny, witty if you wish... but meaningless from a scientific point of view.

princenuadha said...

>You are not leaving the door open, you are positing an alternative hypothesis

Hypothesis (or guess in my case) is the key word! Ergo I'm not claiming to know* the answer.

None of that changes that you are making assumptions and acting as if your theory is correct.

Breogan said...

I'm #11 under the Dodecad Spanish samples and I'm 100% Galician.