May 06, 2010

Population structure in Hispanics (Bryc et al. 2010)

From the public release:
The study, published in the May 3 online issue of the Proceedings of the National Academy of Sciences, tested the genetic makeup of 100 individuals of Hispanic/Latino background in the New York tri-state area, including Dominicans, Columbians and Ecuadorians, as well as Mexicans and Puerto Ricans, the two largest Hispanic/Latino ethnic groups in the United States. Currently, Hispanic/Latino Americans comprise 15.4% of the total United States population, or 46.9 million people, and account for the largest ethnic minority in the United States.

"It is important to quantify the relative contributions of ancestry in relation to disease outcome in the Hispanic/Latino population," says study co-author Christopher Velez, a medical student at NYU School of Medicine. "This ethnically appropriate genetic research will be critical to the understanding of disease onset and severity in the United States and in Latin America. It will allow for the development of appropriate genetic tests for this population."

Through their analysis of the entire genome, the researchers found evidence of a significant sex bias consistent with the disproportionate contribution of European male and Native American female ancestry to present day populations. The scientists also found that the patterns of genes in the Hispanic/Latino populations were impacted by proximity to the African slave trade. In fact, Puerto Ricans, Dominicans and Columbians from the Caribbean coast had higher proportions of African ancestry, while Mexicans and Ecuadorians showed the lowest level of African ancestry and the highest Native American ancestry.

European migrant contributors were mostly from the Iberian Peninsula and Southern Europe. Evidence was also found for Middle Eastern and North African ancestry, reflecting the Moorish and Jewish (as well as European) origins of the Iberian populations at the time of colonization of the New World. The Native Americans that most influenced the Hispanic/Latino populations were primarily from local indigenous populations.
The paper has plentiful supplementary material online. Here is the result of the frappe analysis in a broader context:


As we can see, Hispanic individuals are a variable mix of Caucasoid (red/orange), Amerindian/Mongoloid (blue, teal) and Sub-Saharan (green) components. The orange Caucasoid component seems centered on Sardinia while the red one in NE Europe.

PNAS
doi:10.1073/pnas.0914618107

Genome-wide patterns of population structure and admixture among Hispanic/Latino populations

Katarzyna Bryc et al.

Abstract

Hispanic/Latino populations possess a complex genetic structure that reflects recent admixture among and potentially ancient substructure within Native American, European, and West African source populations. Here, we quantify genome-wide patterns of SNP and haplotype variation among 100 individuals with ancestry from Ecuador, Colombia, Puerto Rico, and the Dominican Republic genotyped on the Illumina 610-Quad arrays and 112 Mexicans genotyped on Affymetrix 500K platform. Intersecting these data with previously collected high-density SNP data from 4,305 individuals, we use principal component analysis and clustering methods FRAPPE and STRUCTURE to investigate genome-wide patterns of African, European, and Native American population structure within and among Hispanic/Latino populations. Comparing autosomal, X and Y chromosome, and mtDNA variation, we find evidence of a significant sex bias in admixture proportions consistent with disproportionate contribution of European male and Native American female ancestry to present-day populations. We also find that patterns of linkage-disequilibria in admixed Hispanic/Latino populations are largely affected by the admixture dynamics of the populations, with faster decay of LD in populations of higher African ancestry. Finally, using the locus-specific ancestry inference method LAMP, we reconstruct fine-scale chromosomal patterns of admixture. We document moderate power to differentiate among potential subcontinental source populations within the Native American, European, and African segments of the admixed Hispanic/Latino genomes. Our results suggest future genome-wide association scans in Hispanic/Latino populations may require correction for local genomic ancestry at a subcontinental scale when associating differences in the genome with disease risk, progression, and drug efficacy, as well as for admixture mapping.

Link

57 comments:

Gioiello said...

"The Y chromosomal results also demonstrate the insufficiency
of the paradigm of European males and Native American/African
females to capture the complexity within the Latin American
populations. For example, we find Y chromosomal haplotypes in
Hispanic/Latinos with presumed origins in the Middle East and
Northern Africa. Given that historical documentation suggests
that most of the non-African and non–Native American contribution
to admixed Hispanic/Latino populations is from Southwest
Europe, this suggests that the contemporary populations
inherited these Y chromosomes from Europeans who, in turn,
were descended from Middle Eastern or North African men.
Several historical events could have led to the acquisition by
Europeans of non-European haplotypes, perhaps during the
period of the Roman Empire when the Mediterranean Sea
behaved as a conduit (not a physical barrier) between Europe, the
Middle East, and North Africa or by Sephardic Jews or Moorish
Muslims during the European Middle Ages/Islamic Golden Age.
Alternatively, the presence of non-European Y chromosomal
haplotypes originating from the Middle East and North Africa
could represent the result of Iberian Jews and Muslims (themselves
admixed) fleeing the peninsula for New World territories
in response to discriminatory policies that strongly pressured
both communities at the termination of the Reconquista. Essentially,
the diversity of haplotypes in the Y chromosomes in Latinos
reflects not only population dynamics from the 15th century
onward, but also the historical trends of population movement
occurring across the Atlantic during centuries prior."

Apart the grossness to think that hg.J is "only" from Middle East etc. (we have documented that there were J1 and J2 in Europe from thousands of years), these results are very interesting: the R-M269 ones, above all from Hiberia, lacks once more the most ancient haplotypes, as I have
said many times. There is in the sample only one R-P25, from Puerto Rico,clade: 16,12-14,14,14,30,23,11,13,13,12,11,15,15.

Gioiello said...

This R-P25 matches a Puertorican and is close to an Hispano-American from Illinois (US) who had DYS385=11-14. Very interesting.

Gioiello said...

Put on Ysearch from SMGF DEl Toro (PP26T). He matches all the other R1b1*-A2. They have the same ancestor and are rooted in Puerto Rico.

Maju said...

I'd use K=6 in this case, really, because the orange component in Europeans only appears at K=7 and we don't know how it behaves at deeper K levels.

For most purposes, K=3 is equally valid in this particular case.

In other studies, when Europeans are clustered in K=2, the real clustering doesn't yet shows up. You'd need at least five or more clusters (K depth), otherwise they tend to look a random mixture of Finnish and Greek, so to say (but it's just an illusion dispelled at greater depths).

I would also have favored a greater representation of Iberians (intently disproportionate) in order to be able to recreate such Iberian-specific cluster (that we know exists from other studies) and try to discern the differential impact of Iberian and non-Iberian West Eurasians in Latin America.

I see no point in sampling so many different Europeans for a Latin American study: an Iberian sample, a Basque sample, an Italian sample and the HapMap CEU one would have been more than enough. However I would have added North African and Sephardic samples out of curiosity.

One thing that intrigues me, anyhow, is the second Native American component and the apparent consistent presence of both NA components among Europeans since K=3. Is it Siberian legacy or some sort of random noise?

Dienekes said...

One thing that intrigues me, anyhow, is the second Native American component and the apparent consistent presence of both NA components among Europeans since K=3. Is it Siberian legacy or some sort of random noise?

Both. Amerindians are not a good representative for the East Eurasian component in some Europeans, so the fit is not good.

It's similar to what happened when a certain DNA company used NW Europeans as representatives for Caucasoids, the result was that people like Larry David came out as significantly "Native American".

Compare with a paper using a similar number of markers but a wider sampling of global populations.

Onur Dincer said...

One thing that intrigues me, anyhow, is the second Native American component and the apparent consistent presence of both NA components among Europeans since K=3. Is it Siberian legacy or some sort of random noise?

No, Maju, it just shows how incompetent this group of scientists are in completely separating Europeans from Native Americans (also from sub-Saharan Africans looking at the sub-Saharan African components among Europeans). Of all the inter-continental STRUCTURE analyses I've seen so far that includes Europeans, this is the worst in this respect.

Maju said...

"Compare with a paper using a similar number of markers but a wider sampling of global populations".

Thanks.

Still the Russian apportion of Oriental genes is almost equally split in that graph between the East Asian (orange) and Native American (purple) components - and even has some South Asian blue one. Native Americans are surely most closely related not to East Asians specifically but Siberian peoples, what leaves me with the doubt.

...

"... it just shows how incompetent this group of scientists are in completely separating Europeans from Native Americans (also from sub-Saharan Africans looking at the sub-Saharan African components among Europeans)".

You don't seem to understand what K-means analysis is about. It's not meant to neatly differentiate people by continental or "racial" origins (there are much simpler "chips" for that) but to discern genetic clusters (not people) as well as possible.

If people is mixed that's not the problem of K-means analysis, only if it could be demonstrated that they are not mixed in such way and that the algorithm is producing wrong results then it'd be such a problem.

But maybe it's just that bloods are mixed and have been so "all the time". I have no problem with that, just curiosity.

aargiedude said...

The genetic cluster charts were made using "frappe". What's that? Same as STRUCTURE, different... ?

They found an R1b1* sample from Puerto Rico which makes an interesting revelation. There are several of them in ysearch, they all have the same haplotype as this one, but I had thought the ysearch entries were simply one guy who had entered his own sample many times, which happens. This study shows there's a remarkable case of genetic drift or more likely founder effect going on here. The study tested 26 Puerto Ricans and found 1 of these samples, or 4%. In ysearch there are 8 out of 195 samples, or 4%. It fits. Curiously, there's a non-Puerto Rican origin member of this group, from Venezuela. So, we have a remarkable case here, given that R1b1* in Iberia is extremely rare. Funny enough, this would make Puerto Rico the region of the world with the highest frequency of the most basal clade of R1b (V88-).

Onur Dincer said...
This comment has been removed by the author.
Onur Dincer said...

AN = NA

Maju said...

Funny, sure. But it's most likely an R1b1a founder effect from Sardinia, North Africa or even Central Africa.

Puerto Rico was never a major attractor but rather a peripheral area within the Castilian colonial empire so any colonists or slaves or whatever had a good chance to leave their mark.

Maju said...

And yah, frappe and structure are like the same thing but different for whatever algorithmic reason. It's just a program that does the hard work for you very fast and provides a neat colored chart.

aargiedude said...

So frappe is the method, and STRUCTURE is the software?

The R1b1* lineage belongs to V88-, which is found in Europe, Anatolia and the Middle East. North Africa and probably Sardinia only have V88+.

Average Joe said...

The analysis seems to show that Northwestern Europeans and Northeastern Europeans have similar levels of red and orange once again showing that Northern Europeans are more closely related to each other than they are to Southern Europeans. In Europe, genes seem to have flowed more easily from east to west than from south to north.

Onur Dincer said...

In Europe, genes seem to have flowed more easily from east to west than from south to north.

That seems to be the general rule everywhere in the world (and not only restricted to humans but seems to be the general rule among almost all species in the world).

Maju said...

I think frappé is another software, same method (at least in the essentials) as srtucture.

Maju said...

"The R1b1* lineage belongs to V88-"

So you mean true R1b(xR1b1a, R1b1b)? Curious.

Onur Dincer said...

recent European admixture? - this may also explain, at least partially, the very visible existence of the NA components among Europeans

Btw, I am not saying here that Europeans have NA admixture, I am merely saying that the existence of European admixture (most probably completely or almost completely from the last 500 years) among the analyzed supposedly pure NA groups (at least among some of them) may have distorted the componental distributions of the analyzed European groups in favor of ostensibly more NA containing componental distributions than their real componental distributions. So we may be face to face with a statistical illusion.

aargiedude said...

R1b1 clusters

Marnie said...

frappe is a maximum likelihood statistical method.

I'm still trying to get to the bottom of the algorithm of STRUCTURE.

UUUh. Y' know, it ain't such a good idea to boldly accept the results of a software program without understanding the algorithm and assuptions within program.

Polak said...

Programs like Structure don't actually show admixture. What they show is a probability of membership in certain clusters.

So the fact that Europeans show a small but clear probability of membership in the Native American based clusters doesn't necessarily mean admixture.

The fact that this probability is even across Europe, but rises slowly towards the Northeast, shows it to be the result of ancient links between Eurasians, and the gradual cline in diversity that was always present due to the earliest migrations.

The only group that does't fit the pattern of that gradual cline in that Frappe analysis are the "Russians", who are actually the HGDP North Russians from Vologda, and from other studies we know that they do actually show admixture from Siberia and Central Asia. Finns would show similar behavior on this plot, but only one Finn was sampled here, and he's in the Europe NNE set.

Basically, that's what admixture is; it's when a group is out of whack with the old clines of genetic diversity around the world due to more recent gene flow.

Onur Dincer said...
This comment has been removed by the author.
Onur Dincer said...

The only group that does't fit the pattern of that gradual cline in that Frappe analysis are the "Russians", who are actually the HGDP North Russians from Vologda, and from other studies we know that they do actually show admixture from Siberia and Central Asia.

What about the Adygei (who are a NW Caucasian speaking people from northwest Caucasus)? They also appear as clearly different from the rest of the analyzed European populations in their proportion of non-European component distribution.

Furthermore, according to the genetic studies I have read about Adygei people they also show some signs of admixture from Siberia and Central Asia (also from South Asia).

Marnie said...

Thanks Polak. Also appreciated your comments on Persians.

Maju said...

Thanks Argiedude. Do cluster names A1, A2, B1, B2 mean haplotype clusters within R1b1*?

"What they show is a probability of membership in certain clusters".

Based on the affinity of the genome with such cluster. In the end it's about the same, though I reckon that the clinal approach (as opposed to the cluster approach) has its interest.

"What about the Adygei"...

If you follow the link Dienekes posted above you can see that they show high presence of the "South Asian" component (also high but not so much in West Asia) but only quite low of the East Asian/Amerindian component, lower than Vologda Russians certainly.

Marnie said...

For those of us still wondering about STRUCTURE:

download and top description
http://pritch.bsd.uchicago.edu/structure.html

paper
http://pritch.bsd.uchicago.edu/publications/structure.pdf

STRUCTURE uses a Markov Chain Monte Carlo method with Gibbs sampling. It appeas to us different algorithms for populations without and with admixture. In this method, K is the number of populations or clusters that the algorithm is permitted.

A quote from the paper:

"The problem of inferring the number of clusters, K, present in a data set is notoriously difficult." The authors suggest approximating K in an ad hoc and computationally convenient way.

Newer versions of STRUCTURE appear to employ more sophisticated methods to estimate genotype and resolve genotype ambiguity.

[http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo]

aargiedude said...

Thanks Argiedude. Do cluster names A1, A2, B1, B2 mean haplotype clusters within R1b1*?

Yeah, the clusters are A1, A2, and B1 to B5. I could just as well have written "A2 (Puerto Rico)" and "A2 (Jewish)" as A2a and A2b.

A is V88-, B is V88+.

Cuah123 said...

Here is another study in Mexico:
http://www.ncbi.nlm.nih.gov/pubmed/18161845?ordinalpos=1&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_SingleItemSupl.Pubmed_Discovery_RA&linkpos=1&log$=relatedarticles

Up 60-68% European fathers in the western state of Jalisco and Chihuahua. With an almost inverse towards the center of the country with Native Fathers being the predominate group.

In my town of Mexico R1b y haplo is appearing to be predominate. I myself am r1b1b2 with a Frisian prediction (final analysis on the way) with 500 years of presence. Also I1 ud is my mothers fathers and J2a4b1 mothers mothers fathers. Mothers are mtdna D.

The one issue I have with this study is that historically the different "colonist" waves to the New World are not best represented in New York. New York was not a port of passage for people for the first spanish speaking Europeans. (Unless this is one of many studies).

Cuah123 said...
This comment has been removed by the author.
Cuah123 said...

[http://www3.interscience.wiley.com/journal/117873902
/abstract?CRETRY=1&SRETRY=0]

Sorry this is a better link.

Anonymous said...

Any study that mentions haplogroups as indicators of recent ancestry, as in the last 3,000 years is just lowering the academic tone to dirt level. There are many studies, contradictory and collaborative, which contend that certain haplogroups found in Europeans and other peoples based on present day frequencies indicate that those haplogroups had exactly the same frequencies, genetic diversity and origins going back to the Paleolithic Age. Going by that logic, because Christmas is celebrated every year and Santa Claus is always shown, Santa must have existed. That folks is simplistic, even downright moronic and what is more cannot be proven to be the case.

It is about time these scientists got serious and stop using data, modern frequencies of haplogroups, in a most pseudo scientific manner, and got back to real science and real scientific methods.

Interesting that the non European elements found in the frappe results in Europeans, the Amerindian and the Pygmy African did not receive elaboration or explanation yet the perennial Jewish or Moorish bull does.

I also think it is about time the cast of fools, the samples of various ethnic groups, was changed. Who is really interested in Basques,Tuscans, North Russians and Mormons from Utah? I am not. Get some other ethnic groups and real Europeans, not Americans of uncertain pedigree like those Mormons.

Marnie said...

Ancient DNA shows interbreeding between Homo sapiens and Neanderthal

http://www.washingtonpost.com/wp-dyn/content/article/2010/05/06/AR2010050604423.html

eurologist will be happy.

Gioiello said...

Very good work, Argiedude. Not having more excel on my PC, I am seeing your spreadsheet only now at my school.
It seems that Puertorican and Jewish A2 have suffered a founder effect and they are worth each for 1.
The ancestor of R1b1b2 is probably among the European YCA 18-23 or 18-22 (the Italian one).

eurologist said...

eurologist will be happy

OT - you can comment more on that in that particular thread - but what in particular have I stated in the past that would make you think so? Your memory is probably better than mine... ;) Just curious, because I tend to "just throw out there" a lot of thoughts and ideas...

I do have some specific ideas, but they are certainly not of the "multiregional" variety.

Gioiello said...

Having Italy YCA=18-22 and 23-23, I think we can surely adfirm that the Iranian and the Eastern A2 clade are derived from the European one by a RecLOH. If the Italian one is derived from the North-West European or vice versa we can only hypothesize. But I bet yet on Italy.

Onur Dincer said...

ONUR DINCER: "What about the Adygei (who are a NW Caucasian speaking people from northwest Caucasus)? They also appear as clearly different from the rest of the analyzed European populations in their proportion of non-European component distribution.

Furthermore, according to the genetic studies I have read about Adygei people they also show some signs of admixture from Siberia and Central Asia (also from South Asia)."

LUIS ALDAMIZ: "If you follow the link Dienekes posted above you can see that they show high presence of the "South Asian" component (also high but not so much in West Asia) but only quite low of the East Asian/Amerindian component, lower than Vologda Russians certainly."


Maju, I didn't say Adygei are NA admixed, I only said they show signs of admixture from Siberia, CA and SA without referring to the proportions of each of these three places of admixture. I already know that the main non-European component in Adygeis is SA, I think this points to their relationship to the Iranian corridor, though cluster analyses also on Iranian populations is needed to confirm this. Btw, the small EA component in Adygeis may be a remnant of some Turkic admixture.

Gioiello said...

Argiedude, as I think having demonstrated in the past on this forum with regard to R1b1b2(L23-), also in these R1b1*s probably there are many cripto-Italians, among above all the South Americans. But also the Puertoricans could be of Italian descent, in spite of their Spanish surnames. Ask Milesius, who wrote on "forums-dna", and who is an expert of Puertorican ancestry.

That the modal of R1b1b2(L23-) could be 18-23 and not 19-23 could be demonstrated by the fact that Italians (the friend Joe Merante and an Italian I put on Ysearch from SMGF: Ferrero, then Italy from South to North) are the unique who have 17-23 and in Italy there are also R1b1b2(L23-)s with 18-23.

Onur Dincer said...

I think this points to their relationship to the Iranian corridor

In view of the fact that the SA component of the analyzed Adygeis is much more prominent than the analyzed Arabic-speaking West Asian populations of course.

Onur Dincer said...

though cluster analyses also on Iranian populations is needed to confirm this

We also need to make cluster analyses of the West Siberian populations and Turkic and Iranic speaking CA populations to test other alternatives.

Onur Dincer said...
This comment has been removed by the author.
Onur Dincer said...

Uygurs are a Turkic speaking CA population, but they live too much in the east to have any relevance to Adygeis, I think geographically western Turkic CA populations (like Turkmens, Uzbeks, Karakalpaks, western Kazaks) are more relevant in this regard.

Onur Dincer said...

Correction: "In view of the fact that the SA component of the analyzed Adygeis is much more prominent than the analyzed Arabic-speaking West Asian populations of course."

It would be thus:

"In view of the fact that the SA component of the analyzed Adygeis is much more prominent than that of the analyzed Arabic-speaking West Asian populations of course."

Kepler said...

Did they get pay to carry out this study? I don't see the added value. I have read quite a lot of studies carried out already over the years on population genetics in Venezuela, Argentina, Mexico, Dominican Republic. What's new?

And they all have shown over and over again what any Latin American who has had contact with other Latin Americans know: that coast Colombians tend to be more sub-Saharan, as Dominicanos, that Mexicans tend to be more Indian, etc. and that the European part predominates big time on the paternal side with native American as first component on the maternal side, plus a couple of other details. Nothing new here.

By the way, as I mentioned already, I have a J2 as male haplogroup, the pattern looks like matching one of http://en.wikipedia.org/wiki/Y-chromosomal_Aaron
and I have no knowledge of Jewishness whatsover in my family. My closest matches are from Italy, Germany, some Spain and one Libanon and Saudi Arabia...all over the "Old World" map...but then we know how Palestinians, Tunesians, etc, are underrepresented in the whole samples.

Did they want to prove we were all
marranos that escaped to America?
What else?

I think this study lacked focus.

Marnie said...

Hey Kepler,

To my eye, this paper simply quantifies what even the most casual observer will intuitively understand when observing Latino populations. I guess it is helpful to understand a little about "disease risk, drug efficacy, etc, etc." but with the currently economic and healthcare climate in the US, Mexico and Central America, it's hard to see how this data could be used to implement an effective drug and disease risk policy. I'm sure the health insurance companies will be combing over the possibilities.

Marnie said...

Regarding this paper, populations in disequilibrium, and statistical methods, I came across this abstract:

http://content.karger.com/produktedb/produkte.asp?typ=fulltext&file=000119107

"Review and Evaluation of Methods Correcting for Population Stratification with a Focus on Underlying Statistical Principles"

Hemant Tiwari, et al.

Haven't paid for and downloaded the full text and I don't happen to subscribe to the International Journal of Human and Medical Genetics.

Cuah123 said...

@ Kepler, I too have a big issue with [pick a religion] focus of these tests. Another study showed that Southwest spanish americans statistically were no different that Iberian (even still the want exists for certain people to be descendants of conversos/marranos/cryptojews etc...). I've been a big fan of this site and the excellent commentary that has brought a spot light to this issue. For whatever reason, I have never seen a study showing the effects of the major empires of the Mediterranean on Iberia. In other words, why don't these dna sites have tests for "Greek" or "Roman" or Anatolian ancestory? (or is there money to be made otherwise thus studies are skewed?). On one particular Mexican ancestory site there appears to be an overwhelming wish to be Jewish; even after showing evidence to the contrary (dna, patronym usage, birth certificates, even proving certain familes being part of the Order of St James...nothing works). One potential author on this site claims he's using ancient scrolls and a book called the "green book" to distinguish converso Jewish Mexican families (a book which has no merit now or then); Or the ancient scrolls which go all the way back to Noah, I called it a fantasy and was banned from the site.

Now you'll even see a site or two claiming tortillas of arena (flour tortillas) as being Jewish! Even if they are made with lard!

Andrew Oh-Willeke said...

@ Kepler:

Studies that have the same general conclusion constitute both (1) replication, which is a core part of the scientific process, and (2) expand the sample size and thereby the accuracy of prior studies via meta-analysis.

Also, aDNA studies like this one, which are more rare because they used to be expensive, resolve distinctions between admixture in individuals v. admixture of populations. Y-DNA and mtDNA studies can distinguish between the two.

Finally, aDNA studies like this one, provide more resolution of the underlying subgroup population structures beyond mtDNA and Y-DNA. This is useful, for example, in getting a sense of what kind of population structure may have existing in indigenous American populations in a historical era like ours when there are few, if any, pure blooded indigenous Americans left. In Latin America, where the paternal line contribution was much greater than the maternal line contribution, aDNA studies like this one allow us to infer, for example, information about indigenous paternal line contributions that could be determined in no other way.

aargiedude said...

Gioiello, that was very interesting! Indeed, there's definitely a cluster of M269* with YCAII = 17/23, and it is very curious that R1b1*-A2 has YCAII = 18/23. There's one problem, though. R1b1b1 has YCA = 19/25, suggesting that a jump from YCAIIa=18 to YCAIIa=19 would have occured already in R1b1b (P297), and then M269* would have appeared, and one of its lineages would have suffered a freak mutation from YCAII = 19/23 to YCAII = 17/23.

Or maybe, R1b1b inherited 18/23 from R1b1*, and so did R1b1b1 and R1b1b2, and then R1b1b1 and R1b1b2 independently mutated to 19/23, while this cluster retained the ancestral value 18/23. It is very curious how of all R1b1b2 lineages the only one that has modal 17/23 comes from the most basal (and very rare) of existing R1b1b2 clades.

The cluster is confirmed thanks to the following values that clearly distinguish it from other M269* samples:

385a = 12
448 = 20
YCAII = 17/23
444 = 13
446 = 14

426=11 helps distinguish this cluster from non-M269* R1b1b2 samples.

The cluster is centered in Italy. I think the percentages might be 0,5% in the south and Sicily, and 0,2% in the north. But there's a ysearch sample, 8334H, who isn't Italian, probably North European, and who seems to belong, though maybe not (I think he does). And there's an smgf sample from Lebanon, Jlelati, who is unquestionably a member of the cluster, having every single STR value I outlined previously.

I don't know what to say, I think there might be a chance this cluster could be the most basal grouping of R1b1b2* (M269*). All because of that 17/23 oddity. 446=14 is another coincidence with R1b1*-A2, but there were a lot of misses, too; that 446 coincidence could easily be an inevitable coincidence that we would find in any other cluster just due to sheer luck. It's the 17/23 value that doesn't look like sheer luck.

To get a grasp of how rare these values are in R1b1b2, YCAIIa = 18 is 1% of R1b1b2-ht15 samples, and YCAIIa = 17 is 0,1%. There doesn't seem to be any cluster of R1b1b2-ht15 that has YCAIIa = 18. So yes, this is very intriguing. A legacy from R1b1* ?? We need a walk-the-y test on these samples.

Kepler said...

Andrew,

I think at this stage they were trying to find about everything on a general study of people coming from all over Latin America. I don't think they are doing much about expanding the size, we are talking about 2 hundred something individuals from 5 countries...and with that they expect to find out about indigenous structures, Middle Eastern background and all the rest?

For me what they are doing is basically adding some genetic samples to the Ecuatorian sample, to the Mexican sample, to the Dominican sample, etc.

Do you want to do some study about native American substrata among non native Latinos?

I found this way more interesting:

http://webcache.googleusercontent.com/search?q=cache:6BEcPdmkSKQJ:www.scielo.org.co/pdf/abc/v14n1/v14n1a12.pdf+diversidad+in+Venezuela+haplogrupos&cd=1&hl=en&ct=clnk&gl=be

It is in Spanish, by Venezuelan researchers. They first selected 4 locations in Northwestern Venezuela and from there individuals with all grandparents coming from the same area. They determined the maternal haplogroups of native American origin and then they tried to see if there was some pattern that may relate to what we know from history about the populations that existed in those areas.
Those areas have been mixed and people there have spoken Spanish only already for several centuries and yet we know some may have had Carib, some Arawac populations.
The study was not as detailed as I would have wanted, but it brought things a little bit forward.

It would have been interesting had they performed aDNA studies as well, but at least they focused on something.

Only 1% of Venezuela's population is "purely native American" and they live very far from the studied regions. Still, within that 1% we have about 29 languages from about 7-8 language families.

And the NY blokes wanted to find native American patterns by studying "220 Latinos from five different countries in a NY area"?

I still think at this stage they should have had better focus.

Gioiello said...

Argiedude, 8334H (Theile), has probably a German surname and perhaps comes from the Rhaetian Region like Tarnuzzer (UY5NN) who come from the Grison (Switzerland).
The other Italian close to Merante is Prowting, who has an Italian extraction and is very close to
Merante.

Merante, who is my friend from long time, and has always said that I am the most expert in Genetics and Genealogy all over the world (?) (I have always centered his ancestry, like his mitochondrial), did 23andMe. I invited many times him to post his data to "Adriano's spreadsheet, but he hasn't done it so far. He is living in Russia now, but I wouldn't want he is a friend of mine but perhaps more Vincent's friend. On his data many are playing.

Gioiello said...

What do you think, Argiedude? If we compare Merante (QXGKN), Prowting (JUDZ8), Donato (796ME) and Ferrero (FMTPA), the first three from Calabria and the fourth from Piedmont, they have a MRCA between 1,000 and 1,200 YBP.
If we add Jlelati (G5CCP) from Lebanon they have a MRCA between 1,350 and 1620 YBP.
If we add Thiel (8334H), probably from Switzerland (Rhaetia), we have a MRCA between 1550 and 1860 YBP.
Do you think it is more likely that this cluster is from Italy or from Middle East?

Andrew Oh-Willeke said...

"[W]ithin that 1% we have about 29 languages from about 7-8 language families."

Maybe. Most S. American linguists are very reluctant to lump as opposed to split language families. They set a very high burden of proof to establish one, and have very thin data. The only literary languages in the New World were in Central America (on up into Mexico) and Europeans who did any documentation of languages (mostly Catholic missionaries) came considerably more recently than 500 years ago, so there are no historic versions of the current languages to work with.

Limited to the current versions of non-literary languages with the extent of outsider knowledge of them being pretty thin, it is safe to assume that further data can only lead to more lumping, rather than more splitting. I wouldn't be at all surprised to see the number of language families fall to three or fewer eventually.

Gioiello said...

I am seeing that there is another Italian who has YCAII=17-23: Urso from Mesoraca, Calabria (Ysearch VU772). Unfortunately he hasn't updated his results.

Gioiello said...

Comparing Urso with the other Calabrians (Merante, Prowting and Donato) and the Piedmontese Ferrero, who are closely related, they have a MRCA about 2800 YBP. This is for an Italian origin, I think, of this clade of R1b1b2*.

Gareth said...

Interesting, according to this study, about 30% of both Ecuadorians and Colombians have a Y-chromosome indicating Middle Eastern / North African ancestry. I have always thought that many Colombians have very Middle Eastern looking features.

C.L. Malloy said...

Maju: With respect to the R1b1*, I know someone who belongs to this Haplogroup and has an oral family history, and a last name to back it up, of being descended from the "Black Conquistador" Juan Valiente.
See http://en.wikipedia.org/wiki/Juan_Valiente
May be spurious, but interesting none the less.

Gioiello said...

Malloy, we know that R1b1*/V88+ and some subclades is diffused in Africa: see the paper of Cruciani, then an African R1b1* is likely. The problem is where this haplogroup came from: from Asia via Middle East or from Europe (Italy or Spain)?