April 02, 2010

Little genetic differentiation in different language groups of the Cross River region of Nigeria

BMC Evolutionary Biology 2010, 10:92doi:10.1186/1471-2148-10-92

Little genetic differentiation as assessed by uniparental markers in the presence of substantial language variation in peoples of the Cross River region of Nigeria

Krishna R Veeramah et al.

Abstract (provisional)

The Cross River region in Nigeria is an extremely diverse area linguistically with over 60 distinct languages still spoken today. It is also a region of great historical importance, being a) adjacent to the likely homeland from which Bantu-speaking people migrated across most of sub-Saharan Africa 3000-5000 years ago and b) the location of Calabar, one of the largest centres during the Atlantic slave trade. Over 1000 DNA samples from 24 clans representing speakers of the six most prominent languages in the region were collected and typed for Y-chromosome (SNPs and microsatellites) and mtDNA markers (Hypervariable Segment 1) in order to examine whether there has been substantial gene flow between groups speaking different languages in the region. In addition the Cross River region was analysed in the context of a larger geographical scale by comparison to bordering Igbo speaking groups as well as neighbouring Cameroon populations and more distant Ghanaian communities.

The Cross River region was shown to be extremely homogenous for both Y-chromosome and mtDNA markers with language spoken having no noticeable effect on the genetic structure of the region, consistent with estimates of inter-language gene flow of 10% per generation based on sociological data. However the groups in the region could clearly be differentiated from others in Cameroon and Ghana (and to a lesser extent Igbo populations). Significant correlations between genetic distance and both geographic and linguistic distance were observed at this larger scale.

Previous studies have found significant correlations between genetic variation and language in Africa over large geographic distances, often across language families. However the broad sampling strategies of these datasets have limited their utility for understanding the relationship within language families. This is the first study to show that at very fine geographic/linguistic scales language differences can be maintained in the presence of substantial gene flow over an extended period of time and demonstrates the value of dense sampling strategies and having DNA of known and detailed provenance, a practice that is generally rare when investigating sub-Saharan African demographic processes using genetic data.



Marnie said...

West African Highlife:


aargiedude said...

Rare deep-rooting Y chromosome lineages in humans: lessons for phylogeography
Weale, 2003
Rare deep-rooting Y chromosome lineages in humans: lessons for phylogeography (Weale, 2003)
Download pdf, on page 2:

"The new haplogroup, labeled DE* according to the nomenclature of the Y Chromosome Consortium (2002), has been found in 5 Nigerians (from different villages, languages, ethnic backgrounds, and paternal birthplaces) from a data set of 8000 men worldwide, including 1247 Nigerians."

This new 2010 study tested 1184 Nigerians, suspiciously close to the samples cited by Weale above, which are from a previous study, which he never references! I've asked before if anyone knows anything about this study, but nothing. Curiously, the new samples from this 2010 study have been tested for a battery of SNPs and STRs that were typical of the SNPs and STRs that used to be tested back in the early 2000's, especially O2b, a typical oddball haplogroup that was tested in several older studies, when we didn't yet have a clear grasp of things. It looks like these 2010 samples are really the mysterious pre-2003 samples mentioned back then by Weale, which have never been published before. And they re-tested them for E1b1a8, a new haplogroup found in ~2007. But on the other hand, if the samples were rehashed from a previous study they would have clarified it; instead, the manner they talk about them seems like they obtained them recently for this study.

Ebizur said...


Haplogroup DE-YAP(xE-SRY4064) has been found in six individuals from southeastern Nigeria (including 1/101 = 1.0% Oron, 2/209 = 1.0% Igbo, and 3/516 = 0.6% Ibibio), zero individuals from northwestern Cameroon (including 0/34 Tikar, 0/117 Bamun, and 0/118 Aghem), and zero individuals from Ghana (including 0/155 Akan and 0/88 Ewe) in the present study. The authors have not reported testing for any marker of haplogroup D.

Since the authors of the present study have found DE-YAP(xE-SRY4064) in 6/1222 males from southeastern Nigeria, and Weale et al. previously have found haplogroup DE* in only 5/1247 males from Nigeria, the authors of the present study must have tested at least some new Nigerian samples, or else there has been an error either in the present study (Veeramah et al. 2010) or in the previous study (Weale et al. 2003).

Ebizur said...

I forgot to mention a third possibility: if the Nigerian samples studied by Weale et al. 2003 included one haplogroup D individual, that would equalize the number of haplogroup DE-YAP(xE-SRY4064) individuals found in the two studies' Nigerian samples. Have Weale et al. 2003 specifically mentioned not finding any instance of haplogroup D in their African samples?

Ebizur said...

I suppose I also should add that the area in which the Tikar, Bamun, and Aghem peoples reside is geographically rather "western Cameroon" or "central-western Cameroon," though the Bamun and Aghem samples have been obtained from the so-called Northwest Region and the Tikar sample has been obtained from the southwestern part of the Adamawa Region of Cameroon. This part of Cameroon is actually adjacent to Taraba State of Nigeria, which is just slightly northeast of the region of southeastern Nigeria from which the Nigerian samples of the present study have been obtained. The three Cameroonian populations tested by the authors of the present study are all speakers of Bantoid languages, meaning that they are considered to be close linguistic relatives of the Bantu peoples who have spread across most of Central and Southern Africa over the course of the past several millennia. The Ejagham of southeastern Nigeria also speak a language that has been classified as Bantoid. The Ibibio, Oron, Efik, and Annang peoples of southeastern Nigeria speak languages/dialects that are classified in the Cross River family, so they are not technically Bantoid, but still closely related to Ejagham, Bamun, etc. These populations' neighbors to the west are the Igbo, who speak a language that is rather closely related to that of the Yoruba, and the Ijaw, whose dialects are considered to comprise one of the most divergent extant branches of the Niger-Congo language phylum. I would like to see some data on the Y-DNA of the Ijaw; I wonder if the minor presence of haplogroup DE-YAP(xE-SRY4064) Y-DNA in neighboring Cross River- and Igbo-speaking populations might be due to Ijaw influence.

Andrew Oh-Willeke said...

From the haplogroup DE (Y-DNA) entry in Wikipedia:

"More recently, one example of DE* was found amongst the Nalu in Guinea Bissau (1/17). The DE* sequence of this individual differs by one mutation from the DE* sequence of the Nigerian individuals. This indicates common ancestry, though the phylogenetic relationship between the two lineages was not determined in this particular study.[7] A 2008 study detected DE* in two individuals from Tibet (2/594).[8]"

It certainly isn't surprising to find an isolated instances of a Nigerian haplotype in Guinea Biassau, also in West Africa. But, the two Tibetan cases in one of the places where other D haplotypes are found, is notable, as it gives DE* a similar geographic distribution to D generally.

aargiedude said...

I found an R1b1b2-ht35 haplotype from Togo in West Africa:

ht35 in West Africa (image)

It has the only 2 values known to be different in R1b1b2-ht35 (southeast Europe, Anatolia) than in R1b1b2-ht15 (West Europe): 393=12 and 461=11. These values, together, occur very rarely in West European R1b1b2, less than 1 in 100 samples.

There are 4 more R1b1b2 samples from West Africa, and 1 of them also has 393=12 (from Ghana). 3 of them thanks to this recent study, including the one with 393=12. 393=12 occurs in just 3% of British R1b1b2.

It was only possible for me to locate this sample in smgf.org because of its non-West European haplotype. If there are R1b1b2-ht15 samples in West Africa, there's no way I can find them, because smgf.org doesn't allow searching by country, and there must be tens of thousands of R1b1b2-ht15 samples in their database. But from what I've already seen in y-dna studies of West Africa, I would guess there might be another 2 to 4 R1b1b2 samples in their database, and they would all be R1b1b2-ht15.

So, putting all this together, we would have 8 R1b1b2 samples, of which 1 has 393=12 and 1 other has both 393=12 and 461=11. The odds of finding 2 samples such as this in just 8 R1b1b2 samples of British or northwest European origin is low, about 1 in 100. It really calls the attention.

North Africa has the same proportions of ht15 vs. ht35 as these West African samples, but there is virtually no E-M81 and absolutely no J1 in West Africa, and these haplogroups make up 70% of North African y-dna, while they only have about 3% R1b1b2.

Ebizur said...

Data from Wood et al. 2005 unless otherwise noted:

1/11 = 9.1% Nama (Typically Khoisan-looking Khoe speakers from Namibia)
2/24 = 8.3% Herero (Bantu speakers from Namibia)
1/18 = 5.6% Dama (Typically Bantu-looking Khoe speakers from Namibia)
1/32 = 3.1% Fante (Kwa > Akan speakers from Ghana)
1/92 = 1.1% Egyptian

10/28 = 35.7% Tunisian
13/44 = 29.5% Mali (Underhill et al. 2000)
2/34 = 5.9% Wolof (Senegal-Guinea speakers from Gambia/Senegal)
2/40 = 5.0% Sudan (Underhill et al. 2000)
4/92 = 4.3% Egyptian
1/24 = 4.2% Mid-east (Underhill et al. 2000)
1/28 = 3.6% Morocco (Underhill et al. 2000)
1/39 = 2.6% Mandinka (Mande speakers from Gambia/Senegal)
1/45 = 2.2% Basque (Underhill et al. 2000)

It looks like your finding of R1b1b2 in populations resident in the vicinity of Ghana should not be entirely unexpected.

Have you seen any R1b1b2 haplotypes from non-whites in Namibia? If you have, do the haplotypes suggest recent Southwest European (e.g. Portuguese) or Northwest European (e.g. Dutch) admixture, or is the provenance of the haplotypes unclear? Namibia does have the second-greatest population of white people among all countries in Sub-Saharan Africa, so recent European admixture may be a plausible explanation for the presence of R1b1b2-M269 in non-white ethnic groups of Namibia.

On a tangentially related note, Berniell-Lee et al. 2009 have found a greater TMRCA for their collection of R1b1* haplotypes from Bantus and pygmies of Gabon and southern Cameroon in western Central Africa (7,000 years (SD 8,100)) than for their collection of E1b1a haplotypes from the same populations (5,800 years (SD 7,200)). They also found that the subclades of E1b1a are indistinguishable based on 18 Y-STR haplotypes. Unfortunately, the authors have not provided a TMRCA estimate for their collection of E2 haplotypes from the same populations.