September 30, 2010

Collective intelligence in groups

The press release has more info. Not sure why the proportion of women resulted in higher "collective intelligence". The authors suggest that it is because of women's higher "social sensitivity". Personally, I think it may be because men (and women) tend to try to impress members of the opposite sex for obvious evolutionary reasons.

Science DOI: 10.1126/science.1193147

Evidence for a Collective Intelligence Factor in the Performance of Human Groups

Anita Williams Woolley et al.

Psychologists have repeatedly shown that a single statistical factor—often called "general intelligence"— emerges from the correlations among people's performance on a wide variety of cognitive tasks. But no one has systematically examined whether a similar kind of "collective intelligence" exists for groups of people. In two studies with 699 individuals, working in groups of two to five, we find converging evidence of a general collective intelligence factor that explains a group's performance on a wide variety of tasks. This "c factor" is not strongly correlated with the average or maximum individual intelligence of group members but is correlated with the average social sensitivity of group members, the equality in distribution of conversational turn-taking, and the proportion of females in the group.


More ADMIXTURE estimates in Eurasia

This time, I removed SNPs with more than 1% genotyping no-call from the Xing et al. (2010) dataset, and ran ADMIXTURE on the following populations (left-to-right): Slovenian, Kyrgyzstani, Buryat, and HapMap Chinese. For K=2:

The results are as expected, with Slovenians and Chinese forming opposite poles, and Kyrgyzstanis and Buryat showing a preponderence of Mongoloid ancestry, but with variable Caucasoid admixture. Notice a single Slovenian showing eastern influence.

For K=3:

The Buryat get their own cluster (blue). Some Chinese are seen as having "Buryat" influence, which makes sense as there have been incursion of Mongols into China in historical times. Some Buryat too seem to be "Chinese"-influenced.

Kyrgyzstanis show mixed affiliations. The presence of both a Buryat and a "Chinese" cluster is interesting. The Kyrgyz live at a lower latitude than the Buryat, so this may be a reason behind the "Chinese" cluster, while the Buryat are a more purely northern Mongoloid population.

Notice too, how the lone Slovenian becomes "blue" indicating Mongol rather than Chinese origins. This also makes sense as the Chinese people did not migrate to Europe, while Mongoloids of the steppe and forest zones did.

Interesting is also the emergence of a 2-3 Buryat with some "European" admixture. These may not stem from the centuries old mix between Sakas and Mongols, but may represent a more recent (e.g., Slavic) European element.

Y-chromosomes of Filipino Negritos and non-Negritos

European Journal of Human Genetics (29 September 2010) | doi:10.1038/ejhg.2010.162

The Y-chromosome landscape of the Philippines: extensive heterogeneity and varying genetic affinities of Negrito and non-Negrito groups

Frederick Delfin et al.

The Philippines exhibits a rich diversity of people, languages, and culture, including so-called ‘Negrito’ groups that have for long fascinated anthropologists, yet little is known about their genetic diversity. We report here, a survey of Y-chromosome variation in 390 individuals from 16 Filipino ethnolinguistic groups, including six Negrito groups, from across the archipelago. We find extreme diversity in the Y-chromosome lineages of Filipino groups with heterogeneity seen in both Negrito and non-Negrito groups, which does not support a simple dichotomy of Filipino groups as Negrito vs non-Negrito. Filipino non-recombining region of the human Y chromosome lineages reflect a chronology that extends from after the initial colonization of the Asia-Pacific region, to the time frame of the Austronesian expansion. Filipino groups appear to have diverse genetic affinities with different populations in the Asia-Pacific region. In particular, some Negrito groups are associated with indigenous Australians, with a potential time for the association ranging from the initial colonization of the region to more recent (after colonization) times. Overall, our results indicate extensive heterogeneity contributing to a complex genetic history for Filipino groups, with varying roles for migrations from outside the Philippines, genetic drift, and admixture among neighboring groups.


Preferred vs. actual mate body shape

PLoS ONE 5(9): e13010. doi:10.1371/journal.pone.0013010

From Preferred to Actual Mate Characteristics: The Case of Human Body Shape

Alexandre Courtiol et al.

The way individuals pair to produce reproductive units is a major factor determining evolution. This process is complex because it is determined not only by individual mating preferences, but also by numerous other factors such as competition between mates. Consequently, preferred and actual characteristics of mates obtained should differ, but this has rarely been addressed. We simultaneously measured mating preferences for stature, body mass, and body mass index, and recorded corresponding actual partner's characteristics for 116 human couples from France. Results show that preferred and actual partner's characteristics differ for male judges, but not for females. In addition, while the correlation between all preferred and actual partner's characteristics appeared to be weak for female judges, it was strong for males: while men prefer women slimmer than their actual partner, those who prefer the slimmest women also have partners who are slimmer than average. This study therefore suggests that the influences of preferences on pair formation can be sex-specific. It also illustrates that this process can lead to unexpected results on the real influences of mating preferences: traits considered as highly influencing attractiveness do not necessarily have a strong influence on the actual pairing, the reverse being also possible.


September 29, 2010

Hundreds of variants influence human height

From the press release:
An international team of researchers, including a number from the University of North Carolina at Chapel Hill schools of medicine and public health, have discovered hundreds of genes that influence human height.

Their findings confirm that the combination of a large number of genes in any given individual, rather than a simple "tall" gene or "short" gene, helps to determine a person's stature. It also points the way to future studies exploring how these genes combine into biological pathways to impact human growth.

"While we haven't explained all of the heritability of height with this study, we have confidence that these genes play a role in height and now can begin to learn about the pathways in which these genes play a role," said study coauthor Karen L. Mohlke, PhD, associate professor of genetics in the UNC School of Medicine.
"These investigators had once been competing with each other to find height genes, but then realized that the next step was to combine their samples and see what else could be found," said Mohlke. "The competitors became collaborators to achieve a common scientific goal."
Large-scale collaborations like this are awesome: people get less credit in a paper with hundreds of co-authors, but they are part of something worthwhile. Plus, it's nice to see a table of authors' different contributions listed in the supplementary material :)

Nature doi:10.1038/nature09410

Hundreds of variants clustered in genomic loci and biological pathways affect human height

Hana Lango Allen et al.

Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits1, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait2, 3. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P less than 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.

Bronze Age Mediterraneans may have visited Stonehenge

A DNA test on the "Boy with the Amber necklace" would be interesting.

Bronze Age Mediterraneans may have visited Stonehenge
The links between the Stonehenge area and the Mediterranean have been debated for years. Recent research by the British Geological Survey (BGS) suggests people came from both the snow of the Alps and the heat of the Mediterranean to visit Stonehenge.
However, scientific studies show that some of the people buried in the area during the Bronze Age were not local.

The analysis of the teeth from two males provides new evidence that one, dubbed ‘the Boy with the Amber necklace’, had come from the Mediterranean area, whilst the previously known ‘Amesbury Archer’ had come from the Alps.


The new evidence shows that ‘the Boy with the Amber necklace’ spent his childhood in a warm climate typical of Iberia or the Mediterranean. Such warm oxygen values are theoretically possible in the British Isles but are only found on the extreme west coast of South West England, western Ireland and the Outer Hebrides. These areas can be excluded as likely childhood origins of his on the basis of the strontium isotope composition of his teeth


‘The Boy with the Amber necklace’, whose grave was found on Boscombe Down, about 3 km south-east of Stonehenge, is from a more recent time — the end of the Early Bronze Age. His skeleton has been radiocarbon dated to around 1550 BC (dated by Wessex Archaeology). Aged 14–15 years when he died, he was buried wearing a necklace of around 90 amber beads.

September 28, 2010

Some ADMIXTURE estimates in Eurasia

(Last Update: Sep 29)

Continuing my exploration of ADMIXTURE, I turned to the HGDP data, which has 660,918 SNPs for a wide assortment of worldwide populations. After pruning 12,086 SNPs with more than 1% missing genotypes, I was still left with ~650k SNPs.

Here are some experiments on this dataset. First, a clustering with K=2 of Han Chinese, Russians, and Orcadians (left to right)

The emergence of 2 clusters (red=Mongoloid, blue=Caucasoid) is as expected, with Russians showing a small participation in the red cluster (7.2%). These northern Russians are believed to have a substantial Finno-Ugric genetic origin, so this is inline with a recent estimate for the eastern component in the westernmost Finno-Ugric speakers being less than 10% (but see below).

Notice a couple of Chinese individuals with a small Caucasoid component: as I've mentioned before Mongolians, and presumably northern Han have a small Caucasoid component from early movements of Iranian speakers from the west. That's an advantage of doing your own admixture analysis, that you can look at the data at a fine detail, and not rely on the published figures.

Next, a clustering of Orcadians, Uygur, and Han Chinese:
The variable admixture in Uygurs is evident (47.2-63.7%, mean: 54.2%)

Next, a clustering of Druze, Bedouin, and Bantu from Kenya.

Druze appear complete Caucasoid (red), Bantu completely Negroid (save for a couple of individuals), while Bedouins show a quite variable minor Negroid component. This variable African contribution (0-17.6%) makes an elongated cluster out of Bedouins in a recent analysis, pulling them away from other Middle Eastern populations in a Sub-Saharan direction.

Finally, I clustered European populations together with Mandenka and Han Chinese:

The populations are in the following order: Han, Mandenka, Orcadian, French Basque, French, North Italian, Tuscan, Sardinian, Russian.

Here are the admixture proportions:

Notice how the eastern component in Russians is now estimated as 10.9%. This probably reflects the inclusion of French Basque and Sardinians, i.e., populations which have historically no opportunity for eastern Eurasian admixture, rather than only Orcadians. This underscores the importance of having appropriate poles in inter-continental admixture estimates (see Appendix I).

Note also that the 100% value for the Han Chinese is not incompatible with the presence of the two aforementioned Caucasoid-admixed individuals, who are present here with an estimated 1.9% and 0.5% such admixture. However, this contributes little to the sample average of 40+ individuals.

The minor (0.1%) Sub-Saharan admixture in Tuscans and Sardinians is also interesting. As you can guess from the figure, this stems from a handful of individuals (green specks) with less than 1% admixture, which is, however more than the numerical low of 0.001% inferred for most Europeans by the software.

UPDATE I: Eurasian Cline

Below is a run for the following populations (left-to-right: French Basque, Russians, Uygur, Mongolians, Daur, Han Chinese). Notice that the Mongolic-speakers (Mongolian and Daur from HGDP have a small Caucasoid admixture, as I have mentioned before.
APPENDIX I: The importance of choosing poles

The choice of appropriate poles in the estimation of inter-continental admixture is extremely important.

If there is a racial admixture continuum between two major races, such as we observe in Eurasia, then we can express each intermediate population as a weighted sum of populations that live to the east and west of it.

For example, I will use a variable in interval [0, 1] to represent the position in the continuum, with 0: pure western, and 1: pure eastern.

A population at 0.4 can be expressed as the following weighted sum:

0.4 = 0.6*0 + 0.4*1

i.e., as an admixture of 60% western, and 40% eastern.

But, it can also be expressed as e.g.,

0.4 = 0.612*0.02 + 0.388*1

Notice that the choice of a slightly eastward-tilted "western pole" (at position 0.02 in the continuum) has resulted in a reduction of the inferred eastern component (from 40% to 38.8%).

This is exactly what happened in our example: Russian eastern admixture reduced when we used Orcadians, rather than French Basque as the western pole.

Note also, that this is all done automatically: no one told ADMIXTURE to identify these two poles: it was the presence of unlabeled individuals from different ends of the spectrum that influenced the admixture estimates for the rest.

APPENDIX II: Latent populations

Another important point that needs to be remembered has to do with the possible existence of latent ancestral populations.

For example, it is true that Eurasia (minus South Asia) is economically described as a continuum from the Caucasoids of the Atlantic coast to the Mongoloids of the Pacific, with a transition zone in Central Asia and Siberia, and spillovers on either side. But, we cannot exclude the prehistoric existence of other races in the Eurasian landmass that do not exist today in a relatively unadmixed form.

In Eurasia, the Proto-Uralic race was postulated as such a "third race" with features of its own and not reducible to simple Caucasoid-Mongoloid admixture. It is difficult to see whether these features are ancestral peculiarites (prior to admixture with Caucasoids and Mongoloids), or if they have arisen in a mixed Caucasoid-Mongoloid population.

It is also important to understand how such latent populations affect genetic continua:

First, if the latent population is equidistant from the two major races, then its admixture has no effect on an individual's position in the continuum between the two races. However, it is possible that the latent population was more related to one of the two major races. In that case, admixture with it will move a population towards that race.

So while the jury is still out about the existence of a Proto-Uralic race in Eurasia, its effects on admixed populations indicates that if it had existed it was genetically closer to Mongoloids than to Caucasoids.

Y chromosome study of Serbian Roma

The haplogroups are available as supplementary material. I wonder whether different population of Roma underwent different levels of admixture, or whether the Roma are themselves originally unrelated groups of wanderers which came to be identified by others as "Gypsies" and eventually believed it.

Both "massive admixture" and the scenario I am entertaining have their problems: in the former: why did a group of Roma admix so heavily while another not at all? in the latter: how did groups of unrelated origin come to share common cultural-linguistic traits? Balkan ethnology is not easy.

Here is an interesting tidbit from the paper which complements my recent enumeration of the genealogical mutation rate's superiority:
For the majority of the populations, time estimates based on Zhivotovsky et al., (2004) and NETWORK using the evolutionary mutation rate are comparable.
On the other hand, time estimates using the genealogical mutation rate (Goedbloed et al., 2009) seem to fit better with historical data of the Romani diaspora.
American Journal of Physical Anthropology DOI: 10.1002/ajpa.21372

Divergent patrilineal signals in three Roma populations

Maria Regueiro et al.


Previous studies have revealed that the European Roma share close genetic, linguistic and cultural similarities with Indian populations despite their disparate geographical locations and divergent demographic histories. In this study, we report for the first time Y-chromosome distributions in three Roma collections residing in Belgrade, Vojvodina and Kosovo. Eighty-eight Y-chromosomes were typed for 14 SNPs and 17 STRs. The data were subsequently utilized for phylogenetic comparisons to pertinent reference collections available from the literature. Our results illustrate that the most notable difference among the three Roma populations is in their opposing distributions of haplogroups H and E. Although the Kosovo and Belgrade samples exhibit elevated levels of the Indian-specific haplogroup H-M69, the Vojvodina collection is characterized almost exclusively by haplogroup E-M35 derivatives, most likely the result of subsequent admixture events with surrounding European populations. Overall, the available data from Romani groups points to different levels of gene flow from local populations.

September 25, 2010

How to use EURO-DNA-CALC with Family Finder (FTDNA) autosomal data

Thanks to the people who offered their help!

You can use the existing EURO-DNA-CALC with Family Finder data. There are only 63 SNPs in common between the dataset used in EURO-DNA-CALC and the chip used by Family Finder, so it will be interesting to see what kinds of results will turn up (*)

Here are the instructions on how to use Family Finder data.
  1. Download the zip file of EURO-DNA-CALC and extract its contents into a directory. There is a Readme file which you will use, but first you must convert your data.
  2. You autosomal Family Finder data has a csv.gz extension, i.e., it is a comma-delimited GZIP-compressed file. You should use an suitable program (e.g., Winrar or Winzip) to extract the csv file into the same directory as in step #1.
  3. Open the csv file in any text editor (Word or Wordpad should work fine).
  4. Remove the header (RSID,CHROMOSOME,POSITION,RESULT) at the top of the file
  5. Replace all quotes (") with nothing.
  6. Replace all commas (,) with tabs.
  7. Replace all missing value characters (-) with the character m.
  8. Save the file in the same directory as 23andme.txt. You've just converted your Family Finder data into a format that mimics that of 23andme.
  9. Follow the instructions in the Readme file exactly as if you had 23andme data.
  10. Feel free to e-mail me with your results, as I enjoy hearing from people who've used this tool.
(*) There are 192 and 163 SNPs in common between the results of 23andMe and deCODEme and the Price et al. dataset used by EURO-DNA-CALC, so it is expected that the accuracy of the estimate with Family Finder will be reduced. All three companies test multi-100K SNPs, but only a limited number of them coincide with the 300 AIMs selected in the Price et al. publication on which EURO-DNA-CALC is based.

Family Finder call for data

Family Tree DNA (no endorsement) is offering an autosomal test in the form of Family Finder. Naturally, it would make sense to tweak EURO-DNA-CALC to include people who have results from FF. Unfortunately I haven't been able to track down a sample file on the FTDNA site, to see what the data format actually is.

So, if anyone wants this feature added, feel free to e-mail me. DO NOT send your autosomal results right away, as I don't want to be flooded with data files. Just let me know you'd be willing to help and I'll let you know to send me the data.

Assuming the data format is reasonable enough, this should not be too much work to add the feature. If you've worked with FF data before and have any tips on any quirks it might have, or any differences from the way e.g., 23andMe reports the data, feel free to write to me as well.

My e-mail address is at the bottom of the blog.

September 24, 2010

ISBA4 abstracts

Here are some interesting abstracts from the recent 4th International Symposium on Biomolecular Archaeology.

Naglaa Abu-Mandil1 & Terry Brown
Kinship analysis and sex identification of skeletons from two archaeological sites in Greece
Ancient DNA offers unprecedented opportunities for anthropologists and bioarchaeologists to assess the biological relationships of ancient populations. This study is designed to assess the family relationship among skeletons from two different archaeological sites in Greece which can help in shedding light on the ritual practice in Aegean prehistory. Another aim is to identify the sex of these skeletons genetically to confirm the conventional sexing methods. These sites are called Kouphovouno and Bostani. Kouphovouno is an important Neolithic and Bronze Age archaeological site near Sparta in Lakonia, while Bostani is dated back to the Early Helladic period in Ancient Greek history. In both cases the sites are recently excavated and DNA samples from all people who have handled the skeletons are available. Both mtDNA and nuclear DNA markers are being studied in order to identify maternal relationships and to reveal the sex of the skeletons.

Morten Rasmussen et al.
The nuclear genome of an ancient human
We have sequenced the complete genome from an ancient human. It was obtained from 4,000-year-old permafrost-preserved hair; the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20, we recover 79% of the diploid genome, and identify 353,151 highconfidence single-nucleotide polymorphisms (SNPs). Comparisons with SNP data from contemporary populations allow us to explore the migrations and kinship of this extinct culture. Analyses provide evidence of a migration from Siberia into the New World, independent of that giving rise to the modern Native Americans and Inuit. The migration was dated to approximately 5,500 years BP and the closest living relatives are found in North-East Siberia showing no signs of admixture with modern Native Americans or Inuit. We use functional SNP assessment to assign possible phenotypic characteristics of the individual.

This is extremely exciting as it speaks of Y-chromosome results from Central European Neolithic sites, which is a first. It seems to me that migrationism is due for a big comeback. If anyone has attended the symposium and/or has more information on this feel free to leave a comment/send me an e-mail.

Wolfgang Haak et al.
Ancient DNA from Early Neolithic Farmers suggests a major genetic input from the Near East
The Neolithic transition (approx. 8000-4000 BC) is considered one of the most important demographic events in Europe’s past since the initial peopling of anatomically modern humans in the Upper Paleolithic (40,000 BC). Whether this transition has been cultural or driven by large-scale population movements is subject of a long-standing scientific debate in archaeology, anthropology and human population genetics. So far, inferences about the genetic make-up of past populations have been drawn from studies of modern-day Eurasian populations, but ancient DNA studies now provide direct snapshots of specific time frames in the past.

We present new mitochondrial and Y-chromosomal data from Neolithic individualsfrom a Central European early farming site, Derenburg (Germany), which significantly extends the genetic dataset of the Linearbandkeramik (LBK; n=42), and provides the first detailed genetic picture of the earliest Neolithic culture in Central Europe (5500-5000 cal BC). Comprehensive population-genetic analyses utilizing a large database of modern-day Western Eurasian populations (n=23,394) reveal unique genetic features of the LBK population and a clearly distinct mitochondrial haplogroup frequency distribution. Importantly however, the LBK population shows an affinity to populations in the modern-day Near East, suggesting a major genetic input from this region at the time of the advent of farming in Europe.
I wish we knew what the haplogroups of the unambiguously defined samples were...

Silja Dillenberger et al.
Parallel tagged amplicon sequencing of highly degraded Ychromosomal DNA from archaeological skeletons

The intent of the study was to develop a Y-SNP multiplex-PCR suitable for genetic analysis of ancient human remains. Therefore 37 SNPs characterizing Eurasian
haplogroups, with a focus on Europe and Central Asia, were selected in order to get a
high phylogeographic resolution. The 37 SNPs, using amplicon lengths between 64 and 107bp, were co-amplified within 2 multiplex PCRs followed by parallel tagged sequencing on the 454 platform. After testing on 3 recent male and 2 recent female individuals it was applied to 8 male prehistoric samples from Central Asia and Europe. One sample was too poorly preserved for haplogroup identification. Another individual could be narrowed down to Q or R*. The haplogroups of the remaining 6 samples could unambiguously be defined. This shows that this approach is adequate for Y-chromosomal typing of highly degraded ancient human remains.
Five years ago I estimated an 11% contribution of Central Asian Turks to modern Anatolians, which seems quite inline with the following estimate. I will simply note that the number of 1.5 million is probably inflated, as the invaders did not have the same reproductive success as the local population; this means that the "original Turks" were fewer than 11%. Moreover, not all of the Anatolian population of the 11th century became present-day Anatolians; the current Muslims of Anatolia are a part of the 11th century population mixed with the invaders, and hence the number of invaders must've been even smaller.

Inci Togan et al.
An Anatolian Trilogy: Arrival of nomadic Turks with their sheep and shepherd dogs
Because of its geographical location, Anatolia was subject to migrations from multiple different regions throughout time. The last, well-known migration was the movement of Turkic speaking, pastoral nomadic group from Central Asia. They invaded Anatolia and then the language of the region was gradually replaced by the Turkic language. Central Asian genetic contribution to Anatolia with respect to the Balkans was estimated as 13% by an admixture analysis implemented in LEA. This estimate was obtained by employing nuclear genetic markers. MtDNA and Y-chromosome estimates confirmed this admixture proportion. Based on the population size estimation for Anatolia in 12th century, it can be calculated that at least 1.5 million nomads might have arrived to Anatolia. History tells us that they have arrived to Central and Eastern Anatolia first and only 150 years later they invaded Western Anatolia. Distributions of genetic diversity of domestic sheep and shepherd dogs in Turkey support that as well the language spoken in Anatolia these nomads have changed the genetic landscape of these two domestic species within Turkey. These observations have implications on conservation strategies of domestic sheep in Anatolia which is known to be the cradle of sheep domestication. Results must be confirmed by ancient DNA studies.

Alicia K Wilbur et al.
Ancient tuberculosis before and after the Age of Exploration
The Age of Exploration resulted in contact between human populations that were previously isolated from each other, initiating exchange of ideas, cultigens, and diseases. The modern biogeography of tuberculosis (TB) strains appears to reflect this with, for example, the presence of European type strains in the Americas and elsewhere. Until recently, it was thought that TB originated in the Old World in the last 10,000 years and the presence of TB in the Americas prior to contact was debated. Current estimates of TB’s origins, however, range from 3-6 million years ago. In our research, we attempt to characterize ancient mycobacterial strains from cases of disseminated bone TB in order to understand the phylogenetic relationships between strains of tuberculosis prior to and after the Age of Exploration. DNA was extracted from over 115 samples exhibiting classic tuberculosis lesions obtained from both the New and Old Worlds and ranging in age from 5800 BCE to A.D. 1800. Then, four quantitative PCR assays were used to gauge the preservation of host and pathogen DNA. Human nuclear and mitochondrial, and mycobacterial repetitive (IS6110) and single copy (rpoB) loci were analyzed. These results show that while approximately one third of the samples contain human nuclear and/or mitochondrial DNA, only 10% were positive for mycobacterial DNA. Mycobacterial DNA was usually recovered in the presence of human DNA (75%). In addition, our results suggest that TB strains in the Americas dating prior to European contact did not contain the IS6110 repeat element. From the samples that tested positive for host and mycobacterial DNA, we first selected two from Peru and one from Canada, for subsequent analyses using highthroughput pyrosequencing. Our analyses indicate that both slow-growing (pathogenic) and fast-growing (environmental) species of mycobacteria are present in the samples. However, our analyses also indicate that new methods for targeting specific sequences of interest are necessary to obtain sufficient genome coverage for evolutionary analyses. We will discuss ways of doing this and our current progress in this effort.

The New Scientist reports that the following study discovered a couple of Africans were present in the crew of Columbus.

Vera Tiesler et al.
Age at death, biological ancestry and provenience of Christopher Columbus’ crew at La Isabela, Santo Domingo, (1493-1498). Histological and biomolecular approaches
The site of La Isabela, in the Dominican Republic, was the first colonial town in the Americas. It was settled by Christopher Columbus and his crew at the beginning of AD 1494, and initially housed some 1,500 individuals from a wide array of social, economic and probably ethnic backgrounds. Its graveyard quickly accumulated the mortal remains of those who succumbed to the harsh conditions of the Atlantic crossing and life in the colony. In this study we present the preliminary results of a series of histological and molecular (isotopic and DNA) studies that expand on the macroscopic skeletal information in combination with detailed historical records on the lives of the deceased. Considered jointly, the data sets provide deepened insights into age at death, disease, nutrition, biological ancestry and geographic origins of 49 individuals unearthed between 1983 and 1991 and currently stored at the Museo del Hombre Dominicano in Santo Domingo, Dominican Republic. The analyses were largely funded by the Universidad Autónoma de Yucatan, Merida, Mexico, and National Geographic Society, Washington D.C., US, and received logistical support form the Museo del Hombre Dominicano, Dominican Republic.

Tracey Pierre et al.
American Southwest prehistory through ancient DNA
The American Southwest is one of the best archaeologically known areas of the world. It is also one of the most ethnically and linguistically diverse regions inhabited by contemporary Native American groups in North America. To what extent are the early and late prehistoric Southwest occupants associated with the Mesa Verde, Chaco Canyon, Mimbres and Basketmaker cultures related to today’s Athapaskan, Puebloan and Uto-Aztecan speakers? Is there genetic evidence for an earlier migration into the Southwest by populations ancestral to today’s Southern Athapaskans? Can the spread of farming into the Southwest region by Uto-Aztecan speakers from Mexico be detected in the gene pools of these earlier cultures? How are the former occupants of Chaco Canyon related to other prehistoric and modern inhabitants of this region? Does the current regional diversity reflect the geographical distribution of Southwest cultures prior to European contact? Previous ancient DNA research from the greater Southwest has demonstrated both regional continuity and discontinuity through the study of short-read mtDNA sequences. With the advent of second generation sequencing technology it is now possible to address in finer resolution these microcontinental migrations questions associated with the spread of language families into the American Southwest.

I don't want to comment too much on the following abstract, but I'm always favorably inclined to the prosaic rather than the ornate interpretations of ancient artifacts, unless there is clear evidence to the contrary.

Lucija Šoberl et al.
On the Beaker trail: Investigating the function of British Beakers through organic residue analysis
Beaker pottery is traditionally regarded as a material symbol of social, material and ideological changes that began in the latest Neolithic – these included the appearance of new ceramic technologies, modes of dress and adornment, the introduction of metallurgy and single burial. As far as the pottery goes, meticulous and numerous typological schemes have been produced in the past, but the function of Beakers has never been established on a larger scale from a scientific point of view.

British Beakers are most commonly found with inhumation burials, laid in pits or cists, and often in association with other objects. It has often been supposed that Beakers were produced specifically for grave deposition, since they differ in terms of fabric quality and decoration from those produced for non-funerary use. Due to their elaborate decoration and innovative fine fabric, Beakers have been considered as prestige items. As a consequence of Sherratt’s interpretation of Beakers as drinking cups, used to consume alcoholic beverages or narcotic substances at ritual gatherings, these vessels have gained almost a legendary status as prestige drinking equipment that has not been scientifically contested. The porous fabric of prehistoric pottery has been known to represent a favourable environment for the long term preservation of organic molecules, such as lipids. Beaker potsherds from funerary and non-funerary contexts have been analysed using solvent extraction, followed by gas chromatography, mass spectrometry and isotope ratio mass spectrometry to provide structure identification, biomolecular fingerprints and compound specific δ13C values.

Through analyses of absorbed lipids we can directly address the function and contents of ceramic vessels. Here we present preliminary results of our research project aimed at addressing the function of Beaker pottery through organic residue analyses. Surprisingly, no support is found for the interpretation of Beakers as vessels used in alcohol consumption, and their very status as prestige items might even be questioned.

Melanie Pruvost et al
Nuclear ancient DNA draws picture of wild and early domesticated horses
Domesticated horses played key roles in the history of mankind providing nutrition and offering unprecedented modes of transportation. If the reasons related to the beginning of horse domestication are still unknown, horses were crucial to the life of nomadic pastoralists on the Eurasian steppe and had always have a particular position among domestic animal (warfare capabilities, symbol of social status, human's nutrition). For these reasons, deciphering the spatial and temporal origin of domestic horses is of key importance for understanding the origin of modern human societies. Due to the high variability of mtDNA among modern and ancient horse populations, the genetic analysis failed to reveal either time or place of horse domestication. In this case, the failure has pushed us to look for other genetic markers and to adapt new sequencing methods to ancient DNA. Thus, we were able for the first time to address the question of horse domestication by analyzing nuclear trait markers directly linked to early breeding practice. Coat color is an easily detectable phenotypic trait, which was likely a major goal of animal breeders since the beginning of domestication. Fortunately, single mutations are often responsible for color variants, which make these mutations very valuable for the analysis of SNP via pyrosequencing. We successfully typed for a dozen nuclear markers in more than 90 horse samples from the Pleistocene to medieval times. Through this example, we will present the advantage and limits of our methodological approach. By comparing mtDNA data and the data for coat color selection of horses, we will open the discussion about the perspective of the analysis of nuclear markers in palaeogenetics.

Linus Girdland Flink et al.
The Mediterranean route: analysing early domestic pigs in Southeast Neolithic France by combining Mitochondrial and Nuclear DNA with Geometric Morphometrics
The Neolithisation of Europe followed two main routes of expansion – the northern so called Danubian or Balkanic route and the southern Mediterranean route. Previous research has shown that the earliest domestic pigs in Europe were of Near Eastern descent, and specifically, that the spatiotemporal occurrence of haplotype Y1-6A is well correlated with the Danubian expansion. Whether domestic pigs along the southern route carried the same or divergent haplotypes remains unknown. A current hypothesis argues that early domestic pigs in the northern Mediterranean basin carried a different haplotype but has up to date lacked sufficient data to test it.
Here we report the results of our analysis of an 80bp d-loop fragment, a MC1R SNP that’s causative of dominant black coat colouring, and 2D geometric morphometric (GMM) data from sus remains in early to middle Neolithic layers in southeast France. Our results support the current hypothesis that divergent mitochondrial lineages accompanied the different routes of expansion as we find high prevalence of the Near eastern haplotype Y2-5A, but not a single Y1-6A. By applying GMM shape analysis we can show that individuals that carried a European d-loop signature (Aside haplotype) were significantly differentiated from individuals that carried the Y2-5A haplotype. This could imply a diverse origin that might represent local wild boar and imported domestic pigs. However, at least one individual that belonged to a European mitochondrial lineage also carried a derived allele at the 0301 locus in the MC1R gene – an allele that is assumed to have originated in domestic stock. Combined with previously published data, these results indicate that by 4000 BC, introgression with wild boar was widespread in Europe. For future analyses we aim to apply the integrated use of DNA and GMM to archaeological wild and domestic pig remains from locations across Europe and the Near East. As we demonstrated here, different analytical techniques can be used to answer a variety of questions and their combined use will make small case studies like this one more easily incorporated into a larger framework.

Ben Krause-Kyora et al.
The flying pig, migration or transfer of ideas in prehistory. Molecular genetic and archaeological investigations of Mesolithic and Neolithic pigs (Sus scrofa).
This study shows the reflection of population dynamics, like mobility and migration, in archaeological evidence from pigs. How did the domestication of the pigs take place in Northern Europe? Did domestic pigs of Near Eastern ancestry were definitely introduced into Europe during the Neolithic or did local European wild boar were also domesticated by this time?
First goal of this study was the development and establishment of extraction methods suited for extraction of DNA from historical samples, the selection of suitable genetic markers, and the establishment of sensitive, reliable and reproducible detection methods. PCRs were established to amplify pig-specific DNA with high sensitivity down to single molecules. Different primer pairs were used to amplify and sequence highly variable regions of the mitochondrial DNA like the dloop, cytb, XXX to determined specific mtDNA haplotypes. Further on specific nuclear DNA were analysed to determine the sex and the paternal haplotypes. The sequences finally aligned and compared to those already deposited in databases. A SNP analyse were established to determine the coat colour.
The results of over 300 individuals from 25 neolithic sites shows that around 4800- 4000 BC domestic pigs are introduced in the archaeological sites in northern Germany. The study points out that the oldest domestic pig in the sample (4600 BC) has a “Near East” haplotype. All other domestic and wild boars show the same “European” haplotype. The conclusion leads to the opinion that the domestic pigs with a maternal “Near East” ancestor were introduced into central Europe with the linear pottery (LBK) culture. After a short period the domestic pigs with “European” haplotypes coexist with the “Near East” haplotypes in the LBK and the Chaseen culture. An explanation could be that the people of the Ertebølle culture adapt the idea of domestication and permuted it on the indigenous wild boar population. With the established methods it is possible to determine the sex and the coat colour of ancient individuals. Further on the study shows the important of the coat colour as a marker for the domestication.

Josef Caruana & Terry Brown
The Maltese through time: A comparison of prehistoric, Roman and modern Maltese mitochondrial DNA haplotypes
The Maltese islands are a small archipelago situated in the middle of the Mediterranean Sea. Throughout history these islands have been dominated by the Mediterranean power of the era due to their strategic importance in controlling the shipping lanes between the eastern and western Mediterranean Sea. This study compares ancient DNA amplifications from a prehistoric site situated on the island of Gozo, two Roman burial sites in Malta, one of which is found in an urban context whilst the other in a rural context, and a sample group from the modern Maltese population. By analysing mitochondrial DNA Hypervariable Region 1, due to its higher copy number and survivability, this project aimed to study if any changes to the population of the islands can be observed through time. Another aim of the study was to see if any unique haplotypes might have survived these colonisations, and might still be present in the modern population. The modern Maltese population was also compared with other modern populations in the region in order to ascertain who it is most closely related to, and thus, which neighbouring influence most closely affected the matrilineal line of the Maltese population.

September 22, 2010

Using ADMIXTURE on the Xing et al. (2010) dataset

This is the result of running ADMIXTURE on 246,554 SNPs / 850 individuals / 40 populations from the Xing et al. (2010) dataset.

Below are the admixture proportions for the 40 populations:

The clusters can be loosely labeled as:

A: South Asian
B: Altaic
C: Irula (S Indian tribals)
D: !Kung (Khoisan speakers from Africa)
E: Sub-Saharan
F: Polynesian
G: Southeast Asian
H: Pygmy
I: West Asian
J: European
K: Amerindian
L: Northeast Asian

September 21, 2010

Optimal committee size

This is a very interesting paper in view of my recent post on democracy. The authors address the issue of how many evaluators or judges a committee ought to have. Adding more judges of similar competence improves accuracy of evaluation, but comes at a cost. Moreover, the optimal committee size depends on the quality of the judges, with the counterintuitive finding that it decreases as the judges become worse.

The moral: it pays to gather many experts, but it doesn't pay to gather many fools.

Democracy can be viewed as a form of decision-making system by a very large committee of mediocre evaluators. From a cost-benefit perspective it's a rather big waste.

PLoS ONE 5(9): e12642. doi:10.1371/journal.pone.0012642

The Calculus of Committee Composition

Eric Libby, Leon Glass

Modern institutions face the recurring dilemma of designing accurate evaluation procedures in settings as diverse as academic selection committees, social policies, elections, and figure skating competitions. In particular, it is essential to determine both the number of evaluators and the method for combining their judgments. Previous work has focused on the latter issue, uncovering paradoxes that underscore the inherent difficulties. Yet the number of judges is an important consideration that is intimately connected with the methodology and the success of the evaluation. We address the question of the number of judges through a cost analysis that incorporates the accuracy of the evaluation method, the cost per judge, and the cost of an error in decision. We associate the optimal number of judges with the lowest cost and determine the optimal number of judges in several different scenarios. Through analytical and numerical studies, we show how the optimal number depends on the evaluation rule, the accuracy of the judges, the (cost per judge)/(cost per error) ratio. Paradoxically, we find that for a panel of judges of equal accuracy, the optimal panel size may be greater for judges with higher accuracy than for judges with lower accuracy. The development of any evaluation procedure requires knowledge about the accuracy of evaluation methods, the costs of judges, and the costs of errors. By determining the optimal number of judges, we highlight important connections between these quantities and uncover a paradox that we show to be a general feature of evaluation procedures. Ultimately, our work provides policy-makers with a simple and novel method to optimize evaluation procedures.


September 19, 2010

Playing with ADMIXTURE

(Last Update: 22 Sep)

I've been trying out ADMIXTURE recently. It's lightning fast compared to both frappe and STRUCTURE, its main competitors in the admixture estimate field, simple to use, and well-documented.

My main goal was to analyze the data in the recent Xing et al. (2010) paper. It's unfortunate that many recent papers do not have their data online, or they hide them behind various institutional controls, but the data in that paper (a total of 40 populations typed for a quarter million markers) is available online.

My main goal is to eventually update the EURO-DNA-CALC, making it more powerful and extending it with non-European populations. There are a few aspects that are particularly important:
  1. You can't assume that people will have the computing power and know-how to go through various steps to run ADMIXTURE themselves.
  2. The alternative of having people send me their genotype data is impossible because of legitimate privacy concerns and the obvious impossibility of accommodating a large number of requests.
The beauty of ADMIXTURE is that it provides allele frequency estimates for its inferred K ancestral populations. Thus, end users can side-step the task of running the full analysis (850 individuals x 250k markers), which should make it possible to run the next version of EURO-DNA-CALC in modest machines.

Here is a 10k SNP/K=7 run of ADMIXTURE on the aforementioned data, which had a running time of a few minutes in my machine. As you can see 10k is already quite good in separating different groups of individuals. I will probably use more SNPs in the final version.

Feel free to leave comments on what features you'd like to see in the new version. I can't promise a timetable, but I will try to incorporate as many suggestions as I can.

UPDATE I (Sep 21):

Here is a run with all 246,554 SNPs for the 850 individuals. If you notice, this looks like the figure published in the Xing et al. paper, although I've kept the individuals in the order they appear in the genotype file, while the published version has re-arranged them so that the different clusters will appear contiguously. This run took several minutes, and I am estimating that the full run for K=12, i.e., to generate the other figure from the paper will take about half a day, so I will probably leave it running overnight one of these days, and post it as well.

UPDATE II (Sep 12):

The results for K=12 and 246,554 SNPs, which took (as I had estimated) about 10.5 hours to compute.

September 15, 2010

Major study of Central Asian populations (Martinez-Cruz et al. 2010)

I have often commented on the fact that Central Asians were mainly formed by the pendulum of Western Eurasians (Caucasoids) moving east with Indo-Iranian languages during prehistory and the later historical westward movement of Turkic speakers. There were other movements besides these, e.g., the Tocharians represent a non-Indo-Iranian eastward Caucasoid movement, while the Mongols represent a non-Turkic westward Mongoloid movement.

Central Asians are therefore today a variable mix of Caucasoids and Mongoloids, formed over the last few millennia, although the constituent elements are still present and recognizable. The Turkicization of the region was, to a large extent, the result of language shift among Iranian populations (Sakas-Scythians), but not without some genetic contribution from the original Turks who were a Mongoloid people akin to their linguistic Altaic cousins, the Mongols. This Mongoloid component is attenuated westward, reaching its minimum among Anatolian and Balkan Turkish speakers.

With respect to the admixture proportions (figure top left) presented in the paper, I have a couple of quick comments:
  • Using modern south Asians as representatives of a source population of Central Asia is problematic, as modern south Asians are admixed, comprised of Caucasoids and indigenous South Asians. While South Asia may have been a population source during remote periods of the Paleolithic, in the more recent post-Neolithic times when Central Asian populations were formed, South Asia was a population sink.
  • The use of only a few autosomal markers does give a broad overview of the east-west components in these populations, but it should be noted that the use of few markers tends to overestimate minority ancestral components.

Even with such a small number of markers, it is evident that the separation of groups at the population level is possible, as the correspondence analysis indicates: green/European, red/East Asian, blue/Indo-Iranian from Central Asia, orange/Turkic from Central Asia.


The paper includes STRUCTURE results for K=2 to K=6. Below is the STRUCTURE run for K=6:

While less distinct than what we would get with more markers, the emergence of several clusters of individuals is apparent (from left to right: East Asian, Turkic, Central Asian Iranian, South Asian, West Eurasian, Sub-Saharan). Notice how Hazaras and Uyghurs are islands of the Turkic component in the Central/South Asian cluster, and how some Uzbeks are Iranian-like while others are Turkic-like. I am reminded of an older study which found how mythology was used among some Uzbek groups to create a common ancestry for groups of unrelated origin.

European Journal of Human Genetics , (8 September 2010) | doi:10.1038/ejhg.2010.153

In the heartland of Eurasia: the multilocus genetic landscape of Central Asian populations

Begoña Martínez-Cruz

Located in the Eurasian heartland, Central Asia has played a major role in both the early spread of modern humans out of Africa and the more recent settlements of differentiated populations across Eurasia. A detailed knowledge of the peopling in this vast region would therefore greatly improve our understanding of range expansions, colonizations and recurrent migrations, including the impact of the historical expansion of eastern nomadic groups that occurred in Central Asia. However, despite its presumable importance, little is known about the level and the distribution of genetic variation in this region. We genotyped 26 Indo-Iranian- and Turkic-speaking populations, belonging to six different ethnic groups, at 27 autosomal microsatellite loci. The analysis of genetic variation reveals that Central Asian diversity is mainly shaped by linguistic affiliation, with Turkic-speaking populations forming a cluster more closely related to East-Asian populations and Indo-Iranian speakers forming a cluster closer to Western Eurasians. The scattered position of Uzbeks across Turkic- and Indo-Iranian-speaking populations may reflect their origins from the union of different tribes. We propose that the complex genetic landscape of Central Asian populations results from the movements of eastern, Turkic-speaking groups during historical times, into a long-lasting group of settled populations, which may be represented nowadays by Tajiks and Turkmen. Contrary to what is generally thought, our results suggest that the recurrent expansions of eastern nomadic groups did not result in the complete replacement of local populations, but rather into partial admixture.


September 14, 2010

Post-glacial migrations of humans into East Asia

I'll comment on this paper after I read it.

Mol Biol Evol (2010)
doi: 10.1093/molbev/msq247

Extended Y-chromosome investigation suggests post-Glacial migrations of modern humans into East Asia via the northern route

Hua Zhong et al.

Genetic diversity data, from Y chromosome and mitochondrial DNA as well as recent genome-wide autosomal SNPs, suggested that mainland Southeast Asia was the major geographic source of East Asian populations. However, these studies also detected Central-South Asia- and/or West Eurasia-related genetic components in East Asia, implying either recent population admixture or ancient migrations via the proposed northern route. To trace the time period and geographic source of these Central-South Asia- and West Eurasia-related genetic components, we sampled 3,826 males (116 populations from China and one population from South Korea) and performed high-resolution genotyping according to the well-resolved Y-chromosome phylogeny. Our data, in combination with the published East Asian Y-haplogroup data, show that there are four dominant haplogroups (accounting for 92.87% of the East Asian Y chromosomes), O-M175, D-M174, C-M130 (not including C5-M356) and N-M231, in both southern and northern East Asian populations, which is consistent with the proposed southern route of modern human origin in East Asia. However, there are other haplogroups (6.79% in total) (E-SRY4064, C5-M356, G-M201, H-M69, I-M170, J-P209, L-M20, Q-M242, R-M207 and T-M70) detected primarily in northern East Asian populations, and were identified as Central-South Asian and/or West Eurasian origin based on the phylogeographic analysis. In particular, evidence of geographic distribution and Y-STR diversity indicate that haplogroup Q-M242 (the ancestral haplogroup of the native American-specific haplogroup Q1a3a-M3) and R-M207 probably migrated into East Asia via the northern route. The age estimation of Y-STR variation within haplogroups suggests the existence of post-Glacial (∼18 thousand years ago, kya) migrations via the northern route as well as recent (∼3 kya) population admixture. We propose that although the Paleolithic migrations via the southern route played a major role in modern human settlement in East Asia, there are ancient contributions, though limited, from western Eurasia which partly explain the genetic divergence between current southern and northern East Asian populations.


Magnus Carlsen vs. the World or the shortcomings of representative democracy

I participated in the G-STAR RAW World Chess Challenge last Friday, in which top-rated Norwegian chess player Magnus Carlsen played against the "World", eventually winning the game after 44 moves.

The game started going downhill for the "World" early on, and, as I was waiting to see how the inevitable 1-0 would play out, it dawned on me how closely the whole experience paralleled what goes on in a modern representative democracy.

Carlsen (C): a "challenge" facing the public; the crisis, or problem, that needs to be addressed
Lagrave, Nakamura, Polgar (LNP): the "politicians", suggesting what needs to be done to address C, one move at a time
The public (P): the "electorate", choosing from the moves suggested by LNP
Kasparov & Ashley (KA): the "media", providing commentary on C, LNP, and P

The game itself was good evidence in favor of two assertions about democracy:

1. The sum of mediocre minds does not create a genius
2. Crowds do not plan long-term and are not consistent

1) The sum of mediocre minds does not create a genius

In many situations, combining mediocre elements produces a superior result. This is true when tasks can be broken down to components that are easy to handle. A regular person would die of boredom before he could sum up a thousand numbers, but a hundred people could achieve it with a handful of additions each.

It is also true when mediocre participants are unbiased and independent, and there is a simple way to combine their output. For example, I can't guess another person's height with any great accuracy, but if twenty people take a guess, the average of their guesses may be quite close to the truth.

In the Carlsen vs. the World match, neither of these two conditions held: different individuals did not think about different aspects of the game. In short, there was no co-operation, no strength in numbers. The result was that the World did not play as a massively parallel machine, but rather as one amateur player picking between alternatives sugested by three really strong ones. Nor did they combine their thoughts in any interesting way, but simply with the basic rule of majority voting.

2) Crowds do not plan long-term

Long-term planning with many participants is difficult to achieve. This is not only due to the unpredictability of the future, or people's inability to think ahead except in the vaguest of terms, but also due to the difficulty of maintaining a consensus.

Consider a group of people deciding to reach the end of a maze. Each person has an idea of how to achieve this goal, but there are different ways in which the people can achieve their goal:

In a democracy there is a vote at each intersection, in monarchy the crowd follows a leader, in anarchy everyone goes their own way, while in aristocracy (in the original sense of the word, which is "rule by the best") the crowd follows leader(s) chosen for their ability in maze-navigation.

In the Carlsen vs. the World match it could be argued that there were aristocratic elements at play (as LNP, who led the people, are definitely chess experts) but also a democratic principle, as the leaders did not decide, but the people did.

If someone (say one of LNP) came up with a good plan, he could not carry it through, because midway through the plan's execution, the public, who did not understand the plan, and the other two leaders, who may not have guessed it, or may have preferred a different one, changed course.

Moves that were suggested early on, ended up played much later, when their "bite" was gone. The crowd flocked alternatively to displays of bravado and early counter-attack, followed quickly by overdefensive moves that created an uncomfortable cramped position that was impossible to defend.

The parallels to democracy

The world's failure is impressive, because it had so many good things going for it: the "leaders" were part of the chess elite, and both them, and the public had a clearly defined common interest: to win the game.

In a real democracy, not only do politicians and citizens have different and conflicting agendas (e.g., raise or lower taxes, increase or decrease immigration, regulate or liberalize markets, etc.) but the quality of politicians is generally low: LNP were selected largely because of merit, putting their plans to democratic scrutiny; in a real representative democracy, both leaders and plans are subject to a vote.

Not to mention the media, composed to a large extent of opinionated ignorami who have to sink to the level of the lowest common denominator of the populace in order to achieve circulation or viewership targets, rather than inform the public about the issues at hand, and the different political parties' plans.

It could be argued that the Carlsen was too smart, so we should not make too much out of the World's failure to defeat him; yet, it can also be argued that the real-world challenges that face societies are even more complex, whether they are climate change, the organization of financial markets, the decision to go to war, domestic and foreign enemies, and so on.


The main advantage of democracy, compared to other political systems is its adaptiveness. In a monarchy, you either get a good ruler or a bad one, for long periods of time. You have the advantage of long-term planning, but the disadvantage that the long-term plan may lead to ruin. In a typical representative democracy, you get a series of average rulers. Constancy and long-term planning go out the window, but course correction is built into the system.

Experiences like the Carlsen vs. the World match should, however, bring into our attention how suboptimal as a governing system representative democracy really is. It's difficult to see how democracy would evolve in a more efficient direction, but it is definitely worth thinking about.

September 11, 2010

Natural selection for high altitude living

The frappe results for Andean and Tibetan populations are shown below (Figures S2 and S3):

PLoS Genet 6(9): e1001116. doi:10.1371/journal.pgen.1001116

Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data

Abigail Bigham et al.

High-altitude hypoxia (reduced inspired oxygen tension due to decreased barometric pressure) exerts severe physiological stress on the human body. Two high-altitude regions where humans have lived for millennia are the Andean Altiplano and the Tibetan Plateau. Populations living in these regions exhibit unique circulatory, respiratory, and hematological adaptations to life at high altitude. Although these responses have been well characterized physiologically, their underlying genetic basis remains unknown. We performed a genome scan to identify genes showing evidence of adaptation to hypoxia. We looked across each chromosome to identify genomic regions with previously unknown function with respect to altitude phenotypes. In addition, groups of genes functioning in oxygen metabolism and sensing were examined to test the hypothesis that particular pathways have been involved in genetic adaptation to altitude. Applying four population genetic statistics commonly used for detecting signatures of natural selection, we identified selection-nominated candidate genes and gene regions in these two populations (Andeans and Tibetans) separately. The Tibetan and Andean patterns of genetic adaptation are largely distinct from one another, with both populations showing evidence of positive natural selection in different genes or gene regions. Interestingly, one gene previously known to be important in cellular oxygen sensing, EGLN1 (also known as PHD2), shows evidence of positive selection in both Tibetans and Andeans. However, the pattern of variation for this gene differs between the two populations. Our results indicate that several key HIF-regulatory and targeted genes are responsible for adaptation to high altitude in Andeans and Tibetans, and several different chromosomal regions are implicated in the putative response to selection. These data suggest a genetic role in high-altitude adaption and provide a basis for future genotype/phenotype association studies necessary to confirm the role of selection-nominated candidate genes and gene regions in adaptation to altitude.


September 10, 2010

Genetic differences between five European populations (Moskvina et al. 2010)

Notice (on the left) the number of SNPs with significant (p=0.05) differences between population varies between a low for the Scotland-Ireland pair to a high for the Sweden-Portugal one.

The supplementary material is also interesting. In Supp. Fig. 2 you can see the occurrence of 4 distinct clusters corresponding to the four corners of Europe, and also a barely perceptible tilting of Scotland toward Sweden relative to Ireland, within the NW cluster. As always, we should not interpret this as a lack of distinctiveness of the two populations, as such distinctiveness may hide in either higher-order dimensions, or a combination of more markers/individuals to be able to discern it. We could, however, say, as common sense would also dictate that these two populations are very close to each other in the European context.

Human Heredity Vol. 70, No. 2, 2010

Genetic Differences between Five European Populations

Valentina Moskvina et al.


Aims: We sought to examine the magnitude of the differences in SNP allele frequencies between five European populations (Scotland, Ireland, Sweden, Bulgaria and Portugal) and to identify the loci with the greatest differences. Methods: We performed a population-based genome-wide association analysis with Affymetrix 6.0 and 5.0 arrays. We used a 4 degrees of freedom χ2 test to determine the magnitude of stratification for each SNP. We then examined the genes within the most stratified regions, using a highly conservative cutoff of p less than 10–45. Results: We found 40,593 SNPs which are genome-wide significantly (p ≤ 10–8) stratified between these populations. The largest differences clustered in gene ontology categories for immunity and pigmentation. Some of the top loci span genes that have already been reported as highly stratified: genes for hair color and pigmentation (HERC2, EXOC2, IRF4), the LCT gene, genes involved in NAD metabolism, and in immunity (HLA and the Toll-like receptor genes TLR10, TLR1, TLR6). However, several genes have not previously been reported as stratified within European populations, indicating that they might also have provided selective advantages: several zinc finger genes, two genes involved in glutathione synthesis or function, and most intriguingly, FOXP2, implicated in speech development. Conclusion: Our analysis demonstrates that many SNPs show genome-wide significant differences within European populations and the magnitude of the differences correlate with the geographical distance. At least some of these differences are due to the selective advantage of polymorphisms within these loci.


September 08, 2010

ASHG 2010 abstracts

The 2010 meeting of the American Society of Human Genetics is in November. Here are some interesting abstracts that caught my eye:

It's nice to finally see a genomic study on the Greek population.
P. Paschou et al. Evaluation of the HapMap dataset as reference for the Greek population.
The HapMap project has provided a unique tool for the analysis of human genetic variation, providing reference information for allele frequency and genotype distributions as well as linkage disequilibrium patterns of Single Nucleotide Polymorphisms (SNPs) across the entire genome. The latest release of HapMap phase 3 data provides genotypes for millions of SNPs in 11 populations from around the world, with Europe being represented by the CEU (originating from Northwestern Europe) and the TSI populations (Tuscan Italians from Southern Europe). Although initial studies support the fact that the CEU can be used as reference for the selection of tagging SNPs in other European populations, a critical step in the design of genetic association studies, this hypothesis has not been extensively studied across Europe and in particular in Southern Europe. We set out to explore the extent to which the HapMap populations can be used as reference for a previously unstudied population of South-Eastern Europe, the Greek population. To do so we studied genomic variation in 1,813 SNPs, genotyped by our group in 56 individuals of Greek origin, and compared them to the CEU and TSI genotypes (1,813 SNPs from the CEU HapMap dataset and 1,205 from the TSI dataset). The studied SNPs are spread over 13 autosomal chromosomes and 26 regions, ranging in size from 120Kb to more than 4Mb. Genotype, allele frequency, and pairwise LD measures were compared across all three populations. PCA was used in order to identify those markers that are responsible for the observed inter-sample variance. Tagging SNPs were selected in the CEU and TSI samples and their transferability to the Greek population was tested, using both the r2 metric as well as the efficiency of genotype imputation of the non-selected SNPs. Our results demonstrate that, although the CEU population can to some extent be used as reference for the Greek population, it is preferable to use as reference a European population of closer genetic ancestry, like the TSI. These results are applicable in medical genetics, in order to inform the design of genetic association studies, as well as in studies of evolutionary relationships of Southern European populations.
One of the great problems of Eurasian anthropology is whether the Uralic populations are simply variable admixtures of Caucasoids and Mongoloids or they contain a tertium quid in the form of a Proto-Uralic element. The latter need not be distinct from the other two, as it can also be an old or stabilized blend of the two major Eurasian races that later admixed with more recent groups on either side. The abstract does not seem promising in this respect, i.e., in identifying a common core of ancestry among Uralic speakers in addition to their variable east-west admixture, but it would be nice to see if anything like that exists in the paper.

K. Tambets et al. Haploid and autosomal variation within a linguistic continuum of the Uralic-speaking people of Eurasia.
For about last two decades the examination of uniparentally inherited genetic marker systems revealing the variation embedded in mtDNA and Y chromosome has been the main tool in the studies of human genetic origins. Within few recent years the analysis of the genome-wide SNP data of individuals from different populations has started to give promising new insights in the field of human population genetics. The uniparentally inherited markers have shown slightly different demographic scenarios for the maternal and paternal lineages of North Eurasian, particularly of European Uralic-speaking populations. The geographical location of a population has evidently been the most important component that dictates the proportion of western and eastern mtDNA types in the gene pool of Uralic-speakers. Thus, the palette of maternal lineages of the Uralic-speakers resembles that of their geographically close European or Western Siberian Indo-European and/or Altaic-speaking neighbours, respectively. At the same time, the most frequent North Eurasian Y chromosome type N1c, that is also a common link between almost all Uralic-speakers, is with few exceptions rare, if present at all, among Indo-European-speakers of Western and Southern Europe. Here we combine genome-wide high density SNP data (650 000 SNPs, Illumina) with uniparentally inherited mtDNA and Y-chromosome variation of 16 Uralic-speaking populations to assess their place on the genetic landscape of North Eurasia. By the use of principal component and structure-like analysis on the autosomal data we show that the proportions of western and eastern ancestry components among the Uralic-speakers are determined mostly by geographical factors. The westernmost populations from Europe, both Uralic- and Indo-European speakers, are similar in their pattern of ancestry components and show low levels (less than 10%) of the eastern component. Conversely, the eastern ancestry component is dominant (60-70%) in the gene pool of the Siberian Uralic-speakers. In general, the genome-wide analyses corroborate the results of mtDNA analysis and do not reflect the common genetic characteristics between western and eastern Uralic-speakers at the level seen in case of N1c. Interestingly, among Saami from North Europe, who are often considered as „outliers“ in genetic studies, the dominant western component is accompanied by 30% of eastern component making them more similar to Volga-Uralic populations than to their closest neighbours.

This seems to validate my thoughts on relics and their importance in age estimation.

U. A. Perego et al. The Initial Peopling Of The Americas: An Ever-Growing Number Of Founding Mitochondrial Genomes From Beringia
Genetic evidence based on mitochondrial DNA (mtDNA) has recently revealed the existence of additional founding lineages that have contributed to the first peopling of America’s double-continent in addition to the more popular five Native American haplogroups (A2, B2, C1, D1 and X2a), and has demonstrated as well the need for additional sampling and analysis to be performed for some of the already known but poorly characterized lineages. One paradigmatic example is represented by the pan-American haplogroup C1. Two of its sub-branches (C1b and C1c) harbor ages and geographical distributions that are indicative of an early arrival from Beringia about 15-17,000 years ago, concomitantly with the other currently accepted Paleo-Indian founders. However, the estimated age of C1d - the third Native American subset of C1 - is only 8-10,000 years, which is suggestive of a much later entry and spread in the Americas. In this study, we shed light on the origin of this enigmatic Native American branch of C1 by completely sequencing a large number of C1d mitochondrial genomes from a wide range of geographically diverse, mixed and indigenous American populations. The revised phylogeny shows that the age previously reported for C1d was heavily underestimated and indicate that C1d is ancient enough to be among the founding Paleo-Indian mtDNA lineages. Moreover, our results reveal that there were two C1d founder genomes for Paleo-Indians that most likely arose early (~16kya), either in the dynamic Beringian gene pool, or at a very initial stage of the Paleo-Indian southward migration. This brings the recognized maternal founding lineages of Native Americans to the unexpected number of 15, and indicates that the overall number of Beringian or Asian founder mitochondrial genomes will probably continue to increase as more Native American haplogroups reach the same level of phylogenetic resolution as we obtained here for C1d. Additionally, we have confirmed a nearly identical geographic distribution pattern for haplogroup C1d when comparing samples collected in the general mixed population with those from native tribal groups, as it was also reported previously for haplogroups X2a and D4h3. This substantiates the validity of searching large public mtDNA databases (such as the one available through the Sorenson Molecular Genealogy Foundation, for novel founder candidates able to reveal unknown details concerning the ancient human history of the Americas.

Another interesting abstract. I've written before about the association of Y-chromosome haplogroups with the spread of Semitic speakers and the agreement with language phylogenetics.

N. Al-Zahery et al. The male gene pool of the contemporary Mesopotamia marsh population supports their Semitic origin.
The origin of the modern Mesopotamia marsh people, which are locally called “Ma’dan” or “Marsh’s Arabs”, is a question of great interest. Based on their life-style (living in reed houses, grazing of water buffalo and other aspects) and local archaeological sites, many historians and archaeologists believe they may have Sumerian ancestry. Although little is known about the origin of Sumerians themselves, two main hypotheses have been advanced in this regard. According to the first, Sumerians were a group of populations which migrated from the “South East” following a seashore route through the Arabian Gulf, and settled down in the southern marshes of Iraq. According to the second, the advancement of the Sumerian civilization is the result of migration from the mountainous area of Anatolia to the southern marshes of Iraq where they settled, adsorbing previous populations. In order to shed some light on the genetic origin of the Mesopotamia marsh population, we investigated the male gene pool of 145 DNA samples of modern Mesopotamia people, still living in marshes in the south of Iraq. The analyses of Single Nucleotide Polymorphisms (SNPs) and Short Tandem Repeats (STRs) of the paternally transmitted Male Specific region of the Y chromosome (MSY) revealed that more than 80% of marsh Y chromosomes belong to (Hg) J1-M267, the autochthonous haplogroup of Middle Eastern/Semitic speakers with possible recent expansion and/or founder effect reflected by the reduced STRs variability. In particular, 90% of them were assigned to the J1e-M267-PAGE08 sub-haplogroup, which is the predominant Y chromosome lineage among Middle Eastern Arab populations (Yemen, Qatar, UAE, and Levant). Thus, these findings testify, at least from the paternal side, a strong Semitic Arabian component in the contemporary Mesopotamia marshes population, whereas no clear Anatolian and/or South Asian genetic evidence has been detected.
The finding of haplogroup I in China is surprising, as I is not generally found that far away from Europe. It would be interesting to see what the actual haplotypes are.
Y. Lu et al. Western Eurasian Y chromosomes found in the Chinese Salar ethnic group
Salar is a small Western-Turkish-speaking population living mostly in Qinghai province of China. The most similar languages to Salar are all far in Turkmenistan. Historical records suggested that they may be descendants of the Turkic nomadic tribes in Central Asia. In this study, 141 Salar Y chromosomes were analyzed for 39 SNP and 14 STR markers to investigate the potential imprints of their western ancestors. The most frequent haplogroup (hg) in this population sample is Hg R, comprising 40% of all Y chromosomes. Most of these Hg R samples belong to R1a1 (M17), which distributes in a wide geographic region including South Asia, East Europe, Central Asia, and South Siberia. Other four Western Eurasian haplogroups (G-2%, H-5%, I-3%, J-3%) were also found in Salar Y chromosome gene pool. These paternal lineages of Salar are absent in their East Asian neighbors but frequent in Central Asia. Y-STR-based analyses also grouped Salar to Central Asians. On the other side, Salar also has low frequencies of the East Asian specific Hg D and Hg O, suggesting possible gene flow from their neighboring populations. This Y chromosome study demonstrated that Salar well keeps the Western Eurasian paternal lineages of their Central Asian ancestors although they may have migrated to Central China for about 800 years.

I wish that more "people pairs" would be studied this way, as it would give us some good insight of how migration affects gene pools (allele frequency changes, founder effects, possible social selection etc.)

M. Davis et al. Ancient and recent demographic events influence mitochondrial DNA diversity in an immigrant Basque population
The Basques are an ancient people, considered by many anthropologists to represent the oldest extant European population. Because of this, they have been the subject of numerous sociological and biological investigations. The Basque Diaspora, a relatively recent demographic expansion of the Basque population, has until now been overlooked in genetic studies. Samples were taken from 53 individuals with Basque ancestry in Boise, Idaho, and the mitochondrial DNA (mtDNA) sequence variation of the first and second hypervariable regions were determined. Thirty-six mtDNA haplotypes were detected in the sample. Comparing the genetic diversity in the Idaho sample with other Basque populations, signatures of founder effects were observed, consistent with both the recent and ancient history of Basque mitochondrial lineages. There has been a marked alteration of haplogroup frequency and diversity, and there is a slight reduction in other measures of diversity in the NW Basque population compared to the native Basque population. We have found a relatively high percentage of the Cambridge Reference Sequence (rCRS) haplotype for hypervariable regions I and II, which is absent in previous studies of Basque mtDNA, and rare in other Spanish populations. The amount of nucleotide diversity is consistent with a sample that is predominantly haplogroup H, which is especially common in the Basque regions of Europe, due to ancient migrations and expansions out of glacial refugia. This is the first report of mtDNA diversity in an immigrant Basque population, and we find that the diversity in NW Basques can be explained by the recent history of migration, as well as the phylogeography and diversity of the major European haplogroups.

W. S. Watkins et al. Admixture in New World populations: an analysis of Y-chromosome, mtDNA, and genome-wide microarray data
The first major interaction between Native Americans and Europeans is documented historically and occurred less than 550 years ago. This recent time frame provides an excellent opportunity to investigate the effects of admixture between two populations that were previously separated for hundreds of generations. To characterize European admixture in Native American populations, we sampled and analyzed a group of isolated Totonac agriculturists from tropical Mexico near Veracruz and a group of native Bolivians predominantly from the mountainous region near La Paz, Boliva. Mitochondrial sequencing of HVS1 showed that all samples had pre-Columbian mtDNA haplogroups (A, B, C, and D). Using a panel of 48 STRs or 12 Y-chromosome SNPs, Totonac Y-chromosomes lineages were all assigned to the pre-Columbian haplogroup Q1a3a, and Bolivian Y-chromosome lineages were assigned to haplogroups Q1a3a, R1, and J2. Haplogroups R1 and J2 are common in European populations. Principal components analysis (PCA) using >800K autosomal SNPs typed in 24 Totonacs and 23 Bolivians showed that all Totonacs and 14 Bolivians clustered distinctly from Eurasian individuals. Nine Bolivians, however, were positioned between the New World and European PCA clusters. Admixture analysis showed that these nine samples had 21 - 33% European admixture using a European reference population. All three observed Y-chromosome haplogroups, including the well-studied pre-Columbian haplogroup Q1a3a, occurred in the admixed individuals. Two of the nine admixed individuals had pre-Columbian mtDNA and Y-chromosome haplogroups but 21-23% European ancestry. This result demonstrates that Y-chromosome and mtDNA haplogroups are only partial indicators of an individual’s complete ancestry.

Readers of the blog know that I don't agree with the scenario presented in the followin abstract. The serial founder effect idea is used by geneticists to explain the overall reduced genetic diversity of our species (that we appear to be young, in evolutionary terms). Personally, I don't see how a smart, expanding species that all of the sudden had access to the resources of the landmass of Eurasia went through these extreme bottlenecks.
I think that the alternative of a larger human population, genetic diversity reduced across the species by ongoing climate- and culture-mediated selection, and admixture within Africa itself -where a particular expanding H. sapiens group must've co-existed with pre-existed hominids, anatomically modern or not- has merit.
J. Long et al. Evidence for archaic admixture in contemporary non-African human populations
Analyses of large-scale genetic data sets show evidence for a series of founder effects that occurred as modern humans left Africa and settled the rest of the world. Nonetheless, research on modern humans has not ruled out the possibility that other processes, such as local gene flow, or mixing between archaic and modern humans, have also contributed to modern human diversity. Recent analyses of the Neanderthal genome make archaic admixture a salient issue because they show evidence for mixing between Neanderthals and out-of-Africa migrants. The present study examines evidence for archaic admixture in genotypes for 619 microsatellite loci collected from over 2,000 individuals from 100 human populations. We obtained these data from the Marshfield Clinic collection. The populations analyzed represent all inhabited continents of the world. In our analysis, we formulate the serial founder effects (SFE) model as a special case of a phylogenetic model promoted by Cavalli-Sforza and his associates. In this light, the SFE process makes four predictions: 1) A tree of descent according to the pattern of fissions. 2) The root of the tree lies in Africa. 3) The length of each branch is proportional to ratio of evolutionary time to effective population size. 4) The gene identity between all pairs of populations that share the same most recent common ancestor is equal in expectation. Using hypothesis tests based on generalized hierarchical statistical models, we find good agreement between the SFE predictions and diversity within and between African populations, and we find good agreement between the SFE predictions and diversity between non-African populations. However, there is more diversity within the non-African populations than the SRE model can account for. This makes for greater genetic distance between Africans and non-Africans than otherwise expected. How and where did the non-Africans obtain this diversity? A simple explanation for the finding is that the earliest migrants out-of-Africa mixed with an archaic population such as Neanderthals prior to their expansion throughout Europe and Asia. Coalescent based computer simulations of the SFE model with mixing support our interpretation. The time and place that we detect mixing coincides perfectly with that detected in a recent examination of Neanderthal genome sequences. Our study shows that genomic diversity in modern humans still reflects ancient events and processes.

C. Flores et al. Using EuroAIMs to measure admixture proportions in atypical European populations: the case of Canary Islanders
Using ancestry informative markers (AIMs) allows reducing the number of makers needed for population stratification adjustments in association studies. As few as 100 AIMs are sufficient to adjust for the largest European axis of differentiation (i.e. EuroAIMs). However, their use for ancestry inference and adjustment in association studies in atypical European populations such as the Canary Islanders, a recently African-admixed population from Spain, needs to be addressed. We aimed to explore whether EuroAIMs were suitable both for the inference of Spanish and Northwest African admixture proportions and for ancestry adjustments in association studies including samples from Canary Islanders. We analyzed samples from Canary Islanders, mainland Spanish (IBE) and Northwest Africans (NWA) for 93 EuroAIMs and compared the data with CEU and YRI from HapMap, Basques and Mozabite from HGDP, as well as from previously analyzed European samples. The major genetic difference was observed between NWA and all European populations, preserving the northwest-to-southeast differentiation of European populations in the second axis. Analyses revealed that Canary Islanders were intermediate between IBE and NWA, and that direct sub-Saharan African influences were negligible. Assessment of individual admixtures without prior population information clearly identified two subpopulations corresponding to NWA and IBE, while Canary Islanders were admixed with an average of 17.4% Northwest African contribution varying largely among individuals (range 0-95.7%). As few as 23 EuroAIMs correctly estimated population membership to IBE and NWA, while 69 EuroAIMs were required to accurately estimate individual admixture proportions in Canary Islanders. Ancestry estimates based on a subset of 69 EuroAIMs also controlled significant allele frequency differences between IBE and Canary Islanders. These data suggest that a handful of EuroAIMs would be useful to control false-positives in association studies performed in Spanish populations. Supported by FUNCIS 23/07 and grants from the Spanish Ministry of Science and Innovation PI081383 and EMER07/001 to CF.
As I have I mentioned before, the Maasai (and many other east Africans in various degrees) are intermediate between Negroids and Caucasoids, and hence admixture estimates considering Yoruba Nigerians would tend to underestimate the African element. It's important to remember that extant Africans are not uniform, ranging from Caucasoids to Negroids, Pygmies, and Khoi-San, with multiple identifiable clusters within the major Negroid group itself, and all sorts of between-group gene flow in a regional basis. It is always useful (as is the case e.g., with African Americans) to both use historical knowledge about population sources, and also to validate historical narratives with the genetic evidence.
R. L. Raaum et al. Autosomal African admixture in Yemeni populations.
Approximately 30% of mtDNA lineages in South Arabian samples are African L haplotypes, whose origin has usually been attributed to migration and assimilation of African females into the Arabian population over approximately the last 2,500 years. Few In contrast, few Y chromosome lineages of clear recent sub-Saharan African origin have been found in Southern Arabian populations. This bias in maternal and paternal lineages is in accord with historical accounts of the female bias in the Middle Eastern slave trade. In order to evaluate autosomal African ancestry, we collected high-resolution SNP genotype data from a geographically representative set of 62 Yemenis selected from a collection of 552 samples acquired in the Spring of 2007. The ancestry of chromosomal segments in the Yemeni population was estimated using a haplotype-based local ancestry estimation method, HAPMIX. The HAPMIX method is based on a two way admixture model that requires two phased reference populations; we used the HapMap Yoruba in Ibadan, Nigeria (YRI), Luhya in Webuye, Kenya (LWK), Maasai in Kinyawa, Kenya (MKK), and CEPH US residents with ancestry from northern and western Europe (CEU) samples. The three African reference populations include two Bantu-speaking groups (YRI and LWK) and one Nilotic-speaking group (MKK). We estimated local ancestry in the Yemeni sample with all three European-African reference population combinations (CEU-YRI, CEU-LWK, CEU-MKK). The correlations among African ancestry calculated using all three reference population combinations are high (r > 0.98 in all pairwise correlations). Furthermore, there is no significant difference between the average proportion of African ancestry in Yemenis calculated using either of the two Bantu-speaking reference populations: CEU-YRI (mean 0.062, sd 0.044) and CEU-LWK (mean 0.076, sd 0.049) (p=0.13, two-tailed Welch two sample t-test). However, the average African ancestry calculated using the Maasai reference population (CEU-MKK, mean 0.148, sd 0.060) is significantly greater from that calculated using either the Yoruba or Luhya reference populations (p less than 0.0001 in both comparison, two-tailed Welch two sample t-test). These data suggest that the source population for the African ancestry of the Yemeni population is more similar to the contemporary Maasai population than either the Luhya or Yoruba.
The next abstract seems fun; it's always nice to see something that isn't like everything that came before it.
T. Rzeszutek et al. Music as a novel marker in the study of prehistoric human migrations.
The study of prehistoric human population history is often fraught with controversy owing to incongruent evidence among various markers of present-day genetic and cultural diversity. While archaeological evidence can be used to calibrate the conclusions drawn from present-day diversity, the fickle nature of the fossil record leaves some migration histories unresolved. Our work analyzes the potential of music - in particular, vocal music - to serve as novel migration marker, bolstering established migration work and shedding light on regions of the world whose settlement history is contested. One such migration is the recent expansion of Austronesian-speaking peoples across the Pacific within the last 6000 years. The dominant hypothesis posits a recent origin in Taiwan, with a rapid movement southwards and eastwards to populate Polynesia during the following 3500 years. While this model is strongly supported by both archaeological evidence and the present-day distribution of linguistic diversity, our goal was to analyze whether music could serve as a novel line of evidence in the study of Pacific prehistory. A critical concern regarding any migration marker is its time depth. In order to examine this for music, we analyzed correlations between musical diversity and mitochondrial-DNA diversity in 9 Taiwanese aboriginal tribes for which both types of data were available. A sample of 226 choral songs was analyzed using 39 binary characters representing significant structural features of music (e.g., rhythm, interval size, melodic contour, etc.). The musical samples were restricted to ritual musics, which constitute the most conservative (i.e., slowly changing) component of a culture’s repertoire. Mantel tests showed a significant correlation between musical distance and genetic distance among these 9 tribes, suggesting that music may have a time depth comparable to widely-used genetic markers like mitochondrial DNA. This work demonstrates that music has the potential to enrich the conclusions drawn from other markers, and establishes methods for employing it as a tool in the study of prehistoric human movements throughout the world. At the same time, we want to capitalize on music’s own unique dynamics of change over time and place, particularly its capacity for admixture. In other words, music might not only be able to support the narratives told by other migration markers but shed new light on the histories of population movement and cultural contact.

The bolded part in the following abstract makes sense, as it indicates (i) the distinctiveness of Ashkenazi Jews compared to CEU Europeans, and (ii) the fairly recent widespread formation of admixed individuals (in the last couple of generations) which generated individuals that are 1/4 1/2 and 3/4 AJ genomically.

V. Vacic et al., Admixture in Ashkenazi Jewish cohorts and implications for association studies.
Studies of complex genetic disorders may benefit from focusing on population isolates, such as Ashkenazi Jews (AJ). However, in order to truly exploit the advantages of reduced genetic diversity the self-declared AJ ancestry of study participants should be independently confirmed with available genetic data. We investigate whether the AJ cohorts display genetic heterogeneity, such as e.g. different rate of admixing in cases and controls, which could potentially confound disease association studies. We applied principal component analysis (PCA) to AJ cohorts ascertained in Israel and the US East Coast with the goal of characterizing population structure. As described previously, when compared to the HapMap samples with CEU, YRI and CHB/JPT ancestry, virtually all AJ samples cluster with the CEU. Similar analysis done on CEU and Jewish HapMap samples from Ashkenazi, Sephardic and Middle Eastern Jewish communities revealed that 97.8% of AJ samples cluster along the AJ-CEU axis, with modes at AJ and CEU cluster centers and at approximately quartile distances between them. We postulate that these groups correspond to 100-0, 75-25, 50-50, 25-75, and 0-100% AJ-CEU admixtures. Notably, only 91.7% of self-reported AJ individuals fall into the reference JHapMap panel AJ cluster, with 1.6, 3.3, 0.5 and 0.7% in the admixed modes ordered by decreasing fraction of AJ ancestry. We also observe admixing with the non-AJ Jewish communities: 0.7% of samples fall within the non-AJ clusters and 1.4% at a subgroup approximately halfway between the AJ and non-AJ cluster centers. In our dataset we found that when compared to the sample as a whole or only to controls, individuals with Crohn’s disease (CD) show significantly more admixing: 78.1, 3.1, 8.5, 2.0 and 0.9% in the 100, 75, 50, 25 and 0% AJ subgroups respectively. Also, CD samples show more admixing with non-AJ groups (2.8 and 1.0% in the 50-50 and 0-100 AJ-non-AJ subgroups). Isolates typically exhibit a greater amount of cryptic relatedness compared to outbred populations, which motivates an orthogonal method for verifying AJ ancestry based on identity-by-descent (IBD). The high background level of IBD within the Ashkenazi Jewish community can be used to estimate degree of AJ ancestry by averaging the IBD between a sample under study and the AJ individuals in the JHapMap panel. Our preliminary results show that this method recapitulates the high-level results from the PCA analysis and provides better resolution.