February 22, 2012

ChromoPainter/fineSTRUCTURE analysis of select South Asian/West Eurasian populations

This is the final result of the analysis mentioned in this previous post on the Kalash, using all 22 chromosomes.

Due to the quadratic running time of ChromoPainter, I took a random sample of 15 individuals from every included population with more than15 individuals. The final set included 392 individuals. It appears that a set of ~400 individuals/~260k SNPs can be processed in about 2 weeks on a single thread.

The raw chunkcounts between all individuals can be obtained from here.

The heatmap can be seen below:


The principal components analysis, shows the familiar West-to-South Asia cline:

More information can be found in the spreadsheet, including:

  • How many individuals from each population were assigned to each of 51 clusters
  • Individual assignments of all 392 individuals
  • Raw chunkcounts between all 33 different populations
  • Z scores of the above (by row)
  • Z scores of the above (by column)
How to read the Z scores:

  • by row: scan each line to see which populations (columns) are the bigger donors for each row.
  • by column: scan each column to see which populations (rows) are the bigger recipients for each column.

Finally, in the RAR file you can find some plots of Z scores (by row) for the different population.

For example, here is a list of donors for the Kalash population; the order is slightly different compared to the teaser, but the overall pattern is the same.


Compare with an outbred population, such as the Armenians:

4 comments:

Daro said...

"More information can be found in the spreadsheet, including..."

Sorry, but I cannot find the spreadsheet.

Dienekes said...

I've added the link

valeryz2001 said...

Dienekes, it's a matter of taste, but the plot literally cries out to be transformed non-specially, to fit the geography of samples :)

n/a said...

How long are the "chunks"?