March 13, 2012

Auxiliary TreeMix scripts

Joe Pickrell has put up a useful Python script for TreeMix in the software page. It takes in PLINK-formatted input and transforms it into TreeMix-formatted input. You only need to specify a PLINK cluster file to indicate the correspondence between individuals and populations.

I was planning to release my own R script that does the same, and which I've used to produce this, but it is now redundant.

But since some people might want to try TreeMix with ADMIXTURE components, I am releasing my script that takes ADMIXTURE output and produces TreeMix output.

To use it, type in R:

ADMIXTUREtoTreeMix(PFILE='data.5.P', QFILE='data.5.Q', NAMES='components.txt', outfile='plink.treemix', GZIP=T)

data.5.P and data.5.Q are produced by an ADMIXTURE run.

This will output a plink.treemix.gz file that is ready for input into TreeMix

If you specify GZIP=F, the file won't be gzipped, so you can inspect it visually and gzip it later yourself, since TreeMix takes *.gz input.

The 'components.txt' file is a list of names for the components of ADMIXTURE, one per row.

No comments: