With the aid of PhyML (default parameters and 100 bootstrap replicates), here is what I get:
Note that the above was done by isolating the Y-SNPs on 28 unrelated males in the data. I also threw out all SNPs that had no-calls. I tried to infer terminal classifications for the different individuals based on current ISOGG nomenclature, although it's possible that there are downstream mutations that I missed. NA18940 that is cut off in the figure is D2a-M11 and, NA19649 is R1b1a2a1a1b2a1a-L20. I couldn't quite figure out NA19670.
Here is the tree code for anyone who wants to play with it:
(NA21732_MKK_E1b1b1a1c-V22:0.00295296,NA21737_MKK_E1b1b1a1c-V22:0.00270971,(NA20510_TSI_E1b1b1a1b-V13:0.02092798,((NA19239_YRI_E1a2-P110:0.08297556,(NA18940_JPT_D2a-M11
6.1:0.12168757,((NA12891_Utah_I1a4-Z63:0.01039303,(NA06994_CEU_I1a3a1b-Z73:0.00945249,NA20511_TSI_I1a3a1a-Z140:0.01697537)54:0.00049670)100:0.07005125,(NA19670_MXL_?:0.
08181470,(NA18558_CHB_NO-M214:0.10352315,(NA19735_MXL_Q1a3a1-M3:0.05725892,(NA20845_GIH_R2a1a-L294:0.05384443,((NA20846_GIH_R1a1a1b2-Z93:0.01116683,NA20850_GIH_R1a1a1b2
-Z93:0.00810758)100:0.02945897,(((NA12889_Utah_R1b1a2a1a1b5-DF19:0.01118764,HG00731_PUR_R1b1a2a1a1b1-DF27:0.00955759)24:0.00020607,NA07357_CEU_R1b1a2a1a1b2-U152:0.00963
759)21:0.00012284,(NA19649_MXL_R1b1a2a1a1b2a1a-L20:0.05342631,(NA20509_TSI_R1b1a2a1a1b2c3-Z146:0.00943290,NA10851_CEU_R1b1a2a1a1b3-L21:0.01307073)15:0.00019824)5:0.0000
0002)100:0.03211886)100:0.00848072)100:0.00921281)100:0.02274049)100:0.00299324)54:0.00000005)100:0.03274296)100:0.01917298)100:0.00999561,((NA18504_YRI_E1b1a1a1f1a1-U1
74:0.01490192,(NA19026_LWK_E1b1a1a1f1a1-U174:0.01033155,NA19834_ASW_E1b1a1a1f1a1-U174:0.00961706)75:0.00031406)100:0.00787632,(NA18501_YRI_E1b1a-V38:0.00966164,((NA1902
0_LWK_E1b1a-V38:0.01193590,NA19025_LWK_E1b1a-V38:0.00793484)100:0.00317626,(NA19700_ASW_E1b1a-V38:0.01130782,NA19703_ASW_E1b1a-V38:0.01054823)77:0.00022885)100:0.001228
37)100:0.00457000)100:0.04581828)100:0.04898306)100:0.01922685);
Nice tree, Dienekes! But you should add the number of Y SNPs for each branch, so that you and others can estimate dates for various lineages.
ReplyDeleteDo I miss something or does it claim, that R1b is closer related to the Asian R1a than to the European R1a?
ReplyDeleteThere is no European R1a here.
ReplyDelete"There is no European R1a here."
ReplyDeleteAh! All right. Missed that its a R2 not R1
When I googled for L294 I got a list with individuals including Czech and Slowakians, so I got fooled that this must be the Euro R1a. X-D
In case it helps, our collaborative spreadsheet at https://docs.google.com/spreadsheet/ccc?key=0Agq_ez43qXCjdFlxemtlUnZ1Qk01cVhMRVBFcm5WX3c&authkey=CIOag_UD#gid=12 has NA19670 as G2a3b1a2 (L497+). (I didn't personally contribute to this particular categorization.)
ReplyDelete@GregRM,
ReplyDeleteThanks! I had read somewhere that within F, haplogroup G branches off early, and this seems consistent with that.
I think the deep phylogeny will be near perfectly resolved soon, based on the papers that are coming out.