tag:blogger.com,1999:blog-7785493.post6477399857436287167..comments2014-09-01T07:40:52.813+03:00Comments on Dienekesâ€™ Anthropology Blog: Classifier for 23andMe/deCODEme genotype dataDienekeshttp://www.blogger.com/profile/02082684850093948970noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-7785493.post-72878288133989498892008-06-13T21:05:00.000+03:002008-06-13T21:05:00.000+03:00Thanks! I'll try playing around ...Thanks! I'll try playing around ...caciohttp://www.blogger.com/profile/17902017914305322799noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-91572280264393530172008-06-13T14:11:00.000+03:002008-06-13T14:11:00.000+03:00The probability that a person from the, say, NW Eu...The probability that a person from the, say, NW European group will have a particular genotype is the product of the probabilities that they have a particular genotype for any particular SNP (assuming independence; I didn't look into the marker selection process in the paper, but I don't think they would have picked up tightly linked markers).<BR/><BR/>The probability that they have the reference allele for a SNP is the frequency of that allele which can be read off the authors' table. 1-that is the probability that they don't have it.<BR/><BR/>So, if in a particular group the ref alleles are:<BR/><BR/>ACGT<BR/><BR/>and their frequencies are:<BR/>0.2,0.3,0.1,0.4<BR/><BR/>and the individual has a genotype:<BR/>ACCT<BR/><BR/>Then we calculate:<BR/>0.2*0.3*(1-0.1)*0.4<BR/><BR/>for him, and the same for all three populations. Finally you have to normalize things so that they add up to 1. <BR/><BR/>Another important consideration is to use Log(P) and sum up Log's rather than multiply probabilities, because if you multiply hundreds of probabilities you may end up with 0, i.e., the machine can't represent such a small number.Dienekeshttp://www.blogger.com/profile/02082684850093948970noreply@blogger.comtag:blogger.com,1999:blog-7785493.post-36629451652743505362008-06-13T08:07:00.000+03:002008-06-13T08:07:00.000+03:00Intriguing. Are you weighting the marker values us...Intriguing. Are you weighting the marker values using the weights provided (frequencies) and then normalizing the three values to 1? Or are you using something more sophisticated?<BR/><BR/>I may try something similar if I hav! time. Thanks for the post.<BR/><BR/>caciocaciohttp://www.blogger.com/profile/17902017914305322799noreply@blogger.com