November 06, 2013

Dealing with false positive IBD segments

False positive IBD segments are a real problem for those who wish to use genotype data to establish family connections with distant relatives. Traditionally, this involves finding shared common IBD segments, and then comparing genealogies to find potential common ancestors from which these segments could be inherited. IBD is also used in population genetics (e.g., Coop & Ralph 2013). There is an obvious tradeoff, since sloppy IBD detection may enable more genealogical links to be established but adds to the burden of establishing the validity of these links (the infamous "ignoring contact requests from potential genetic cousins" issue). It will be nice if this technology finds its way to end users who stand to most benefit from it.

arXiv:1311.1120 [q-bio.PE]

Reducing pervasive false positive identical-by-descent segments detected by large-scale pedigree analysis

Eric Y. Durand, Nicholas Eriksson, Cory Y. McLean

(Submitted on 5 Nov 2013)

Analysis of genomic segments shared identical-by-descent (IBD) between individuals is fundamental to many genetic applications, but IBD detection accuracy in non-simulated data is largely unknown. Using 25,432 genotyped European individuals, and exploiting known familial relationships in 2,952 father-mother-child trios contained therein, we identify a false positive rate over 67% for short (2-4 centiMorgan) segments. We introduce a novel, computationally-efficient, haplotype-based metric that enables accurate IBD detection on population-scale datasets.



Charles Nydorf said...

I am surprised that correct positive IBD's for 2-4 cM's even approaches 33%.

Mrs. Watanabe said...

I believe that the 67% false positive rate is per 2-4 cM segment, not per person.