Genetics doi: 10.1534/genetics.112.145599
The Effect of Genomic Inversions on Estimation of Population Genetic Parameters from SNP Data
Nafisa-Katrin Seich al Basatena et al.
In recent years it has emerged that structural variants have a substantial impact on genomic variation. Inversion polymorphisms represent a significant class of structural variant, and despite the challenges in their detection, data on inversions in the human genome are increasing rapidly. Statistical methods for inferring parameters such as the recombination rate and selection coefficient have generally been developed without accounting for the presence of inversions. Here we exploit new software for simulating inversions in population genetic data, invertFREGENE, to assess the potential impact of inversions on such methods. Using data simulated by invertFREGENE, as well as real data from several sources, we test whether large inversions have a disruptive effect on widely applied population genetics methods for inferring recombination rates, for detecting selection, and for controlling for population structure in genome-wide association studies (GWAS). We find that recombination rates estimated by LDhat are biased downward at inversion loci relative to the true contemporary recombination rates at the loci but that recombination hotspots are not falsely inferred at inversion breakpoints as may have been expected. We find that the iHS method for detecting selection appears robust to the presence of inversions. Finally, we observe a strong bias on the genome-wide results of principle components analysis (PCA), used to control for population structure in GWAS, in the presence of even a single large inversion, confirming the necessity to thin SNPs by LD at large physical distances in order to obtain unbiased results.