Bioinformatics | 2019

Kinpute: using identity by descent to improve genotype imputation

 
 

Abstract


MOTIVATION\nGenotype imputation, though generally accurate, often results in many genotypes being poorly imputed, particularly in studies where the individuals are notwell represented by standard reference panels. When individuals in the study share regions of the genome identical by descent (IBD), it is possible to use this information in combination with a study specific reference panel (SSRP) to improve the imputation results. Kinpute uses IBD information-due to either recent, familial relatedness or distant, unknown ancestors-in conjunction with the output from linkage disequilibrium (LD) based imputation methods to compute more accurate genotype probabilities. Kinpute uses a novel method for IBD imputation, which works even in the absence of a pedigree, and results in substantially improved imputation quality.\n\n\nRESULTS\nGiven initial estimates of average IBD between subjects in the study sample, Kinpute uses a novel algorithm to select an optimal set of individuals to sequence and use as an SSRP. Kinpute is designed to use as input both this SSRP and the genotype probabilities output from other LD based imputation software, and uses a new method to combine the LD imputed genotype probabilities with IBD configurations to substantially improve imputation. We tested Kinpute on a human population isolate where 98 individuals have been sequenced. In half of this sample, whose sequence data was masked, we used Impute2 to perform LD based imputation and Kinpute was used to obtain higher accuracy genotype probabilities. Measures of imputation accuracy improved significantly, particularly for those genotypes that Impute2 imputed with low certainty.\n\n\nAVAILABILITY\nKinpute is an open-source and freely available C++ software package that can be downloaded from.\n\n\nSUPPLEMENTARY INFORMATION\nSupplementary information is available at Bioinformatics online.

Volume None
Pages None
DOI 10.1093/bioinformatics/btz221
Language English
Journal Bioinformatics

Full Text