Journal of Computational Biology | 2019

Mendelian Inconsistent Signatures from 1314 Ancestrally Diverse Family Trios Distinguish Biological Variation from Sequencing Error

 
 
 
 

Abstract


Abstract Next-generation sequencing enables advances in the clinical application of genomics by providing high-throughput detection of genomic variation. However, next-generation sequencing technologies, especially whole-genome sequencing (WGS), are often associated with a high false-positive rate. Trio-based WGS can contribute significantly towards improved quality control methods. Mendelian-inconsistent calls (MIC) in parent–child trios are commonly attributed to erroneous sequencing calls, as the true de novo mutation rate is extremely low compared with MIC incidence. Here, we analyzed WGS data from 1314 mother, father, and child trios across ethnically diverse populations with the goal of characterizing MIC. Genotype calls in a trio can be used to assign different signatures to MIC. MIC occur more frequently within repeats but show varying distribution and error mechanisms across repeat types. MIC are enriched within poly-A/T runs in short interspersed nuclear elements. Alignability scores, allele balance, and relative parental read depth vary among MIC signatures and these differences should be considered when designing filters for MIC reduction. MIC cluster in germline deletions and these MIC also segregate with population. Our results provide a basis for making decisions on how each MIC type should be evaluated before discarding them as errors or including them in alternative applications. With the reduction of sequencing cost, family trio whole genome and exome analysis are being performed more routinely in clinical practice. We provide a reference that can be used for annotating MIC with their frequencies in a larger population to aid in the filtering of candidate de novo mutations.

Volume 26
Pages 405 - 419
DOI 10.1089/cmb.2018.0253
Language English
Journal Journal of Computational Biology

Full Text