bioRxiv | 2021

Reconstructing SNP Allele and Genotype Frequencies from GWAS Summary Statistics

 
 
 

Abstract


The emergence of genomewide association studies (GWAS) has led to the creation of large repositories of human genetic variation, creating enormous opportunities for genetic research and worldwide collaboration. Methods that are based on GWAS summary statistics seek to leverage such records, overcoming barriers that often exist in individual-level data access while also offering significant computational savings. Here, we propose a novel framework that can reconstruct allelic and genotypic counts/frequencies for each SNP from case-control GWAS summary statistics. Our framework is simple and efficient without the need of any complicated underlying assumptions. Illustrating the great potential of this framework we also propose three summary-statistics-based applications implemented in a new software package (ReACt): GWAS meta-analysis (with and without sample overlap), case-case GWAS, and, for the first time, group polygenic risk score (PRS) estimation. We evaluate our methods against the current state-of-the-art on both synthetic data and real genotype data and show high performance in power and error control. Our novel group PRS method based on summary statistics could not be achieved prior to our proposed framework. We demonstrate here the potential applications and advantages of this approach. Our work further highlights the great potential of summary-statistics-based methodologies towards elucidating the genetic background of complex disease and opens up new avenues for research.

Volume None
Pages None
DOI 10.1101/2021.04.02.438281
Language English
Journal bioRxiv

Full Text