Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ryan Poplin is active.

Publication


Featured researches published by Ryan Poplin.


Nature Genetics | 2011

A framework for variation discovery and genotyping using next-generation DNA sequencing data

Mark A. DePristo; Eric Banks; Ryan Poplin; Kiran Garimella; Jared Maguire; Christopher Hartl; Anthony A. Philippakis; Guillermo Del Angel; Manuel A. Rivas; Matt Hanna; Aaron McKenna; Timothy Fennell; Andrew Kernytsky; Andrey Sivachenko; Kristian Cibulskis; Stacey B. Gabriel; David Altshuler; Mark J. Daly

Recent advances in sequencing technology make it possible to comprehensively catalog genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious, and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (i) initial read mapping; (ii) local realignment around indels; (iii) base quality score recalibration; (iv) SNP discovery and genotyping to find all potential variants; and (v) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We here discuss the application of these tools, instantiated in the Genome Analysis Toolkit, to deep whole-genome, whole-exome capture and multi-sample low-pass (∼4×) 1000 Genomes Project datasets.


Nature | 2016

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek; Konrad J. Karczewski; Eric Vallabh Minikel; Kaitlin E. Samocha; Eric Banks; Timothy Fennell; Anne H. O’Donnell-Luria; James S. Ware; Andrew Hill; Beryl B. Cummings; Taru Tukiainen; Daniel P. Birnbaum; Jack A. Kosmicki; Laramie Duncan; Karol Estrada; Fengmei Zhao; James Zou; Emma Pierce-Hoffman; Joanne Berghout; David Neil Cooper; Nicole Deflaux; Mark A. DePristo; Ron Do; Jason Flannick; Menachem Fromer; Laura Gauthier; Jackie Goldstein; Namrata Gupta; Daniel P. Howrigan; Adam Kiezun

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human ‘knockout’ variants in protein-coding genes.


Nature | 2012

Patterns and rates of exonic de novo mutations in autism spectrum disorders

Benjamin M. Neale; Yan Kou; Li Liu; Avi Ma'ayan; Kaitlin E. Samocha; Aniko Sabo; Chiao-Feng Lin; Christine Stevens; Li-San Wang; Vladimir Makarov; Pazi Penchas Polak; Seungtai Yoon; Jared Maguire; Emily L. Crawford; Nicholas G. Campbell; Evan T. Geller; Otto Valladares; Chad Shafer; Han Liu; Tuo Zhao; Guiqing Cai; Jayon Lihm; Ruth Dannenfelser; Omar Jabado; Zuleyma Peralta; Uma Nagaswamy; Donna M. Muzny; Jeffrey G. Reid; Irene Newsham; Yuanqing Wu

Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant, and the overall rate of mutation is only modestly higher than the expected rate. In contrast, the proteins encoded by genes that harboured de novo missense or nonsense mutations showed a higher degree of connectivity among themselves and to previous ASD genes as indexed by protein-protein interaction screens. The small increase in the rate of de novo events, when taken together with the protein interaction results, are consistent with an important but limited role for de novo point mutations in ASD, similar to that documented for de novo copy number variants. Genetic models incorporating these data indicate that most of the observed de novo events are unconnected to ASD; those that do confer risk are distributed across many genes and are incompletely penetrant (that is, not necessarily sufficient for disease). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case–control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors.


Current protocols in human genetics | 2013

From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.

Geraldine A. Van der Auwera; Mauricio O. Carneiro; Christopher Hartl; Ryan Poplin; Guillermo Del Angel; Ami Levy-Moonshine; Tadeusz Jordan; Khalid Shakir; David Roazen; Joel Thibault; Eric Banks; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark A. DePristo

This unit describes how to use BWA and the Genome Analysis Toolkit (GATK) to map genome sequencing data to a reference and produce high‐quality variant calls that can be used in downstream analyses. The complete workflow includes the core NGS data‐processing steps that are necessary to make the raw data suitable for analysis by the GATK, as well as the key methods involved in variant discovery using the GATK. Curr. Protoc. Bioinform. 43:11.10.1‐11.10.33.


Science | 2015

Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome

Manuel A. Rivas; Matti Pirinen; Donald F. Conrad; Monkol Lek; Emily K. Tsang; Konrad J. Karczewski; Julian Maller; Kimberly R. Kukurba; David S. DeLuca; Menachem Fromer; Pedro G. Ferreira; Kevin S. Smith; Rui Zhang; Fengmei Zhao; Eric Banks; Ryan Poplin; Douglas M. Ruderfer; Shaun Purcell; Taru Tukiainen; Eric Vallabh Minikel; Peter D. Stenson; David Neil Cooper; Katharine H. Huang; Timothy J. Sullivan; Jared L. Nedzel; Carlos Bustamante; Jin Billy Li; Mark J. Daly; Roderic Guigó; Peter Donnelly

Expression, genetic variation, and tissues Human genomes show extensive genetic variation across individuals, but we have only just started documenting the effects of this variation on the regulation of gene expression. Furthermore, only a few tissues have been examined per genetic variant. In order to examine how genetic expression varies among tissues within individuals, the Genotype-Tissue Expression (GTEx) Consortium collected 1641 postmortem samples covering 54 body sites from 175 individuals. They identified quantitative genetic traits that affect gene expression and determined which of these exhibit tissue-specific expression patterns. Melé et al. measured how transcription varies among tissues, and Rivas et al. looked at how truncated protein variants affect expression across tissues. Science, this issue p. 648, p. 660, p. 666; see also p. 640 Protein-truncated variants impact gene expression levels and splicing across human tissues. [Also see Perspective by Gibson] Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants.


PLOS Genetics | 2013

Analysis of Rare, Exonic Variation amongst Subjects with Autism Spectrum Disorders and Population Controls

Li Liu; Aniko Sabo; Benjamin M. Neale; Uma Nagaswamy; Christine Stevens; Elaine T. Lim; Corneliu A. Bodea; Donna M. Muzny; Jeffrey G. Reid; Eric Banks; Hillary Coon; Mark A. DePristo; Huyen Dinh; Tim Fennel; Jason Flannick; Stacey Gabriel; Kiran Garimella; Shannon Gross; Alicia Hawes; Lora Lewis; Vladimir Makarov; Jared Maguire; Irene Newsham; Ryan Poplin; Stephan Ripke; Khalid Shakir; Kaitlin E. Samocha; Yuanqing Wu; Eric Boerwinkle; Joseph D. Buxbaum

We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD.


bioRxiv | 2017

Scaling accurate genetic variant discovery to tens of thousands of samples

Ryan Poplin; Valentin Ruano-Rubio; Mark A. DePristo; Timothy Fennell; Mauricio O. Carneiro; Geraldine A. Van der Auwera; David E. Kling; Laura Gauthier; Ami Levy-Moonshine; David Roazen; Khalid Shakir; Joel Thibault; Sheila Chandran; Chris Whelan; Monkol Lek; Stacey Gabriel; Mark J. Daly; Benjamin M. Neale; Daniel G. MacArthur; Eric Banks

Comprehensive disease gene discovery in both common and rare diseases will require the efficient and accurate detection of all classes of genetic variation across tens to hundreds of thousands of human samples. We describe here a novel assembly-based approach to variant calling, the GATK HaplotypeCaller (HC) and Reference Confidence Model (RCM), that determines genotype likelihoods independently per-sample but performs joint calling across all samples within a project simultaneously. We show by calling over 90,000 samples from the Exome Aggregation Consortium (ExAC) that, in contrast to other algorithms, the HC-RCM scales efficiently to very large sample sizes without loss in accuracy; and that the accuracy of indel variant calling is superior in comparison to other algorithms. More importantly, the HC-RCM produces a fully squared-off matrix of genotypes across all samples at every genomic position being investigated. The HC-RCM is a novel, scalable, assembly-based algorithm with abundant applications for population genetics and clinical studies.


BMC Genomics | 2015

The distribution and mutagenesis of short coding INDELs from 1,128 whole exomes

Danny Challis; Lilian Antunes; Erik Garrison; Eric Banks; Uday S. Evani; Donna M. Muzny; Ryan Poplin; Richard A. Gibbs; Gabor T. Marth; Fuli Yu

BackgroundIdentifying insertion/deletion polymorphisms (INDELs) with high confidence has been intrinsically challenging in short-read sequencing data. Here we report our approach for improving INDEL calling accuracy by using a machine learning algorithm to combine call sets generated with three independent methods, and by leveraging the strengths of each individual pipeline. Utilizing this approach, we generated a consensus exome INDEL call set from a large dataset generated by the 1000 Genomes Project (1000G), maximizing both the sensitivity and the specificity of the calls.ResultsThis consensus exome INDEL call set features 7,210 INDELs, from 1,128 individuals across 13 populations included in the 1000 Genomes Phase 1 dataset, with a false discovery rate (FDR) of about 7.0%.ConclusionsIn our study we further characterize the patterns and distributions of these exonic INDELs with respect to density, allele length, and site frequency spectrum, as well as the potential mutagenic mechanisms of coding INDELs in humans.


Nature Biotechnology | 2018

A universal SNP and small-indel variant caller using deep neural networks

Ryan Poplin; Pi-Chuan Chang; David Alexander; Scott Schwartz; Thomas Colthurst; Alexander Ku; Dan Newburger; Jojo Dijamco; Nam Nguyen; Pegah T Afshar; Sam S. Gross; Lizzie Dorfman; Cory Y. McLean; Mark A. DePristo

Despite rapid advances in sequencing technologies, accurately calling genetic variants present in an individual genome from billions of short, errorful sequence reads remains challenging. Here we show that a deep convolutional neural network can call genetic variation in aligned next-generation sequencing read data by learning statistical relationships between images of read pileups around putative variant and true genotype calls. The approach, called DeepVariant, outperforms existing state-of-the-art tools. The learned model generalizes across genome builds and mammalian species, allowing nonhuman sequencing projects to benefit from the wealth of human ground-truth data. We further show that DeepVariant can learn to call variants in a variety of sequencing technologies and experimental designs, including deep whole genomes from 10X Genomics and Ion Ampliseq exomes, highlighting the benefits of using more automated and generalizable techniques for variant calling.


American Journal of Human Genetics | 2014

Simulation of Finnish Population History, Guided by Empirical Genetic Data, to Assess Power of Rare-Variant Tests in Finland

Sophie R. Wang; Vineeta Agarwala; Jason Flannick; Charleston W. K. Chiang; David Altshuler; Alisa Manning; Christopher Hartl; Pierre Fontanillas; Todd Green; Eric Banks; Mark A. DePristo; Ryan Poplin; Khalid Shakir; Timothy Fennell; Jacquelyn Murphy; Noël P. Burtt; Stacey Gabriel; Christian Fuchsberger; Hyun Min Kang; Xueling Sim; Clement Ma; Adam E. Locke; Thomas W. Blackwell; Anne U. Jackson; Tanya M. Teslovich; Heather M. Stringham; Peter S. Chines; Phoenix Kwan; Jeroen R. Huyghe; Adrian Tan

Collaboration


Dive into the Ryan Poplin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Donna M. Muzny

Baylor College of Medicine

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge