Daniel R. Lavage
Geisinger Health System
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniel R. Lavage.
Science | 2016
Frederick E. Dewey; Michael F. Murray; John D. Overton; Lukas Habegger; Joseph B. Leader; Samantha N. Fetterolf; Colm O’Dushlaine; Cristopher V. Van Hout; Jeffrey Staples; Claudia Gonzaga-Jauregui; Raghu Metpally; Sarah A. Pendergrass; Monica A. Giovanni; H. Lester Kirchner; Suganthi Balasubramanian; Noura S. Abul-Husn; Dustin N. Hartzel; Daniel R. Lavage; Korey A. Kost; Jonathan S. Packer; Alexander E. Lopez; John Penn; Semanti Mukherjee; Nehal Gosalia; Manoj Kanagaraj; Alexander H. Li; Lyndon J. Mitnaul; Lance J. Adams; Thomas N. Person; Kavita Praveen
Unleashing the power of precision medicine Precision medicine promises the ability to identify risks and treat patients on the basis of pathogenic genetic variation. Two studies combined exome sequencing results for over 50,000 people with their electronic health records. Dewey et al. found that ∼3.5% of individuals in their cohort had clinically actionable genetic variants. Many of these variants affected blood lipid levels that could influence cardiovascular health. Abul-Husn et al. extended these findings to investigate the genetics and treatment of familial hypercholesterolemia, a risk factor for cardiovascular disease, within their patient pool. Genetic screening helped identify at-risk patients who could benefit from increased treatment. Science, this issue p. 10.1126/science.aaf6814, p. 10.1126/science.aaf7000 More than 50,000 exomes, coupled with electronic health records, inform on medically relevant genetic variants. INTRODUCTION Large-scale genetic studies of integrated health care populations, with phenotypic data captured natively in the documentation of clinical care, have the potential to unveil genetic associations that point the way to new biology and therapeutic targets. This setting also represents an ideal test bed for the implementation of genomics in routine clinical care in service of precision medicine. RATIONALE The DiscovEHR collaboration between the Regeneron Genetics Center and Geisinger Health System aims to catalyze genomic discovery and precision medicine by coupling high-throughput exome sequencing to longitudinal electronic health records (EHRs) of participants in Geisinger’s MyCode Community Health Initiative. Here, we describe initial insights from whole-exome sequencing of 50,726 adult participants of predominantly European ancestry using clinical phenotypes derived from EHRs. RESULTS The median duration of EHR data associated with sequenced participants was 14 years, with a median of 87 clinical encounters, 687 laboratory tests, and seven procedures per participant. Forty-eight percent of sequenced individuals had one or more first- or second-degree relatives in the sample, and genome-wide autozygosity was similar to other outbred European populations. We found ~4.2 million single-nucleotide variants and insertion/deletion events, of which ~176,000 are predicted to result in loss of gene function (LoF). The overwhelming majority of these genetic variants occurred at a minor allele frequency of ≤1%, and more than half were singletons. Each participant harbored a median of 21 rare predicted LoFs. At this sample size, ~92% of sequenced genes, including genes that encode existing drug targets or confer risk for highly penetrant genetic diseases, harbor rare heterozygous predicted LoF variants. About 7% of sequenced genes contained rare homozygous predicted LoF variants in at least one individual. Linking these data to EHR-derived laboratory phenotypes revealed consequences of partial or complete LoF in humans. Among these were previously unidentified associations between predicted LoFs in CSF2RB and basophil and eosinophil counts, and EGLN1-associated erythrocytosis segregating in genetically identified family networks. Using predicted LoFs as a model for drug target antagonism, we found associations supporting the majority of therapeutic targets for lipid lowering. To highlight the opportunity for genotype-phenotype association discovery, we performed exome-wide association analyses of EHR-derived lipid values, newly implicating rare predicted LoFs, and deleterious missense variants in G6PC in association with triglyceride levels. In a survey of 76 clinically actionable disease-associated genes, we estimated that 3.5% of individuals harbor pathogenic or likely pathogenic variants that meet criteria for clinical action. Review of the EHR uncovered findings associated with the monogenic condition in ~65% of pathogenic variant carriers’ medical records. CONCLUSION The findings reported here demonstrate the value of large-scale sequencing in an integrated health system population, add to the knowledge base regarding the phenotypic consequences of human genetic variation, and illustrate the challenges and promise of genomic medicine implementation. DiscovEHR provides a blueprint for large-scale precision medicine initiatives and genomics-guided therapeutic target discovery. Therapeutic target validation and genomic medicine in DiscovEHR. (A) Associations between predicted LoF variants in lipid drug target genes and lipid levels. Boxes correspond to effect size, given as the absolute value of effect, in SD units; whiskers denote 95% confidence intervals for effect. The size of the box is proportional to the logarithm (base 10) of predicted LoF carriers. (B and C) Prevalence and expressivity of clinically actionable genetic variants in 76 disease genes, according to EHR data. G76, Geisinger-76. The DiscovEHR collaboration between the Regeneron Genetics Center and Geisinger Health System couples high-throughput sequencing to an integrated health care system using longitudinal electronic health records (EHRs). We sequenced the exomes of 50,726 adult participants in the DiscovEHR study to identify ~4.2 million rare single-nucleotide variants and insertion/deletion events, of which ~176,000 are predicted to result in a loss of gene function. Linking these data to EHR-derived clinical phenotypes, we find clinical associations supporting therapeutic targets, including genes encoding drug targets for lipid lowering, and identify previously unidentified rare alleles associated with lipid levels and other blood level traits. About 3.5% of individuals harbor deleterious variants in 76 clinically actionable genes. The DiscovEHR data set provides a blueprint for large-scale precision medicine initiatives and genomics-guided therapeutic discovery.
JAMA | 2017
Amit Khera; Hong-Hee Won; Gina M. Peloso; Colm O'Dushlaine; Dajiang J. Liu; Nathan O. Stitziel; Pradeep Natarajan; Akihiro Nomura; Connor A. Emdin; Namrata Gupta; Ingrid B. Borecki; Rosanna Asselta; Stefano Duga; Piera Angelica Merlini; Adolfo Correa; Thorsten Kessler; James G. Wilson; Matthew J. Bown; Alistair S. Hall; Peter S. Braund; David J. Carey; Michael F. Murray; H. Lester Kirchner; Joseph B. Leader; Daniel R. Lavage; J. Neil Manus; Dustin N. Hartzel; Nilesh J. Samani; Heribert Schunkert; Jaume Marrugat
Importance The activity of lipoprotein lipase (LPL) is the rate-determining step in clearing triglyceride-rich lipoproteins from the circulation. Mutations that damage the LPL gene (LPL) lead to lifelong deficiency in enzymatic activity and can provide insight into the relationship of LPL to human disease. Objective To determine whether rare and/or common variants in LPL are associated with early-onset coronary artery disease (CAD). Design, Setting, and Participants In a cross-sectional study, LPL was sequenced in 10 CAD case-control cohorts of the multinational Myocardial Infarction Genetics Consortium and a nested CAD case-control cohort of the Geisinger Health System DiscovEHR cohort between 2010 and 2015. Common variants were genotyped in up to 305 699 individuals of the Global Lipids Genetics Consortium and up to 120 600 individuals of the CARDIoGRAM Exome Consortium between 2012 and 2014. Study-specific estimates were pooled via meta-analysis. Exposures Rare damaging mutations in LPL included loss-of-function variants and missense variants annotated as pathogenic in a human genetics database or predicted to be damaging by computer prediction algorithms trained to identify mutations that impair protein function. Common variants in the LPL gene region included those independently associated with circulating triglyceride levels. Main Outcomes and Measures Circulating lipid levels and CAD. Results Among 46 891 individuals with LPL gene sequencing data available, the mean (SD) age was 50 (12.6) years and 51% were female. A total of 188 participants (0.40%; 95% CI, 0.35%-0.46%) carried a damaging mutation in LPL, including 105 of 32 646 control participants (0.32%) and 83 of 14 245 participants with early-onset CAD (0.58%). Compared with 46 703 noncarriers, the 188 heterozygous carriers of an LPL damaging mutation displayed higher plasma triglyceride levels (19.6 mg/dL; 95% CI, 4.6-34.6 mg/dL) and higher odds of CAD (odds ratio = 1.84; 95% CI, 1.35-2.51; P < .001). An analysis of 6 common LPL variants resulted in an odds ratio for CAD of 1.51 (95% CI, 1.39-1.64; P = 1.1 × 10−22) per 1-SD increase in triglycerides. Conclusions and Relevance The presence of rare damaging mutations in LPL was significantly associated with higher triglyceride levels and presence of coronary artery disease. However, further research is needed to assess whether there are causal mechanisms by which heterozygous lipoprotein lipase deficiency could lead to coronary artery disease.
Circulation Research | 2017
Akihiro Nomura; Hong-Hee Won; Amit Khera; Fumihiko Takeuchi; Kaoru Ito; Shane McCarthy; Connor A. Emdin; Derek Klarin; Pradeep Natarajan; Seyedeh M. Zekavat; Namrata Gupta; Gina M. Peloso; Ingrid B. Borecki; Tanya M. Teslovich; Rosanna Asselta; Stefano Duga; Piera Angelica Merlini; Adolfo Correa; Thorsten Kessler; James G. Wilson; Matthew J. Bown; Alistair S. Hall; Peter S. Braund; David J. Carey; Michael F. Murray; H. Lester Kirchner; Joseph B. Leader; Daniel R. Lavage; J. Neil Manus; Dustin N. Hartze
Rationale: Therapies that inhibit CETP (cholesteryl ester transfer protein) have failed to demonstrate a reduction in risk for coronary heart disease (CHD). Human DNA sequence variants that truncate the CETP gene may provide insight into the efficacy of CETP inhibition. Objective: To test whether protein-truncating variants (PTVs) at the CETP gene were associated with plasma lipid levels and CHD. Methods and Results: We sequenced the exons of the CETP gene in 58 469 participants from 12 case–control studies (18 817 CHD cases, 39 652 CHD-free controls). We defined PTV as those that lead to a premature stop, disrupt canonical splice sites, or lead to insertions/deletions that shift frame. We also genotyped 1 Japanese-specific PTV in 27561 participants from 3 case–control studies (14 286 CHD cases, 13 275 CHD-free controls). We tested association of CETP PTV carrier status with both plasma lipids and CHD. Among 58 469 participants with CETP gene-sequencing data available, average age was 51.5 years and 43% were women; 1 in 975 participants carried a PTV at the CETP gene. Compared with noncarriers, carriers of PTV at CETP had higher high-density lipoprotein cholesterol (effect size, 22.6 mg/dL; 95% confidence interval, 18–27; P<1.0×10−4), lower low-density lipoprotein cholesterol (−12.2 mg/dL; 95% confidence interval, −23 to −0.98; P=0.033), and lower triglycerides (−6.3%; 95% confidence interval, −12 to −0.22; P=0.043). CETP PTV carrier status was associated with reduced risk for CHD (summary odds ratio, 0.70; 95% confidence interval, 0.54–0.90; P=5.1×10−3). Conclusions: Compared with noncarriers, carriers of PTV at CETP displayed higher high-density lipoprotein cholesterol, lower low-density lipoprotein cholesterol, lower triglycerides, and lower risk for CHD.
pacific symposium on biocomputing | 2016
Anurag Verma; Joseph B. Leader; Shefali S. Verma; Alex T. Frase; John R. Wallace; Scott M. Dudek; Daniel R. Lavage; Cristopher V. Van Hout; Frederick E. Dewey; John Penn; Alexander E. Lopez; John D. Overton; David J. Carey; David H. Ledbetter; H. Lester Kirchner; Marylyn D. Ritchie; Sarah A. Pendergrass
Electronic health records (EHR) provide a comprehensive resource for discovery, allowing unprecedented exploration of the impact of genetic architecture on health and disease. The data of EHRs also allow for exploration of the complex interactions between health measures across health and disease. The discoveries arising from EHR based research provide important information for the identification of genetic variation for clinical decision-making. Due to the breadth of information collected within the EHR, a challenge for discovery using EHR based data is the development of high-throughput tools that expose important areas of further research, from genetic variants to phenotypes. Phenome-Wide Association studies (PheWAS) provide a way to explore the association between genetic variants and comprehensive phenotypic measurements, generating new hypotheses and also exposing the complex relationships between genetic architecture and outcomes, including pleiotropy. EHR based PheWAS have mainly evaluated associations with case/control status from International Classification of Disease, Ninth Edition (ICD-9) codes. While these studies have highlighted discovery through PheWAS, the rich resource of clinical lab measures collected within the EHR can be better utilized for high-throughput PheWAS analyses and discovery. To better use these resources and enrich PheWAS association results we have developed a sound methodology for extracting a wide range of clinical lab measures from EHR data. We have extracted a first set of 21 clinical lab measures from the de-identified EHR of participants of the Geisinger MyCodeTM biorepository, and calculated the median of these lab measures for 12,039 subjects. Next we evaluated the association between these 21 clinical lab median values and 635,525 genetic variants, performing a genome-wide association study (GWAS) for each of 21 clinical lab measures. We then calculated the association between SNPs from these GWAS passing our Bonferroni defined p-value cutoff and 165 ICD-9 codes. Through the GWAS we found a series of results replicating known associations, and also some potentially novel associations with less studied clinical lab measures. We found the majority of the PheWAS ICD-9 diagnoses highly related to the clinical lab measures associated with same SNPs. Moving forward, we will be evaluating further phenotypes and expanding the methodology for successful extraction of clinical lab measurements for research and PheWAS use. These developments are important for expanding the PheWAS approach for improved EHR based discovery.
Scientific Reports | 2018
Shefali Setia Verma; Navya Josyula; Anurag Verma; Xinyuan Zhang; Yogasudha Veturi; Frederick E. Dewey; Dustin N. Hartzel; Daniel R. Lavage; Joe Leader; Marylyn D. Ritchie; Sarah A. Pendergrass
The DrugBank database consists of ~800 genes that are well characterized drug targets. This list of genes is a useful resource for association testing. For example, loss of function (LOF) genetic variation has the potential to mimic the effect of drugs, and high impact variation in these genes can impact downstream traits. Identifying novel associations between genetic variation in these genes and a range of diseases can also uncover new uses for the drugs that target these genes. Phenome Wide Association Studies (PheWAS) have been successful in identifying genetic associations across hundreds of thousands of diseases. We have conducted a novel gene based PheWAS to test the effect of rare variants in DrugBank genes, evaluating associations between these genes and more than 500 quantitative and dichotomous phenotypes. We used whole exome sequencing data from 38,568 samples in Geisinger MyCode Community Health Initiative. We evaluated the results of this study when binning rare variants using various filters based on potential functional impact. We identified multiple novel associations, and the majority of the significant associations were driven by functionally annotated variation. Overall, this study provides a sweeping exploration of rare variant associations within functionally relevant genes across a wide range of diagnoses.
Proceedings of the Pacific Symposium | 2017
Christopher R. Bauer; Daniel R. Lavage; John W. Snyder; Joseph B. Leader; J. Matthew Mahoney; Sarah A. Pendergrass
The past decade has seen exponential growth in the numbers of sequenced and genotyped individuals and a corresponding increase in our ability of collect and catalogue phenotypic data for use in the clinic. We now face the challenge of integrating these diverse data in new ways new that can provide useful diagnostics and precise medical interventions for individual patients. One of the first steps in this process is to accurately map the phenotypic consequences of the genetic variation in human populations. The most common approach for this is the genome wide association study (GWAS). While this technique is relatively simple to implement for a given phenotype, the choice of how to define a phenotype is critical. It is becoming increasingly common for each individual in a GWAS cohort to have a large profile of quantitative measures. The standard approach is to test for associations with one measure at a time; however, there are many justifiable ways to define a set of phenotypes, and the genetic associations that are revealed will vary based on these definitions. Some phenotypes may only show a significant genetic association signal when considered together, such as through principle components analysis (PCA). Combining correlated measures may increase the power to detect association by reducing the noise present in individual variables and reduce the multiple hypothesis testing burden. Here we show that PCA and k-means clustering are two complimentary methods for identifying novel genotype-phenotype relationships within a set of quantitative human traits derived from the Geisinger Health System electronic health record (EHR). Using a diverse set of approaches for defining phenotype may yield more insights into the genetic architecture of complex traits and the findings presented here highlight a clear need for further investigation into other methods for defining the most relevant phenotypes in a set of variables. As the data of EHR continue to grow, addressing these issues will become increasingly important in our efforts to use genomic data effectively in medicine.
Scientific Reports | 2018
Shefali Setia Verma; Navya Josyula; Anurag Verma; Xinyuan Zhang; Yogasudha Veturi; Frederick E. Dewey; Dustin N. Hartzel; Daniel R. Lavage; Joe Leader; Marylyn D. Ritchie; Sarah A. Pendergrass
A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has not been fixed in the paper.
JAMA Network Open | 2018
Kandamurugu Manickam; Adam H. Buchanan; Marci Schwartz; Miranda L. G. Hallquist; Janet Williams; Alanna Kulchak Rahm; Heather Rocha; Juliann M. Savatt; Alyson E. Evans; Loren Butry; Amanda Lazzeri; D’Andra M. Lindbuchler; Carroll N. Flansburg; Rosemary Leeming; Victor G. Vogel; Matthew S. Lebo; Heather Mason-Suares; Derick C. Hoskinson; Noura S. Abul-Husn; Frederick E. Dewey; John D. Overton; Jeffrey G. Reid; Aris Baras; Huntington F. Willard; Cara Z. McCormick; Sarath Krishnamurthy; Dustin N. Hartzel; Korey A. Kost; Daniel R. Lavage; Amy C. Sturm
Key Points Question Can population-level genomic screening identify those at risk for disease? Findings In this cross-sectional study of an unselected population cohort of 50 726 adults who underwent exome sequencing, pathogenic and likely pathogenic BRCA1 and BRCA2 variants were found in a higher proportion of patients than was previously reported. Meaning Current methods to identify BRCA1/2 variant carriers may not be sufficient as a screening tool; population genomic screening for hereditary breast and ovarian cancer may better identify patients at high risk and provide an intervention opportunity to reduce mortality and morbidity.
bioRxiv | 2017
Brett K. Beaulieu-Jones; Daniel R. Lavage; John W. Snyder; Jason H. Moore; Sarah A. Pendergrass; Christopher R. Bauer
Missing data is a challenge for all studies; however, this is especially true for electronic health record (EHR) based analyses. Failure to appropriately consider missing data can lead to biased results. Here, we provide detailed procedures for when and how to conduct imputation of EHR data. We demonstrate how the mechanism of missingness can be assessed, evaluate the performance of a variety of imputation methods, and describe some of the most frequent problems that can be encountered. We analyzed clinical lab measures from 602,366 patients in the Geisinger Health System EHR. Using these data, we constructed a representative set of complete cases and assessed the performance of 12 different imputation methods for missing data that was simulated based on 4 mechanisms of missingness. Our results show that several methods including variations of Multivariate Imputation by Chained Equations (MICE) and softImpute consistently imputed missing values with low error; however, only a subset of the MICE methods were suitable for multiple imputation. The analyses described provide an outline of considerations for dealing with missing EHR data, steps that researchers can perform to characterize missingness within their own data, and an evaluation of methods that can be applied to impute clinical data. While the performance of methods may vary between datasets, the process we describe can be generalized to the majority of structured data types that exist in EHRs and all of our methods and code are publicly available.
American Journal of Human Genetics | 2018
Anurag Verma; Anastasia Lucas; Shefali S. Verma; Yu Zhang; Navya Josyula; Anqa Khan; Dustin N. Hartzel; Daniel R. Lavage; Joseph B. Leader; Marylyn D. Ritchie; Sarah A. Pendergrass