Konrad J. Karczewski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Konrad J. Karczewski is active.

Explore More

Publication

Featured researches published by Konrad J. Karczewski.

Nature | 2016

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek; Konrad J. Karczewski; Eric Vallabh Minikel; Kaitlin E. Samocha; Eric Banks; Timothy Fennell; Anne H. O’Donnell-Luria; James S. Ware; Andrew Hill; Beryl B. Cummings; Taru Tukiainen; Daniel P. Birnbaum; Jack A. Kosmicki; Laramie Duncan; Karol Estrada; Fengmei Zhao; James Zou; Emma Pierce-Hoffman; Joanne Berghout; David Neil Cooper; Nicole Deflaux; Mark A. DePristo; Ron Do; Jason Flannick; Menachem Fromer; Laura Gauthier; Jackie Goldstein; Namrata Gupta; Daniel P. Howrigan; Adam Kiezun

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human ‘knockout’ variants in protein-coding genes.

Genome Research | 2012

Annotation of functional variation in personal genomes using RegulomeDB

Alan P. Boyle; Eurie L. Hong; Manoj Hariharan; Yong Cheng; Marc A. Schaub; Maya Kasowski; Konrad J. Karczewski; Julie Park; Benjamin C. Hitz; Shuai Weng; J. Michael Cherry; Michael Snyder

As the sequencing of healthy and disease genomes becomes more commonplace, detailed annotation provides interpretation for individual variation responsible for normal and disease phenotypes. Current approaches focus on direct changes in protein coding genes, particularly nonsynonymous mutations that directly affect the gene product. However, most individual variation occurs outside of genes and, indeed, most markers generated from genome-wide association studies (GWAS) identify variants outside of coding segments. Identification of potential regulatory changes that perturb these sites will lead to a better localization of truly functional variants and interpretation of their effects. We have developed a novel approach and database, RegulomeDB, which guides interpretation of regulatory variants in the human genome. RegulomeDB includes high-throughput, experimental data sets from ENCODE and other sources, as well as computational predictions and manual annotations to identify putative regulatory potential and identify functional variants. These data sources are combined into a powerful tool that scores variants to help separate functional variants from a large pool and provides a small set of putative sites with testable hypotheses as to their function. We demonstrate the applicability of this tool to the annotation of noncoding variants from 69 full sequenced genomes as well as that of a personal genome, where thousands of functionally associated variants were identified. Moreover, we demonstrate a GWAS where the database is able to quickly identify the known associated functional variant and provide a hypothesis as to its function. Overall, we expect this approach and resource to be valuable for the annotation of human genome sequences.

Cell | 2012

Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes

Rui Chen; George Mias; Jennifer Li-Pook-Than; Lihua Jiang; Hugo Y. K. Lam; Rong Chen; Elana Miriami; Konrad J. Karczewski; Manoj Hariharan; Frederick E. Dewey; Yong Cheng; Michael J. Clark; Hogune Im; Lukas Habegger; Suganthi Balasubramanian; Maeve O'Huallachain; Joel T. Dudley; Sara Hillenmeyer; Rajini Haraksingh; Donald Sharon; Ghia Euskirchen; Phil Lacroute; Keith Bettinger; Alan P. Boyle; Maya Kasowski; Fabian Grubert; Scott Seki; Marco Garcia; Michelle Whirl-Carrillo; Mercedes Gallardo

Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.

Science | 2010

Variation in Transcription Factor Binding Among Humans

Maya Kasowski; Fabian Grubert; Christopher Heffelfinger; Manoj Hariharan; Akwasi Asabere; Sebastian M. Waszak; Lukas Habegger; Joel Rozowsky; Minyi Shi; Alexander E. Urban; Miyoung Hong; Konrad J. Karczewski; Wolfgang Huber; Sherman M. Weissman; Mark Gerstein; Jan O. Korbel; Michael Snyder

Like Father, Like Mother, Like Child Transcriptional regulation is mediated by chromatin structure, which may affect the binding of transcription factors, but the extent of how individual-to-individual genetic variation affects such regulation is not well understood. Kasowski et al. (p. 232, published online 18 March) investigated the binding of two transcription factors across the genomes of human individuals and one chimpanzee. Transcription factor binding was associated with genomic features such as nucleotide variation, insertions and deletions, and copy number variation. Thus, genomic sequence variation affects transcription factor binding and may explain expression difference among individuals. McDaniell et al. (p. 235, published online 18 March) provide a genome-wide catalog of variation in chromatin and transcription factor binding in two parent-child trios of European and African ancestry. Up to 10% of active chromatin binding sites were specific to a set of individuals and were often inherited. Furthermore, variation in active chromatin sites showed heritable allele-specific correlation with variation in gene expression. Transcription factor binding sites vary among individuals and are correlated with differences in expression. Differences in gene expression may play a major role in speciation and phenotypic diversity. We examined genome-wide differences in transcription factor (TF) binding in several humans and a single chimpanzee by using chromatin immunoprecipitation followed by sequencing. The binding sites of RNA polymerase II (PolII) and a key regulator of immune responses, nuclear factor κB (p65), were mapped in 10 lymphoblastoid cell lines, and 25 and 7.5% of the respective binding regions were found to differ between individuals. Binding differences were frequently associated with single-nucleotide polymorphisms and genomic structural variants, and these differences were often correlated with differences in gene expression, suggesting functional consequences of binding variation. Furthermore, comparing PolII binding between humans and chimpanzee suggests extensive divergence in TF binding. Our results indicate that many differences in individuals and species occur at the level of TF binding, and they provide insight into the genetic events responsible for these differences.

Nature Biotechnology | 2011

Performance comparison of exome DNA sequencing technologies

Michael J. Clark; Rui Chen; Hugo Y. K. Lam; Konrad J. Karczewski; Rong Chen; Ghia Euskirchen; Atul J. Butte; Michael Snyder

Whole exome sequencing by high-throughput sequencing of target-enriched genomic DNA (exome-seq) has become common in basic and translational research as a means of interrogating the interpretable part of the human genome at relatively low cost. We present a comparison of three major commercial exome sequencing platforms from Agilent, Illumina and Nimblegen applied to the same human blood sample. Our results suggest that the Nimblegen platform, which is the only one to use high-density overlapping baits, covers fewer genomic regions than the other platforms but requires the least amount of sequencing to sensitively detect small variants. Agilent and Illumina are able to detect a greater total number of variants with additional sequencing. Illumina captures untranslated regions, which are not targeted by the Nimblegen and Agilent platforms. We also compare exome sequencing and whole genome sequencing (WGS) of the same sample, demonstrating that exome sequencing can detect additional small variants missed by WGS.

European Heart Journal | 2015

Mendelian randomization of blood lipids for coronary heart disease.

Michael V. Holmes; Folkert W. Asselbergs; Tom Palmer; Fotios Drenos; Matthew B. Lanktree; Christopher P. Nelson; Caroline Dale; Sandosh Padmanabhan; Chris Finan; Daniel I. Swerdlow; Vinicius Tragante; Erik P A Van Iperen; Suthesh Sivapalaratnam; Sonia Shah; Clara C. Elbers; Tina Shah; Jorgen Engmann; Claudia Giambartolomei; Jon White; Delilah Zabaneh; Reecha Sofat; Stela McLachlan; Pieter A. Doevendans; Anthony J. Balmforth; Alistair S. Hall; Kari E. North; Berta Almoguera; Ron C. Hoogeveen; Mary Cushman; Myriam Fornage

Aims To investigate the causal role of high-density lipoprotein cholesterol (HDL-C) and triglycerides in coronary heart disease (CHD) using multiple instrumental variables for Mendelian randomization. Methods and results We developed weighted allele scores based on single nucleotide polymorphisms (SNPs) with established associations with HDL-C, triglycerides, and low-density lipoprotein cholesterol (LDL-C). For each trait, we constructed two scores. The first was unrestricted, including all independent SNPs associated with the lipid trait identified from a prior meta-analysis (threshold P < 2 × 10−6); and the second a restricted score, filtered to remove any SNPs also associated with either of the other two lipid traits at P ≤ 0.01. Mendelian randomization meta-analyses were conducted in 17 studies including 62,199 participants and 12,099 CHD events. Both the unrestricted and restricted allele scores for LDL-C (42 and 19 SNPs, respectively) associated with CHD. For HDL-C, the unrestricted allele score (48 SNPs) was associated with CHD (OR: 0.53; 95% CI: 0.40, 0.70), per 1 mmol/L higher HDL-C, but neither the restricted allele score (19 SNPs; OR: 0.91; 95% CI: 0.42, 1.98) nor the unrestricted HDL-C allele score adjusted for triglycerides, LDL-C, or statin use (OR: 0.81; 95% CI: 0.44, 1.46) showed a robust association. For triglycerides, the unrestricted allele score (67 SNPs) and the restricted allele score (27 SNPs) were both associated with CHD (OR: 1.62; 95% CI: 1.24, 2.11 and 1.61; 95% CI: 1.00, 2.59, respectively) per 1-log unit increment. However, the unrestricted triglyceride score adjusted for HDL-C, LDL-C, and statin use gave an OR for CHD of 1.01 (95% CI: 0.59, 1.75). Conclusion The genetic findings support a causal effect of triglycerides on CHD risk, but a causal role for HDL-C, though possible, remains less certain.

Bioinformatics | 2011

Bioinformatics challenges for personalized medicine

Guy Haskin Fernald; Emidio Capriotti; Roxana Daneshjou; Konrad J. Karczewski; Russ B. Altman

Motivation: Widespread availability of low-cost, full genome sequencing will introduce new challenges for bioinformatics. Results: This review outlines recent developments in sequencing technologies and genome analysis methods for application in personalized medicine. New methods are needed in four areas to realize the potential of personalized medicine: (i) processing large-scale robust genomic data; (ii) interpreting the functional effect and the impact of genomic variation; (iii) integrating systems data to relate complex genetic interactions with phenotypes; and (iv) translating these discoveries into medical practice. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Science | 2015

De novo mutations in congenital heart disease with neurodevelopmental and other congenital anomalies

Jason Homsy; Samir Zaidi; Yufeng Shen; James S. Ware; Kaitlin E. Samocha; Konrad J. Karczewski; Steven R. DePalma; David M. McKean; Hiroko Wakimoto; Josh Gorham; Sheng Chih Jin; John Deanfield; Alessandro Giardini; George A. Porter; Richard Kim; Kaya Bilguvar; Francesc López-Giráldez; Irina Tikhonova; Shrikant Mane; Angela Romano-Adesman; Hongjian Qi; Badri N. Vardarajan; Lijiang Ma; Mark J. Daly; Amy E. Roberts; Mark W. Russell; Seema Mital; Jane W. Newburger; J. William Gaynor; Roger E. Breitbart

Putting both heart and brain at risk For reasons that are unclear, newborns with congenital heart disease (CHD) have a high risk of neurodevelopmental disabilities. Homsy et al. performed exome sequence analysis of 1200 CHD patients and their parents to identify spontaneously arising (de novo) mutations. Patients with both CHD and neurodevelopmental disorders had a much higher burden of damaging de novo mutations, particularly in genes with likely roles in both heart and brain development. Thus, clinical genotyping of patients with CHD may help to identify those at greatest risk of neurodevelopmental disabilities, allowing surveillance and early intervention. Science, this issue p. 1262 Genotyping of children with congenital heart disease may identify those at high risk of neurodevelopmental disorders. Congenital heart disease (CHD) patients have an increased prevalence of extracardiac congenital anomalies (CAs) and risk of neurodevelopmental disabilities (NDDs). Exome sequencing of 1213 CHD parent-offspring trios identified an excess of protein-damaging de novo mutations, especially in genes highly expressed in the developing heart and brain. These mutations accounted for 20% of patients with CHD, NDD, and CA but only 2% of patients with isolated CHD. Mutations altered genes involved in morphogenesis, chromatin modification, and transcriptional regulation, including multiple mutations in RBFOX2, a regulator of mRNA splicing. Genes mutated in other cohorts examined for NDD were enriched in CHD cases, particularly those with coexisting NDD. These findings reveal shared genetic contributions to CHD, NDD, and CA and provide opportunities for improved prognostic assessment and early therapeutic intervention in CHD patients.

Science Translational Medicine | 2016

Quantifying prion disease penetrance using large population control cohorts

Eric Vallabh Minikel; Sonia M. Vallabh; Monkol Lek; Karol Estrada; Kaitlin E. Samocha; J. Fah Sathirapongsasuti; Cory Y. McLean; Joyce Y. Tung; Linda P C Yu; Pierluigi Gambetti; Janis Blevins; Shulin Zhang; Yvonne Cohen; Wei Chen; Masahito Yamada; Tsuyoshi Hamaguchi; Nobuo Sanjo; Hidehiro Mizusawa; Yosikazu Nakamura; Tetsuyuki Kitamoto; Steven J. Collins; Alison Boyd; Robert G. Will; Richard Knight; Claudia Ponto; Inga Zerr; Theo F. J. Kraus; Sabina Eigenbrod; Armin Giese; Miguel Calero

Large genomic reference data sets reveal a spectrum of pathogenicity in the prion protein gene and provide genetic validation for a therapeutic strategy in prion disease. Share trumps rare No longer just buzz words, “patient empowerment” and “data sharing” are enabling breakthrough research on rare genetic diseases. Although more than 100,000 genetic variants are believed to drive disease in humans, little is known about penetrance—the probability that a mutation will actually cause disease in the carrier. This conundrum persists because small sample sizes breed imperfect alliance estimates between mutations and disease risk. Now, a patient-turned-scientist joined with a large bioinformatics team to analyze vast amounts of shared data—from the Exome Aggregation Consortium and the 23andMe database—to provide insights into genetic-variant penetrance and possible treatment approaches for a rare, fatal genetic prion disease. More than 100,000 genetic variants are reported to cause Mendelian disease in humans, but the penetrance—the probability that a carrier of the purported disease-causing genotype will indeed develop the disease—is generally unknown. We assess the impact of variants in the prion protein gene (PRNP) on the risk of prion disease by analyzing 16,025 prion disease cases, 60,706 population control exomes, and 531,575 individuals genotyped by 23andMe Inc. We show that missense variants in PRNP previously reported to be pathogenic are at least 30 times more common in the population than expected on the basis of genetic prion disease prevalence. Although some of this excess can be attributed to benign variants falsely assigned as pathogenic, other variants have genuine effects on disease susceptibility but confer lifetime risks ranging from <0.1 to ~100%. We also show that truncating variants in PRNP have position-dependent effects, with true loss-of-function alleles found in healthy older individuals, a finding that supports the safety of therapeutic suppression of prion protein expression.

Genome Research | 2013

The origin, evolution, and functional impact of short insertion–deletion variants identified in 179 human genomes

Stephen B. Montgomery; David L. Goode; Erika Kvikstad; Cornelis A. Albers; Zhengdong D. Zhang; Xinmeng Jasmine Mu; Guruprasad Ananda; Bryan Howie; Konrad J. Karczewski; Kevin S. Smith; Vanessa Anaya; Rhea Richardson; Joseph S. Davis; Daniel G. MacArthur; Arend Sidow; Laurent Duret; Mark Gerstein; Kateryna D. Makova; Jonathan Marchini; Gil McVean; Gerton Lunter

Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%-48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.

Explore More