Günter Klambauer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Günter Klambauer is active.

Explore More

Publication

Featured researches published by Günter Klambauer.

Nucleic Acids Research | 2012

cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate

Günter Klambauer; Karin Schwarzbauer; Andreas Mayr; Djork-Arné Clevert; Andreas Mitterecker; Ulrich Bodenhofer; Sepp Hochreiter

Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose ‘Copy Number estimation by a Mixture Of PoissonS’ (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1–FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor.

Drug Discovery Today | 2015

Using transcriptomics to guide lead optimization in drug discovery projects: Lessons learned from the QSTAR project.

Bie Verbist; Günter Klambauer; Liesbet Vervoort; Willem Talloen; Ziv Shkedy; Olivier Thas; Andreas Bender; Hinrich Göhlmann; Sepp Hochreiter

The pharmaceutical industry is faced with steadily declining R&D efficiency which results in fewer drugs reaching the market despite increased investment. A major cause for this low efficiency is the failure of drug candidates in late-stage development owing to safety issues or previously undiscovered side-effects. We analyzed to what extent gene expression data can help to de-risk drug development in early phases by detecting the biological effects of compounds across disease areas, targets and scaffolds. For eight drug discovery projects within a global pharmaceutical company, gene expression data were informative and able to support go/no-go decisions. Our studies show that gene expression profiling can detect adverse effects of compounds, and is a valuable tool in early-stage drug discovery decision making.

Nucleic Acids Research | 2013

DEXUS: identifying differential expression in RNA-Seq studies with unknown conditions

Günter Klambauer; Thomas Unterthiner; Sepp Hochreiter

Detection of differential expression in RNA-Seq data is currently limited to studies in which two or more sample conditions are known a priori. However, these biological conditions are typically unknown in cohort, cross-sectional and nonrandomized controlled studies such as the HapMap, the ENCODE or the 1000 Genomes project. We present DEXUS for detecting differential expression in RNA-Seq data for which the sample conditions are unknown. DEXUS models read counts as a finite mixture of negative binomial distributions in which each mixture component corresponds to a condition. A transcript is considered differentially expressed if modeling of its read counts requires more than one condition. DEXUS decomposes read count variation into variation due to noise and variation due to differential expression. Evidence of differential expression is measured by the informative/noninformative (I/NI) value, which allows differentially expressed transcripts to be extracted at a desired specificity (significance level) or sensitivity (power). DEXUS performed excellently in identifying differentially expressed transcripts in data with unknown conditions. On 2400 simulated data sets, I/NI value thresholds of 0.025, 0.05 and 0.1 yielded average specificities of 92, 97 and 99% at sensitivities of 76, 61 and 38%, respectively. On real-world data sets, DEXUS was able to detect differentially expressed transcripts related to sex, species, tissue, structural variants or quantitative trait loci. The DEXUS R package is publicly available from Bioconductor and the scripts for all experiments are available at http://www.bioinf.jku.at/software/dexus/.

Bioinformatics | 2018

DeepSynergy: predicting anti-cancer drug synergy with Deep Learning

Kristina Preuer; Richard Lewis; Sepp Hochreiter; Andreas Bender; Krishna C. Bulusu; Günter Klambauer

Abstract Motivation While drug combination therapies are a well-established concept in cancer treatment, identifying novel synergistic combinations is challenging due to the size of combinatorial space. However, computational approaches have emerged as a time- and cost-efficient way to prioritize combinations to test, based on recently available large-scale combination screening data. Recently, Deep Learning has had an impact in many research areas by achieving new state-of-the-art model performance. However, Deep Learning has not yet been applied to drug synergy prediction, which is the approach we present here, termed DeepSynergy. DeepSynergy uses chemical and genomic information as input information, a normalization strategy to account for input data heterogeneity, and conical layers to model drug synergies. Results DeepSynergy was compared to other machine learning methods such as Gradient Boosting Machines, Random Forests, Support Vector Machines and Elastic Nets on the largest publicly available synergy dataset with respect to mean squared error. DeepSynergy significantly outperformed the other methods with an improvement of 7.2% over the second best method at the prediction of novel drug combinations within the space of explored drugs and cell lines. At this task, the mean Pearson correlation coefficient between the measured and the predicted values of DeepSynergy was 0.73. Applying DeepSynergy for classification of these novel drug combinations resulted in a high predictive performance of an AUC of 0.90. Furthermore, we found that all compared methods exhibit low predictive performance when extrapolating to unexplored drugs or cell lines, which we suggest is due to limitations in the size and diversity of the dataset. We envision that DeepSynergy could be a valuable tool for selecting novel synergistic drug combinations. Availability and implementation DeepSynergy is available via www.bioinf.jku.at/software/DeepSynergy. Supplementary information Supplementary data are available at Bioinformatics online.

Chemistry & Biology | 2018

Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery

Jaak Simm; Günter Klambauer; Adam Arany; Marvin Steijaert; Jörg Kurt Wegner; Emmanuel Gustin; Vladimir Chupakhin; Yolanda T. Chong; Jorge Vialard; Peter Jacobus Johannes Antonius Buijnsters; Ingrid Velter; Alexander Vapirev; Shantanu Singh; Anne E. Carpenter; Roel Wuyts; Sepp Hochreiter; Yves Moreau; Hugo Ceulemans

In both academia and the pharmaceutical industry, large-scale assays for drug discovery are expensive and often impractical, particularly for the increasingly important physiologically relevant model systems that require primary cells, organoids, whole organisms, or expensive or rare reagents. We hypothesized that data from a single high-throughput imaging assay can be repurposed to predict the biological activity of compounds in other assays, even those targeting alternate pathways or biological processes. Indeed, quantitative information extracted from a three-channel microscopy-based screen for glucocorticoid receptor translocation was able to predict assay-specific biological activity in two ongoing drug discovery projects. In these projects, repurposing increased hit rates by 50- to 250-fold over that of the initial project assays while increasing the chemical structure diversity of the hits. Our results suggest that data from high-content screens are a rich source of information that can be used to predict and replace customized biological assays.

Bioinformatics | 2015

Rchemcpp: a web service for structural analoging in ChEMBL, Drugbank and the Connectivity Map

Günter Klambauer; Martin Wischenbart; Michael Mahr; Thomas Unterthiner; Andreas Mayr; Sepp Hochreiter

UNLABELLED We have developed Rchempp, a web service that identifies structurally similar compounds (structural analogs) in large-scale molecule databases. The service allows compounds to be queried in the widely used ChEMBL, DrugBank and the Connectivity Map databases. Rchemcpp utilizes the best performing similarity functions, i.e. molecule kernels, as measures for structural similarity. Molecule kernels have proven superior performance over other similarity measures and are currently excelling at machine learning challenges. To considerably reduce computational time, and thereby make it feasible as a web service, a novel efficient prefiltering strategy has been developed, which maintains the sensitivity of the method. By exploiting information contained in public databases, the web service facilitates many applications crucial for the drug development process, such as prioritizing compounds after screening or reducing adverse side effects during late phases. Rchemcpp was used in the DeepTox pipeline that has won the Tox21 Data Challenge and is frequently used by researchers in pharmaceutical companies. AVAILABILITY AND IMPLEMENTATION The web service and the R package are freely available via http://shiny.bioinf.jku.at/Analoging/ and via Bioconductor. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Human Mutation | 2017

panelcn.MOPS: Copy number detection in targeted NGS panel data for clinical diagnostics

Gundula Povysil; Antigoni Tzika; Julia Vogt; Verena Haunschmid; Ludwine Messiaen; Johannes Zschocke; Günter Klambauer; Sepp Hochreiter; Katharina Wimmer

Targeted next‐generation‐sequencing (NGS) panels have largely replaced Sanger sequencing in clinical diagnostics. They allow for the detection of copy‐number variations (CNVs) in addition to single‐nucleotide variants and small insertions/deletions. However, existing computational CNV detection methods have shortcomings regarding accuracy, quality control (QC), incidental findings, and user‐friendliness. We developed panelcn.MOPS, a novel pipeline for detecting CNVs in targeted NGS panel data. Using data from 180 samples, we compared panelcn.MOPS with five state‐of‐the‐art methods. With panelcn.MOPS leading the field, most methods achieved comparably high accuracy. panelcn.MOPS reliably detected CNVs ranging in size from part of a region of interest (ROI), to whole genes, which may comprise all ROIs investigated in a given sample. The latter is enabled by analyzing reads from all ROIs of the panel, but presenting results exclusively for user‐selected genes, thus avoiding incidental findings. Additionally, panelcn.MOPS offers QC criteria not only for samples, but also for individual ROIs within a sample, which increases the confidence in called CNVs. panelcn.MOPS is freely available both as R package and standalone software with graphical user interface that is easy to use for clinical geneticists without any programming experience. panelcn.MOPS combines high sensitivity and specificity with user‐friendliness rendering it highly suitable for routine clinical diagnostics.

Scientific Reports | 2017

The unfolded protein response impacts melanoma progression by enhancing FGF expression and can be antagonized by a chemical chaperone

Karin Eigner; Yüksel Filik; Florian Mark; Birgit Schütz; Günter Klambauer; Richard Moriggl; Markus Hengstschläger; Herbert Stangl; Mario Mikula; Clemens Röhrl

The mechanisms hallmarking melanoma progression are insufficiently understood. Here we studied the impact of the unfolded protein response (UPR) - a signalling cascade playing ambiguous roles in carcinogenesis - in melanoma malignancy. We identified isogenic patient-derived melanoma cell lines harboring BRAFV600E-mutations as a model system to study the role of intrinsic UPR in melanoma progression. We show that the activity of the three effector pathways of the UPR (ATF6, PERK and IRE1) was increased in metastatic compared to non-metastatic cells. Increased UPR-activity was associated with increased flexibility to cope with ER stress. The activity of the ATF6- and the PERK-, but not the IRE-pathway, correlated with poor survival in melanoma patients. Using whole-genome expression analysis, we show that the UPR is an inducer of FGF1 and FGF2 expression and cell migration. Antagonization of the UPR using the chemical chaperone 4-phenylbutyric acid (4-PBA) reduced FGF expression and inhibited cell migration and viability. Consistently, FGF expression positively correlated with the activity of ATF6 and PERK in human melanomas. We conclude that chronic UPR stimulates the FGF/FGF-receptor signalling axis and promotes melanoma progression. Hence, the development of potent chemical chaperones to antagonize the UPR might be a therapeutic approach to target melanoma.

Systems Biomedicine | 2013

Increasing the discovery power of -omics studies

Djork-Arné Clevert; Andreas Mayr; Andreas Mitterecker; Günter Klambauer; Armand Valsesia; Karl Forner; Marianne Tuefferd; Willem Talloen; Jérôme Wojcik; Hinrich Göhlmann; Sepp Hochreiter

Motivation: Current clinical and biological studies apply different biotechnologies and subsequently combine the resulting -omics data to test biological hypotheses. The plethora of -omics data and their combination generates a large number of hypotheses and apparently increases the study power. Contrary to these expectations, the wealth of -omics data may even reduce the statistical power of a study because of a large correction factor for multiple testing. Typically, this loss of power in analyzing -omics data are caused by an increased false detection rate (FDR) in measurements, like falsely detected DNA copy number changes, or falsely identified differentially expressed genes. The false detections are random and, therefore, not related to the tested conditions. Thus, a high FDR considerably decreases the discovery power of studies, especially if different -omics data are involved. Results: On a HapMap data set, where known CNVs have to be re-detected, I/NI call filtering was much more efficient than variance-based filtering. In particular, the I/NI call filter outperforms variance-based filters on data with rare events like the CNVs in the HapMap data set. We assessed the efficiency of the I/NI call filter in reducing the FDR on two different cancer cell lines where it reduced the FDR 18- to 22-fold. Materials and Methods: A mitigation strategy for too high FDRs is to filter out putative false detections. We suggest using probabilistic latent variable models to identify putative false detections which may be found via such models by high estimated noise or by model-based measurement inconsistencies across samples. To select such a model, a Bayesian approach starts with the maximum a priori model that assumes no detection and selects the maximum a posteriori model. Hence detection results in a deviation of the maximal posterior from the maximal prior model measured by the information gain obtained by the data. If this information gain exceeds a threshold then the selected model obtains an Informative/Non-Informative (I/NI) call that indicates a detection. I/NI call filtering has been successfully applied in different projects, but it has so far not been shown that correction for multiple testing after I/NI call filtering still controls the type-I error rate. We prove this important property of the I/NI call and show that it is independent of commonly used test statistics for null hypotheses. We apply the I/NI call to transcriptomics (gene expression), where the prior model corresponds to a constant gene expression level across compared samples, and to genomics, analyzing copy number variation (CNV) data, where the prior model corresponds to a constant DNA copy number of 2 across compared samples.

Journal of Chemical Information and Modeling | 2018

Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery

Kristina Preuer; Philipp Renz; Thomas Unterthiner; Sepp Hochreiter; Günter Klambauer

The new wave of successful generative models in machine learning has increased the interest in deep learning driven de novo drug design. However, method comparison is difficult because of various flaws of the currently employed evaluation metrics. We propose an evaluation metric for generative models called Fréchet ChemNet distance (FCD). The advantage of the FCD over previous metrics is that it can detect whether generated molecules are diverse and have similar chemical and biological properties as real molecules.

Explore More