Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andreas Maunz is active.

Publication


Featured researches published by Andreas Maunz.


Journal of Cheminformatics | 2010

Collaborative development of predictive toxicology applications

Barry Hardy; Nicki Douglas; Christoph Helma; Micha Rautenberg; Nina Jeliazkova; Vedrin Jeliazkov; Ivelina Nikolova; Romualdo Benigni; Olga Tcheremenskaia; Stefan Kramer; Tobias Girschick; Fabian Buchwald; Jörg Wicker; Andreas Karwath; Martin Gütlein; Andreas Maunz; Haralambos Sarimveis; Georgia Melagraki; Antreas Afantitis; Pantelis Sopasakis; David Gallagher; Vladimir Poroikov; Dmitry Filimonov; Alexey V. Zakharov; Alexey Lagunin; Tatyana A. Gloriozova; Sergey V. Novikov; Natalia Skvortsova; Dmitry Druzhilovsky; Sunil Chawla

OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals.The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation.Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way.


Frontiers in Pharmacology | 2013

lazar: a modular predictive toxicology framework

Andreas Maunz; Martin Gütlein; Micha Rautenberg; David Vorgrimmler; Denis Gebele; Christoph Helma

lazar (lazy structure–activity relationships) is a modular framework for predictive toxicology. Similar to the read across procedure in toxicological risk assessment, lazar creates local QSAR (quantitative structure–activity relationship) models for each compound to be predicted. Model developers can choose between a large variety of algorithms for descriptor calculation and selection, chemical similarity indices, and model building. This paper presents a high level description of the lazar framework and discusses the performance of example classification and regression models.


Molecular Pharmaceutics | 2011

Combinatorial QSAR modeling of human intestinal absorption.

Claudia Suenderhauf; Felix Hammann; Andreas Maunz; Christoph Helma; Jörg Huwyler

Intestinal drug absorption in humans is a central topic in drug discovery. In this study, we use a broad selection of machine learning and statistical methods for the classification and numerical prediction of this key end point. Our data set is based on a selection of 458 small druglike compounds with FDA approval. Using easily available tools, we calculated one- to three-dimensional physicochemical descriptors and used various methods of feature selection (best-first backward selection, correlation analysis, and decision tree analysis). We then used decision tree induction (DTI), fragment-based lazy-learning (LAZAR), support vector machine classification, multilayer perceptrons, random forests, k-nearest neighbor and Naïve Bayes analysis to model absorption ratios and binary classification (well-absorbed and poorly absorbed compounds). Best performance for classification was seen with DTI using the chi-squared analysis interaction detector (CHAID) algorithm, yielding corrected classification rate of 88% (Matthews correlation coefficient of 75%). In numeric predictions, the multilayer perceptron performed best, achieving a root mean squared error of 25.823 and a coefficient of determination of 0.6. In line with current understanding is the importance of descriptors such as lipophilic partition coefficients (log P) and hydrogen bonding. However, we are able to highlight the utility of gravitational indices and moments of inertia, reflecting the role of structural symmetry in oral absorption. Our models are based on a diverse data set of marketed drugs representing a broad chemical space. These models therefore contribute substantially to the molecular understanding of human intestinal drug absorption and qualify for a generalized use in drug discovery and lead optimization.


Current Drug Metabolism | 2009

Development of Decision Tree Models for Substrates, Inhibitors, and Inducers of P-Glycoprotein

Felix Hammann; Heike Gutmann; Ursula Jecklin; Andreas Maunz; Christoph Helma; Juergen Drewe

In silico classification of new compounds for certain properties is a useful tool to guide further experiments or compound selection. Interaction of new compounds with the efflux pump P-glycoprotein (P-gp) is an important drug property determining tissue distribution and the potential for drug-drug interactions. We present three datasets on substrate, inhibitor, and inducer activities for P-gp (n = 471) obtained from a literature search which we compared to an existing evaluation of the Prestwick Chemical Library with the calcein-AM assay (retrieved from PubMed). Additionally, we present decision tree models of these activities with predictive accuracies of 77.7 % (substrates), 86.9 % (inhibitors), and 90.3 % (inducers) using three algorithms (CHAID, CART, and C4.5). We also present decision tree models of the calcein-AM assay (79.9 %). Apart from a comprehensive dataset of P-gp interacting compounds, our study provides evidence of the efficacy of logD descriptors and of two algorithms not commonly used in pharmacological QSAR studies (CART and CHAID).


knowledge discovery and data mining | 2009

Large-scale graph mining using backbone refinement classes

Andreas Maunz; Christoph Helma; Stefan Kramer

We present a new approach to large-scale graph mining based on so-called backbone refinement classes. The method efficiently mines tree-shaped subgraph descriptors under minimum frequency and significance constraints, using classes of fragments to reduce feature set size and running times. The classes are defined in terms of fragments sharing a common backbone. The method is able to optimize structural inter-feature entropy as opposed to occurrences, which is characteristic for open or closed fragment mining. In the experiments, the proposed method reduces feature set sizes by >90 % and >30 % compared to complete tree mining and open tree mining, respectively. Evaluation using crossvalidation runs shows that their classification accuracy is similar to the complete set of trees but significantly better than that of open trees. Compared to open or closed fragment mining, a large part of the search space can be pruned due to an improved statistical constraint (dynamic upper bound adjustment), which is also confirmed in the experiments in lower running times compared to ordinary (static) upper bound pruning. Further analysis using large-scale datasets yields insight into important properties of the proposed descriptors, such as the dataset coverage and the class size represented by each descriptor. A final cross-validation run confirms that the novel descriptors render large training sets feasible which previously might have been intractable.


Machine Learning | 2011

Efficient mining for structurally diverse subgraph patterns in large molecular databases

Andreas Maunz; Christoph Helma; Stefan Kramer

We present a new approach to large-scale graph mining based on so-called backbone refinement classes. The method efficiently mines tree-shaped subgraph descriptors under minimum frequency and significance constraints, using classes of fragments to reduce feature set size and running times. The classes are defined in terms of fragments sharing a common backbone. The method is able to optimize structural inter-feature entropy as opposed to purely occurrence-based criteria, which is characteristic for open or closed fragment mining. We first give an intuitive explanation why backbone refinement class features lead to a set of relevant features that are suitable for classification, in particular in the area of structure-activity relationships (SARs). We then show that backbone refinement classes yield a high compression in the search space of rooted perfect binary trees. We conduct several experiments to evaluate our theoretical insights in practice: A visualization suggests low co-occurrence and high entropy of backbone refinement class features. By comparison to a class of patterns sampled from the maximal patterns previously introduced by Al Hasan et al., we find a favorable tradeoff between the structural similarity and the resources needed to compute the descriptors. Cross-validation shows that classification accuracy is similar to the complete set of trees but significantly better than that of open trees, while feature set size is reduced by >90% and >30% compared to complete tree mining and open tree mining, respectively. Furthermore, compared to open or closed pattern mining, a large part of the search space can be pruned due to an improved statistical constraint (dynamic upper bound adjustment). This is confirmed experimentally by running times reduced by more than 60% compared to ordinary (static) upper bound pruning. The application of our method to the largest datasets that have been used in correlated graph mining so far indicates robustness against the minimum frequency parameter, and a cross-validation run on this data confirms that the novel descriptors render large training sets feasible, which previously might have been intractable.A C++ implementation of the mining algorithm is available at http://www.maunz.de/libfminer-doc. Animated figures, links to datasets, and further resources are available at http://www.maunz.de/mlj-res.


Regulatory Toxicology and Pharmacology | 2014

Automated and reproducible read-across like models for predicting carcinogenic potency

Elena Lo Piparo; Andreas Maunz; Christoph Helma; David Vorgrimmler; Benoît Schilter

Several qualitative (hazard-based) models for chronic toxicity prediction are available through commercial and freely available software, but in the context of risk assessment a quantitative value is mandatory in order to be able to apply a Margin of Exposure (predicted toxicity/exposure estimate) approach to interpret the data. Recently quantitative models for the prediction of the carcinogenic potency have been developed, opening some hopes in this area, but this promising approach is currently limited by the fact that the proposed programs are neither publically nor commercially available. In this article we describe how two models (one for mouse and one for rat) for the carcinogenic potency (TD50) prediction have been developed, using lazar (Lazy Structure Activity Relationships), a procedure similar to read-across, but automated and reproducible. The models obtained have been compared with the recently published ones, resulting in a similar performance. Our aim is also to make the models freely available in the near future thought a user friendly internet web site.


Frontiers in Pharmacology | 2016

Innovative Strategies to Develop Chemical Categories Using a Combination of Structural and Toxicological Properties

Monika Batke; Martin Gütlein; Falko Partosch; Ursula Gundert-Remy; Christoph Helma; Stefan Kramer; Andreas Maunz; Madeleine Seeland; Annette Bitsch

Interest is increasing in the development of non-animal methods for toxicological evaluations. These methods are however, particularly challenging for complex toxicological endpoints such as repeated dose toxicity. European Legislation, e.g., the European Unions Cosmetic Directive and REACH, demands the use of alternative methods. Frameworks, such as the Read-across Assessment Framework or the Adverse Outcome Pathway Knowledge Base, support the development of these methods. The aim of the project presented in this publication was to develop substance categories for a read-across with complex endpoints of toxicity based on existing databases. The basic conceptual approach was to combine structural similarity with shared mechanisms of action. Substances with similar chemical structure and toxicological profile form candidate categories suitable for read-across. We combined two databases on repeated dose toxicity, RepDose database, and ELINCS database to form a common database for the identification of categories. The resulting database contained physicochemical, structural, and toxicological data, which were refined and curated for cluster analyses. We applied the Predictive Clustering Tree (PCT) approach for clustering chemicals based on structural and on toxicological information to detect groups of chemicals with similar toxic profiles and pathways/mechanisms of toxicity. As many of the experimental toxicity values were not available, this data was imputed by predicting them with a multi-label classification method, prior to clustering. The clustering results were evaluated by assessing chemical and toxicological similarities with the aim of identifying clusters with a concordance between structural information and toxicity profiles/mechanisms. From these chosen clusters, seven were selected for a quantitative read-across, based on a small ratio of NOAEL of the members with the highest and the lowest NOAEL in the cluster (< 5). We discuss the limitations of the approach. Based on this analysis we propose improvements for a follow-up approach, such as incorporation of metabolic information and more detailed mechanistic information. The software enables the user to allocate a substance in a cluster and to use this information for a possible read- across. The clustering tool is provided as a free web service, accessible at http://mlc-reach.informatik.uni-mainz.de.


european conference on machine learning | 2010

Latent structure pattern mining

Andreas Maunz; Christoph Helma; Tobias Cramer; Stefan Kramer

Pattern mining methods for graph data have largely been restricted to ground features, such as frequent or correlated subgraphs. Kazius et al. have demonstrated the use of elaborate patterns in the biochemical domain, summarizing several ground features at once. Such patterns bear the potential to reveal latent information not present in any individual ground feature. However, those patterns were handcrafted by chemical experts. In this paper, we present a data-driven bottom-up method for pattern generation that takes advantage of the embedding relationships among individual ground features. The method works fully automatically and does not require data preprocessing (e.g., to introduce abstract node or edge labels). Controlling the process of generating ground features, it is possible to align them canonically and merge (stack) them, yielding a weighted edge graph. In a subsequent step, the subgraph features can further be reduced by singular value decomposition (SVD). Our experiments show that the resulting features enable substantial performance improvements on chemical datasets that have been problematic so far for graph mining approaches.


acm symposium on applied computing | 2014

Extracting information from support vector machines for pattern-based classification

Madeleine Seeland; Andreas Maunz; Andreas Karwath; Stefan Kramer

Statistical machine learning algorithms building on patterns found by pattern mining algorithms have to cope with large solution sets and thus the high dimensionality of the feature space. Vice versa, pattern mining algorithms are frequently applied to irrelevant instances, thus causing noise in the output. Solution sets of pattern mining algorithms also typically grow with increasing input datasets. The paper proposes an approach to overcome these limitations. The approach extracts information from trained support vector machines, in particular their support vectors and their relevance according to their coefficients. It uses the support vectors along with their coefficients as input to pattern mining algorithms able to handle weighted instances. Our experiments in the domain of graph mining and molecular graphs show that the resulting models are not significantly less accurate than models trained on the full datasets, yet require only a fraction of the time using much smaller sets of patterns.

Collaboration


Dive into the Andreas Maunz's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Falko Partosch

University of Göttingen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge