Mathias Wawer
University of Bonn
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mathias Wawer.
Journal of Medicinal Chemistry | 2010
Anne Mai Wassermann; Mathias Wawer; Jürgen Bajorath
The study of compound structure-activity relationships (SARs) is one of the central themes in medicinal chemistry. SAR information is analyzed in different contexts, from screening and hit-to-lead to lead optimization projects. For the exploration of SARs, the concept of an activity landscape, which integrates molecular similarity and potency information, is of high relevance. The computational study of activity landscapes is still an evolving field. Activity landscapemodels are designed to rationalize SAR features of compound data sets and select key compounds for chemical exploration. The choice of molecular representations and the way molecular similarity is assessed are critically important factors for landscape generation and analysis. Graphical representation of SAR features is a major focal point of landscape modeling. Although complex activity landscapes are generally difficult to analyze, much progress has recently been made in extracting SAR information from various landscape views. This Perspective aims to provide an overview of the state-of-theart in activity landscape analysis and a discussion of its potential for medicinal chemistry applications. Understanding how structural modifications affect the biological activity of compounds or deriving a pharmacophore hypothesis from diverse active chemical entities present challenges that can be tackled using medicinal chemistry experience and intuition and/or computational tools. By no means is SAR analysis a priori dependent on computational methods. Rather, SAR analysis is often carried out on paper or whiteboards, by comparing molecular graphs of active compounds, consistentwith theway chemists are traditionally trained. It has been pointed out that judgments of medicinal chemists are naturally subjective and often inconsistent. This is of course not specific to medicinal chemistry but rather a consequence of how we as individuals subjectively access and evaluate data sets of any kind. Likely inconsistencies in individual judgments about chemical and biological data might well be taken as an argument to promote the use of computational methods for SAR analysis. However, it would be rather careless to assume that computational analysis would per se be objective. In fact, computational objectivity does not exist. We typically apply models with underlying assumptions and inherent approximations that are often only useful within relatively narrow applicability domains and the results of which are generally difficult to evaluate. In this context, it is often overlooked thatwe can notmodel phenomena whose physicochemical or biological foundations we do not understand. Of course, calculations that are carried out and reported should at least be reproducible (one would hope), but reproducibility does not mean objectivity. There is, however, a rather simple factor that generally favors computational approaches to SARanalysis, and that is data set size. As long as one investigates one compound series at a time, knowledge of chemical graphs and activity data might be readily sufficient to deduce and predict SAR behavior. However, as molecular data sets grow in size, we quickly approach our limits to access and compare structures and associated biological properties such that computational data processing and analysis often become essential. Many compound data sets that have accumulated in pharmaceutical settings go far beyond the capacity of medicinal chemistrycentric SARanalysis and require the applicationof specialized computational tools for data handling and also modeling. Again, given the model-based nature of computational SAR analysis schemes, this does notmake SARanalysis necessarily more objective (than individual assessments), but it makes it feasible. Currently available computational approaches to SAR analysis are multifaceted and of rather different methodological complexity. A general distinction can be made between methodologies thatprimarily help toaccess andvisualize SAR data obtained from screening or chemical optimization campaigns and those that ultimately predict biological activities. Among predictive methods, there are, for example, approaches to model linear and nonlinear structure-activity relationships, in particular, those based on the classicalQSAR paradigm, pharmacophore techniques, andvariousmachine learning approaches. Activity landscape methods, as introduced in the following, add to this methodological spectrum a strong focus on data-driven, descriptive, and large-scale SAR analysis schemes.
Journal of Medicinal Chemistry | 2008
Mathias Wawer; Lisa Peltason; Nils Weskamp; Andreas Teckentrup; Jürgen Bajorath
The study of structure-activity relationships (SARs) of small molecules is of fundamental importance in medicinal chemistry and drug design. Here, we introduce an approach that combines the analysis of similarity-based molecular networks and SAR index distributions to identify multiple SAR components present within sets of active compounds. Different compound classes produce molecular networks of distinct topology. Subsets of compounds related by different local SARs are often organized in small communities in networks annotated with potency information. Many local SAR communities are not isolated but connected by chemical bridges, i.e., similar molecules occurring in different local SAR contexts. The analysis makes it possible to relate local and global SAR features to each other and identify key compounds that are major determinants of SAR characteristics. In many instances, such compounds represent start and end points of chemical optimization pathways and aid in the selection of other candidates from their communities.
Journal of Chemical Information and Modeling | 2010
Eugen Lounkine; Mathias Wawer; Anne Mai Wassermann; Jürgen Bajorath
We introduce SARANEA, an open-source Java application for interactive exploration of structure-activity relationship (SAR) and structure-selectivity relationship (SSR) information in compound sets of any source. SARANEA integrates various SAR and SSR analysis functions and utilizes a network-like similarity graph data structure for visualization. The program enables the systematic detection of activity and selectivity cliffs and corresponding key compounds across multiple targets. Advanced SAR analysis functions implemented in SARANEA include, among others, layered chemical neighborhood graphs, cliff indices, selectivity trees, editing functions for molecular networks and pathways, bioactivity summaries of key compounds, and markers for bioactive compounds having potential side effects. We report the application of SARANEA to identify SAR and SSR determinants in different sets of serine protease inhibitors. It is found that key compounds can influence SARs and SSRs in rather different ways. Such compounds and their SAR/SSR characteristics can be systematically identified and explored using SARANEA. The program and source code are made freely available under the GNU General Public License.
Drug Discovery Today | 2010
Mathias Wawer; Eugen Lounkine; Anne Mai Wassermann; Jürgen Bajorath
Computational data mining and visualization techniques play a central part in the extraction of structure-activity relationship (SAR) information from compound sets including high-throughput screening data. Standard statistical and classification techniques can be used to organize data sets and evaluate the chemical neighborhood of potent hits; however, such methods are limited in their ability to extract complex SAR patterns from data sets and make them readily accessible to medicinal chemists. Therefore, new approaches and data structures are being developed that explicitly focus on molecular structure and its relationship to biological activity across multiple targets. Here, we review standard techniques for compound data analysis and describe new data structures and computational tools for SAR mining of large compound data sets.
Journal of Medicinal Chemistry | 2011
Mathias Wawer; Jürgen Bajorath
The systematic extraction of structure-activity relationship (SAR) information from large and diverse compound data sets depends on the application of computational analysis methods. Irrespective of the methodological details, the ultimate goal of large-scale SAR analysis is to identify most informative compounds and rationalize structural changes that determine SAR behavior. Such insights provide a basis for further chemical exploration. Herein we introduce the first graphical SAR analysis method that globally organizes large compound data sets on the basis of local structural relationships, hence providing an immediate access to important structural modifications and SAR determinants.
Journal of Chemical Information and Modeling | 2010
Mathias Wawer; Jürgen Bajorath
An intuitive and generally applicable analysis method, termed similarity-potency tree (SPT), is introduced to mine structure-activity relationship (SAR) information in compound data sets of any source. Only compound potency values and nearest-neighbor similarity relationships are considered. Rather than analyzing a data set as a whole, in part overlapping compound neighborhoods are systematically generated and represented as SPTs. This local analysis scheme simplifies the evaluation of SAR information and SPTs of high SAR information content are easily identified. By inspecting only a limited number of compound neighborhoods, it is also straightforward to determine whether data sets contain only little or no interpretable SAR information. Interactive analysis of SPTs is facilitated by reading the trees in two directions, which makes it possible to extract SAR rules, if available, in a consistent manner. The simplicity and interpretability of the data structure and the ease of calculation are characteristic features of this approach. We apply the methodology to high-throughput screening and lead optimization data sets, compare the approach to standard clustering techniques, illustrate how SAR rules are derived, and provide some practical guidance how to best utilize the methodology. The SPT program is made freely available to the scientific community.
Journal of Chemical Information and Modeling | 2011
Dilyana Dimova; Mathias Wawer; Anne Mai Wassermann; Jürgen Bajorath
An activity landscape model of a compound data set can be rationalized as a graphical representation that integrates molecular similarity and potency relationships. Activity landscape representations of different design are utilized to aid in the analysis of structure-activity relationships and the selection of informative compounds. Activity landscape models reported thus far focus on a single target (i.e., a single biological activity) or at most two targets, giving rise to selectivity landscapes. For compounds active against more than two targets, landscapes representing multitarget activities are difficult to conceptualize and have not yet been reported. Herein, we present a first activity landscape design that integrates compound potency relationships across multiple targets in a formally consistent manner. These multitarget activity landscapes are based on a general activity cliff classification scheme and are visualized in graph representations, where activity cliffs are represented as edges. Furthermore, the contributions of individual compounds to structure-activity relationship discontinuity across multiple targets are monitored. The methodology has been applied to derive multitarget activity landscapes for compound data sets active against different target families. The resulting landscapes identify single-, dual-, and triple-target activity cliffs and reveal the presence of hierarchical cliff distributions. From these multitarget activity landscapes, compounds forming complex activity cliffs can be readily selected.
ACS Medicinal Chemistry Letters | 2011
Mathias Wawer; Jürgen Bajorath
We combine two graphical SAR analysis methods, Network-like Similarity Graphs (NSGs) and Similarity-Potency Trees (SPTs), to search for SAR information in a large and heterogeneous compound data set containing more than 13,000 antimalarial screening hits that was recently released by GlaxoSmithKline (GSK). The NSG-SPT approach first identifies subsets of compounds inducing local SAR discontinuity in data sets and then extracts available SAR information from these subsets in a graphically intuitive manner. Applying the NSG-SPT analysis scheme, we have identified in the GSK collection compound subsets of high local SAR information content including both known and previously unknown antimalarial chemotypes, which yielded interpretable SAR patterns. This information should be helpful to prioritize and select antimalarial candidate compounds for further chemical exploration. Furthermore, the NSG-SPT tools are publicly available, and our study also shows how to practically apply these SAR analysis methods to study large compound data sets.
Journal of Medicinal Chemistry | 2009
Mathias Wawer; Lisa Peltason; Jürgen Bajorath
A computational molecular network analysis of various high-throughput screening (HTS) data sets including inhibition assays and cell-based screens organizes screening hits according to different local structure-activity relationships (SARs). The resulting network representations make it possible to focus on different local SAR environments in screening data. We have designed a simple scoring function accounting for similarity and potency relationships among hits that identifies SAR pathways leading from active compounds in different SAR contexts to key compounds forming activity cliffs. From these pathways, SAR information can be extracted and utilized to select hits for further analysis. In clusters of hits related by different local SARs, alternative pathways can be systematically explored and ranked according to SAR information content, which makes it possible to prioritize hits in a consistent manner.
ChemMedChem | 2009
Mathias Wawer; Jürgen Bajorath
A data mining approach is introduced that automatically extracts SAR information from high‐throughput screening data sets and that helps to select active compounds for chemical exploration and hit‐to‐lead projects. SAR pathways are systematically identified consisting of sequences of similar active compounds with gradual increases in potency. Fully enumerated SAR pathway sets are subjected to pathway scoring, filtering, and mining, and pathways with the most significant SAR information content are prioritized. High‐scoring SAR pathways often reveal activity cliffs contained in screening data. Subsets of SAR pathways are analyzed in SAR trees that make it possible to identify microenvironments of significant SAR discontinuity from which hits are preferentially selected. SAR trees of alternative pathways leading to activity cliffs identify key compounds and help to develop chemically intuitive SAR hypotheses.