Kathrin Heikamp
University of Bonn
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kathrin Heikamp.
Chemical Biology & Drug Design | 2013
Kathrin Heikamp; Jürgen Bajorath
We provide a future perspective of the virtual screening field. A number of challenges will be highlighted that virtual screening will likely face when compound data will further grow at or beyond current rates and when much more target information will become available. These challenges go beyond computational efficiency issues (that will of course also play a critical role). For example, for structure‐based approaches, the accuracy of scoring functions and energy calculations will need to be improved. For ligand‐based approaches, the compound class‐dependence of similarity methods needs to be further explored and relationships between molecular similarity and activity similarity need to be established. We also comment on the current and future value of virtual screening. Opportunities for further development in a postgenome era are also discussed. It is hoped that some of the views and hypotheses we articulate might stimulate further discussion about the virtual screening field going forward.
Journal of Chemical Information and Modeling | 2011
Kathrin Heikamp; Jürgen Bajorath
A large-scale similarity search investigation has been carried out on 266 well-defined compound activity classes extracted from the ChEMBL database. The analysis was performed using two widely applied two-dimensional (2D) fingerprints that mark opposite ends of the current performance spectrum of these types of fingerprints, i.e., MACCS structural keys and the extended connectivity fingerprint with bond diameter four (ECFP4). For each fingerprint, three nearest neighbor search strategies were applied. On the basis of these search calculations, a similarity search profile of the ChEMBL database was generated. Overall, the fingerprint search campaign was surprisingly successful. In 203 of 266 test cases (∼76%), a compound recovery rate of at least 50% was observed with at least the better performing fingerprint and one search strategy. The similarity search profile also revealed several general trends. For example, fingerprint searching was often characterized by an early enrichment of active compounds in database selection sets. In addition, compound activity classes have been categorized according to different similarity search performance levels, which helps to put the results of benchmark calculations into perspective. Therefore, a compendium of activity classes falling into different search performance categories is provided. On the basis of our large-scale investigation, the performance range of state-of-the-art 2D fingerprinting has been delineated for compound data sets directed against a wide spectrum of pharmaceutical targets.
Journal of Chemical Information and Modeling | 2012
Kathrin Heikamp; Xiaoying Hu; Aixia Yan; Jürgen Bajorath
Activity cliffs are formed by pairs of structurally similar compounds that act against the same target but display a significant difference in potency. Such activity cliffs are the most prominent features of activity landscapes of compound data sets and a primary focal point of structure-activity relationship (SAR) analysis. The search for activity cliffs in various compound sets has been the topic of a number of previous investigations. So far, activity cliff analysis has concentrated on data mining for activity cliffs and on their graphical representation and has thus been descriptive in nature. By contrast, approaches for activity cliff prediction are currently not available. We have derived support vector machine (SVM) models to successfully predict activity cliffs. A key aspect of the approach has been the design of new kernels to enable SVM classification on the basis of molecule pairs, rather than individual compounds. In test calculations on different data sets, activity cliffs have been accurately predicted using specifically designed structural representations and kernel functions.
Journal of Chemical Information and Modeling | 2013
Kathrin Heikamp; Jürgen Bajorath
The choice of negative training data for machine learning is a little explored issue in chemoinformatics. In this study, the influence of alternative sets of negative training data and different background databases on support vector machine (SVM) modeling and virtual screening has been investigated. Target-directed SVM models have been derived on the basis of differently composed training sets containing confirmed inactive molecules or randomly selected database compounds as negative training instances. These models were then applied to search background databases consisting of biological screening data or randomly assembled compounds for available hits. Negative training data were found to systematically influence compound recall in virtual screening. In addition, different background databases had a strong influence on the search results. Our findings also indicated that typical benchmark settings lead to an overestimation of SVM-based virtual screening performance compared to search conditions that are more relevant for practical applications.
Expert Opinion on Drug Discovery | 2014
Kathrin Heikamp; Jürgen Bajorath
Introduction: Support vector machines (SVMs) are supervised machine learning algorithms for binary class label prediction and regression-based prediction of property values. In recent years, SVMs have become increasingly popular for drug discovery-relevant applications such as compound classification, the search for novel active compounds and property predictions. Areas covered: The authors provide the readers with a brief introduction of SVM theory and discuss the kernel functions designed for drug discovery applications. The authors also review the different types of SVM applications in drug discovery, looking at particular case studies. Furthermore, the authors discuss the recent hybrid methods developed that incorporate SVM modeling in different ways. Expert opinion: SVMs are currently among the best-performing approaches for chemical and biological property prediction and the computational identification of active compounds. It is anticipated that their use in drug discovery will further increase. Indeed, this will also include the development of SVM-based meta-classifiers that combine different approaches and exploit their individual strengths and complementarity.
Journal of Medicinal Chemistry | 2013
Dilyana Dimova; Kathrin Heikamp; Dagmar Stumpfe; Jürgen Bajorath
Activity cliffs are defined as pairs of structurally similar compounds with a significant difference in potency. These compound pairs have high SAR information content because they represent small structural changes leading to large potency alterations. Accordingly, activity cliffs are of prime interest for SAR exploration and compound optimization. It is currently unknown to what extent activity cliff information is utilized in practical medicinal chemistry. Therefore, we have assembled 56 compound data sets that evolved over time and searched for analogues of activity cliff-forming compounds with further increased potency. For ∼75% of all activity cliffs, there was no evidence for further chemical exploration. For ∼25% of all cliffs, potency progression was detected. In total, for ∼15% of all activity cliffs, positive cliff progression was observed that often involved multiple analogues. Given these findings, chemically unexplored activity cliffs should provide significant opportunities for further study in medicinal chemistry.
Journal of Chemical Information and Modeling | 2011
Kathrin Heikamp; Jürgen Bajorath
In independent studies it has previously been demonstrated that two-dimensional (2D) fingerprints have scaffold hopping ability in virtual screening, although these descriptors primarily emphasize structural and/or topological resemblance of reference and database compounds. However, the mechanism by which such fingerprints enrich structurally diverse molecules in database selection sets is currently little understood. In order to address this question, similarity search calculations on 120 compound activity classes of varying structural diversity were carried out using atom environment fingerprints. Two feature selection methods, Kullback-Leibler divergence and gain ratio analysis, were applied to systematically reduce these fingerprints and generate alternative versions for searching. Gain ratio is a feature selection method from information theory that has thus far not been considered in fingerprint analysis. However, it is shown here to be an effective fingerprint feature selection approach. Following comparative feature selection and similarity searching, the compound recall characteristics of original and reduced fingerprint versions were analyzed in detail. Small sets of fingerprint features were found to distinguish subsets of active compounds from other database molecules. The compound recall of fingerprint similarity searching often resulted from a cumulative detection of distinct compound subsets by different fingerprint features, which provided a rationale for the scaffold hopping potential of these 2D fingerprints.
Journal of Chemical Information and Modeling | 2013
Kathrin Heikamp; Jürgen Bajorath
Using support vector machine (SVM) ranking, a complex multi-class prediction task has been investigated involving sets of compounds that were active against related targets and represented all possible combinations of single-, dual-, and triple-target activities. Standard SVM models were not capable of differentiating compounds with overlapping yet distinct activity profiles. To address this problem, we designed differentially weighted SVM linear combinations that were found to preferentially detect compounds with desired activity profiles and deprioritize others. Hence, combining independently derived SVM models using negative and positive linear weighting factors balanced relative contributions from individual reference sets and successfully distinguished between compounds with overlapping activity profiles.
Future Medicinal Chemistry | 2012
Kathrin Heikamp; Jürgen Bajorath
Fingerprints (FPs) are bit or integer string representations of molecular structure and properties, and are popular descriptors for chemical similarity searching. A major goal of similarity searching is the identification of novel active compounds on the basis of known reference molecules. In this review recent FP design and engineering strategies are discussed. New types of FPs continue to be replaced, often applying different design principles. FP engineering techniques have recently been introduced to further improve search performance and computational efficiency and elucidate mechanisms by which FPs recognize active compounds. In addition, through feature selection and hybridization techniques, standard FPs have been transformed into compound class-specific versions with further increased search performance. Moreover, scaffold hopping mechanisms have been explored. FPs will continue to play an important role in the search for novel active compounds.
Journal of Chemical Information and Modeling | 2015
Andrew Anighoro; Dagmar Stumpfe; Kathrin Heikamp; Kristin Beebe; Leonard M. Neckers; Jürgen Bajorath; Giulio Rastelli
The design of a single drug molecule that is able to simultaneously and specifically interact with multiple biological targets is gaining major consideration in drug discovery. However, the rational design of drugs with a desired polypharmacology profile is still a challenging task, especially when these targets are distantly related or unrelated. In this work, we present a computational approach aimed at the identification of suitable target combinations for multitarget drug design within an ensemble of biologically relevant proteins. The target selection relies on the analysis of activity annotations present in molecular databases and on ligand-based virtual screening. A few target combinations were also inspected with structure-based methods to demonstrate that the identified dual-activity compounds are able to bind target combinations characterized by remote binding site similarities. Our approach was applied to the heat shock protein 90 (Hsp90) interactome, which contains several targets of key importance in cancer. Promising target combinations were identified, providing a basis for the computational design of compounds with dual activity. The approach may be used on any ensemble of proteins of interest for which known inhibitors are available.