Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Juergen Bajorath is active.

Publication


Featured researches published by Juergen Bajorath.


Journal of Chemical Information and Computer Sciences | 2001

Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening.

Juergen Bajorath

Compound classification and virtual screening methods are capable of exploring and exploiting molecular similarity beyond chemistry, in accordance with the similar property principle.1 They can be used to analyze and predict biologically active compounds and correlate structural features and chemical properties of molecules with specific activities. This explains why such approaches are highly attractive tools in pharmaceutical research, 2 although a number of the underlying scientific concepts have originally been developed for different purposes. Since it is increasingly recognized that simply synthesizing and screening more and more compounds does not necessarily provide a sufficiently large number of high-quality leads and, ultimately, clinical candidates, much effort is spent in developing and implementing computational concepts that help to identify and refine leads. Typical applications include the identification of compounds with desired activity by database searching, derivation of predictive models of activity for database mining, selection of representative subsets from large compound libraries, or analysis of druglike properties. The aim of this contribution is to review and comment on some major developments in compound classification and molecular similarity research, reflect their diversity, and highlight some of the questions that remain unanswered. In a single contribution, it is difficult, if not impossible, to provide a complete account of, and give full credit to, all methods and developments relevant to compound classification and virtual screening. Therefore, some areas have been, rather subjectively, more emphasized than others or even omitted. For example, the discussion of virtual screening approaches is limited to those that focus on the small molecular level, as opposed to target structure-based design or docking methods, which have been reviewed elsewhere. 3-5


Journal of Chemical Information and Computer Sciences | 2001

Mini-fingerprints detect similar activity of receptor ligands previously recognized only by three-dimensional pharmacophore-based methods.

Ling Xue; Florence L. Stahura; Jeffrey W. Godden; Juergen Bajorath

Mini-fingerprints (MFPs) are short binary bit string representations of molecular structure and properties, composed of few selected two-dimensional (2D) descriptors and a number of structural keys. MFPs were specifically designed to recognize compounds with similar activity. Here we report that MFPs are capable of detecting similar activities of some druglike molecules, including endothelin A antagonists and alpha(1)-adrenergic receptor ligands, the recognition of which was previously thought to depend on the use of multiple point three-dimensional (3D) pharmacophore methods. Thus, in these cases, MFPs and pharmacophore fingerprints produce similar results, although they define, in terms of their complexity, opposite ends of the spectrum of methods currently used to study molecular similarity or diversity. For each of the studied compound classes, comparison of MFP bit settings identified a consensus or signature pattern. Scaling factors can be applied to these bits in order to increase the probability of finding compounds with similar activity by virtual screening.


Journal of Chemical Information and Modeling | 2008

Similarity searching using fingerprints of molecular fragments involved in protein-ligand interactions.

Lu Tan; Eugen Lounkine; Juergen Bajorath

To incorporate protein-ligand interaction information into conventional two-dimensional (2D) fingerprint searching, interacting fragments of active compounds were extracted from X-ray structures of protein-ligand complexes and encoded as structural key-type fingerprints. Similarity search calculations with fingerprints derived from interacting fragments were compared to fingerprints of complete ligands and control fragments. In these calculations, fingerprints of interacting fragments produced significantly higher compound recall than other fingerprints. These results indicate that ligand fragments involved in protein-ligand interactions carry much activity-specific chemical information that can be exploited in similarity searching without explicitly accounting for interaction information.


Journal of Chemical Information and Modeling | 2008

Balancing the Influence of Molecular Complexity on Fingerprint Similarity Searching

Yuan Wang; Juergen Bajorath

Differences in molecular complexity and size are known to bias the evaluation of fingerprint similarity. For example, complex molecules tend to produce fingerprints with higher bit density than simple ones, which often leads to artificially high similarity values in search calculations. We introduce here a variant of the Tversky coefficient that makes it possible to modulate or eliminate molecular complexity effects when evaluating fingerprint similarity. This has enabled us to study in detail the role of molecular complexity in similarity searching and the relationship between reference and active database compounds. Balancing complexity effects leads to constant distributions of similarity values for reference and database molecules, independent of how compound contributions are weighted. When searching for active compounds with varying complexity, hit rates can be optimized by modulating complexity effects, rather than eliminating them, and adjusting relative compound weights. For reference molecules and active database compounds having different complexity, preferred parameter settings are identified.


Journal of Chemical Information and Computer Sciences | 2002

Accurate partitioning of compounds belonging to diverse activity classes.

Ling Xue; Juergen Bajorath

Diverse sets of compounds were classified according to biological activity by use of a partitioning approach based on principal component analysis in conjunction with a genetic algorithm for molecular descriptor evaluation. Combinations of 236 molecular property and structural key descriptors were explored for their performance in classifying 317 molecules belonging to 21 distinct biological activity classes from various sources. Preferred descriptor combinations were further explored by complete factorial analysis. In these calculations, compounds having similar specific activity were predicted with greater than 80% accuracy.


Journal of Chemical Information and Computer Sciences | 2004

Molecular similarity analysis and virtual screening by mapping of consensus positions in binary-transformed chemical descriptor spaces with variable dimensionality.

Jeffrey W. Godden; John R. Furr; Ling Xue; Florence L. Stahura; Juergen Bajorath

A novel compound classification algorithm is described that operates in binary molecular descriptor spaces and groups active compounds together in a computationally highly efficient manner. The method involves the transformation of continuous descriptor value ranges into a binary format, subsequent definition of simplified descriptor spaces, identification of consensus positions of specific compound sets in these spaces, and iterative adjustments of the dimensionality of the descriptor spaces in order to discriminate compounds sharing similar activity from others. We term this approach Dynamic Mapping of Consensus positions (DMC) because the definition of reference spaces is tuned toward specific compound classes and their dimensionality is increased as the analysis proceeds. When applied to virtual screening, sets of bait compounds are added to a large screening database to identify hidden active molecules. In these calculations, molecules that map to consensus positions after elimination of most of the database compounds are considered hit candidates. In a benchmark study on five biological activity classes, hits for randomly assembled sets of bait molecules were correctly identified in 95% of virtual screening calculations in a source database containing more than 1.3 million molecules, thus providing a measure of the sensitivity of the DMC technique.


Journal of Chemical Information and Modeling | 2008

Distinguishing between Bioactive and Modeled Compound Conformations through Mining of Emerging Chemical Patterns

Jens Auer; Juergen Bajorath

To systematically compare bioactive and theoretically derived compound conformations, we have analyzed 18 different sets of active small molecules with experimentally determined binding conformations and modeled conformers using a pattern recognition approach. Compound class-specific descriptor value range patterns that accurately distinguish bioactive conformations from other low-energy conformers were identified for all 18 compound classes. Discriminatory patterns were often chemically intuitive and could be well rationalized on the basis of X-ray structures of the protein-ligand complexes. Target-specific descriptor patterns can be used as filters to screen conformational ensembles for bioactive conformations.


Journal of Chemical Information and Computer Sciences | 2002

Median Partitioning: a novel method for the selection of representative subsets from large compound pools.

Jeffrey W. Godden; Ling Xue; Douglas B. Kitchen; Florence L. Stahura; E. James Schermerhorn; Juergen Bajorath

A method termed Median Partitioning (MP) has been developed to select diverse sets of molecules from large compound pools. Unlike many other methods for subset selection, the MP approach does not depend on pairwise comparison of molecules and can therefore be applied to very large compound collections. The only time limiting step is the calculation of molecular descriptors for database compounds. MP employs arrays of property descriptors with little correlation to divide large compound pools into partitions from which representative molecules can be selected. In each of n subsequent steps, a population of molecules is divided into subpopulations above and below the median value of a property descriptor until a desired number of 2n partitions are obtained. For descriptor evaluation and selection, an entropy formulation was embedded in a genetic algorithm. MP has been applied here to generate a subset of the Available Chemicals Directory, and the results have been compared with cell-based partitioning.


Journal of Chemical Information and Modeling | 2008

Design and Exploration of Target-Selective Chemical Space Representations

Ingo Vogt; Juergen Bajorath

We report the design of target-selective chemical spaces using CA-DynaMAD, a mapping algorithm that generates and navigates flexible space representations for the identification of active or selective compounds. The algorithm iteratively increases the dimensionality of reference spaces in a controlled manner by evaluating a single descriptor per iteration. For seven sets of closely related biogenic amine G protein coupled receptor (GPCR) antagonists with different selectivity, target-selective reference spaces were designed and used to identify selective compounds by screening a biologically annotated database. Combinations of descriptors that constitute target-selective reference spaces identified with CA-DynaMAD can also be used to build other computational models for the prediction of compound selectivity.


Journal of Chemical Information and Modeling | 2008

Core trees and consensus fragment sequences for molecular representation and similarity analysis.

Eugen Lounkine; Juergen Bajorath

A new type of molecular representation is introduced that is based on activity class characteristic substructures extracted from random fragment populations. Mapping of characteristic substructures is used to determine atom match rates in active molecules. Comparison of match rates of bonded atoms defines a hierarchical molecular fragmentation scheme. Active compounds are encoded as fragmentation pathways isolated from core trees. These paths are amenable to biological sequence alignment methods in combination with substructure-based scoring functions. From multiple core path alignments, consensus fragment sequences are derived that represent compound activity classes. Consensus fragment sequences weighted by increasing structural specificity can also be used to map molecules and search databases for active compounds.

Collaboration


Dive into the Juergen Bajorath's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hanna Eckert

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar

Ingo Vogt

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar

John R. Furr

Albany Molecular Research

View shared research outputs
Researchain Logo
Decentralizing Knowledge