Swarit Jasial
University of Bonn
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Swarit Jasial.
PLOS ONE | 2016
Swarit Jasial; Ye Hu; Jürgen Bajorath
In the context of polypharmacology, an emerging concept in drug discovery, promiscuity is rationalized as the ability of compounds to specifically interact with multiple targets. Promiscuity of drugs and bioactive compounds has thus far been analyzed computationally on the basis of activity annotations, without taking assay frequencies or inactivity records into account. Most recent estimates have indicated that bioactive compounds interact on average with only one to two targets, whereas drugs interact with six or more. In this study, we have further extended promiscuity analysis by identifying the most extensively assayed public domain compounds and systematically determining their promiscuity. These compounds were tested in hundreds of assays against hundreds of targets. In our analysis, assay promiscuity was distinguished from target promiscuity and separately analyzed for primary and confirmatory assays. Differences between the degree of assay and target promiscuity were surprisingly small and average and median degrees of target promiscuity of 2.6 to 3.4 and 2.0 were determined, respectively. Thus, target promiscuity remained at a low level even for most extensively tested active compounds. These findings provide further evidence that bioactive compounds are less promiscuous than drugs and have implications for pharmaceutical research. In addition to a possible explanation that drugs are more extensively tested for additional targets, the results would also support a “promiscuity enrichment model” according to which promiscuous compounds might be preferentially selected for therapeutic efficacy during clinical evaluation to ultimately become drugs.
Journal of Medicinal Chemistry | 2017
Swarit Jasial; Ye Hu; Jürgen Bajorath
Undetected pan-assay interference compounds (PAINS) with false-positive activities in assays often propagate through medicinal chemistry programs and compromise their outcomes. Although a large number of PAINS have been classified, often on the basis of individual studies or chemical experience, little has been done so far to systematically assess their activity profiles. Herein we report a large-scale analysis of the behavior of PAINS in biological screening assays. More than 23 000 extensively tested compounds containing PAINS substructures were detected, and their hit rates were determined. Many consistently inactive compounds were identified. The hit frequency was low overall, with median values of two to five hits for PAINS tested in hundreds of assays. Only confined subsets of PAINS produced abundant hits. The same PAINS substructure was often found in consistently inactive and frequently active compounds, indicating that the structural context in which PAINS occur modulates their effects.
F1000Research | 2016
Swarit Jasial; Ye Hu; Martin Vogt; Jürgen Bajorath
A largely unsolved problem in chemoinformatics is the issue of how calculated compound similarity relates to activity similarity, which is central to many applications. In general, activity relationships are predicted from calculated similarity values. However, there is no solid scientific foundation to bridge between calculated molecular and observed activity similarity. Accordingly, the success rate of identifying new active compounds by similarity searching is limited. Although various attempts have been made to establish relationships between calculated fingerprint similarity values and biological activities, none of these has yielded generally applicable rules for similarity searching. In this study, we have addressed the question of molecular versus activity similarity in a more fundamental way. First, we have evaluated if activity-relevant similarity value ranges could in principle be identified for standard fingerprints and distinguished from similarity resulting from random compound comparisons. Then, we have analyzed if activity-relevant similarity values could be used to guide typical similarity search calculations aiming to identify active compounds in databases. It was found that activity-relevant similarity values can be identified as a characteristic feature of fingerprints. However, it was also shown that such values cannot be reliably used as thresholds for practical similarity search calculations. In addition, the analysis presented herein helped to rationalize differences in fingerprint search performance.
F1000Research | 2015
Ye Hu; Swarit Jasial; Jürgen Bajorath
In the context of polypharmacology, compound promiscuity is rationalized as the ability of small molecules to specifically interact with multiple targets. To study promiscuity progression of bioactive compounds in detail, nearly 1 million compounds and more than 5.2 million activity records were analyzed. Compound sets were assembled by applying different data confidence criteria and selecting compounds with activity histories over many years. On the basis of release dates, compounds and activity records were organized on a time course, which ultimately enabled monitoring data growth and promiscuity progression over nearly 40 years, beginning in 1976. Surprisingly low degrees of promiscuity were consistently detected for all compound sets and there were only small increases in promiscuity over time. In fact, most compounds had a constant degree of promiscuity, including compounds with an activity history of 10 or 20 years. Moreover, during periods of massive data growth, beginning in 2007, promiscuity degrees also remained constant or displayed only minor increases, depending on the activity data confidence levels. Considering high-confidence data, bioactive compounds currently interact with 1.5 targets on average, regardless of their origins, and display essentially constant degrees of promiscuity over time. Taken together, our findings provide expectation values for promiscuity progression and magnitudes among bioactive compounds as activity data further grow.
Aaps Journal | 2017
Ye Hu; Swarit Jasial; Erik Gilberg; Jürgen Bajorath
Publicly available screening data were systematically searched for extensively assayed structural analogs with large differences in the number of targets they were active against. Screening compounds with potential chemical liabilities that may give rise to assay artifacts were identified and excluded from the analysis. “Promiscuity cliffs” were frequently identified, defined here as pairs of structural analogs with a difference of at least 20 target annotations across all assays they were tested in. New assay indices were introduced to prioritize cliffs formed by screening compounds that were extensively tested in comparably large numbers of assays including many shared assays. In these cases, large differences in promiscuity degrees were not attributable to differences in assay frequency and/or lack of assay overlap. Such analog pairs have high priority for further exploring molecular origins of multi-target activities. Therefore, these promiscuity cliffs and associated target annotations are made freely available. The corresponding analogs often represent equally puzzling and interesting examples of structure-promiscuity relationships.
ACS Omega | 2018
Martin Vogt; Swarit Jasial; Jürgen Bajorath
Compound profiling matrices record assay results for compound libraries tested against panels of targets. In addition to their relevance for exploring structure–activity relationships, such matrices are of considerable interest for chemoinformatic and chemogenomic applications. For example, profiling matrices provide a valuable data resource for the development and evaluation of machine learning approaches for multitask activity prediction. However, experimental compound profiling matrices are rare in the public domain. Although they are generated in pharmaceutical settings, they are typically not disclosed. Herein, we present an algorithm for the generation of large profiling matrices, for example, containing more than 100 000 compounds exhaustively tested against 50 to 100 targets. The new methodology is a variant of biclustering algorithms originally introduced for large-scale analysis of genomics data. Our approach is applied here to assays from the PubChem BioAssay database and generates profiling matrices of increasing assay or compound coverage by iterative removal of entities that limit coverage. Weight settings control final matrix size by preferentially retaining assays or compounds. In addition, the methodology can also be applied to generate matrices enriched with active entries representing above-average assay hit rates.
ACS Omega | 2018
Raquel Rodríguez-Pérez; Tomoyuki Miyao; Swarit Jasial; Martin Vogt; Jürgen Bajorath
Screening of compound libraries against panels of targets yields profiling matrices. Such matrices typically contain structurally diverse screening compounds, large numbers of inactives, and small numbers of hits per assay. As such, they represent interesting and challenging test cases for computational screening and activity predictions. In this work, modeling of large compound profiling matrices was attempted that were extracted from publicly available screening data. Different machine learning methods including deep learning were compared and different prediction strategies explored. Prediction accuracy varied for assays with different numbers of active compounds, and alternative machine learning approaches often produced comparable results. Deep learning did not further increase the prediction accuracy of standard methods such as random forests or support vector machines. Target-based random forest models were prioritized and yielded successful predictions of active compounds for many assays.
Future Science OA | 2018
Martin Vogt; Swarit Jasial; Jürgen Bajorath
Aim: Screening of compounds against panels of targets yields profiling matrices. Such matrices are excellent test cases for the analysis and prediction of ligand–target interactions. We made three matrices freely available that were extracted from public screening data. Methodology: A new algorithm was used to derive complete profiling matrices from assay data. Data: Two profiling matrices were derived from confirmatory assays containing 53 different targets and 109,925 and 143,310 distinct compounds, respectively. A third matrix was extracted from primary screening assays covering 171 different targets and 224,251 compounds. Next steps: Profiling matrices can be used to test computational chemogenomics methods for their ability to predict ligand–target pairs. Additional matrices will be generated for individual target families.
Molecular Informatics | 2015
Swarit Jasial; Jenny Balfer; Martin Vogt; Jürgen Bajorath
Support vector machines (SVMs) are among the most popular machine learning methods for compound classification and other chemoinformatics tasks such as, for example, the prediction of ligand‐target pairs or compound activity profiles. Depending on the specific applications, different SVM strategies can be used. For example, in the context of potency‐directed virtual screening, linear combinations of multiple SVM models have been shown to enrich database selection sets with potent compounds compared to individual models. An open question concerning the use of SVM linear combinations (SVM‐LCs) is how to best weight the models on a relative scale. Typically, linear weights are subjectively set. Herein, preferred weighting factors for SVM‐LC were systematically determined. Therefore, weights were treated as meta‐parameters and optimized by machine learning to enrich data set rankings with highly active compounds. The meta‐parameter approach has been applied to 10 screening data sets and found to further improve SVM performance over other SVM‐LCs and support vector regression (SVR) models. The results show that optimal weights depend on data set characteristics and chosen molecular representations. In addition, individual models often do not contribute to the performance of SVM‐LCs. Taken together, these findings emphasize the need for systematic meta‐parameter estimation.
Journal of Medicinal Chemistry | 2016
Erik Gilberg; Swarit Jasial; Dagmar Stumpfe; Dilyana Dimova; Jürgen Bajorath