Jörg K. Wegner
Tibotec
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jörg K. Wegner.
MedChemComm | 2011
Gerard J. P. van Westen; Jörg K. Wegner; Adriaan P. IJzerman; Herman W. T. van Vlijmen; Andreas Bender
‘Proteochemometric modeling’ is a bioactivity modeling technique founded on the description of both small molecules (the ligands), and proteins (the targets). By combining those two elements of a ligand – target interaction proteochemometrics techniques model the interaction complex or the full ligand – target interaction space, and they are able to quantify the similarity between both ligands and targets simultaneously. Consequently, proteochemometric models or complex based models, can be considered an extension of QSAR models, which are ligand based. As proteochemometric models are able to incorporate target information they outperform conventional QSAR models when extrapolating from the activities of known ligands on known targets to novel targets. Vice versa, proteochemometrics can be used to virtually screen for selective compounds that are solely active on a single member of a subfamily of targets, as well as to select compounds with a desired bioactivity profile – a topic particularly relevant with concepts such as ‘ligand polypharmacology’ in mind. Here we illustrate the concept of proteochemometrics and provide a review of relevant methodological publications in the field. We give an overview of the target families proteochemometrics modeling has previously been applied to, and introduce some novel application areas of the modeling technique. We conclude that proteochemometrics is a promising technique in preclinical drug research that allows merging data sets that were previously considered separately, with the potential to extrapolate more reliably both in ligand as well as target space.
PLOS ONE | 2011
Gerard J. P. van Westen; Jörg K. Wegner; Peggy Geluykens; Leen Kwanten; Inge Vereycken; Anik Peeters; Adriaan P. IJzerman; Herman W. T. van Vlijmen; Andreas Bender
In quite a few diseases, drug resistance due to target variability poses a serious problem in pharmacotherapy. This is certainly true for HIV, and hence, it is often unknown which drug is best to use or to develop against an individual HIV strain. In this work we applied ‘proteochemometric’ modeling of HIV Non-Nucleoside Reverse Transcriptase (NNRTI) inhibitors to support preclinical development by predicting compound performance on multiple mutants in the lead selection stage. Proteochemometric models are based on both small molecule and target properties and can thus capture multi-target activity relationships simultaneously, the targets in this case being a set of 14 HIV Reverse Transcriptase (RT) mutants. We validated our model by experimentally confirming model predictions for 317 untested compound – mutant pairs, with a prediction error comparable with assay variability (RMSE 0.62). Furthermore, dependent on the similarity of a new mutant to the training set, we could predict with high accuracy which compound will be most effective on a sequence with a previously unknown genotype. Hence, our models allow the evaluation of compound performance on untested sequences and the selection of the most promising leads for further preclinical research. The modeling concept is likely to be applicable also to other target families with genetic variability like other viruses or bacteria, or with similar orthologs like GPCRs.
Journal of Medicinal Chemistry | 2012
Gerard J. P. van Westen; Olaf O. van den Hoven; Rianne van der Pijl; Thea Mulder-Krieger; Henk de Vries; Jörg K. Wegner; Adriaan P. IJzerman; Herman W. T. van Vlijmen; Andreas Bender
The four subtypes of adenosine receptors form relevant drug targets in the treatment of, e.g., diabetes and Parkinsons disease. In the present study, we aimed at finding novel small molecule ligands for these receptors using virtual screening approaches based on proteochemometric (PCM) modeling. We combined bioactivity data from all human and rat receptors in order to widen available chemical space. After training and validating a proteochemometric model on this combined data set (Q(2) of 0.73, RMSE of 0.61), we virtually screened a vendor database of 100910 compounds. Of 54 compounds purchased, six novel high affinity adenosine receptor ligands were confirmed experimentally, one of which displayed an affinity of 7 nM on the human adenosine A(1) receptor. We conclude that the combination of rat and human data performs better than human data only. Furthermore, we conclude that proteochemometric modeling is an efficient method to quickly screen for novel bioactive compounds.
PLOS Computational Biology | 2013
Gerard J. P. van Westen; Alwin Hendriks; Jörg K. Wegner; Adriaan P. IJzerman; Herman W. T. van Vlijmen; Andreas Bender
Infection with HIV cannot currently be cured; however it can be controlled by combination treatment with multiple anti-retroviral drugs. Given different viral genotypes for virtually each individual patient, the question now arises which drug combination to use to achieve effective treatment. With the availability of viral genotypic data and clinical phenotypic data, it has become possible to create computational models able to predict an optimal treatment regimen for an individual patient. Current models are based only on sequence data derived from viral genotyping; chemical similarity of drugs is not considered. To explore the added value of chemical similarity inclusion we applied proteochemometric models, combining chemical and protein target properties in a single bioactivity model. Our dataset was a large scale clinical database of genotypic and phenotypic information (in total ca. 300,000 drug-mutant bioactivity data points, 4 (NNRTI), 8 (NRTI) or 9 (PI) drugs, and 10,700 (NNRTI) 10,500 (NRTI) or 27,000 (PI) mutants). Our models achieved a prediction error below 0.5 Log Fold Change. Moreover, when directly compared with previously published sequence data, derived models PCM performed better in resistance classification and prediction of Log Fold Change (0.76 log units versus 0.91). Furthermore, we were able to successfully confirm both known and identify previously unpublished, resistance-conferring mutations of HIV Reverse Transcriptase (e.g. K102Y, T216M) and HIV Protease (e.g. Q18N, N88G) from our dataset. Finally, we applied our models prospectively to the public HIV resistance database from Stanford University obtaining a correct resistance prediction rate of 84% on the full set (compared to 80% in previous work on a high quality subset). We conclude that proteochemometric models are able to accurately predict the phenotypic resistance based on genotypic data even for novel mutants and mixtures. Furthermore, we add an applicability domain to the prediction, informing the user about the reliability of predictions.
Journal of Cheminformatics | 2013
Gerard J. P. van Westen; Remco F. Swier; Isidro Cortes-Ciriano; Jörg K. Wegner; John P. Overington; Adriaan P. IJzerman; Herman W. T. van Vlijmen; Andreas Bender
BackgroundWhile a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants.ResultsThe amino acid descriptor sets compared here show similar performance (<0.1 log units RMSE difference and <0.1 difference in MCC), while errors for individual proteins were in some cases found to be larger than those resulting from descriptor set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last.ConclusionsWhile amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side.
Journal of Cheminformatics | 2013
Gerard J. P. van Westen; Remco F. Swier; Jörg K. Wegner; Adriaan P. IJzerman; Herman W. T. van Vlijmen; Andreas Bender
BackgroundWhile a large body of work exists on comparing and benchmarking of descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 different protein descriptor sets have been compared with respect to their behavior in perceiving similarities between amino acids. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI and BLOSUM, and a novel protein descriptor set termed ProtFP (4 variants). We investigate to which extent descriptor sets show collinear as well as orthogonal behavior via principal component analysis (PCA).ResultsIn describing amino acid similarities, MSWHIM, T-scales and ST-scales show related behavior, as do the VHSE, FASGAI, and ProtFP (PCA3) descriptor sets. Conversely, the ProtFP (PCA5), ProtFP (PCA8), Z-Scales (Binned), and BLOSUM descriptor sets show behavior that is distinct from one another as well as both of the clusters above. Generally, the use of more principal components (>3 per amino acid, per descriptor) leads to a significant differences in the way amino acids are described, despite that the later principal components capture less variation per component of the original input data.ConclusionIn this work a comparison is provided of how similar (and differently) currently available amino acids descriptor sets behave when converting structure to property space. The results obtained enable molecular modelers to select suitable amino acid descriptor sets for structure-activity analyses, e.g. those showing complementary behavior.
Protein Science | 2010
Gerard J. P. van Westen; Jörg K. Wegner; Andreas Bender; Adriaan P. IJzerman; Herman W. T. van Vlijmen
In this work, we describe two novel approaches to utilize the dynamic structure information implicitly contained in large crystal structure data sets. The first approach visualizes both consistent as well as variable ligand‐induced changes in ligand‐bound compared with apo protein crystal structures. For this purpose, information was mined from B‐factors and ligand‐induced residue displacements in multiple crystal structures, minimizing experimental error and noise. With this approach, the mechanism of action of non‐nucleoside reverse transcriptase inhibitors (NNRTIs) as an inseparable combination of distortion of protein dynamics and conformational changes of HIV‐1 reverse transcriptase was corroborated (a combination of the previously proposed “molecular arthritis” and “distorted site” mechanisms). The second approach presented here uses “consensus structures” to map common binding features that are present in a set of structures of NNRTI‐bound HIV‐1 reverse transcriptase. Consensus structures are based on different levels of structural overlap of multiple crystal structures and are used to analyze protein–ligand interactions. The structures are shown to yield information about conserved hydrogen bonding interactions as well as binding‐pocket flexibility, shape, and volume. From the consensus structures, a common wild type NNRTI binding pocket emerges. Furthermore, we were able to identify a conserved backbone hydrogen bond acceptor at P236 and a novel hydrophobic subpocket, which are not yet utilized by current drugs. Our methods introduced here reinterpret the atom information and make use of the data variability by using multiple structures, complementing classical 3D structural information of single structures.
Journal of Cheminformatics | 2010
Gerard J. P. van Westen; Jörg K. Wegner; Adriaan P. IJzerman; Herman W. T. van Vlijmen; Andreas Bender
The early phases of drug discovery use in silico models to rationalize structure activity relationships, and to predict the activity of novel compounds. However, the performance of these models is not always acceptable and the reliability of external predictions - both to novel compounds and to related protein targets - is often limited. Proteochemometric modeling [1] adds a target description, based on physicochemical properties of the binding site, to these models. Our proteochemometric models [2] are based on Scitegic circular fingerprints on the compound side and on a customized protein fingerprint on the target side. This protein fingerprint is based on a selection of physicochemical descriptors obtained from the AAindex database. Through PCA we selected a number of physicochemical properties which are hashed in a fingerprint using the Scitegic hashing algorithm. We compared this fingerprint to a number of protein descriptors previously published, including the Z-scales, the FASGAI and the BLOSUM descriptors. Our fingerprint performs superior to all of these. In addition, we show that proteochemometric models improve external prediction capabilities. In the case of classification this leads to models with a higher specificity when compared to conventional QSAR. In the case of regression our models show an average lower RMSE of 0.12 log units when based on a pIC50 output variable compared to conventional QSAR modeling the same data-set. Furthermore, our models enable target extrapolation. As a result we can predict the activity of known and new compounds on new targets while retaining the same model quality as when performing external validation without target extrapolation.
Archive | 2007
Jörg K. Wegner; Vlijmen Herman Van; Carlo Willy Maurice Boutton
Virtual Screening: Principles, Challenges, and Practical Guidelines | 2011
Maxwell D. Cummings; Éric Arnoult; Christophe Buyck; Gary Tresadern; Ann Vos; Jörg K. Wegner