Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sascha Urbaczek is active.

Publication


Featured researches published by Sascha Urbaczek.


Journal of Cheminformatics | 2014

Protoss: a holistic approach to predict tautomers and protonation states in protein-ligand complexes

Stefan Bietz; Sascha Urbaczek; B. Schulz; Matthias Rarey

AbstractThe calculation of hydrogen positions is a common preprocessing step when working with crystal structures of protein-ligand complexes. An explicit description of hydrogen atoms is generally needed in order to analyze the binding mode of particular ligands or to calculate the associated binding energies. Due to the large number of degrees of freedom resulting from different chemical moieties and the high degree of mutual dependence this problem is anything but trivial. In addition to an efficient algorithm to take care of the complexity resulting from complicated hydrogen bonding networks, a robust chemical model is needed to describe effects such as tautomerism and ionization consistently. We present a novel method for the placement of hydrogen coordinates in protein-ligand complexes which takes tautomers and protonation states of both protein and ligand into account. Our method generates the most probable hydrogen positions on the basis of an optimal hydrogen bonding network using an empirical scoring function. The high quality of our results could be verified by comparison to the manually adjusted Astex diverse set and a remarkably low rate of undesirable hydrogen contacts compared to other tools.


Journal of Chemical Information and Modeling | 2014

Facing the challenges of structure-based target prediction by inverse virtual screening.

Karen T. Schomburg; Stefan Bietz; Hans Briem; Angela M. Henzler; Sascha Urbaczek; Matthias Rarey

Computational target prediction for bioactive compounds is a promising field in assessing off-target effects. Structure-based methods not only predict off-targets, but, simultaneously, binding modes, which are essential for understanding the mode of action and rationally designing selective compounds. Here, we highlight the current open challenges of computational target prediction methods based on protein structures and show why inverse screening rather than sequential pairwise protein-ligand docking methods are needed. A new inverse screening method based on triangle descriptors is introduced: iRAISE (inverse Rapid Index-based Screening Engine). A Scoring Cascade considering the reference ligand as well as the ligand and active site coverage is applied to overcome interprotein scoring noise of common protein-ligand scoring functions. Furthermore, a statistical evaluation of a score cutoff for each individual protein pocket is used. The ranking and binding mode prediction capabilities are evaluated on different datasets and compared to inverse docking and pharmacophore-based methods. On the Astex Diverse Set, iRAISE ranks more than 35% of the targets to the first position and predicts more than 80% of the binding modes with a root-mean-square deviation (RMSD) accuracy of <2.0 Å. With a median computing time of 5 s per protein, large amounts of protein structures can be screened rapidly. On a test set with 7915 protein structures and 117 query ligands, iRAISE predicts the first true positive in a ranked list among the top eight ranks (median), i.e., among 0.28% of the targets.


Journal of Chemical Information and Modeling | 2011

NAOMI: on the almost trivial task of reading molecules from different file formats.

Sascha Urbaczek; Adrian Kolodzik; J. Robert Fischer; Tobias Lippert; Stefan Heuser; Inken Groth; Tanja Schulz-Gasch; Matthias Rarey

In most cheminformatics workflows, chemical information is stored in files which provide the necessary data for subsequent calculations. The correct interpretation of the file formats is an important prerequisite to obtain meaningful results. Consistent reading of molecules from files, however, is not an easy task. Each file format implicitly represents an underlying chemical model, which has to be taken into consideration when the input data is processed. Additionally, many data sources contain invalid molecules. These have to be identified and either corrected or discarded. We present the chemical file format converter NAOMI, which provides efficient procedures for reliable handling of molecules from the common chemical file formats SDF, MOL2, and SMILES. These procedures are based on a consistent chemical model which has been designed for the appropriate representation of molecules relevant in the context of drug discovery. NAOMIs functionality is tested by round robin file IO exercises with public data sets, which we believe should become a standard test for every cheminformatics tool.


Journal of Chemical Information and Modeling | 2013

Fast Protein Binding Site Comparison via an Index-Based Screening Technology

Mathias M. von Behren; Andrea Volkamer; Angela M. Henzler; Karen T. Schomburg; Sascha Urbaczek; Matthias Rarey

We present TrixP, a new index-based method for fast protein binding site comparison and function prediction. TrixP determines binding site similarities based on the comparison of descriptors that encode pharmacophoric and spatial features. Therefore, it adopts the efficient core components of TrixX, a structure-based virtual screening technology for large compound libraries. TrixP expands this technology by new components in order to allow a screening of protein libraries. TrixP accounts for the inherent flexibility of proteins employing a partial shape matching routine. After the identification of structures with matching pharmacophoric features and geometric shape, TrixP superimposes the binding sites and, finally, assesses their similarity according to the fit of pharmacophoric properties. TrixP is able to find analogies between closely and distantly related binding sites. Recovery rates of 81.8% for similar binding site pairs, assisted by rejecting rates of 99.5% for dissimilar pairs on a test data set containing 1331 pairs, confirm this ability. TrixP exclusively identifies members of the same protein family on top ranking positions out of a library consisting of 9802 binding sites. Furthermore, 30 predicted kinase binding sites can almost perfectly be classified into their known subfamilies.


Journal of Cheminformatics | 2013

MONA – Interactive manipulation of molecule collections

Matthias Hilbig; Sascha Urbaczek; Inken Groth; Stefan Heuser; Matthias Rarey

Working with small‐molecule datasets is a routine task forcheminformaticians and chemists. The analysis and comparison of vendorcatalogues and the compilation of promising candidates as starting pointsfor screening campaigns are but a few very common applications. Theworkflows applied for this purpose usually consist of multiple basiccheminformatics tasks such as checking for duplicates or filtering byphysico‐chemical properties. Pipelining tools allow to create andchange such workflows without much effort, but usually do not supportinterventions once the pipeline has been started. In many contexts, however,the best suited workflow is not known in advance, thus making it necessaryto take the results of the previous steps into consideration beforeproceeding.To support intuition‐driven processing of compound collections, wedeveloped MONA, an interactive tool that has been designed to prepare andvisualize large small‐molecule datasets. Using an SQL database commoncheminformatics tasks such as analysis and filtering can be performedinteractively with various methods for visual support. Great care was takenin creating a simple, intuitive user interface which can be instantly usedwithout any setup steps. MONA combines the interactivity of moleculedatabase systems with the simplicity of pipelining tools, thus enabling thecase‐to‐case application of chemistry expert knowledge. Thecurrent version is available free of charge for academic use and can bedownloaded at http://www.zbh.uni-hamburg.de/mona.


Journal of Chemical Information and Modeling | 2013

Reading PDB: perception of molecules from 3D atomic coordinates.

Sascha Urbaczek; Adrian Kolodzik; Inken Groth; Stefan Heuser; Matthias Rarey

The analysis of small molecule crystal structures is a common way to gather valuable information for drug development. The necessary structural data is usually provided in specific file formats containing only element identities and three-dimensional atomic coordinates as reliable chemical information. Consequently, the automated perception of molecular structures from atomic coordinates has become a standard task in cheminformatics. The molecules generated by such methods must be both chemically valid and reasonable to provide a reliable basis for subsequent calculations. This can be a difficult task since the provided coordinates may deviate from ideal molecular geometries due to experimental uncertainties or low resolution. Additionally, the quality of the input data often differs significantly thus making it difficult to distinguish between actual structural features and mere geometric distortions. We present a method for the generation of molecular structures from atomic coordinates based on the recently published NAOMI model. By making use of this consistent chemical description, our method is able to generate reliable results even with input data of low quality. Molecules from 363 Protein Data Bank (PDB) entries could be perceived with a success rate of 98%, a result which could not be achieved with previously described methods. The robustness of our approach has been assessed by processing all small molecules from the PDB and comparing them to reference structures. The complete data set can be processed in less than 3 min, thus showing that our approach is suitable for large scale applications.


Journal of Chemical Information and Modeling | 2014

The Valence State Combination Model: A Generic Framework for Handling Tautomers and Protonation States

Sascha Urbaczek; Adrian Kolodzik; Matthias Rarey

The consistent handling of molecules is probably the most basic and important requirement in the field of cheminformatics. Reliable results can only be obtained if the underlying calculations are independent of the specific way molecules are represented in the input data. However, ensuring consistency is a complex task with many pitfalls, an important one being the fact that the same molecule can be represented by different valence bond structures. In order to achieve reliability, a cheminformatics system needs to solve two fundamental problems. First, different choices of valence bond structures must be identified as the same molecule. Second, for each molecule all valence bond structures relevant to the context must be taken into consideration. The latter is especially important with regard to tautomers and protonation states, as these have considerable influence on physicochemical properties of molecules. We present a comprehensive method for the rapid and consistent generation of reasonable tautomers and protonation states for molecules relevant in the context of drug design. This method is based on a generic scheme, the Valence State Combination Model, which has been designed for the enumeration and scoring of valence bond structures in large data sets. In order to ensure our methods consistency, we have developed procedures which can serve as a general validation scheme for similar approaches. The analysis of both the average number of generated structures and the associated runtimes shows that our method is perfectly suited for typical cheminformatics applications. By comparison with frequently used and curated public data sets, we can demonstrate that the tautomers and protonation state produced by our method are chemically reasonable.


Journal of Chemical Information and Modeling | 2012

Unique Ring Families: A Chemically Meaningful Description of Molecular Ring Topologies

Adrian Kolodzik; Sascha Urbaczek; Matthias Rarey

The perception of a set of rings forms the basis for a number of chemoinformatics applications, e.g. the systematic naming of compounds, the calculation of molecular descriptors, the matching of SMARTS expressions, and the generation of atomic coordinates. We introduce the concept of unique ring families (URFs) as an extension of the concept of relevant cycles (RCs). URFs are consistent for different atom orders and represent an intuitive description of the rings of a molecular graph. Furthermore, in contrast to RCs, URFs are polynomial in number. We provide an algorithm to efficiently calculate URFs in polynomial time and demonstrate their suitability for real-time applications by providing computing time benchmarks for the PubChem Database. URFs combine three important properties of chemical ring descriptions, for the first time, namely being unique, chemically meaningful, and efficient to compute. Therefore, URFs are a valuable alternative to the commonly used concept of the smallest set of smallest rings (SSSR) and would be suited to become the standard measure for ring topologies of small molecules.


Journal of Cheminformatics | 2012

A flexible-hydrogen interaction model for protein-ligand docking

Angela M. Henzler; Sascha Urbaczek; B. Schulz; Matthias Rarey

Although some docking methods accounting for protein flexibility exist, most large scale virtual screening approaches work with rigid protein models. A first step towards flexibility integration is the consideration of degrees of freedom resulting from hydrogens, especially, if involved in hydrogen bonding. To account for this type of flexibility, we present a flexible-hydrogen interaction model as part of a descriptor-based docking technique. The model discretizes interaction spheres of rigid and flexible hydrogen-bond donors and acceptors as interaction spots. A spot has an associated interaction direction which indicates hydrogen or lone pair orientation, and thus, the potential location of a hydrogen-bond counterpart. This new flexible-hydrogen interaction model is combined with a novel approach to describe hydrophobic contacts. Both are introduced in our descriptor-based docking approach named TrixX [1]. TrixX handles ligand flexibility by applying a conformer ensemble approach [2]. The latter allows for the use of efficient indexing techniques upon virtual screening. The discretized, flexible-hydrogen model proposes potential hydrogen and lone pair positions. However, these proposals may still slightly differ from their actual location which can be only determined in presence of pose and active site, i. e., after the docking stage. In order to grant a thorough assessment of hydrogen bonds, thereby, the predicted poses are forwarded to an efficient post-optimization of the hydrogen-bond network. It optimally aligns hydrogens, identifies favorable tautomeric and protonation states, and evaluates the predicted pose [3]. Redocking of the Astex Diverse Set [4] shows that the described docking method produces results in good agreement with co-crystalized ligand structures. Several case studies using different levels of discretization and post-optimization, illustrate the influence of our presented procedures in ligand placement and scoring. The studies highlight the impact of flexible hydrogens and lone pairs during docking and confirm the introduction of our flexible hydrogen model.


Journal of Cheminformatics | 2011

Hydrogen placement in protein-ligand complexes under consideration of tautomerism

Stefan Bietz; Sascha Urbaczek; Matthias Rarey

Protein-ligand complexes are often consulted for the understanding of binding modes and mechanisms of action as well as the development of novel drugs. Unfortunately the resolution of most x-ray structures is too low to resolve hydrogen atoms. However, hydrogen positions play a major role in the analysis of important interaction types as hydrogen bonding or metal interactions. Therefore, it is important to predict the orientation of hydrogen containing rotatable groups as well as sensible protonation and tautomeric states of both protein and ligand. While in most cases these degrees of freedom are still manageable for the protein and therefore incorporated in the models of the most common prediction tools for hydrogen placement [1-3], the consideration of different protonation states and tautomers of the ligand and their relative frequency can, due to chemical multiplicity and the physicochemical complexity of protonation and tautomeric equilibria, easily become a complicated problem. We present a new method for the prediction of hydrogen positions in protein-ligand complexes that considers tautomeric variability of the ligand in addition to common degrees of freedom. Beginning with a random tautomeric state, different reasonable tautomers of the ligand are enumerated and their relative stability is estimated on the basis of a heuristic scoring scheme. This tautomerism model is integrated in the hydrogen placement application Protoss [3]. Our approach permits an enhanced automatic prediction of hydrogen positions, especially for ligands that exhibit tautomers with similar stability but different interaction facilities. Furthermore we were able to reproduce the ligand tautomers that were proposed in several studies of tautomerism preferences in protein-ligand complexes.

Collaboration


Dive into the Sascha Urbaczek's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

B. Schulz

University of Hamburg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge