Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alexander Tropsha is active.

Publication


Featured researches published by Alexander Tropsha.


Journal of Molecular Graphics & Modelling | 2002

Beware of q2

Alexander Golbraikh; Alexander Tropsha

Validation is a crucial aspect of any quantitative structure-activity relationship (QSAR) modeling. This paper examines one of the most popular validation criteria, leave-one-out cross-validated R2 (LOO q2). Often, a high value of this statistical characteristic (q2 > 0.5) is considered as a proof of the high predictive ability of the model. In this paper, we show that this assumption is generally incorrect. In the case of 3D QSAR, the lack of the correlation between the high LOO q2 and the high predictive ability of a QSAR model has been established earlier [Pharm. Acta Helv. 70 (1995) 149; J. Chemomet. 10(1996)95; J. Med. Chem. 41 (1998) 2553]. In this paper, we use two-dimensional (2D) molecular descriptors and k nearest neighbors (kNN) QSAR method for the analysis of several datasets. No correlation between the values of q2 for the training set and predictive ability for the test set was found for any of the datasets. Thus, the high value of LOO q2 appears to be the necessary but not the sufficient condition for the model to have a high predictive power. We argue that this is the general property of QSAR models developed using LOO cross-validation. We emphasize that the external validation is the only way to establish a reliable QSAR model. We formulate a set of criteria for evaluation of predictive ability of QSAR models.


Molecular Informatics | 2010

Best Practices for QSAR Model Development, Validation, and Exploitation

Alexander Tropsha

After nearly five decades “in the making”, QSAR modeling has established itself as one of the major computational molecular modeling methodologies. As any mature research discipline, QSAR modeling can be characterized by a collection of well defined protocols and procedures that enable the expert application of the method for exploring and exploiting ever growing collections of biologically active chemical compounds. This review examines most critical QSAR modeling routines that we regard as best practices in the field. We discuss these procedures in the context of integrative predictive QSAR modeling workflow that is focused on achieving models of the highest statistical rigor and external predictive power. Specific elements of the workflow consist of data preparation including chemical structure (and when possible, associated biological data) curation, outlier detection, dataset balancing, and model validation. We especially emphasize procedures used to validate models, both internally and externally, as well as the need to define model applicability domains that should be used when models are employed for the prediction of external compounds or compound libraries. Finally, we present several examples of successful applications of QSAR models for virtual screening to identify experimentally confirmed hits.


Journal of Computer-aided Molecular Design | 2003

Rational selection of training and test sets for the development of validated QSAR models.

Alexander Golbraikh; Min Shen; Zhiyan Xiao; Yun De Xiao; Kuo Hsiung Lee; Alexander Tropsha

Quantitative Structure–Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors (kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q2 for the training set and accuracy of prediction (R2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.


Journal of Medicinal Chemistry | 2014

QSAR Modeling: Where have you been? Where are you going to?

Artem Cherkasov; Eugene N. Muratov; Denis Fourches; Alexandre Varnek; I. I. Baskin; Mark T. D. Cronin; John C. Dearden; Paola Gramatica; Yvonne C. Martin; Roberto Todeschini; Viviana Consonni; Victor E. Kuz’min; Richard D. Cramer; Romualdo Benigni; Chihae Yang; James F. Rathman; Lothar Terfloth; Johann Gasteiger; Ann M. Richard; Alexander Tropsha

Quantitative structure-activity relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists toward collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR studies, as well as encourage the use of high quality, validated QSARs for regulatory decision making.


Journal of Chemical Information and Computer Sciences | 2000

Novel Variable Selection Quantitative Structure−Property Relationship Approach Based on the k-Nearest-Neighbor Principle

Weifan Zheng; Alexander Tropsha

A novel automated variable selection quantitative structure-activity relationship (QSAR) method, based on the kappa-nearest neighbor principle (kNN-QSAR) has been developed. The kNN-QSAR method explores formally the active analogue approach, which implies that similar compounds display similar profiles of pharmacological activities. The activity of each compound is predicted as the average activity of K most chemically similar compounds from the data set. The robustness of a QSAR model is characterized by the value of cross-validated R2 (q2) using the leave-one-out cross-validation method. The chemical structures are characterized by multiple topological descriptors such as molecular connectivity indices or atom pairs. The chemical similarity is evaluated by Euclidean distances between compounds in multidimensional descriptor space, and the optimal subset of descriptors is selected using simulated annealing as a stochastic optimization algorithm. The application of the kNN-QSAR method to 58 estrogen receptor ligands as well as to several other groups of pharmacologically active compounds yielded QSAR models with q2 values of 0.6 or higher. Due to its relative simplicity, high degree of automation, nonlinear nature, and computational efficiency, this method could be applied routinely to a large variety of experimental data sets.


Nature Medicine | 2004

Autoimmunity is triggered by cPR-3(105–201), a protein complementary to human autoantigen proteinase-3

William F. Pendergraft; Gloria A. Preston; Ruchir R. Shah; Alexander Tropsha; Charles W. Carter; J. Charles Jennette; Ronald J. Falk

It remains unclear how and why autoimmunity occurs. Here we show evidence for a previously unrecognized and possibly general mechanism of autoimmunity. This new finding was discovered serendipitously using material from patients with inflammatory vascular disease caused by antineutrophil cytoplasmic autoantibodies (ANCA) with specificity for proteinase-3 (PR-3). Such patients harbor not only antibodies to the autoantigen (PR-3), but also antibodies to a peptide translated from the antisense DNA strand of PR-3 (complementary PR-3, cPR-3) or to a mimic of this peptide. Immunization of mice with the middle region of cPR-3 resulted in production of antibodies not only to cPR-3, but also to the immunogens sense peptide counterpart, PR-3. Both human and mouse antibodies to PR-3 and cPR-3 bound to each other, indicating idiotypic relationships. These findings indicate that autoimmunity can be initiated through an immune response against a peptide that is antisense or complementary to the autoantigen, which then induces anti-idiotypic antibodies (autoantibodies) that cross-react with the autoantigen.


Journal of Chemical Information and Modeling | 2010

Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research

Denis Fourches; Eugene N. Muratov; Alexander Tropsha

Molecular modelers and cheminformaticians typically analyze experimental data generated by other scientists. Consequently, when it comes to data accuracy, cheminformaticians are always at the mercy of data providers who may inadvertently publish (partially) erroneous data. Thus, dataset curation is crucial for any cheminformatics analysis such as similarity searching, clustering, QSAR modeling, virtual screening, etc., especially nowadays when the availability of chemical datasets in public domain has skyrocketed in recent years. Despite the obvious importance of this preliminary step in the computational analysis of any dataset, there appears to be no commonly accepted guidance or set of procedures for chemical data curation. The main objective of this paper is to emphasize the need for a standardized chemical data curation strategy that should be followed at the onset of any molecular modeling investigation. Herein, we discuss several simple but important steps for cleaning chemical records in a database including the removal of a fraction of the data that cannot be appropriately handled by conventional cheminformatics techniques. Such steps include the removal of inorganic and organometallic compounds, counterions, salts and mixtures; structure validation; ring aromatization; normalization of specific chemotypes; curation of tautomeric forms; and the deletion of duplicates. To emphasize the importance of data curation as a mandatory step in data analysis, we discuss several case studies where chemical curation of the original “raw” database enabled the successful modeling study (specifically, QSAR analysis) or resulted in a significant improvement of models prediction accuracy. We also demonstrate that in some cases rigorously developed QSAR models could be even used to correct erroneous biological data associated with chemical compounds. We believe that good practices for curation of chemical records outlined in this paper will be of value to all scientists working in the fields of molecular modeling, cheminformatics, and QSAR studies.


Chemical Reviews | 2014

Chemical basis of interactions between engineered nanoparticles and biological systems.

Qingxin Mu; Guibin Jiang; Lingxin Chen; Hongyu Zhou; Denis Fourches; Alexander Tropsha; Bing Yan

As defined by the European Commission, nanomaterial is a natural, incidental or manufactured material containing particles in an unbound state or as an aggregate or agglomerate in which ≥ 50% of the particles in the number size distribution have one or more external dimensions in the size range 1 to 100 nm. In specific cases and where warranted by concerns for the environment, health, safety or competition, the number size distribution threshold of 50% may be replaced with a threshold between 1 and 50%.1 Engineered nanomaterials (ENMs) refer to man-made nanomaterials. Materials in the nanometer range often possess unique physical, optical, electronic, and biological properties compared with larger particles, such as the strength of graphene,2 the electronic properties of carbon nanotubes (CNTs),3 the antibacterial activity of silver nanoparticles4 and the optical properties of quantum dots (QDs).5 The unique and advanced properties of ENMs have led to a rapid increase in their application. These applications include aerospace and airplanes, energy, architecture, chemicals and coatings, catalysts, environmental protection, computer memory, biomedicine and consumer products. Driven by these demands, the worldwide ENM production volume in 2016 is conservatively estimated in a market report by Future Markets to be 44,267 tons or ≥


ACS Nano | 2010

Quantitative Nanostructure−Activity Relationship Modeling

Denis Fourches; Dongqiuye Pu; Carlos Tassa; Ralph Weissleder; Stanley Y. Shaw; Russell J. Mumper; Alexander Tropsha

5 billion.6 As the production and applications of ENMs rapidly expand, their environmental impacts and effects on human health are becoming increasingly significant.7 Due to their small sizes, ENMs are easily made airborne.8 However, no accurate method to quantitatively measure their concentration in air currently exists. A recently reported incident of severe pulmonary fibrosis caused by inhaled polymer nanoparticles in seven female workers obtained much attention.9 In addition to the release of ENM waste from industrial sites, a major release of ENMs to environmental water occurs due to home and personal use of appliances, cosmetics and personal products, such as shampoo and sunscreen.10 Airborne and aqueous ENMs pose immediate danger to the human respiratory and gastrointestinal systems. ENMs may enter other human organs after they are absorbed into the bloodstream through the gastrointestinal or respiratory systems.11,12 Furthermore, ENMs in cosmetics and personal care products, such as lotion, sunscreen and shampoo may enter human circulation through skin penetration.13 ENMs are very persistent in the environment and are slowly degraded. The dissolved metal ions from ENMs can also revert back to nanoparticles under natural conditions.14 ENMs are stored in plants, microbes and animal organs and can be transferred and accumulated through the food chain.15,16 In addition to the accidental entry of ENMs into human and biological systems, ENMs are also purposefully injected into or enter humans for medicinal and diagnostic purposes.17 Therefore, interactions of ENMs with biological systems are inevitable. In addition to engineered nanomaterials, there are also naturally existing nanomaterials such as proteins and DNA molecules, which are key components of biological systems. These materials, combined with lipids and organic and inorganic small molecules, form the basic units of living systems –cells.18 To elucidate how nanomaterials affect organs and physiological functions, a thorough understanding of how nanomaterials perturb cells and biological molecules is required (Figure 1). Rapidly accumulating evidence indicates that ENMs interact with the basic components of biological systems, such as proteins, DNA molecules and cells.19-21 The driving forces for such interactions are quite complex and include the size, shape and surface properties (e.g., hydrophobicity, hydrogen-bonding capability, pi-bonds and stereochemical interactions) of ENMs.22-25 Figure 1 Interactions of nanoparticles with biological systems at different levels. Nanoparticles enter the human body through various pathways, reaching different organs and contacting tissues and cells. All of these interactions are based on nanoparticle-biomacromolecule ... Evidence also indicates that chemical modifications on a nanoparticle’s surface alter its interactions with biological systems.26-28 These observations not only support the hypothesis that basic nano-bio interactions are mainly physicochemical in nature but also provide a powerful approach to controlling the nature and strength of a nanoparticle’s interactions with biological systems. Practically, a thorough understanding of the fundamental chemical interactions between nanoparticles and biological systems has two direct impacts. First, this knowledge will encourage and assist experimental approaches to chemically modify nanoparticle surfaces for various industrial or medicinal applications. Second, a range of chemical information can be combined with computational methods to investigate nano-biological properties and predict desired nanoparticle properties to direct experiments.29-31 The literature regarding nanoparticle-biological system interactions has increased exponentially in the past decade (Figure 2). However, a mechanistic understanding of the chemical basis for such complex interactions is still lacking. This review intends to explore such an understanding in the context of recent publications. Figure 2 An analysis of literature statistics indicates growing concern for the topics that are the focus of this review. The number of publications and citations were obtained using the keywords “nanoparticles” and “biological systems” ... A breakthrough technology cannot prosper without wide acceptance from the public and society; that is, it must pose minimal harm to human health and the environment. Nanotechnology is now facing such a critical challenge. We must elucidate the effects of ENMs on biological systems (such as biological molecules, human cells, organs and physiological systems). Accumulating experimental evidence suggests that nanoparticles interact with biological systems at nearly every level, often causing unwanted physiological consequences. Elucidating these interactions is the goal of this review. This endeavor will help regulate the proper application of ENMs in various products and their release into the environment. A more significant mission of this review is to direct the development of “safe-by-design” ENMs, as their demands for and applications continue to increase.


Journal of Chemical Information and Modeling | 2008

Combinatorial QSAR Modeling of Chemical Toxicants Tested against Tetrahymena pyriformis

Hao Zhu; Alexander Tropsha; Denis Fourches; Alexandre Varnek; Ester Papa; Paola Gramatica; Tomas Öberg; Phuong Dao; Artem Cherkasov; Igor V. Tetko

Evaluation of biological effects, both desired and undesired, caused by manufactured nanoparticles (MNPs) is of critical importance for nanotechnology. Experimental studies, especially toxicological, are time-consuming, costly, and often impractical, calling for the development of efficient computational approaches capable of predicting biological effects of MNPs. To this end, we have investigated the potential of cheminformatics methods such as quantitative structure-activity relationship (QSAR) modeling to establish statistically significant relationships between measured biological activity profiles of MNPs and their physical, chemical, and geometrical properties, either measured experimentally or computed from the structure of MNPs. To reflect the context of the study, we termed our approach quantitative nanostructure-activity relationship (QNAR) modeling. We have employed two representative sets of MNPs studied recently using in vitro cell-based assays: (i) 51 various MNPs with diverse metal cores (Proc. Natl. Acad. Sci. 2008, 105, 7387-7392) and (ii) 109 MNPs with similar core but diverse surface modifiers (Nat. Biotechnol. 2005, 23, 1418-1423). We have generated QNAR models using machine learning approaches such as support vector machine (SVM)-based classification and k nearest neighbors (kNN)-based regression; their external prediction power was shown to be as high as 73% for classification modeling and having an R(2) of 0.72 for regression modeling. Our results suggest that QNAR models can be employed for: (i) predicting biological activity profiles of novel nanomaterials, and (ii) prioritizing the design and manufacturing of nanomaterials toward better and safer products.

Collaboration


Dive into the Alexander Tropsha's collaboration.

Top Co-Authors

Avatar

Denis Fourches

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar

Alexander Golbraikh

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Eugene N. Muratov

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Alexander Sedykh

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Weifan Zheng

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jun Huan

University of Kansas

View shared research outputs
Top Co-Authors

Avatar

Jan F. Prins

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Stephen J. Capuzzi

University of North Carolina at Chapel Hill

View shared research outputs
Researchain Logo
Decentralizing Knowledge