Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chihae Yang is active.

Publication


Featured researches published by Chihae Yang.


Journal of Medicinal Chemistry | 2014

QSAR Modeling: Where have you been? Where are you going to?

Artem Cherkasov; Eugene N. Muratov; Denis Fourches; Alexandre Varnek; I. I. Baskin; Mark T. D. Cronin; John C. Dearden; Paola Gramatica; Yvonne C. Martin; Roberto Todeschini; Viviana Consonni; Victor E. Kuz’min; Richard D. Cramer; Romualdo Benigni; Chihae Yang; James F. Rathman; Lothar Terfloth; Johann Gasteiger; Ann M. Richard; Alexander Tropsha

Quantitative structure-activity relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists toward collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR studies, as well as encourage the use of high quality, validated QSARs for regulatory decision making.


Regulatory Toxicology and Pharmacology | 2009

Evaluation of high-throughput genotoxicity assays used in profiling the US EPA ToxCast™ chemicals

Andrew W. Knight; Stephen Little; Keith A. Houck; David J. Dix; Richard S. Judson; Ann M. Richard; Nancy McCarroll; Gregory S. Akerman; Chihae Yang; Louise Birrell; Richard M. Walmsley

Three high-throughput screening (HTS) genotoxicity assays-GreenScreen HC GADD45a-GFP (Gentronix Ltd.), CellCiphr p53 (Cellumen Inc.) and CellSensor p53RE-bla (Invitrogen Corp.)-were used to analyze the collection of 320 predominantly pesticide active compounds being tested in Phase I of US. Environmental Protection Agencys ToxCast research project. Between 9% and 12% of compounds were positive for genotoxicity in the assays. However, results of the varied tests only partially overlapped, suggesting a strategy of combining data from a battery of assays. The HTS results were compared to mutagenicity (Ames) and animal tumorigenicity data. Overall, the HTS assays demonstrated low sensitivity for rodent tumorigens, likely due to: screening at a low concentration, coverage of selected genotoxic mechanisms, lack of metabolic activation and difficulty detecting non-genotoxic carcinogens. Conversely, HTS results demonstrated high specificity, >88%. Overall concordance of the HTS assays with tumorigenicity data was low, around 50% for all tumorigens, but increased to 74-78% (vs. 60% for Ames) for those compounds producing tumors in rodents at multiple sites and, thus, more likely genotoxic carcinogens. The aim of the present study was to evaluate the utility of HTS assays to identify potential genotoxicity hazard in the larger context of the ToxCast project, to aid prioritization of environmentally relevant chemicals for further testing and assessment of carcinogenicity risk to humans.


Journal of Environmental Science and Health Part C-environmental Carcinogenesis & Ecotoxicology Reviews | 2009

Predictive models for carcinogenicity and mutagenicity: frameworks, state-of-the-art, and perspectives.

Emilio Benfenati; Romualdo Benigni; David M. DeMarini; C. Helma; D. Kirkland; Todd M. Martin; P. Mazzatorta; G. Ouédraogo-Arras; Ann M. Richard; B. Schilter; W. G. E. J. Schoonen; R. D. Snyder; Chihae Yang

Mutagenicity and carcinogenicity are endpoints of major environmental and regulatory concern. These endpoints are also important targets for development of alternative methods for screening and prediction due to the large number of chemicals of potential concern and the tremendous cost (in time, money, animals) of rodent carcinogenicity bioassays. Both mutagenicity and carcinogenicity involve complex, cellular processes that are only partially understood. Advances in technologies and generation of new data will permit a much deeper understanding. In silico methods for predicting mutagenicity and rodent carcinogenicity based on chemical structural features, along with current mutagenicity and carcinogenicity data sets, have performed well for local prediction (i.e., within specific chemical classes), but are less successful for global prediction (i.e., for a broad range of chemicals). The predictivity of in silico methods can be improved by improving the quality of the data base and endpoints used for modelling. In particular, in vitro assays for clastogenicity need to be improved to reduce false positives (relative to rodent carcinogenicity) and to detect compounds that do not interact directly with DNA or have epigenetic activities. New assays emerging to complement or replace some of the standard assays include Vitotox™, GreenScreenGC, and RadarScreen. The needs of industry and regulators to assess thousands of compounds necessitate the development of high-throughput assays combined with innovative data-mining and in silico methods. Various initiatives in this regard have begun, including CAESAR, OSIRIS, CHEMOMENTUM, CHEMPREDICT, OpenTox, EPAA, and ToxCast™. In silico methods can be used for priority setting, mechanistic studies, and to estimate potency. Ultimately, such efforts should lead to improvements in application of in silico methods for predicting carcinogenicity to assist industry and regulators and to enhance protection of public health.


Toxicology Mechanisms and Methods | 2008

Combined Use of MC4PC, MDL-QSAR, BioEpisteme, Leadscope PDM, and Derek for Windows Software to Achieve High-Performance, High-Confidence, Mode of Action–Based Predictions of Chemical Carcinogenesis in Rodents

Edwin J. Matthews; Naomi L. Kruhlak; R. Daniel Benz; Joseph F. Contrera; Carol A. Marchant; Chihae Yang

ABSTRACT This report describes a coordinated use of four quantitative structure-activity relationship (QSAR) programs and an expert knowledge base system to predict the occurrence and the mode of action of chemical carcinogenesis in rodents. QSAR models were based upon a weight-of-evidence paradigm of carcinogenic activity that was linked to chemical structures (n = 1,572). Identical training data sets were configured for four QSAR programs (MC4PC, MDL-QSAR, BioEpisteme, and Leadscope PDM), and QSAR models were constructed for the male rat, female rat, composite rat, male mouse, female mouse, composite mouse, and rodent composite endpoints. Model predictions were adjusted to favor high specificity (>80%). Performance was shown to be affected by the method used to score carcinogenicity study findings and the ratio of the number of active to inactive chemicals in the QSAR training data set. Results demonstrated that the four QSAR programs were complementary, each detecting different profiles of carcinogens. Accepting any positive prediction from two programs showed better overall performance than either of the single programs alone; specificity, sensitivity, and Chi-square values were 72.9%, 65.9%, and 223, respectively, compared to 84.5%, 45.8%, and 151. Accepting only consensus-positive predictions using any two programs had the best overall performance and higher confidence; specificity, sensitivity, and Chi-square values were 85.3%, 57.5%, and 287, respectively. Specific examples are provided to demonstrate that consensus-positive predictions of carcinogenicity by two QSAR programs identified both genotoxic and nongenotoxic carcinogens and that they detected 98.7% of the carcinogens linked in this study to Derek for Windows defined modes of action.


Toxicology Mechanisms and Methods | 2008

Toxicity Data Informatics: Supporting a New Paradigm for Toxicity Prediction

Ann M. Richard; Chihae Yang; Richard S. Judson

ABSTRACT Chemical toxicity data at all levels of description, from treatment-level dose response data to a high-level summarized toxicity “endpoint,” effectively circumscribe, enable, and limit predictive toxicology approaches and capabilities. Several new and evolving public data initiatives focused on the world of chemical toxicity information—as represented here by ToxML (Toxicology XML standard), DSSTox (Distributed Structure-Searchable Toxicity Database Network), and ACToR (Aggregated Computational Toxicology Resource)—are contributing to the creation of a more unified, mineable, and modelable landscape of public toxicity data. These projects address different layers in the spectrum of toxicological data representation and detail and, additionally, span diverse domains of toxicology and chemistry in relation to industry and environmental regulatory concerns. For each of the three projects, data standards are the key to enabling “read-across” in relation to toxicity data and chemical-indexed information. In turn, “read-across” capability enables flexible data mining, as well as meaningful aggregation of lower levels of toxicity information to summarized, modelable endpoints spanning sufficient areas of chemical space for building predictive models. By means of shared data standards and transparent and flexible rules for data aggregation, these and related public data initiatives are effectively spanning the divides among experimental toxicologists, computational modelers, and the world of chemically indexed, publicly available toxicity information.


Toxicology Mechanisms and Methods | 2008

Understanding Genetic Toxicity Through Data Mining: The Process of Building Knowledge by Integrating Multiple Genetic Toxicity Databases

Chihae Yang; C. H. Hasselgren; S. Boyer; Kirk Arvidson; S. Aveston; P. Dierkes; Romualdo Benigni; R. D. Benz; Joseph F. Contrera; Naomi L. Kruhlak; Edwin J. Matthews; X. Han; J. Jaworska; R. A. Kemper; James F. Rathman; Ann M. Richard

ABSTRACT Genetic toxicity data from various sources were integrated into a rigorously designed database using the ToxML schema. The public database sources include the U.S. Food and Drug Administration (FDA) submission data from approved new drug applications, food contact notifications, generally recognized as safe food ingredients, and chemicals from the NTP and CCRIS databases. The data from public sources were then combined with data from private industry according to ToxML criteria. The resulting “integrated” database, enriched in pharmaceuticals, was used for data mining analysis. Structural features describing the database were used to differentiate the chemical spaces of drugs/candidates, food ingredients, and industrial chemicals. In general, structures for drugs/candidates and food ingredients are associated with lower frequencies of mutagenicity and clastogenicity, whereas industrial chemicals as a group contain a much higher proportion of positives. Structural features were selected to analyze endpoint outcomes of the genetic toxicity studies. Although most of the well-known genotoxic carcinogenic alerts were identified, some discrepancies from the classic Ashby-Tennant alerts were observed. Using these influential features as the independent variables, the results of four types of genotoxicity studies were correlated. High Pearson correlations were found between the results of Salmonella mutagenicity and mouse lymphoma assay testing as well as those from in vitro chromosome aberration studies. This paper demonstrates the usefulness of representing a chemical by its structural features and the use of these features to profile a battery of tests rather than relying on a single toxicity test of a given chemical. This paper presents data mining/profiling methods applied in a weight-of-evidence approach to assess potential for genetic toxicity, and to guide the development of intelligent testing strategies.


Journal of Chemical Information and Modeling | 2015

New Publicly Available Chemical Query Language, CSRML, To Support Chemotype Representations for Application to Data Mining and Modeling

Chihae Yang; Aleksey Tarkhov; Jörg Marusczyk; Bruno Bienfait; Johann Gasteiger; Thomas Kleinoeder; Tomasz Magdziarz; Oliver Sacher; Christof H. Schwab; Johannes Schwoebel; Lothar Terfloth; Kirk Arvidson; Ann M. Richard; Andrew Worth; James F. Rathman

Chemotypes are a new approach for representing molecules, chemical substructures and patterns, reaction rules, and reactions. Chemotypes are capable of integrating types of information beyond what is possible using current representation methods (e.g., SMARTS patterns) or reaction transformations (e.g., SMIRKS, reaction SMILES). Chemotypes are expressed in the XML-based Chemical Subgraphs and Reactions Markup Language (CSRML), and can be encoded not only with connectivity and topology but also with properties of atoms, bonds, electronic systems, or molecules. CSRML has been developed in parallel with a public set of chemotypes, i.e., the ToxPrint chemotypes, which are designed to provide excellent coverage of environmental, regulatory, and commercial-use chemical space, as well as to represent chemical patterns and properties especially relevant to various toxicity concerns. A software application, ChemoTyper has also been developed and made publicly available in order to enable chemotype searching and fingerprinting against a target structure set. The public ChemoTyper houses the ToxPrint chemotype CSRML dictionary, as well as reference implementation so that the query specifications may be adopted by other chemical structure knowledge systems. The full specifications of the XML-based CSRML standard used to express chemotypes are publicly available to facilitate and encourage the exchange of structural knowledge.


Current Computer - Aided Drug Design | 2006

The Art of Data Mining the Minefields of Toxicity Databases to Link Chemistry to Biology

Chihae Yang; Ann M. Richard; Kevin P. Cross

Toxicity databases have a special role in predictive toxicology, providing ready access to historical information throughout the workflow of discovery, development, and product safety processes in drug development as well as in review by regulatory agencies. To provide accurate information within a hypothesesbuilding environment, the content of the databases needs to be rigorously modeled using standards and controlled vocabulary. The utilitarian purposes of databases widely vary, ranging from a source for (Q)SAR datasets for modelers to a basis for read-across for regulators. Many tasks involved in the use of databases are closely tied to data mining, hence database and data mining are essential technology pairs. To understand chemically-induced toxicity, chemical structures must be integrated into the toxicity databases. Data mining these structure-integrated toxicity databases requires techniques for handling both chemical structures and textual toxicity information. Structure data mining is similar with some modifications to that conventionally employed for large chemical databases, while data mining of toxicity endpoints is not well developed. This review presents a general strategy to data mine structure-integrated toxicity databases to link chemical structures to biological endpoints. Iterative probing of the chemical domain with toxicity endpoint descriptors and the biological domain with chemical descriptors enables linking of the two domains. Data mining steps to elucidate the hidden relationships between the target organs and chemical classes are presented as an example. Work is in progress in the public domain toward the linking of chemistry to biology by providing databases that can be mined.


ALTEX-Alternatives to Animal Experimentation | 2012

Toxicology ontology perspectives.

Barry Hardy; Gordana Apic; Philip Carthew; Dominic Clark; David Cook; Ian Dix; Sylvia Escher; Janna Hastings; David J. Heard; Nina Jeliazkova; Philip Judson; Sherri Matis-Mitchell; Dragana Mitic; Glenn J. Myatt; Imran Shah; Ola Spjuth; Olga Tcheremenskaia; Luca Toldo; David Watson; Andrew White; Chihae Yang

The field of predictive toxicology requires the development of open, public, computable, standardized toxicology vocabularies and ontologies to support the applications required by in silico, in vitro, and in vivo toxicology methods and related analysis and reporting activities. In this article we review ontology developments based on a set of perspectives showing how ontologies are being used in predictive toxicology initiatives and applications. Perspectives on resources and initiatives reviewed include OpenTox, eTOX, Pistoia Alliance, ToxWiz, Virtual Liver, EU-ADR, BEL, ToxML, and Bioclipse. We also review existing ontology developments in neighboring fields that can contribute to establishing an ontological framework for predictive toxicology. A significant set of resources is already available to provide a foundation for an ontological framework for 21st century mechanistic-based toxicology research. Ontologies such as ToxWiz provide a basis for application to toxicology investigations, whereas other ontologies under development in the biological, chemical, and biomedical communities could be incorporated in an extended future framework. OpenTox has provided a semantic web framework for the implementation of such ontologies into software applications and linked data resources. Bioclipse developers have shown the benefit of interoperability obtained through ontology by being able to link their workbench application with remote OpenTox web services. Although these developments are promising, an increased international coordination of efforts is greatly needed to develop a more unified, standardized, and open toxicology ontology framework.


Current Drug Discovery Technologies | 2004

Systematic Analysis of Large Screening Sets in Drug Discovery

Paul E. Blower; Kevin P. Cross; Michael A. Fligner; Glenn J. Myatt; Joseph S. Verducci; Chihae Yang

Each year large pharmaceutical companies produce massive amounts of primary screening data for lead discovery. To make better use of the vast amount of information in pharmaceutical databases, companies have begun to scrutinize the lead generation stage to ensure that more and better qualified lead series enter the downstream optimization and development stages. This article describes computational techniques for end to end analysis of large drug discovery screening sets. The analysis proceeds in three stages: In stage 1 the initial screening set is filtered to remove compounds that are unsuitable as lead compounds. In stage 2 local structural neighborhoods around active compound classes are identified, including similar but inactive compounds. In stage 3 the structure-activity relationships within local structural neighborhoods are analyzed. These processes are illustrated by analyzing two large, publicly available databases.

Collaboration


Dive into the Chihae Yang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark T. D. Cronin

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar

Elena Fioravanzo

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar

Andrew Worth

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar

Ann M. Richard

United States Environmental Protection Agency

View shared research outputs
Top Co-Authors

Avatar

Christof H. Schwab

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Andrea-Nicole Richarz

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Judith C. Madden

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar

Kirk Arvidson

Center for Food Safety and Applied Nutrition

View shared research outputs
Researchain Logo
Decentralizing Knowledge