Daniel P. Russo
Rutgers University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniel P. Russo.
ALTEX-Alternatives to Animal Experimentation | 2016
Nicholas Ball; Mark T. D. Cronin; Jie Shen; Karen Blackburn; Ewan D. Booth; Mounir Bouhifd; Elizabeth L.R. Donley; Laura A. Egnash; Charles Hastings; D.R. Juberg; Andre Kleensang; Nicole Kleinstreuer; E.D. Kroese; A.C. Lee; Thomas Luechtefeld; Alexandra Maertens; S. Marty; Jorge M. Naciff; Jessica A. Palmer; David Pamies; M. Penman; Andrea-Nicole Richarz; Daniel P. Russo; Sharon B. Stuard; G. Patlewicz; B. van Ravenzwaay; Shengde Wu; Hao Zhu; Thomas Hartung
Summary Grouping of substances and utilizing read-across of data within those groups represents an important data gap filling technique for chemical safety assessments. Categories/analogue groups are typically developed based on structural similarity and, increasingly often, also on mechanistic (biological) similarity. While read-across can play a key role in complying with legislation such as the European REACH regulation, the lack of consensus regarding the extent and type of evidence necessary to support it often hampers its successful application and acceptance by regulatory authorities. Despite a potentially broad user community, expertise is still concentrated across a handful of organizations and individuals. In order to facilitate the effective use of read-across, this document presents the state of the art, summarizes insights learned from reviewing ECHA published decisions regarding the relative successes/pitfalls surrounding read-across under REACH, and compiles the relevant activities and guidance documents. Special emphasis is given to the available existing tools and approaches, an analysis of ECHAs published final decisions associated with all levels of compliance checks and testing proposals, the consideration and expression of uncertainty, the use of biological support data, and the impact of the ECHA Read-Across Assessment Framework (RAAF) published in 2015.
ALTEX-Alternatives to Animal Experimentation | 2016
Thomas Luechtefeld; Alexandra Maertens; Daniel P. Russo; Costanza Rovida; Hao Zhu; Thomas Hartung
Summary The public data on skin sensitization from REACH registrations already included 19,111 studies on skin sensitization in December 2014, making it the largest repository of such data so far (1,470 substances with mouse LLNA, 2,787 with GPMT, 762 with both in vivo and in vitro and 139 with only in vitro data). 21% were classified as sensitizers. The extracted skin sensitization data was analyzed to identify relationships in skin sensitization guidelines, visualize structural relationships of sensitizers, and build models to predict sensitization. A chemical with molecular weight > 500 Da is generally considered non-sensitizing owing to low bioavailability, but 49 sensitizing chemicals with a molecular weight > 500 Da were found. A chemical similarity map was produced using PubChem’s 2D Tanimoto similarity metric and Gephi force layout visualization. Nine clusters of chemicals were identified by Blondel’s module recognition algorithm revealing wide module-dependent variation. Approximately 31% of mapped chemicals are Michael’s acceptors but alone this does not imply skin sensitization. A simple sensitization model using molecular weight and five ToxTree structural alerts showed a balanced accuracy of 65.8% (specificity 80.4%, sensitivity 51.4%), demonstrating that structural alerts have information value. A simple variant of k-nearest neighbors outperformed the ToxTree approach even at 75% similarity threshold (82% balanced accuracy at 0.95 threshold). At higher thresholds, the balanced accuracy increased. Lower similarity thresholds decrease sensitivity faster than specificity. This analysis scopes the landscape of chemical skin sensitization, demonstrating the value of large public datasets for health hazard prediction.
ALTEX-Alternatives to Animal Experimentation | 2016
Thomas Luechtefeld; Alexandra Maertens; Daniel P. Russo; Costanza Rovida; Hao Zhu; Thomas Hartung
Summary Public data from ECHA online dossiers on 9,801 substances encompassing 326,749 experimental key studies and additional information on classification and labeling were made computable. Eye irritation hazard, for which the rabbit Draize eye test still represents the reference method, was analyzed. Dossiers contained 9,782 Draize eye studies on 3,420 unique substances, indicating frequent retesting of substances. This allowed assessment of the test’s reproducibility based on all substances tested more than once. There was a 10% chance of a non-irritant evaluation after a prior severe-irritant result according to UN GHS classification criteria. The most reproducible outcomes were the results negative (94% reproducible) and severe eye irritant (73% reproducible). To evaluate whether other GHS categorizations predict eye irritation, we built a dataset of 5,629 substances (1,931 “irritant” and 3,698 “non-irritant”). The two best decision trees with up to three other GHS classifications resulted in balanced accuracies of 68% and 73%, i.e., in the rank order of the Draize rabbit eye test itself, but both use inhalation toxicity data (“May cause respiratory irritation”), which is not typically available. Next, a dataset of 929 substances with at least one Draize study was mapped to PubChem to compute chemical similarity using 2D conformational fingerprints and Tanimoto similarity. Using a minimum similarity of 0.7 and simple classification by the closest chemical neighbor resulted in balanced accuracy from 73% over 737 substances to 100% at a threshold of 0.975 over 41 substances. This represents a strong support of read-across and (Q)SAR approaches in this area.
ALTEX-Alternatives to Animal Experimentation | 2016
Thomas Luechtefeld; Alexandra Maertens; Daniel P. Russo; Costanza Rovida; Hao Zhu; Thomas Hartung
Summary The European Chemicals Agency (ECHA) warehouses the largest public dataset of in vivo and in vitro toxicity tests. In December 2014 this data was converted into a structured, machine readable and searchable database using natural language processing. It contains data for 9,801 unique substances, 3,609 unique study descriptions and 816,048 study documents. This allows exploring toxicological data on a scale far larger than previously possible. Substance similarity analysis was used to determine clustering of substances for hazards by mapping to PubChem. Similarity was measured using PubChem 2D conformational substructure fingerprints, which were compared via the Tanimoto metric. Following K-Core filtration, the Blondel et al. (2008) module recognition algorithm was used to identify chemical modules showing clusters of substances in use within the chemical universe. The Global Harmonized System of Classification and Labelling provides a valuable information source for hazard analysis. The most prevalent hazards are H317 “May cause an allergic skin reaction” with 20% and H318 “Causes serious eye damage” with 17% positive substances. Such prevalences obtained for all hazards here are key for the design of integrated testing strategies. The data allowed estimation of animal use. The database covers about 20% of substances in the high-throughput biological assay database Tox21 (1,737 substances) and has a 917 substance overlap with the Comparative Toxicogenomics Database (~7% of CTD). The biological data available in these datasets combined with ECHA in vivo endpoints have enormous modeling potential. A case is made that REACH should systematically open regulatory data for research purposes.
ALTEX-Alternatives to Animal Experimentation | 2016
Thomas Luechtefeld; Alexandra Maertens; Daniel P. Russo; Costanza Rovida; Hao Zhu; Thomas Hartung
Summary The European Chemicals Agency, ECHA, made available a total of 13,832 oral toxicity studies for 8,568 substances up to December 2014. 75% of studies were from the retired OECD Test Guideline 401 (11% TG 420, 11% TG 423 and 1.5% TG 425). Concordance across guidelines, evaluated by comparing LD50 values ≥ 2,000 or < 2,000 mg/ kg bodyweight from chemicals tested multiple times between different guidelines, was at least 75% and for their own repetition more than 90%. In 2009, Bulgheroni et al. created a simple model for predicting acute oral toxicity using no observed adverse effect levels (NOAEL) from 28-day repeated dose toxicity studies in rats. This was reproduced here for 1,625 substances. In 2014, Taylor et al. suggested no added value of the 90-day repeated dose oral toxicity test given the availability of a low 28-day study with some constraints. We confirm that the 28-day NOAEL is predictive (albeit imperfectly) of 90-day NOAELs, however, the suggested constraints did not affect predictivity. 1,059 substances with acute oral toxicity data (268 positives, 791 negatives, all Klimisch score 1) were used for modeling: The Chemical Development Kit was used to generate 27 molecular descriptors and a similarity-informed multilayer perceptron showing 71% sensitivity and 72% specificity. Additionally, the k-nearest neighbors (KNN) algorithm indicated that similarity-based approaches alone may be poor predictors of acute oral toxicity, but can be used to inform the multilayer perceptron model, where this was the feature with the highest information value.
Molecular Pharmaceutics | 2017
Alexandru Korotcov; Valery Tkachenko; Daniel P. Russo; Sean Ekins
Machine learning methods have been applied to many data sets in pharmaceutical research for several decades. The relative ease and availability of fingerprint type molecular descriptors paired with Bayesian methods resulted in the widespread use of this approach for a diverse array of end points relevant to drug discovery. Deep learning is the latest machine learning algorithm attracting attention for many of pharmaceutical applications from docking to virtual screening. Deep learning is based on an artificial neural network with multiple hidden layers and has found considerable traction for many artificial intelligence applications. We have previously suggested the need for a comparison of different machine learning methods with deep learning across an array of varying data sets that is applicable to pharmaceutical research. End points relevant to pharmaceutical research include absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties, as well as activity against pathogens and drug discovery data sets. In this study, we have used data sets for solubility, probe-likeness, hERG, KCNQ1, bubonic plague, Chagas, tuberculosis, and malaria to compare different machine learning methods using FCFP6 fingerprints. These data sets represent whole cell screens, individual proteins, physicochemical properties as well as a data set with a complex end point. Our aim was to assess whether deep learning offered any improvement in testing when assessed using an array of metrics including AUC, F1 score, Cohens kappa, Matthews correlation coefficient and others. Based on ranked normalized scores for the metrics or data sets Deep Neural Networks (DNN) ranked higher than SVM, which in turn was ranked higher than all the other machine learning methods. Visualizing these properties for training and test sets using radar type plots indicates when models are inferior or perhaps over trained. These results also suggest the need for assessing deep learning further using multiple metrics with much larger scale comparisons, prospective testing as well as assessment of different fingerprints and DNN architectures beyond those used.
ACS Nano | 2017
Wenyi Wang; Alexander Sedykh; Hainan Sun; Linlin Zhao; Daniel P. Russo; Hongyu Zhou; Bing Yan; Hao Zhu
The discovery of biocompatible or bioactive nanoparticles for medicinal applications is an expensive and time-consuming process that may be significantly facilitated by incorporating more rational approaches combining both experimental and computational methods. However, it is currently hindered by two limitations: (1) the lack of high-quality comprehensive data for computational modeling and (2) the lack of an effective modeling method for the complex nanomaterial structures. In this study, we tackled both issues by first synthesizing a large library of nanoparticles and obtained comprehensive data on their characterizations and bioactivities. Meanwhile, we virtually simulated each individual nanoparticle in this library by calculating their nanostructural characteristics and built models that correlate their nanostructure diversity to the corresponding biological activities. The resulting models were then used to predict and design nanoparticles with desired bioactivities. The experimental testing results of the designed nanoparticles were consistent with the model predictions. These findings demonstrate that rational design approaches combining high-quality nanoparticle libraries, big experimental data sets, and intelligent computational models can significantly reduce the efforts and costs of nanomaterial discovery.
Bioinformatics | 2016
Daniel P. Russo; Marlene T. Kim; Wenyi Wang; Daniel Pinolini; Sunil M. Shende; Judy Strickland; Thomas Hartung; Hao Zhu
Summary: We have developed a public Chemical In vitro‐In vivo Profiling (CIIPro) portal, which can automatically extract in vitro biological data from public resources (i.e. PubChem) for user‐supplied compounds. For compounds with in vivo target activity data (e.g. animal toxicity testing results), the integrated cheminformatics algorithm will optimize the extracted biological data using in vitro‐in vivo correlations. The resulting in vitro biological data for target compounds can be used for read‐across risk assessment of target compounds. Additionally, the CIIPro portal can identify the most similar compounds based on their optimized bioprofiles. The CIIPro portal provides new powerful assessment capabilities to the scientific community and can be easily integrated with other cheminformatics tools. Availability and Implementation: ciipro.rutgers.edu. Contact: [email protected] or [email protected]
ACS Combinatorial Science | 2016
Jinbao Xiang; Zhuoqi Zhang; Yan Mu; Xianxiu Xu; Sigen Guo; Yongjin Liu; Daniel P. Russo; Hao Zhu; Bing Yan; Xu Bai
An efficient discovery strategy by combining diversity-oriented synthesis and converging cellular screening is described. By a three-round screening process, we identified novel tricyclic pyrido[2,3-b][1,4]benzothiazepines showing potent inhibitory activity against paclitaxel-resistant cell line H460TaxR (EC50 < 1.0 μM), which exhibits much less toxicity toward normal cells (EC50 > 100 μM against normal human fibroblasts). The most active hits also exhibited drug-like properties suitable for further preclinical research. This redeployment of antidepressing compounds for anticancer applications provides promising future prospects for treating drug-resistant tumors with fewer side effects.
Molecular Pharmaceutics | 2018
Thomas R. Lane; Daniel P. Russo; Kimberley M. Zorn; Alex M. Clark; Alexandru Korotcov; Valery Tkachenko; Robert C. Reynolds; Alexander L. Perryman; Joel S. Freundlich; Sean Ekins
Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 μM, 1 μM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.