Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Richard L. Marchese Robinson is active.

Publication


Featured researches published by Richard L. Marchese Robinson.


Journal of the Royal Society Interface | 2012

Hierarchical virtual screening for the discovery of new molecular scaffolds in antibacterial hit identification

Pedro J. Ballester; Martina Mangold; Nigel I. Howard; Richard L. Marchese Robinson; Chris Abell; Jochen Blumberger; John B. O. Mitchell

One of the initial steps of modern drug discovery is the identification of small organic molecules able to inhibit a target macromolecule of therapeutic interest. A small proportion of these hits are further developed into lead compounds, which in turn may ultimately lead to a marketed drug. A commonly used screening protocol used for this task is high-throughput screening (HTS). However, the performance of HTS against antibacterial targets has generally been unsatisfactory, with high costs and low rates of hit identification. Here, we present a novel computational methodology that is able to identify a high proportion of structurally diverse inhibitors by searching unusually large molecular databases in a time-, cost- and resource-efficient manner. This virtual screening methodology was tested prospectively on two versions of an antibacterial target (type II dehydroquinase from Mycobacterium tuberculosis and Streptomyces coelicolor), for which HTS has not provided satisfactory results and consequently practically all known inhibitors are derivatives of the same core scaffold. Overall, our protocols identified 100 new inhibitors, with calculated Ki ranging from 4 to 250 μM (confirmed hit rates are 60% and 62% against each version of the target). Most importantly, over 50 new active molecular scaffolds were discovered that underscore the benefits that a wide application of prospectively validated in silico screening tools is likely to bring to antibacterial hit identification.


Molecular Informatics | 2011

Development and Comparison of hERG Blocker Classifiers: Assessment on Different Datasets Yields Markedly Different Results

Richard L. Marchese Robinson; Robert C. Glen; John B. O. Mitchell

In recent years, considerable effort has been invested in the development of classification models for prospective hERG inhibitors, due to the implications of hERG blockade for cardiotoxicity and the low throughput of functional hERG assays. We present novel approaches for binary classification which seek to separate strong inhibitors (IC50<1 µM) from ‘non‐blockers′ exhibiting moderate (1–10 µM) or weak (IC50≥10 µM) inhibition, as required by the pharmaceutical industry. Our approaches are based on (discretized) 2D descriptors, selected using Winnow, with additional models generated using Random Forest (RF) and Support Vector Machines (SVMs). We compare our models to those previously developed by Thai and Ecker and by Dubus et al. The purpose of this paper is twofold: 1. To propose that our approaches (with Matthews Correlation Coefficients from 0.40 to 0.87 on truly external test sets, when extrapolation beyond the applicability domain was not evident and sufficient quantities of data were available for training) are competitive with those currently proposed in the literature. 2. To highlight key issues associated with building and assessing truly predictive models, in particular the considerable variation in model performance when training and testing on different datasets.


arXiv: Learning | 2014

Interpreting Random Forest Classification Models Using a Feature Contribution Method

Anna Palczewska; Jan Palczewski; Richard L. Marchese Robinson; Daniel Neagu

Model interpretation is one of the key aspects of the model evaluation process. The explanation of the relationship between model variables and outputs is relatively easy for statistical models, such as linear regressions, thanks to the availability of model parameters and their statistical significance . For “black box” models, such as random forest, this information is hidden inside the model structure. This work presents an approach for computing feature contributions for random forest classification models. It allows for the determination of the influence of each variable on the model prediction for an individual instance. By analysing feature contributions for a training dataset, the most significant variables can be determined and their typical contribution towards predictions made for individual classes, i.e., class-specific feature contribution “patterns”, are discovered. These patterns represent a standard behaviour of the model and allow for an additional assessment of the model reliability for new data. Interpretation of feature contributions for two UCI benchmark datasets shows the potential of the proposed methodology. The robustness of results is demonstrated through an extensive analysis of feature contributions calculated for a large number of generated random forest models.


Beilstein Journal of Nanotechnology | 2015

An ISA-TAB-Nano based data collection framework to support data-driven modelling of nanotoxicology.

Richard L. Marchese Robinson; Mark T. D. Cronin; Andrea-Nicole Richarz; Robert Rallo

Summary Analysis of trends in nanotoxicology data and the development of data driven models for nanotoxicity is facilitated by the reporting of data using a standardised electronic format. ISA-TAB-Nano has been proposed as such a format. However, in order to build useful datasets according to this format, a variety of issues has to be addressed. These issues include questions regarding exactly which (meta)data to report and how to report them. The current article discusses some of the challenges associated with the use of ISA-TAB-Nano and presents a set of resources designed to facilitate the manual creation of ISA-TAB-Nano datasets from the nanotoxicology literature. These resources were developed within the context of the NanoPUZZLES EU project and include data collection templates, corresponding business rules that extend the generic ISA-TAB-Nano specification as well as Python code to facilitate parsing and integration of these datasets within other nanoinformatics resources. The use of these resources is illustrated by a “Toy Dataset” presented in the Supporting Information. The strengths and weaknesses of the resources are discussed along with possible future developments.


Journal of Cheminformatics | 2012

Winnow based identification of potent hERG inhibitors in silico: comparative assessment on different datasets

Richard L. Marchese Robinson; Robert C. Glen; John B. O. Mitchell

Due to the potentially lethal effects of potent hERG inhibition, in silico approaches which can identify potent (IC50 < 1 µM) inhibitors are of considerable interest to the pharmaceutical industry [1]. We present recent work [2] in which in silico binary classifiers were trained to discriminate potent inhibitors from compounds exhibiting weaker (IC50 ≥ 1 µM) inhibition. Initial models were based on a version of the memory efficient Winnow algorithm [3]. These initial models were generated using various descriptor sets. The descriptor set yielding the best cross-validated initial Winnow model was used to build models using each of Winnow, Random Forest and Support Vector Machine. Analysis of the contributions of different substructural and physiochemical features in the final Winnow models indicates they may be interpreted, albeit with caution. All final models were externally validated, with no algorithm consistently outperforming the others. These approaches were directly compared, on various datasets, to those proposed by Thai and Ecker [4] and by Dubus et al. [5]. The results indicate that the Winnow models are competitive with earlier approaches proposed in the literature. The findings also emphasise a potential difficulty when seeking to estimate the predictive power of in silico models on small quantities of data: model performance may vary considerably, particularly when training and validating on different datasets.


information reuse and integration | 2013

Interpreting random forest models using a feature contribution method

Anna Palczewska; Jan Palczewski; Richard L. Marchese Robinson; Daniel Neagu

Model interpretation is one of the key aspects of the model evaluation process. The explanation of the relationship between model variables and outputs is easy for statistical models, such as linear regressions, thanks to the availability of model parameters and their statistical significance. For “black box” models, such as random forest, this information is hidden inside the model structure. This work presents an approach for computing feature contributions for random forest classification models. It allows for the determination of the influence of each variable on the model prediction for an individual instance. Interpretation of feature contributions for two UCI benchmark datasets shows the potential of the proposed methodology. The robustness of results is demonstrated through an extensive analysis of feature contributions calculated for a large number of generated random forest models.


Journal of Chemical Information and Modeling | 2017

Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets

Richard L. Marchese Robinson; Anna Palczewska; Jan Palczewski; Nathan Kidley

The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical programming language and the Python program HeatMapWrapper [ https://doi.org/10.5281/zenodo.495163 ] for heat map generation.


NanoImpact | 2018

Integration among databases and data sets to support productive nanotechnology: Challenges and recommendations

Sandra C. Karcher; Egon Willighagen; John Rumble; Friederike Ehrhart; Chris T. Evelo; Martin Fritts; Sharon Gaheen; Stacey L. Harper; Mark D. Hoover; Nina Jeliazkova; Nastassja A. Lewinski; Richard L. Marchese Robinson; Karmann C. Mills; Axel P. Mustad; Dennis G. Thomas; Georgia Tsiliki; Christine Ogilvie Hendren

Many groups within the broad field of nanoinformatics are already developing data repositories and analytical tools driven by their individual organizational goals. Integrating these data resources across disciplines and with non-nanotechnology resources can support multiple objectives by enabling the reuse of the same information. Integration can also serve as the impetus for novel scientific discoveries by providing the framework to support deeper data analyses. This article discusses current data integration practices in nanoinformatics and in comparable mature fields, and nanotechnology-specific challenges impacting data integration. Based on results from a nanoinformatics-community-wide survey, recommendations for achieving integration of existing operational nanotechnology resources are presented. Nanotechnology-specific data integration challenges, if effectively resolved, can foster the application and validation of nanotechnology within and across disciplines. This paper is one of a series of articles by the Nanomaterial Data Curation Initiative that address data issues such as data curation workflows, data completeness and quality, curator responsibilities, and metadata.


Journal of Cheminformatics | 2018

The influence of solid state information and descriptor selection on statistical models of temperature dependent aqueous solubility.

Richard L. Marchese Robinson; Kevin J. Roberts; Elaine Martin

AbstractPredicting the equilibrium solubility of organic, crystalline materials at all relevant temperatures is crucial to the digital design of manufacturing unit operations in the chemical industries. The work reported in our current publication builds upon the limited number of recently published quantitative structure–property relationship studies which modelled the temperature dependence of aqueous solubility. One set of models was built to directly predict temperature dependent solubility, including for materials with no solubility data at any temperature. We propose that a modified cross-validation protocol is required to evaluate these models. Another set of models was built to predict the related enthalpy of solution term, which can be used to estimate solubility at one temperature based upon solubility data for the same material at another temperature. We investigated whether various kinds of solid state descriptors improved the models obtained with a variety of molecular descriptor combinations: lattice energies or 3D descriptors calculated from crystal structures or melting point data. We found that none of these greatly improved the best direct predictions of temperature dependent solubility or the related enthalpy of solution endpoint. This finding is surprising because the importance of the solid state contribution to both endpoints is clear. We suggest our findings may, in part, reflect limitations in the descriptors calculated from crystal structures and, more generally, the limited availability of polymorph specific data. We present curated temperature dependent solubility and enthalpy of solution datasets, integrated with molecular and crystal structures, for future investigations.


Nanoscale | 2015

Erratum: Genotoxicity of metal oxide nanomaterials: Review of recent data and discussion of possible mechanisms (Nanoscale (2015) 7 (2154-2198))

Nazanin Golbamaki; Bakhtiyor Rasulev; Antonio Cassano; Richard L. Marchese Robinson; Emilio Benfenati; Jerzy Leszczynski; Mark T. D. Cronin

LJMU has developed LJMU Research Online for users to access the research output of the University more effectively. Copyright

Collaboration


Dive into the Richard L. Marchese Robinson's collaboration.

Top Co-Authors

Avatar

Mark T. D. Cronin

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Emilio Benfenati

Mario Negri Institute for Pharmacological Research

View shared research outputs
Top Co-Authors

Avatar

Andrea-Nicole Richarz

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Antonio Cassano

Liverpool John Moores University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge