John W. Raymond | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John W. Raymond is active.

Explore More

Publication

Featured researches published by John W. Raymond.

Journal of Computer-aided Molecular Design | 2002

Maximum common subgraph isomorphism algorithms for the matching of chemical structures

John W. Raymond; Peter Willett

The maximum common subgraph (MCS) problem has become increasingly important in those aspects of chemoinformatics that involve the matching of 2D or 3D chemical structures. This paper provides a classification and a review of the many MCS algorithms, both exact and approximate, that have been described in the literature, and makes recommendations regarding their applicability to typical chemoinformatics tasks.

The Computer Journal | 2002

RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs

John W. Raymond; Eleanor J. Gardiner; Peter Willett

A new graph similarity calculation procedure is introduced for comparing labeled graphs. Given a minimum similarity threshold, the procedure consists of an initial screening process to determine whether it is possible for the measure of similarity between the two graphs to exceed the minimum threshold, followed by a rigorous maximum common edge subgraph (MCES) detection algorithm to compute the exact degree and composition of similarity. The proposed MCES algorithm is based on a maximum clique formulation of the problem and is a significant improvement over other published algorithms. It presents new approaches to both lower and upper bounding as well as vertex selection.

Journal of Computer-aided Molecular Design | 2002

Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases

John W. Raymond; Peter Willett

This paper reports an evaluation of both graph-based and fingerprint-based measures of structural similarity, when used for virtual screening of sets of 2D molecules drawn from the MDDR and ID Alert databases. The graph-based measures employ a new maximum common edge subgraph isomorphism algorithm, called RASCAL, with several similarity coefficients described previously for quantifying the similarity between pairs of graphs. The effectiveness of these graph-based searches is compared with that resulting from similarity searches using BCI, Daylight and Unity 2D fingerprints. Our results suggest that graph-based approaches provide an effective complement to existing fingerprint-based approaches to virtual screening.

Journal of Molecular Graphics & Modelling | 2003

Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures

John W. Raymond; C. John Blankley; Peter Willett

This paper compares several published methods for clustering chemical structures, using both graph- and fingerprint-based similarity measures. The clusterings from each method were compared to determine the degree of cluster overlap. Each method was also evaluated on how well it grouped structures into clusters possessing a non-trivial substructural commonality. The methods which employ adjustable parameters were tested to determine the stability of each parameter for datasets of varying size and composition. Our experiments suggest that both graph- and fingerprint-based similarity measures can be used effectively for generating chemical clusterings; it is also suggested that the CAST and Yin-Chen methods, suggested recently for the clustering of gene expression patterns, may also prove effective for the clustering of 2D chemical structures.

Journal of Chemical Information and Modeling | 2009

Rationalizing Lead Optimization by Associating Quantitative Relevance with Molecular Structure Modification

John W. Raymond; Ian A. Watson; Abdelaziz Mahoui

Historically, one of the characteristic activities of the medicinal chemist has been the iterative improvement of lead compounds until a suitable therapeutic entity is achieved. Often referred to as lead optimization, this process typically takes the form of minor structural modifications to an existing lead in an attempt to ameliorate deleterious attributes while simultaneously trying to maintain or improve desirable properties. The cumulative effect of this exercise performed over the course of several decades of pharmaceutical research by thousands of trained researchers has resulted in large collections of pharmaceutically relevant chemical structures. As far as the authors are aware, this work represents the first attempt to use that data to define a framework to quantifiably catalogue and summate this information into a medicinal chemistry expert system. A method is proposed that first comprehensively mines a compendium of chemical structures compiling the structural modifications, abridges them to rectify artificially inflated support levels, and then performs an association rule mining experiment to ascribe relative confidences to each transformation. The result is a catalogue of statistically relevant structural modifications that can potentially be used in a number of pharmaceutical applications.

Journal of Chemical Information and Computer Sciences | 2003

Similarity searching in databases of flexible 3D structures using smoothed bounded distance matrices.

John W. Raymond; Peter Willett

This paper describes a method for calculating the similarity between pairs of chemical structures represented by 3D molecular graphs. The method is based on a graph matching procedure that accommodates conformational flexibility by using distance ranges between pairs of atoms, rather than fixing the atom pair distances. These distance ranges are generated using triangle and tetrangle bound smoothing techniques from distance geometry. The effectiveness of the proposed method in retrieving other compounds of like biological activity is evaluated, and the results are compared with those obtained from other, 2D-based methods for similarity searching.

Journal of Chemical Information and Computer Sciences | 1999

Molecular Structure Disassembly Program (MOSDAP): A Chemical Information Model To Automate Structure-Based Physical Property Estimation

John W. Raymond; Tony N. Rogers

Chemical information theory and molecular structure searching have long been used as computational aids to researchers in the pharmaceutical field to estimate molecular structure−property relationships and to assist in drug design. Tailored to these and other specific applications, such endeavors have been expensive to develop and typically are very specialized. Often, they are not readily available and are not a part of the open literature. Because the number of chemicals in commercial use is growing daily (with over 18 million molecular species now catalogued by Chemical Abstract Services), there is a need among engineers in the chemical process industries for predictive structure−property algorithms. The most common and useful methods are those based on group contribution that require only the chemical structure of interest. Unfortunately, each group contribution method typically has its own fragment library and specialized rules, making such models difficult to automate for general use by the engineer...

Journal of Chemical Information and Modeling | 2005

An automated method for exploring targeted substructural diversity within sets of chemical structures.

John W. Raymond; Christopher E. Kibbey

Practicing medicinal chemists tend to treat a lead compound as an assemblage of its substructural parts. By iteratively confining their synthetic efforts in a localized fashion, they are able to systematically investigate how minor changes in certain portions of the molecule effect the properties of interest in the logical expectation that the observed beneficial changes will be cumulative. One disadvantage to this approach arises when large amounts of structure data begin to accumulate which is often the case in recent times due to such developments as high-throughput screening, virtual screening, and combinatorial chemistry. How then does one interactively mine this diverse data consistent with the desired substructural template, so those desirable structural features can be discovered and interpreted, especially when they may not occur in the most active compounds due to structural deficiencies in other portions of the molecule? In this paper, we present an algorithm to automate this process that has historically been performed in an ad-hoc and manual fashion. Using the proposed method, significantly larger numbers of compounds can be analyzed in this fashion, potentially discovering useful structural feature combinations that would not have otherwise been detected due to the sheer scale of modern structural and biological data collections.

WIT Transactions on Information and Communication Technologies | 2002

Use of graph theory for data mining in public health

Peter A. Bath; Cheryl Craigs; Ravi Maheswaran; John W. Raymond; Peter Willett

Data mining problems are common in public health, for example for identifying disease clusters and multidimensional patterns within large databases, e.g. socioeconomic differentials in health. Although numerous data mining methods have been developed, currently available methods are not designed to handle complex pattern searching queries and no satisfactory methods are available for this purpose. The aim of the study reported here was to test graph-theoretical methods for data mining in public health databases to identify areas of high deprivation that are surrounded by affluent areas and deprived areas surrounded by deprived areas. Graph-theory (using the maximum common subgraph isomorphism (mcs) method) was used to search a database containing information on the 10920 enumeration districts (EDs) for the Trent Region of England. Each ED was allocated to a deprivation quintile based on the Townsend Deprivation Score. These mcs program was used to identify deprived EDs that are adjacent to deprived EDs and deprived EDs that are adjacent to affluent EDs. The mcs program identified 1528 deprived EDs adjacent to at least two deprived EDs, 1181 deprived EDs adjacent to at least three deprived EDs, 802 deprived EDs adjacent to at least four deprived EDs, and 505 deprived EDs adjacent to at least five deprived EDs. The program successfully identified 147 deprived EDs adjacent to at least two affluent EDs, 54 deprived EDs adjacent to at least three affluent EDs, 14 deprived EDs adjacent to at least four affluent EDs, and six deprived EDs adjacent to at least five affluent EDs. The retrieved EDs were then used for hypothesis testing using statistical methods. The study demonstrates the potential of graph theoretical techniques for data mining in public health databases.

Journal of Chemical Information and Computer Sciences | 2002