Joe Townsend
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Joe Townsend.
Organic and Biomolecular Chemistry | 2004
Joe Townsend; Sam Adams; Christopher A. Waudby; Vanessa K. de Souza; Jonathan M. Goodman; Peter Murray-Rust
Automatically extracting chemical information from documents is a challenging task, but an essential one for dealing with the vast quantity of data that is available. The task is least difficult for structured documents, such as chemistry department web pages or the output of computational chemistry programs, but requires increasingly sophisticated approaches for less structured documents, such as chemical papers. The identification of key units of information, such as chemical names, makes the extraction of useful information from unstructured documents possible.
Journal of Cheminformatics | 2012
Weerapong Phadungsukanan; Markus Kraft; Joe Townsend; Peter Murray-Rust
This paper introduces a subdomain chemistry format for storing computational chemistry data called CompChem. It has been developed based on the design, concepts and methodologies of Chemical Markup Language (CML) by adding computational chemistry semantics on top of the CML Schema. The format allows a wide range of ab initio quantum chemistry calculations of individual molecules to be stored. These calculations include, for example, single point energy calculation, molecular geometry optimization, and vibrational frequency analysis. The paper also describes the supporting infrastructure, such as processing software, dictionaries, validation tools and database repositories. In addition, some of the challenges and difficulties in developing common computational chemistry dictionaries are discussed. The uses of CompChem are illustrated by two practical applications.
Journal of Chemical Information and Modeling | 2010
Jim Downing; M. J. Harvey; Peter Morgan; Peter Murray-Rust; Henry S. Rzepa; Diana Stewart; Alan P. Tonge; Joe Townsend
The SPECTRa-T project has developed text-mining tools to extract named chemical entities (NCEs), such as chemical names and terms, and chemical objects (COs), e.g., experimental spectral assignments and physical chemistry properties, from electronic theses (e-theses). Although NCEs were readily identified within the two major document formats studied, only the use of structured documents enabled identification of chemical objects and their association with the relevant chemical entity (e.g., systematic chemical name). A corpus of theses was analyzed and it is shown that a high degree of semantic information can be extracted from structured documents. This integrated information has been deposited in a persistent Resource Description Framework (RDF) triple-store that allows users to conduct semantic searches. The strength and weaknesses of several document formats are reviewed.
Journal of Cheminformatics | 2011
Brian Brooks; A. L. Thorn; Matthew E. Smith; Peter D. Matthews; Shaoming Chen; Ben O'Steen; Sam Adams; Joe Townsend; Peter Murray-Rust
The Ami project was a six month Rapid Innovation project sponsored by JISC to explore the Virtual Research Environment space. The project brainstormed with chemists and decided to investigate ways to facilitate monitoring and collection of experimental data.A frequently encountered use-case was identified of how the chemist reaches the end of an experiment, but finds an unexpected result. The ability to replay events can significantly help make sense of how things progressed. The project therefore concentrated on collecting a variety of dimensions of ancillary data - data that would not normally be collected due to practicality constraints. There were three main areas of investigation: 1) Development of a monitoring tool using infrared and ultrasonic sensors; 2) Time-lapse motion video capture (for example, videoing 5 seconds in every 60); and 3) Activity-driven video monitoring of the fume cupboard environs.The Ami client application was developed to control these separate logging functions. The application builds up a timeline of the events in the experiment and around the fume cupboard. The videos and data logs can then be reviewed after the experiment in order to help the chemist determine the exact timings and conditions used.The project experimented with ways in which a Microsoft Kinect could be used in a laboratory setting. Investigations suggest that it would not be an ideal device for controlling a mouse, but it shows promise for usages such as manipulating virtual molecules.
Journal of Cheminformatics | 2011
Peter Murray-Rust; Sam Adams; Jim Downing; Joe Townsend; Yong Zhang
The World-Wide Molecular Matrix (WWMM) is a ten year project to create a peer-to-peer (P2P) system for the publication and collection of chemical objects, including over 250, 000 molecules. It has now been instantiated in a number of repositories which include data encoded in Chemical Markup Language (CML) and linked by URIs and RDF. The technical specification and implementation is now complete. We discuss the types of architecture required to implement nodes in the WWMM and consider the social issues involved in adoption.
Journal of Chemical Information and Modeling | 2012
Joe Townsend; Robert C. Glen; Hamse Y. Mussa
A plethora of articles on naive Bayes classifiers, where the chemical compounds to be classified are represented by binary-valued (absent or present type) descriptors, have appeared in the cheminformatics literature over the past decade. The principal goal of this paper is to describe how a naive Bayes classifier based on binary descriptors (NBCBBD) can be employed as a feature selector in an efficient manner suitable for cheminformatics. In the process, we point out a fact well documented in other disciplines that NBCBBD is a linear classifier and is therefore intrinsically suboptimal for classifying compounds that are nonlinearly separable in their binary descriptor space. We investigate the performance of the proposed algorithm on classifying a subset of the MDDR data set, a standard molecular benchmark data set, into active and inactive compounds.
Archive | 2016
Tom Adeyoola; Nick Brown; Nikki Trott; Edward Herbert; Duncan Robertson; Jim Downing; Nicholas E. Day; Robert Boland; Tom Boucher; Joe Townsend; Edward Clay; Tom Warren; Anoop Unadkat; Yu Chen
Archive | 2012
CompChem Phadungsukanan; Markus Kraft; Joe Townsend; Peter Murray-Rust
Archive | 2015
Yu Chen; Robert Boland; Jim Downing; Ray Miller; Gareth Rogers; Joe Townsend
Archive | 2017
Yu Chen; Dongjoe Shin; Joe Townsend; Jim Downing; Duncan Robertson; Tom Adeyoola