Stuart Owen
University of Manchester
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stuart Owen.
Nucleic Acids Research | 2013
Katherine Wolstencroft; Robert Haines; Donal Fellows; Alan R. Williams; David Withers; Stuart Owen; Stian Soiland-Reyes; Ian Dunlop; Aleksandra Nenadic; Paul Fisher; Jiten Bhagat; Khalid Belhajjame; Finn Bacall; Alex Hardisty; Abraham Nieva de la Hidalga; Maria Paula Balcazar Vargas; Shoaib Sufi; Carole A. Goble
The Taverna workflow tool suite (http://www.taverna.org.uk) is designed to combine distributed Web Services and/or local tools into complex analysis pipelines. These pipelines can be executed on local desktop machines or through larger infrastructure (such as supercomputers, Grids or cloud environments), using the Taverna Server. In bioinformatics, Taverna workflows are typically used in the areas of high-throughput omics analyses (for example, proteomics or transcriptomics), or for evidence gathering methods involving text mining or data mining. Through Taverna, scientists have access to several thousand different tools and resources that are freely available from a large range of life science institutions. Once constructed, the workflows are reusable, executable bioinformatics protocols that can be shared, reused and repurposed. A repository of public workflows is available at http://www.myexperiment.org. This article provides an update to the Taverna tool suite, highlighting new features and developments in the workbench and the Taverna Server.
international conference on e-science | 2010
Sean Bechhofer; John Ainsworth; Jiten Bhagat; Iain Buchan; Philip A. Couch; Don Cruickshank; David De Roure; Mark Delderfield; Ian Dunlop; Matthew Gamble; Carole A. Goble; Danius T. Michaelides; Paolo Missier; Stuart Owen; David R. Newman; Shoaib Sufi
Scientific data stands to represent a significant portion of the linked open data cloud and science itself stands to benefit from the data fusion capability that this will afford. However, simply publishing linked data into the cloud does not necessarily meet the requirements of reuse. Publishing has requirements of provenance, quality, credit, attribution, methods in order to provide the \emph{reproducibility} that allows validation of results. In this paper we make the case for a scientific data publication model on top of linked data and introduce the notion of \emph{Research Objects} as first class citizens for sharing and publishing.
BMC Bioinformatics | 2008
Peter Li; Juan I. Castrillo; Giles Velarde; I. Wassink; Stian Soiland-Reyes; Stuart Owen; David Withers; Tom Oinn; Matthew Pocock; Carole A. Goble; Stephen G. Oliver; Douglas B. Kell
BackgroundThere has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools.ResultsDevelopments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench.ConclusionTaverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data.
BMC Bioinformatics | 2011
Simon Jupp; Matthew Horridge; Luigi Iannone; Julie Klein; Stuart Owen; Joost P. Schanstra; Katy Wolstencroft; Robert D. Stevens
BackgroundOntologies are being developed for the life sciences to standardise the way we describe and interpret the wealth of data currently being generated. As more ontology based applications begin to emerge, tools are required that enable domain experts to contribute their knowledge to the growing pool of ontologies. There are many barriers that prevent domain experts engaging in the ontology development process and novel tools are needed to break down these barriers to engage a wider community of scientists.ResultsWe present Populous, a tool for gathering content with which to construct an ontology. Domain experts need to add content, that is often repetitive in its form, but without having to tackle the underlying ontological representation. Populous presents users with a table based form in which columns are constrained to take values from particular ontologies. Populated tables are mapped to patterns that can then be used to automatically generate the ontologys content. These forms can be exported as spreadsheets, providing an interface that is much more familiar to many biologists.ConclusionsPopulouss contribution is in the knowledge gathering stage of ontology development; it separates knowledge gathering from the conceptualisation and axiomatisation, as well as separating the user from the standard ontology authoring environments. Populous is by no means a replacement for standard ontology editing tools, but instead provides a useful platform for engaging a wider community of scientists in the mass production of ontology content.
international semantic web conference | 2013
Katherine Wolstencroft; Stuart Owen; Olga Krebs; Wolfgang Mueller; Quyen Nguyen; Jacky L. Snoep; Carole A. Goble
Research in Systems Biology involves integrating data and knowledge about the dynamic processes in biological systems in order to understand and model them. Semantic web technologies should be ideal for exploring the complex networks of genes, proteins and metabolites that interact, but much of this data is not natively available to the semantic web. Data is typically collected and stored with free-text annotations in spreadsheets, many of which do not conform to existing metadata standards and are often not publically released. Along with initiatives to promote more data sharing, one of the main challenges is therefore to semantically annotate and extract this data so that it is available to the research community. Data annotation and curation are expensive and undervalued tasks that have enormous benefits to the discipline as a whole, but fewer benefits to the individual data producers. By embedding semantic annotation into spreadsheets, however, and automatically extracting this data into RDF at the time of repository submission, the process of producing standards-compliant data, that is available for semantic web querying, can be achieved without adding additional overheads to laboratory data management. This paper describes these strategies in the context of semantic data management in the SEEK. The SEEK is a web-based resource for sharing and exchanging Systems Biology data and models that is underpinned by the JERM ontology (Just Enough Results Model), which describes the relationships between data, models, protocols and experiments. The SEEK was originally developed for SysMO, a large European Systems Biology consortium studying micro-organisms, but it has since had widespread adoption across European Systems Biology.
knowledge acquisition, modeling and management | 2012
Katy Wolstencroft; Stuart Owen; Matthew Horridge; Wolfgang Mueller; Finn Bacall; Jacky L. Snoep; Franco B. du Preez; Quyen Nguyen; Olga Krebs; Carole A. Goble
RightField is a Java application that provides a mechanism for embedding ontology annotation support for scientific data in Microsoft Excel or Open Office spreadsheets. The result is semantic annotation by stealth, with an annotation process that is less error-prone, more efficient, and more consistent with community standards. By automatically generating RDF statements for each cell a rich, Linked Data querying environment allows scientists to search their data and other Linked Data resources interchangeably, and caters for queries across heterogeneous spreadsheets. RightField has been developed for Systems Biologists but has since adopted more widely. It is open source (BSD license) and freely available from http://www.rightfield.org.uk.
international conference on e-science | 2012
Katherine Wolstencroft; Stuart Owen; Carole A. Goble; Quyen Nguyen; Olga Krebs; Wolfgang Müller
The interpretation and integration of experimental data depends on consistent metadata and uniform annotation. However, there are many barriers to the acquisition of this rich semantic metadata, not least the overhead and complexity of its collection by scientists. We present RightField, a lightweight spreadsheet-based annotation tool for lowering the barrier of manual metadata acquisition; and a data integration application for extracting and querying RDF data from these enriched spreadsheets. By hiding the complexities of semantic annotation, we can improve the collection of rich metadata, at source, by scientists. We illustrate the approach with results from the SysMO program, showing that RightField supports the whole workflow of semantic data collection, submission and RDF querying in Systems Biology. The RightField tool is freely available from http://www.rightfield.org.uk, and the code is open source under the BSD License.
european conference on parallel processing | 2013
Katherine Wolstencroft; Stuart Owen; Matthew Horridge; Simon Jupp; Olga Krebs; Jacky L. Snoep; Franco B. du Preez; Wolfgang Mueller; Robert Stevens; Carole A. Goble
The increase in volume and complexity of biological data has led to increased requirements to reuse that data. Consistent and accurate metadata is essential for this task, creating new challenges in semantic data annotation and in the constriction of terminologies and ontologies used for annotation. The BioSharing community are developing standards and terminologies for annotation, which have been adopted across bioinformatics, but the real challenge is to make these standards accessible to laboratory scientists. Widespread adoption requires the provision of tools to assist scientists whilst reducing the complexities of working with semantics. This paper describes unobtrusive ‘stealthy’ methods for collecting standards compliant, semantically annotated data and for contributing to ontologies used for those annotations. Spreadsheets are ubiquitous in laboratory data management. Our spreadsheet‐based RightField tool enables scientists to structure information and select ontology terms for annotation within spreadsheets, producing high quality, consistent data without changing common working practices. Furthermore, our Populous spreadsheet tool proves effective for gathering domain knowledge in the form of Web Ontology Language (OWL) ontologies. Such a corpus of structured and semantically enriched knowledge can be extracted in Resource Description Framework (RDF), providing further means for searching across the content and contributing to Open Linked Data (http://linkeddata.org/). Copyright
Future Generation Computer Systems | 2013
Sean Bechhofer; Iain Buchan; David De Roure; Paolo Missier; John Ainsworth; Jiten Bhagat; Philip A. Couch; Don Cruickshank; Mark Delderfield; Ian Dunlop; Matthew Gamble; Danius T. Michaelides; Stuart Owen; David R. Newman; Shoaib Sufi; Carole A. Goble
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management | 2010
Paolo Missier; Stian Soiland-Reyes; Stuart Owen; Wei Tan; Alexandra Nenadic; Ian Dunlop; Alan R. Williams; Tom Oinn; Carole A. Goble