Tom Oinn
European Bioinformatics Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tom Oinn.
Bioinformatics | 2004
Tom Oinn; Matthew Addis; Justin Ferris; Darren Marvin; Martin Senger; R. Mark Greenwood; Tim Carver; Kevin Glover; Matthew Pocock; Anil Wipat; Peter Li
MOTIVATION In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate these Web services in workflows as part of their analyses. RESULTS The Taverna project has developed a tool for the composition and enactment of bioinformatics workflows for the life sciences community. The tool includes a workbench application which provides a graphical user interface for the composition of workflows. These workflows are written in a new language called the simple conceptual unified flow language (Scufl), where by each step within a workflow represents one atomic task. Two examples are used to illustrate the ease by which in silico experiments can be represented as Scufl workflows using the workbench application.
Nucleic Acids Research | 2006
Duncan Hull; Katy Wolstencroft; Robert Stevens; Carole A. Goble; Matthew Pocock; Peter Li; Tom Oinn
Taverna is an application that eases the use and integration of the growing number of molecular biology tools and databases available on the web, especially web services. It allows bioinformaticians to construct workflows or pipelines of services to perform a range of different analyses, such as sequence analysis and genome annotation. These high-level workflows can integrate many different resources into a single analysis. Taverna is available freely under the terms of the GNU Lesser General Public License (LGPL) from .
In: Workflows for e-Science, Scientific Workflows for Grids. Springer-Verlag London Ltd; 2006.. | 2007
Tom Oinn; Peter Li; Douglas B. Kell; Carole A. Goble; Antoon Goderis; Mark Greenwood; Duncan Hull; Robert Stevens; Daniele Turi; Jun Zhao
Bioinformatics is a discipline that uses computational and mathematical techniques to store, manage, and analyze biological data in order to answer biological questions. Bioinformatics has over 850 databases [154] and numerous tools that work over those databases and local data to produce even more data themselves. In order to perform an analysis, a bioinformatician uses one or more of these resources to gather, filter, and transform data to answer a question. Thus, bioinformatics is an in silico science.
international conference on e science | 2007
Daniele Turi; Paolo Missier; Carole A. Goble; David De Roure; Tom Oinn
This paper presents the formal syntax and the operational semantics of Taverna, a workflow management system with a large user base among the e-Science community. Such formal foundation, which has so far been lacking, opens the way to the translation between Taverna workflows and other process models. In particular, the ability to automatically compile a simple domain-specific process description into Taverna facilitates its adoption by e-scientists who are not expert workflow developers. We demonstrate this potential through a practical use case.
cluster computing and the grid | 2003
Luc Moreau; Simon Miles; Carole A. Goble; R. Mark Greenwood; Vijay Dialani; Matthew Addis; M. Nedim Alpdemir; Rich Cawley; David De Roure; Justin Ferris; Robert J. Gaizauskas; Kevin Glover; Chris Greenhalgh; Peter Li; Xiaojian Liu; Phillip Lord; Michael Luck; Darren Marvin; Tom Oinn; Norman W. Paton; Steve Pettifer; Milena Radenkovic; Angus Roberts; Alan Robinson; Tom Rodden; Martin Senger; Nick Sharman; Robert Stevens; Brian Warboys; Anil Wipat
My Grid is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments. In this paper, we describe the architecture of my Grid and how it will be used by the scientist. We then show how my Grid can benefit from agents technologies. We have identified three key uses of agent technologies in my Grid: user agents, able to customize and personalise data, agent communication languages offering a generic and portable communication medium, and negotiation allowing multiple distributed entities to reach service level agreements.
BMC Bioinformatics | 2008
Peter Li; Juan I. Castrillo; Giles Velarde; I. Wassink; Stian Soiland-Reyes; Stuart Owen; David Withers; Tom Oinn; Matthew Pocock; Carole A. Goble; Stephen G. Oliver; Douglas B. Kell
BackgroundThere has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools.ResultsDevelopments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench.ConclusionTaverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data.
international world wide web conferences | 2004
Tom Oinn; Matthew Addis; Justin Ferris; Darren Marvin; R. Mark Greenwood; Carole A. Goble; Anil Wipat; Peter Li; Tim Carver
As web service technology matures there is growing interest in exploiting workflow techniques to coordinate web services. Bioinformaticians are a user community who combine web resources to perform in silico experiments. These users are scientists and not information technology experts they require workflow solutions that have a low cost of entry for service users and providers. Problems satisfying these requirements with current techniques led to the development of the Simple conceptual unified flow language (Scufl). Scufl is supported by the Freefluo enactment engine [1], and the Taverna editing workbench [3]. The extensibility of Scufl, supported by these tools, means that workflows coordinating web services can be matched to how users view their problems. The Taverna workbench exploits the web to keep Scufl simple by retrieving detail from URIs when required, and by scavenging the web for services. Scufl and its tools are not bioinformatics specific. They can be exploited by other communities who require user-driven composition and execution of workflows coordinating web resources.
Bioinformatics | 2008
Peter Li; Tom Oinn; Stian Soiland; Douglas B. Kell
UNLABELLED Many data manipulation processes involve the use of programming libraries. These processes may beneficially be automated due to their repeated use. A convenient type of automation is in the form of workflows that also allow such processes to be shared amongst the community. The Taverna workflow system has been extended to enable it to use and invoke Java classes and methods as tasks within Taverna workflows. These classes and methods are selected for use during workflow construction by a Java Doclet application called the API Consumer. This selection is stored as an XML file which enables Taverna to present the subset of the API for use in the composition of workflows. The ability of Taverna to invoke Java classes and methods is demonstrated by a workflow in which we use libSBML to map gene expression data onto a metabolic pathway represented as a SBML model. AVAILABILITY Taverna and the API Consumer application can be freely downloaded from http://taverna.sourceforge.net
Bioinformatics | 2008
Anders Lanzén; Tom Oinn
UNLABELLED Taverna is an application that eases the integration of tools and databases for life science research by the construction of workflows. The Taverna Interaction Service extends the functionality of Taverna by defining human interaction within a workflow and acting as a mediation layer between the automated workflow engine and one or more users. AVAILABILITY Taverna, the Interaction Service plug-in and web application are available as open source and can be downloaded from http://taverna.sourceforge.net/
cluster computing and the grid | 2008
Khalid Belhajjame; Katy Wolstencroft; Oscar Corcho; Tom Oinn; Franck Tanoh; Alan William; Carole A. Goble
There seems to be a general consensus on the crucial role metadata can play for enhancing the functionalities of scientific workflows systems, e.g., workflow and service discovery, composition and provenance browsing, among others. However, in most cases their management is under-specified, if not left unaddressed at all. A step in this direction, the main contribution of the work presented in this paper is an overview of metadata and their management in the Taverna workflow system. In Taverna, we consider metadata to be a first class citizen in the system, in the sense that we fully cover their life cycle from their creation, through their use and curation until their eventual removal. We present the main steps of this cycle and present the models used for metadata specification. In doing so, we distinguish two classes of metadata: metadata that describe workflow related entities, such as services, workflows and sub- workflows, and metadata that describe workflow executions, also known as workflow provenance.