Alan R. Williams
University of Manchester
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alan R. Williams.
Nucleic Acids Research | 2013
Katherine Wolstencroft; Robert Haines; Donal Fellows; Alan R. Williams; David Withers; Stuart Owen; Stian Soiland-Reyes; Ian Dunlop; Aleksandra Nenadic; Paul Fisher; Jiten Bhagat; Khalid Belhajjame; Finn Bacall; Alex Hardisty; Abraham Nieva de la Hidalga; Maria Paula Balcazar Vargas; Shoaib Sufi; Carole A. Goble
The Taverna workflow tool suite (http://www.taverna.org.uk) is designed to combine distributed Web Services and/or local tools into complex analysis pipelines. These pipelines can be executed on local desktop machines or through larger infrastructure (such as supercomputers, Grids or cloud environments), using the Taverna Server. In bioinformatics, Taverna workflows are typically used in the areas of high-throughput omics analyses (for example, proteomics or transcriptomics), or for evidence gathering methods involving text mining or data mining. Through Taverna, scientists have access to several thousand different tools and resources that are freely available from a large range of life science institutions. Once constructed, the workflows are reusable, executable bioinformatics protocols that can be shared, reused and repurposed. A repository of public workflows is available at http://www.myexperiment.org. This article provides an update to the Taverna tool suite, highlighting new features and developments in the workbench and the Taverna Server.
BMC Bioinformatics | 2010
Steffen Möller; Hajo N. Krabbenhöft; Andreas Tille; David Paleino; Alan R. Williams; Katy Wolstencroft; Carole A. Goble; Richard Holland; Dominique Belhachemi; Charles Plessy
BackgroundThe Open Source movement and its technologies are popular in the bioinformatics community because they provide freely available tools and resources for research. In order to feed the steady demand for updates on software and associated data, a service infrastructure is required for sharing and providing these tools to heterogeneous computing environments.ResultsThe Debian Med initiative provides ready and coherent software packages for medical informatics and bioinformatics. These packages can be used together in Taverna workflows via the UseCase plugin to manage execution on local or remote machines. If such packages are available in cloud computing environments, the underlying hardware and the analysis pipelines can be shared along with the software.ConclusionsDebian Med closes the gap between developers and users. It provides a simple method for offering new releases of software and data resources, thus provisioning a local infrastructure for computational biology. For geographically distributed teams it can ensure they are working on the same versions of tools, in the same conditions. This contributes to the world-wide networking of researchers.
Biodiversity Data Journal | 2014
Cherian Mathew; Anton Güntsch; Matthias Obst; Saverio Vicario; Robert Haines; Alan R. Williams; Yde de Jong; Carole A. Goble
Abstract The compilation and cleaning of data needed for analyses and prediction of species distributions is a time consuming process requiring a solid understanding of data formats and service APIs provided by biodiversity informatics infrastructures. We designed and implemented a Taverna-based Data Refinement Workflow which integrates taxonomic data retrieval, data cleaning, and data selection into a consistent, standards-based, and effective system hiding the complexity of underlying service infrastructures. The workflow can be freely used both locally and through a web-portal which does not require additional software installations by users.
international conference on digital mammography | 2006
Chris Rose; Daniele Turi; Alan R. Williams; Katy Wolstencroft; Christopher J. Taylor
The Digital Database for Screening Mammography (DDSM) is an invaluable resource for digital mammography research. However, there are two particular shortcomings that can pose a significant barrier to many of those who may want to use the resource: 1) the actual mammographic image data is encoded using a non-standard lossless variant of the JPEG image format; 2) although detailed metadata is provided, it is not in a form that permits it to be searched, manipulated or reasoned over by standard tools. This paper describes web services that will allow both humans and computers to query for, and obtain, mammograms from the DDSM in a standard and well-supported image file format. Further, this paper describes how these and other services can be used within grid-based workflows, allowing digital mammography researchers to make use of distributed computing facilities.
PLOS Biology | 2017
Julie McMurry; Nick Juty; Niklas Blomberg; Tony Burdett; Tom Conlin; Nathalie Conte; Mélanie Courtot; John Deck; Michel Dumontier; Donal Fellows; Alejandra Gonzalez-Beltran; Philipp Gormanns; Jeffrey S. Grethe; Janna Hastings; Jean-Karim Hériché; Henning Hermjakob; Jon Ison; Rafael C. Jimenez; Simon Jupp; John Kunze; Camille Laibe; Nicolas Le Novère; James Malone; María Martín; Johanna McEntyre; Chris Morris; Juha Muilu; Wolfgang Müller; Philippe Rocca-Serra; Susanna-Assunta Sansone
In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.
Computer-aided Design | 2001
Hilary J. Kahn; Nick Filer; Alan R. Williams; Nigel A. Whitaker
The ability to share information on an enterprise-wide basis is a key requirement for large manufacturing and processing industries such as automotive, aerospace and oil processing companies. The Standard for the Exchange of Product Model data (STEP-ISO 10303) addresses this through formats and programming interfaces derived directly from domain-related information models written in the EXPRESS information modelling language. However, these formats and programming interfaces are predetermined, and not always well suited to current information processing technologies. In this paper, a framework for manipulating EXPRESS models is described. The goal is to retain the STEP concept of the direct mapping of an information model to an implementation, but to do so in a way that enables alternative implementation strategies to be adopted. This system, called STEPWISE, allows the user to specify manipulations and model transformations in order to convert models from one form into another. This might, for example, involve conversion from a conceptual model to an implementation-level data model, creating a subset of a model, or even adding concepts to a low-level model that represents a legacy data format. The transformations are carried out on the models in such a way that the integrity of the referencing and the constraints can be maintained as far as possible. Various examples of model manipulation are described in order to illustrate the issues that must be considered when manipulating sophisticated conceptual models.
Biodiversity Data Journal | 2014
Rutger A. Vos; Jordan Biserkov; Bachir Balech; Niall Beard; Matthew Blissett; Christian Y. A. Brenninkmeijer; Tom van Dooren; David Eades; George Gosline; Quentin Groom; Thomas Hamann; Hannes Hettling; Robert Hoehndorf; Ayco Holleman; Peter Hovenkamp; Patricia Kelbert; Don Kirkup; Youri Lammers; Thibaut DeMeulemeester; Daniel Mietchen; Jeremy Miller; Ross Mounce; Nicola Nicolson; Rod Page; Aleksandra Pawlik; Serrano Pereira; Lyubomir Penev; Kevin Richards; Guido Sautter; David P. Shorthouse
Abstract Background: Recent years have seen a surge in projects that produce large volumes of structured, machine-readable biodiversity data. To make these data amenable to processing by generic, open source “data enrichment” workflows, they are increasingly being represented in a variety of standards-compliant interchange formats. Here, we report on an initiative in which software developers and taxonomists came together to address the challenges and highlight the opportunities in the enrichment of such biodiversity data by engaging in intensive, collaborative software development: The Biodiversity Data Enrichment Hackathon. Results: The hackathon brought together 37 participants (including developers and taxonomists, i.e. scientific professionals that gather, identify, name and classify species) from 10 countries: Belgium, Bulgaria, Canada, Finland, Germany, Italy, the Netherlands, New Zealand, the UK, and the US. The participants brought expertise in processing structured data, text mining, development of ontologies, digital identification keys, geographic information systems, niche modeling, natural language processing, provenance annotation, semantic integration, taxonomic name resolution, web service interfaces, workflow tools and visualisation. Most use cases and exemplar data were provided by taxonomists. One goal of the meeting was to facilitate re-use and enhancement of biodiversity knowledge by a broad range of stakeholders, such as taxonomists, systematists, ecologists, niche modelers, informaticians and ontologists. The suggested use cases resulted in nine breakout groups addressing three main themes: i) mobilising heritage biodiversity knowledge; ii) formalising and linking concepts; and iii) addressing interoperability between service platforms. Another goal was to further foster a community of experts in biodiversity informatics and to build human links between research projects and institutions, in response to recent calls to further such integration in this research domain. Conclusions: Beyond deriving prototype solutions for each use case, areas of inadequacy were discussed and are being pursued further. It was striking how many possible applications for biodiversity data there were and how quickly solutions could be put together when the normal constraints to collaboration were broken down for a week. Conversely, mobilising biodiversity knowledge from their silos in heritage literature and natural history collections will continue to require formalisation of the concepts (and the links between them) that define the research domain, as well as increased interoperability between the software platforms that operate on these concepts.
Nucleic Acids Research | 2017
Katherine Wolstencroft; Olga Krebs; Jacky L. Snoep; Natalie Stanford; Finn Bacall; Martin Golebiewski; Rostyk Kuzyakiv; Quyen Nguyen; Stuart Owen; Stian Soiland-Reyes; Jakub Straszewski; David D. van Niekerk; Alan R. Williams; Lars Malmström; Bernd Rinn; Wolfgang Müller; Carole A. Goble
The FAIRDOMHub is a repository for publishing FAIR (Findable, Accessible, Interoperable and Reusable) Data, Operating procedures and Models (https://fairdomhub.org/) for the Systems Biology community. It is a web-accessible repository for storing and sharing systems biology research assets. It enables researchers to organize, share and publish data, models and protocols, interlink them in the context of the systems biology investigations that produced them, and to interrogate them via API interfaces. By using the FAIRDOMHub, researchers can achieve more effective exchange with geographically distributed collaborators during projects, ensure results are sustained and preserved and generate reproducible publications that adhere to the FAIR guiding principles of data stewardship.
BMC Ecology | 2016
Alex Hardisty; Finn Bacall; Niall Beard; Maria-Paula Balcázar-Vargas; Bachir Balech; Zoltán Barcza; Sarah J. Bourlat; Renato De Giovanni; Yde de Jong; Francesca De Leo; Laura Dobor; Giacinto Donvito; Donal Fellows; Antonio Fernandez Guerra; Nuno Ferreira; Yuliya Fetyukova; Bruno Fosso; Jonathan Giddy; Carole A. Goble; Anton Güntsch; Robert Haines; Vera Hernández Ernst; Hannes Hettling; Dóra Hidy; Ferenc Horváth; Dóra Ittzés; Péter Ittzés; Andrew R. Jones; Renzo Kottmann; Robert Kulawik
BackgroundMaking forecasts about biodiversity and giving support to policy relies increasingly on large collections of data held electronically, and on substantial computational capability and capacity to analyse, model, simulate and predict using such data. However, the physically distributed nature of data resources and of expertise in advanced analytical tools creates many challenges for the modern scientist. Across the wider biological sciences, presenting such capabilities on the Internet (as “Web services”) and using scientific workflow systems to compose them for particular tasks is a practical way to carry out robust “in silico” science. However, use of this approach in biodiversity science and ecology has thus far been quite limited.ResultsBioVeL is a virtual laboratory for data analysis and modelling in biodiversity science and ecology, freely accessible via the Internet. BioVeL includes functions for accessing and analysing data through curated Web services; for performing complex in silico analysis through exposure of R programs, workflows, and batch processing functions; for on-line collaboration through sharing of workflows and workflow runs; for experiment documentation through reproducibility and repeatability; and for computational support via seamless connections to supporting computing infrastructures. We developed and improved more than 60 Web services with significant potential in many different kinds of data analysis and modelling tasks. We composed reusable workflows using these Web services, also incorporating R programs. Deploying these tools into an easy-to-use and accessible ‘virtual laboratory’, free via the Internet, we applied the workflows in several diverse case studies. We opened the virtual laboratory for public use and through a programme of external engagement we actively encouraged scientists and third party application and tool developers to try out the services and contribute to the activity.ConclusionsOur work shows we can deliver an operational, scalable and flexible Internet-based virtual laboratory to meet new demands for data processing and analysis in biodiversity science and ecology. In particular, we have successfully integrated existing and popular tools and practices from different scientific disciplines to be used in biodiversity and ecological research.
The Computer Journal | 1996
Howard Barringer; Graham Gough; Brian Monahan; Alan R. Williams
A process algebraic foundation has been developed for formal analysis of synchronous hardware designs represented through the commercially available hardware design language, ELLA. An underlying semantic foundation, based on input/output trace sets, is presented first through the use of state machines. Such a representation enables direct application of standard, fully automated trace equivalence checking tools. However, to overcome the computational limitations imposed by such analysis methods, the input/ output trace semantics is represented through a synchronous process algebra, EPA. Primitive processes in EPA denote the behaviour of primitive hardware components, such as delays or multiplexers, with composition operators corresponding to the different ways in which behaviours may be built. Of particular significance is the parallel composition operator which captures the machinery for building networks from other components/networks. Actions in EPA are structured and signify the state of input and output signals. This structure, however, is abstracted by developing an algebra for the actions. In particular, parallel composition on processes neatly lifts to a special (synchronous) product operation on actions. The EPA representation forms a good basis for semi-automated high-level symbolic manipulation and reasoning tools. First, the original design structure can be maintained, thus easing the problems of user level feedback from tools. Secondly, the application of EPA to ELLA enables a deterministic finite automation form for EPA terms. This provides a route to tractable symbolic verification and simulation, using a state evolution method to establish strong bisimulation properties. The method has been successfully applied to classes of unbounded state space systems.