Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Katherine Wolstencroft is active.

Publication


Featured researches published by Katherine Wolstencroft.


Scientific Data | 2016

The FAIR Guiding Principles for scientific data management and stewardship

Mark D. Wilkinson; Michel Dumontier; IJsbrand Jan Aalbersberg; Gabrielle Appleton; Myles Axton; Arie Baak; Niklas Blomberg; Jan Willem Boiten; Luiz Olavo Bonino da Silva Santos; Philip E. Bourne; Jildau Bouwman; Anthony J. Brookes; Timothy W.I. Clark; Mercè Crosas; Ingrid Dillo; Olivier Dumon; Scott C Edmunds; Chris T. Evelo; Richard Finkers; Alejandra Gonzalez-Beltran; Alasdair J. G. Gray; Paul T. Groth; Carole A. Goble; Jeffrey S. Grethe; Jaap Heringa; Peter A. C. 't Hoen; Rob W. W. Hooft; Tobias Kuhn; Ruben Kok; Joost N. Kok

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.


Nucleic Acids Research | 2013

The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud

Katherine Wolstencroft; Robert Haines; Donal Fellows; Alan R. Williams; David Withers; Stuart Owen; Stian Soiland-Reyes; Ian Dunlop; Aleksandra Nenadic; Paul Fisher; Jiten Bhagat; Khalid Belhajjame; Finn Bacall; Alex Hardisty; Abraham Nieva de la Hidalga; Maria Paula Balcazar Vargas; Shoaib Sufi; Carole A. Goble

The Taverna workflow tool suite (http://www.taverna.org.uk) is designed to combine distributed Web Services and/or local tools into complex analysis pipelines. These pipelines can be executed on local desktop machines or through larger infrastructure (such as supercomputers, Grids or cloud environments), using the Taverna Server. In bioinformatics, Taverna workflows are typically used in the areas of high-throughput omics analyses (for example, proteomics or transcriptomics), or for evidence gathering methods involving text mining or data mining. Through Taverna, scientists have access to several thousand different tools and resources that are freely available from a large range of life science institutions. Once constructed, the workflows are reusable, executable bioinformatics protocols that can be shared, reused and repurposed. A repository of public workflows is available at http://www.myexperiment.org. This article provides an update to the Taverna tool suite, highlighting new features and developments in the workbench and the Taverna Server.


Nucleic Acids Research | 2007

A systematic strategy for large-scale analysis of genotype phenotype correlations: identification of candidate genes involved in African trypanosomiasis.

Paul Fisher; Cornelia Hedeler; Katherine Wolstencroft; Helen Hulme; Harry Noyes; Stephen J. Kemp; Robert Stevens; Andy Brass

It is increasingly common to combine Microarray and Quantitative Trait Loci data to aid the search for candidate genes responsible for phenotypic variation. Workflows provide a means of systematically processing these large datasets and also represent a framework for the re-use and the explicit declaration of experimental methods. In this article, we highlight the issues facing the manual analysis of microarray and QTL data for the discovery of candidate genes underlying complex phenotypes. We show how automated approaches provide a systematic means to investigate genotype–phenotype correlations. This methodology was applied to a use case of resistance to African trypanosomiasis in the mouse. Pathways represented in the results identified Daxx as one of the candidate genes within the Tir1 QTL region. Subsequent re-sequencing in Daxx identified a deletion of an amino acid, identified in susceptible mouse strains, in the Daxx–p53 protein-binding region. This supports recent experimental evidence that apoptosis could be playing a role in the trypanosomiasis resistance phenotype. Workflows developed in this investigation, including a guide to loading and executing them with example data, are available at http://workflows.mygrid.org.uk/repository/myGrid/PaulFisher/.


PLOS Biology | 2017

Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data

Julie McMurry; Nick Juty; Niklas Blomberg; Tony Burdett; Tom Conlin; Nathalie Conte; Mélanie Courtot; John Deck; Michel Dumontier; Donal Fellows; Alejandra Gonzalez-Beltran; Philipp Gormanns; Jeffrey S. Grethe; Janna Hastings; Jean-Karim Hériché; Henning Hermjakob; Jon Ison; Rafael C. Jimenez; Simon Jupp; John Kunze; Camille Laibe; Nicolas Le Novère; James Malone; María Martín; Johanna McEntyre; Chris Morris; Juha Muilu; Wolfgang Müller; Philippe Rocca-Serra; Susanna-Assunta Sansone

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.


Molecular Systems Biology | 2015

The evolution of standards and data management practices in systems biology

Natalie Stanford; Katherine Wolstencroft; Martin Golebiewski; Renate Kania; Nick Juty; Christopher Tomlinson; Stuart Owen; Sarah Butcher; Henning Hermjakob; Nicolas Le Novère; Wolfgang Mueller; Jacky L. Snoep; Carole A. Goble

A recent community survey conducted by Infrastructure for Systems Biology Europe (ISBE) informs requirements for developing an efficient infrastructure for systems biology standards, data and model management.


Scientific Reports | 2016

A comprehensive gene expression analysis at sequential stages of in vitro cardiac differentiation from isolated MESP1-expressing-mesoderm progenitors.

Sabine C. Den Hartogh; Katherine Wolstencroft; Robert Passier

In vitro cardiac differentiation of human pluripotent stem cells (hPSCs) closely recapitulates in vivo embryonic heart development, and therefore, provides an excellent model to study human cardiac development. We recently generated the dual cardiac fluorescent reporter MESP1mCherry/wNKX2-5eGFP/w line in human embryonic stem cells (hESCs), allowing the visualization of pre-cardiac MESP1+ mesoderm and their further commitment towards the cardiac lineage, marked by activation of the cardiac transcription factor NKX2-5. Here, we performed a comprehensive whole genome based transcriptome analysis of MESP1-mCherry derived cardiac-committed cells. In addition to previously described cardiac-inducing signalling pathways, we identified novel transcriptional and signalling networks indicated by transient activation and interactive network analysis. Furthermore, we found a highly dynamic regulation of extracellular matrix components, suggesting the importance to create a versatile niche, adjusting to various stages of cardiac differentiation. Finally, we identified cell surface markers for cardiac progenitors, such as the Leucine-rich repeat-containing G-protein coupled receptor 4 (LGR4), belonging to the same subfamily of LGR5, and LGR6, established tissue/cancer stem cells markers. We provide a comprehensive gene expression analysis of cardiac derivatives from pre-cardiac MESP1-progenitors that will contribute to a better understanding of the key regulators, pathways and markers involved in human cardiac differentiation and development.


Journal of Biomedical Semantics | 2014

Structuring research methods and data with the research object model: genomics workflows as a case study.

Kristina M. Hettne; Harish Dharuri; Jun Zhao; Katherine Wolstencroft; Khalid Belhajjame; Stian Soiland-Reyes; Eleni Mina; Mark Thompson; Don C. Cruickshank; L. Verdes-Montenegro; Julián Garrido; David De Roure; Oscar Corcho; Graham Klyne; Reinout van Schouwen; Peter A. C. 't Hoen; Sean Bechhofer; Carole A. Goble; Marco Roos

BackgroundOne of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows.ResultsWe present the application of the workflow-centric RO model for our bioinformatics case study. Three workflows were produced following recently defined Best Practices for workflow design. By modelling the experiment as an RO, we were able to automatically query the experiment and answer questions such as “which particular data was input to a particular workflow to test a particular hypothesis?”, and “which particular conclusions were drawn from a particular workflow?”.ConclusionsApplying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment, allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data. The RO model is an extendable reference model that can be used by other systems as well.AvailabilityThe Research Object is available at http://www.myexperiment.org/packs/428The Wf4Ever Research Object Model is available at http://wf4ever.github.io/ro


intelligent systems in molecular biology | 2007

A systematic strategy for the discovery of candidate genes responsible for phenotypic variation.

Paul Fisher; Cornelia Hedeler; Katherine Wolstencroft; Helen Hulme; Harry Noyes; Stephen J. Kemp; Robert Stevens; Andy Brass

IntroductionThe use of Quantitative Trait Loci (QTL) data is increasingly used to aid in the discovery of candidate genes involved in phenotypic variation. Tens to hundreds of genes, however, may lie within even well defined QTL. It is therefore vital that the identification, selection and functional testing of candidate Quantitative Trait genes (QTg) are carried out systematically, and without bias [1]. With the advent of microarrays, researchers are able to directly examine the expression of all genes on a genome wide scale, including those underlying QTL regions.The scale of data being generated by such high-throughput experiments has led some investigators to follow a hypothesis-driven approach [2]. Although these techniques for candidate gene identification are valid, they run the risk of overlooking genes that have less obvious associations with the phenotype. By making selections based on prior assumptions of what processes may be involved, the genes that may actually be involved in the phenotype can be overlooked. A further complication is that the use of ad hoc methods for candidate gene identification are inherently difficult to replicate and are compounded by poor documentation of the methods used to generate and capture the data from such investigations in published literature.With an ever increasing number of institutes offering programmatic access to their resources in the form of web services, however, experiments previously conducted manually can now be replaced by automated experiments, capable of processing a far greater volume of data. By reconstructing the original investigation methods in the form of workflows, we are now able to pass data directly from one service to the next. This enables us to process the data in a much more systematic, un-biased, and explicit manner.MethodsWe propose a data-driven methodology that identifies the known pathways that intersect a QTL and those derived from a set of differentially expressed genes from a microarray study. This methodology is implemented systematically through the use of web services and workflows. For the purpose of implementing this systematic pathway-driven approach, we have chosen to use the Taverna workbench [3].Results and DiscussionPreliminary studies into the modes of resistance to African Trypanosomiasis were carried out for the mouse model organism. These studies illustrated how the large-scale analysis of microarray gene expression and QTL data, investigated at the level of biological pathways, enables links between genotype and phenotype to be successfully established [4]. This approach was implemented systematically through the use of explicitly defined workflows.


Nucleic Acids Research | 2017

FAIRDOMHub: a repository and collaboration environment for sharing systems biology research.

Katherine Wolstencroft; Olga Krebs; Jacky L. Snoep; Natalie Stanford; Finn Bacall; Martin Golebiewski; Rostyk Kuzyakiv; Quyen Nguyen; Stuart Owen; Stian Soiland-Reyes; Jakub Straszewski; David D. van Niekerk; Alan R. Williams; Lars Malmström; Bernd Rinn; Wolfgang Müller; Carole A. Goble

The FAIRDOMHub is a repository for publishing FAIR (Findable, Accessible, Interoperable and Reusable) Data, Operating procedures and Models (https://fairdomhub.org/) for the Systems Biology community. It is a web-accessible repository for storing and sharing systems biology research assets. It enables researchers to organize, share and publish data, models and protocols, interlink them in the context of the systems biology investigations that produced them, and to interrogate them via API interfaces. By using the FAIRDOMHub, researchers can achieve more effective exchange with geographically distributed collaborators during projects, ensure results are sustained and preserved and generate reproducible publications that adhere to the FAIR guiding principles of data stewardship.


Interface Focus | 2016

A physiome interoperability roadmap for personalized drug development.

Simon Thomas; Katherine Wolstencroft; Bernard de Bono; Peter Hunter

The goal of developing therapies and dosage regimes for characterized subgroups of the general population can be facilitated by the use of simulation models able to incorporate information about inter-individual variability in drug disposition (pharmacokinetics), toxicity and response effect (pharmacodynamics). Such observed variability can have multiple causes at various scales, ranging from gross anatomical differences to differences in genome sequence. Relevant data for many of these aspects, particularly related to molecular assays (known as ‘-omics’), are available in online resources, but identification and assignment to appropriate model variables and parameters is a significant bottleneck in the model development process. Through its efforts to standardize annotation with consequent increase in data usability, the human physiome project has a vital role in improving productivity in model development and, thus, the development of personalized therapy regimes. Here, we review the current status of personalized medicine in clinical practice, outline some of the challenges that must be overcome in order to expand its applicability, and discuss the relevance of personalized medicine to the more widespread challenges being faced in drug discovery and development. We then review some of (i) the key data resources available for use in model development and (ii) the potential areas where advances made within the physiome modelling community could contribute to physiologically based pharmacokinetic and physiologically based pharmacokinetic/pharmacodynamic modelling in support of personalized drug development. We conclude by proposing a roadmap to further guide the physiome community in its on-going efforts to improve data usability, and integration with modelling efforts in the support of personalized medicine development.

Collaboration


Dive into the Katherine Wolstencroft's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stuart Owen

University of Manchester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Martin Golebiewski

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar

Olga Krebs

German Cancer Research Center

View shared research outputs
Top Co-Authors

Avatar

Wolfgang Mueller

Heidelberg Institute for Theoretical Studies

View shared research outputs
Top Co-Authors

Avatar

Wolfgang Müller

Heidelberg Institute for Theoretical Studies

View shared research outputs
Researchain Logo
Decentralizing Knowledge