Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Guy Cochrane is active.

Publication


Featured researches published by Guy Cochrane.


Nucleic Acids Research | 2004

The EMBL Nucleotide Sequence Database

Tamara Kulikova; Philippe Aldebert; Nicola Althorpe; Wendy Baker; Kirsty Bates; Paul Browne; Alexandra van den Broek; Guy Cochrane; Karyn Duggan; Ruth Y. Eberhardt; Nadeem Faruque; Maria Garcia-Pastor; Nicola Harte; Carola Kanz; Rasko Leinonen; Quan Lin; Vincent Lombard; Rodrigo Lopez; Renato Mancuso; Michelle McHale; Francesco Nardone; Ville Silventoinen; Peter Stoehr; Guenter Stoesser; Mary Ann Tuli; Katerina Tzouvara; Robert Vaughan; Dan Wu; Weimin Zhu; Rolf Apweiler

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl), maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK, is a comprehensive collection of nucleotide sequences and annotation from available public sources. The database is part of an international collaboration with DDBJ (Japan) and GenBank (USA). Data are exchanged daily between the collaborating institutes to achieve swift synchrony. Webin is the preferred tool for individual submissions of nucleotide sequences, including Third Party Annotation (TPA) and alignments. Automated procedures are provided for submissions from large-scale sequencing projects and data from the European Patent Office. New and updated data records are distributed daily and the whole EMBL Nucleotide Sequence Database is released four times a year. Access to the sequence data is provided via ftp and several WWW interfaces. With the web-based Sequence Retrieval System (SRS) it is also possible to link nucleotide data to other specialist molecular biology databases maintained at the EBI. Other tools are available for sequence similarity searching (e.g. FASTA and BLAST). Changes over the past year include the removal of the sequence length limit, the launch of the EMBLCDSs dataset, extension of the Sequence Version Archive functionality and the revision of quality rules for TPA data.


Omics A Journal of Integrative Biology | 2008

Toward an Online Repository of Standard Operating Procedures (SOPs) for (Meta)genomic Annotation

Samuel V. Angiuoli; Aaron Gussman; William Klimke; Guy Cochrane; Dawn Field; George M Garrity; Chinnappa D. Kodira; Nikos C. Kyrpides; Ramana Madupu; Victor Markowitz; Tatiana Tatusova; Nicholas R. Thomson; Owen White

The methodologies used to generate genome and metagenome annotations are diverse and vary between groups and laboratories. Descriptions of the annotation process are helpful in interpreting genome annotation data. Some groups have produced Standard Operating Procedures (SOPs) that describe the annotation process, but standards are lacking for structure and content of these descriptions. In addition, there is no central repository to store and disseminate procedures and protocols for genome annotation. We highlight the importance of SOPs for genome annotation and endorse an online repository of SOPs.


Nucleic Acids Research | 2011

The International Nucleotide Sequence Database Collaboration

Guy Cochrane; Ilene Karsch-Mizrachi; Yasukazu Nakamura

The members of the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) set out to capture, preserve and present globally comprehensive public domain nucleotide sequence information. The work of the long-standing collaboration includes the provision of data formats, annotation conventions and routine global data exchange. Among the many developments to INSDC resources in 2011 are the newly launched BioProject database and improved handling of assembly information. In this article, we outline INSDC services and update the reader on developments in 2011.


Nucleic Acids Research | 2011

The European Nucleotide Archive

Rasko Leinonen; Ruth Akhtar; Ewan Birney; Lawrence Bower; Ana Cerdeño-Tárraga; Ying Cheng; Iain Cleland; Nadeem Faruque; Neil Goodgame; Richard Gibson; Gemma Hoad; Mikyung Jang; Nima Pakseresht; Sheila Plaister; Rajesh Radhakrishnan; Kethi Reddy; Siamak Sobhany; Petra ten Hoopen; Robert Vaughan; Vadim Zalunin; Guy Cochrane

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe’s primary nucleotide-sequence repository. The ENA consists of three main databases: the Sequence Read Archive (SRA), the Trace Archive and EMBL-Bank. The objective of ENA is to support and promote the use of nucleotide sequencing as an experimental research platform by providing data submission, archive, search and download services. In this article, we outline these services and describe major changes and improvements introduced during 2010. These include extended EMBL-Bank and SRA-data submission services, extended ENA Browser functionality, support for submitting data to the European Genome-phenome Archive (EGA) through SRA, and the launch of a new sequence similarity search service.


Genome Research | 2011

Efficient storage of high throughput DNA sequencing data using reference-based compression

Markus Hsi-Yang Fritz; Rasko Leinonen; Guy Cochrane; Ewan Birney

Data storage costs have become an appreciable proportion of total cost in the creation and analysis of DNA sequence data. Of particular concern is that the rate of increase in DNA sequencing is significantly outstripping the rate of increase in disk storage capacity. In this paper we present a new reference-based compression method that efficiently compresses DNA sequences for storage. Our approach works for resequencing experiments that target well-studied genomes. We align new sequences to a reference genome and then encode the differences between the new sequence and the reference genome for storage. Our compression method is most efficient when we allow controlled loss of data in the saving of quality information and unaligned sequences. With this new compression method we observe exponential efficiency gains as read lengths increase, and the magnitude of this efficiency gain can be controlled by changing the amount of quality information stored. Our compression method is tunable: The storage of quality scores and unaligned sequences may be adjusted for different experiments to conserve information or to minimize storage costs, and provides one opportunity to address the threat that increasing DNA sequence volumes will overcome our ability to store the sequences.


Nucleic Acids Research | 2007

EMBL Nucleotide Sequence Database in 2006

Tamara Kulikova; Ruth Akhtar; Philippe Aldebert; Nicola Althorpe; Mikael Andersson; Alastair Baldwin; Kirsty Bates; Sumit Bhattacharyya; Lawrence Bower; Paul Browne; Matias Castro; Guy Cochrane; Karyn Duggan; Ruth Y. Eberhardt; Nadeem Faruque; Gemma Hoad; Carola Kanz; Charles Lee; Rasko Leinonen; Quan Lin; Vincent Lombard; Rodrigo Lopez; Dariusz Lorenc; Hamish McWilliam; Gaurab Mukherjee; Francesco Nardone; Maria Pilar Garcia Pastor; Sheila Plaister; Siamak Sobhany; Peter Stoehr

The EMBL Nucleotide Sequence Database () at the EMBL European Bioinformatics Institute, UK, offers a large and freely accessible collection of nucleotide sequences and accompanying annotation. The database is maintained in collaboration with DDBJ and GenBank. Data are exchanged between the collaborating databases on a daily basis to achieve optimal synchrony. Webin is the preferred tool for individual submissions of nucleotide sequences, including Third Party Annotation, alignments and bulk data. Automated procedures are provided for submissions from large-scale sequencing projects and data from the European Patent Office. In 2006, the volume of data has continued to grow exponentially. Access to the data is provided via SRS, ftp and variety of other methods. Extensive external and internal cross-references enable users to search for related information across other databases and within the database. All available resources can be accessed via the EBI home page at . Changes over the past year include changes to the file format, further development of the EMBLCDS dataset and developments to the XML format.


Nucleic Acids Research | 2009

Petabyte-scale innovations at the European Nucleotide Archive

Guy Cochrane; Ruth Akhtar; James K. Bonfield; Lawrence Bower; Fehmi Demiralp; Nadeem Faruque; Richard Gibson; Gemma Hoad; Tim Hubbard; Chris Hunter; Mikyung Jang; Szilveszter Juhos; Rasko Leinonen; Steven Leonard; Quan Lin; Rodrigo Lopez; Dariusz Lorenc; Hamish McWilliam; Gaurab Mukherjee; Sheila Plaister; Rajesh Radhakrishnan; Stephen Robinson; Siamak Sobhany; Petra ten Hoopen; Robert Vaughan; Vadim Zalunin; Ewan Birney

Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence repositories, which provide the feed for these data sets into the worldwide computational infrastructure, are challenged by the impact of these data volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising the EMBL Nucleotide Sequence Database and the Ensembl Trace Archive, has identified challenges in the storage, movement, analysis, interpretation and visualization of petabyte-scale data sets. We present here our new repository for next generation sequence data, a brief summary of contents of the ENA and provide details of major developments to submission pipelines, high-throughput rule-based validation infrastructure and data integration approaches.


Nucleic Acids Research | 2014

EBI metagenomics—a new resource for the analysis and archiving of metagenomic data

Sarah Hunter; Matthew Corbett; Hubert Denise; Matthew Fraser; Alejandra Gonzalez-Beltran; Chris Hunter; Philip Jones; Rasko Leinonen; Craig McAnulla; Eamonn Maguire; John Maslen; Alex L. Mitchell; Gift Nuka; Arnaud Oisel; Sebastien Pesseat; Rajesh Radhakrishnan; Philippe Rocca-Serra; Maxim Scheremetjew; Peter Sterk; Daniel Vaughan; Guy Cochrane; Dawn Field; Susanna-Assunta Sansone

Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.


Nucleic Acids Research | 2006

EMBL Nucleotide Sequence Database: developments in 2005

Guy Cochrane; Philippe Aldebert; Nicola Althorpe; Mikael Andersson; Wendy Baker; Alastair Baldwin; Kirsty Bates; Sumit Bhattacharyya; Paul Browne; Alexandra van den Broek; Matias Castro; Karyn Duggan; Ruth Y. Eberhardt; Nadeem Faruque; John Gamble; Carola Kanz; Tamara Kulikova; Charles Lee; Rasko Leinonen; Quan Lin; Vincent Lombard; Rodrigo Lopez; Michelle McHale; Hamish McWilliam; Gaurab Mukherjee; Francesco Nardone; Maria Pilar Garcia Pastor; Siamak Sobhany; Peter Stoehr; Katerina Tzouvara

The EMBL Nucleotide Sequence Database () at the EMBL European Bioinformatics Institute, UK, offers a comprehensive set of publicly available nucleotide sequence and annotation, freely accessible to all. Maintained in collaboration with partners DDBJ and GenBank, coverage includes whole genome sequencing project data, directly submitted sequence, sequence recorded in support of patent applications and much more. The database continues to offer submission tools, data retrieval facilities and user support. In 2005, the volume of data offered has continued to grow exponentially. In addition to the newly presented data, the database encompasses a range of new data types generated by novel technologies, offers enhanced presentation and searchability of the data and has greater integration with other data resources offered at the EBI and elsewhere. In stride with these developing data types, the database has continued to develop submission and retrieval tools to maximise the information content of submitted data and to offer the simplest possible submission routes for data producers. New developments, the submission process, data retrieval and access to support are presented in this paper, along with links to sources of further information.


Database | 2011

Towards BioDBcore: a community-defined information specification for biological databases

Pascale Gaudet; Amos Marc Bairoch; Dawn Field; Susanna-Assunta Sansone; Chris Taylor; Teresa K. Attwood; Alex Bateman; Judith A. Blake; J. Michael Cherry; Rex L. Chrisholm; Guy Cochrane; Charles E. Cook; Janan T. Eppig; Michael Y. Galperin; Robert Gentleman; Carole A. Goble; Takashi Gojobori; John M. Hancock; Douglas G. Howe; Tadashi Imanishi; Janet Kelso; David Landsman; Suzanna E. Lewis; Ilene Karsch Mizrachi; Sandra Orchard; B. F. Francis Ouellette; Shoba Ranganathan; Lorna Richardson; Philippe Rocca-Serra; Paul N. Schofield

The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources; and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.

Collaboration


Dive into the Guy Cochrane's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Petra ten Hoopen

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Rasko Leinonen

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Peter Sterk

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Richard Gibson

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Ana Cerdeño-Tárraga

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Clara Amid

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Nadeem Faruque

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Vadim Zalunin

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge