Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sarah W. Burge is active.

Publication


Featured researches published by Sarah W. Burge.


Nucleic Acids Research | 2006

Quadruplex DNA: sequence, topology and structure

Sarah W. Burge; Gary N. Parkinson; Pascale Hazel; Alan K. Todd; Stephen Neidle

G-quadruplexes are higher-order DNA and RNA structures formed from G-rich sequences that are built around tetrads of hydrogen-bonded guanine bases. Potential quadruplex sequences have been identified in G-rich eukaryotic telomeres, and more recently in non-telomeric genomic DNA, e.g. in nuclease-hypersensitive promoter regions. The natural role and biological validation of these structures is starting to be explored, and there is particular interest in them as targets for therapeutic intervention. This survey focuses on the folding and structural features on quadruplexes formed from telomeric and non-telomeric DNA sequences, and examines fundamental aspects of topology and the emerging relationships with sequence. Emphasis is placed on information from the high-resolution methods of X-ray crystallography and NMR, and their scope and current limitations are discussed. Such information, together with biological insights, will be important for the discovery of drugs targeting quadruplexes from particular genes.


Nucleic Acids Research | 2012

InterPro in 2011: new developments in the family and domain prediction database

Sarah Hunter; P. D. Jones; Alex L. Mitchell; Rolf Apweiler; Teresa K. Attwood; Alex Bateman; Thomas Bernard; David Binns; Peer Bork; Sarah W. Burge; Edouard de Castro; Penny Coggill; Matthew Corbett; Ujjwal Das; Louise Daugherty; Lauranne Duquenne; Robert D. Finn; Matthew Fraser; Julian Gough; Daniel H. Haft; Nicolas Hulo; Daniel Kahn; Elizabeth Kelly; Ivica Letunic; David M. Lonsdale; Rodrigo Lopez; John Maslen; Craig McAnulla; Jennifer McDowall; Conor McMenamin

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.


Nucleic Acids Research | 2013

Rfam 11.0: 10 years of RNA families

Sarah W. Burge; Jennifer Daub; Ruth Y. Eberhardt; John G. Tate; Lars Barquist; Eric P. Nawrocki; Sean R. Eddy; Paul P. Gardner; Alex Bateman

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.


Nucleic Acids Research | 2015

Rfam 12.0: updates to the RNA families database

Eric P. Nawrocki; Sarah W. Burge; Alex Bateman; Jennifer Daub; Ruth Y. Eberhardt; Sean R. Eddy; Evan W. Floden; Paul P. Gardner; Thomas A. Jones; John G. Tate; Robert D. Finn

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.


Nucleic Acids Research | 2014

RNAcentral: an international database of ncRNA sequences

Anton I. Petrov; Simon Kay; Richard Gibson; Eugene Kulesha; Dan Staines; Elspeth A. Bruford; Mathew W. Wright; Sarah W. Burge; Robert D. Finn; Paul J. Kersey; Guy Cochrane; Alex Bateman; Sam Griffiths-Jones; Jennifer Harrow; Patricia P. Chan; Todd M. Lowe; Christian Zwieb; Jacek Wower; Kelly P. Williams; Corey M. Hudson; Robin R. Gutell; Michael B. Clark; Marcel E. Dinger; Xiu Cheng Quek; Janusz M. Bujnicki; Nam-Hai Chua; Jun Liu; Huan Wang; Geir Skogerbø; Yi Zhao

Abstract The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.


Database | 2012

Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation

Sarah W. Burge; Elizabeth Kelly; David M. Lonsdale; Prudence Mutowo-Muellenet; Craig McAnulla; Alex L. Mitchell; Amaia Sangrador-Vegas; Siew-Yit Yong; Nicola Mulder; Sarah Hunter

InterPro amalgamates predictive protein signatures from a number of well-known partner databases into a single resource. To aid with interpretation of results, InterPro entries are manually annotated with terms from the Gene Ontology (GO). The InterPro2GO mappings are comprised of the cross-references between these two resources and are the largest source of GO annotation predictions for proteins. Here, we describe the protocol by which InterPro curators integrate GO terms into the InterPro database. We discuss the unique challenges involved in integrating specific GO terms with entries that may describe a diverse set of proteins, and we illustrate, with examples, how InterPro hierarchies reflect GO terms of increasing specificity. We describe a revised protocol for GO mapping that enables us to assign GO terms to domains based on the function of the individual domain, rather than the function of the families in which the domain is found. We also discuss how taxonomic constraints are dealt with and those cases where we are unable to add any appropriate GO terms. Expert manual annotation of InterPro entries with GO terms enables users to infer function, process or subcellular information for uncharacterized sequences based on sequence matches to predictive models. Database URL: http://www.ebi.ac.uk/interpro. The complete InterPro2GO mappings are available at: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/external2go/interpro2go


Database | 2012

Biocurators and Biocuration: surveying the 21st century challenges

Sarah W. Burge; Teresa K. Attwood; Alex Bateman; Tanya Z. Berardini; Michael J. Cherry; Claire O'Donovan; loannis Xenarios; Pascale Gaudet

Curated databases are an integral part of the tool set that researchers use on a daily basis for their work. For most users, however, how databases are maintained, and by whom, is rather obscure. The International Society for Biocuration (ISB) represents biocurators, software engineers, developers and researchers with an interest in biocuration. Its goals include fostering communication between biocurators, promoting and describing their work, and highlighting the added value of biocuration to the world. The ISB recently conducted a survey of biocurators to better understand their educational and scientific backgrounds, their motivations for choosing a curatorial job and their career goals. The results are reported here. From the responses received, it is evident that biocuration is performed by highly trained scientists and perceived to be a stimulating career, offering both intellectual challenges and the satisfaction of performing work essential to the modern scientific community. It is also apparent that the ISB has at least a dual role to play to facilitate biocurators’ work: (i) to promote biocuration as a career within the greater scientific community; (ii) to aid the development of resources for biomedical research through promotion of nomenclature and data-sharing standards that will allow interconnection of biological databases and better exploit the pivotal contributions that biocurators are making. Database URL: http://biocurator.org


PLOS ONE | 2015

Conservation and Losses of Non-Coding RNAs in Avian Genomes

Paul P. Gardner; Mario Fasold; Sarah W. Burge; Maria Ninova; Jana Hertel; Stephanie Kehr; Tammy E. Steeves; Sam Griffiths-Jones; Peter F. Stadler

Here we present the results of a large-scale bioinformatics annotation of non-coding RNA loci in 48 avian genomes. Our approach uses probabilistic models of hand-curated families from the Rfam database to infer conserved RNA families within each avian genome. We supplement these annotations with predictions from the tRNA annotation tool, tRNAscan-SE and microRNAs from miRBase. We identify 34 lncRNA-associated loci that are conserved between birds and mammals and validate 12 of these in chicken. We report several intriguing cases where a reported mammalian lncRNA, but not its function, is conserved. We also demonstrate extensive conservation of classical ncRNAs (e.g., tRNAs) and more recently discovered ncRNAs (e.g., snoRNAs and miRNAs) in birds. Furthermore, we describe numerous “losses” of several RNA families, and attribute these to either genuine loss, divergence or missing data. In particular, we show that many of these losses are due to the challenges associated with assembling avian microchromosomes. These combined results illustrate the utility of applying homology-based methods for annotating novel vertebrate genomes.


Current protocols in human genetics | 2016

Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families

Lars Barquist; Sarah W. Burge; Paul P. Gardner

Emerging high‐throughput technologies have led to a deluge of putative non‐coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying “families” of homologous ncRNAs starting from single “seed” sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step‐by‐step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process.


RNA Biology | 2013

Special Focus: Bioinformatics

Eric P. Nawrocki; Sarah W. Burge

The development of RNA bioinformatic tools began more than 30 y ago with the description of the Nussinov and Zuker dynamic programming algorithms for single sequence RNA secondary structure prediction. Since then, many tools have been developed for various RNA sequence analysis problems such as homology search, multiple sequence alignment, de novo RNA discovery, read-mapping, and many more. In this issue, we have collected a sampling of reviews and original research that demonstrate some of the many ways bioinformatics is integrated with current RNA biology research.

Collaboration


Dive into the Sarah W. Burge's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alex Bateman

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John G. Tate

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Robert D. Finn

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eric P. Nawrocki

Howard Hughes Medical Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge