Beverley B. Matthews | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Beverley B. Matthews is active.

Explore More

Publication

Featured researches published by Beverley B. Matthews.

Genome Biology | 2002

Apollo: a sequence annotation editor

Suzanna E. Lewis; Smj Searle; Nomi L. Harris; M Gibson; Vivek Iyer; John Richter; C Wiel; Leyla Bayraktaroglu; Ewan Birney; Madeline A. Crosby; Joshua S Kaminker; Beverley B. Matthews; Se Prochnik; Christopher D. Smith; Jl Tupy; Gerald M. Rubin; S Misra; Christopher J. Mungall; Michele Clamp

The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects.

Genome Biology | 2002

Annotation of the Drosophila melanogaster euchromatic genome: a systematic review

Sima Misra; Madeline A. Crosby; Christopher J. Mungall; Beverley B. Matthews; Kathryn S. Campbell; Pavel Hradecky; Yanmei Huang; Joshua S Kaminker; Gillian Millburn; Simon E Prochnik; Christopher D. Smith; Jonathan L Tupy; Eleanor J Whitfield; Leyla Bayraktaroglu; Benjamin P. Berman; Brian Bettencourt; Susan E. Celniker; Aubrey D.N.J. de Grey; Rachel Drysdale; Nomi L. Harris; John Richter; Susan Russo; Andrew J. Schroeder; ShengQiang Shu; Mark Stapleton; Chihiro Yamada; Michael Ashburner; William M. Gelbart; Gerald M. Rubin; Suzanna E. Lewis

BackgroundThe recent completion of the Drosophila melanogaster genomic sequence to high quality and the availability of a greatly expanded set of Drosophila cDNA sequences, aligning to 78% of the predicted euchromatic genes, afforded FlyBase the opportunity to significantly improve genomic annotations. We made the annotation process more rigorous by inspecting each gene visually, utilizing a comprehensive set of curation rules, requiring traceable evidence for each gene model, and comparing each predicted peptide to SWISS-PROT and TrEMBL sequences.ResultsAlthough the number of predicted protein-coding genes in Drosophila remains essentially unchanged, the revised annotation significantly improves gene models, resulting in structural changes to 85% of the transcripts and 45% of the predicted proteins. We annotated transposable elements and non-protein-coding RNAs as new features, and extended the annotation of untranslated (UTR) sequences and alternative transcripts to include more than 70% and 20% of genes, respectively. Finally, cDNA sequence provided evidence for dicistronic transcripts, neighboring genes with overlapping UTRs on the same DNA sequence strand, alternatively spliced genes that encode distinct, non-overlapping peptides, and numerous nested genes.ConclusionsIdentification of so many unusual gene models not only suggests that some mechanisms for gene regulation are more prevalent than previously believed, but also underscores the complex challenges of eukaryotic gene prediction. At present, experimental data and human curation remain essential to generate high-quality genome annotations.

Nucleic Acids Research | 2017

FlyBase at 25: looking to the future

L. Sian Gramates; Steven J. Marygold; Gilberto dos Santos; Jose-Maria Urbano; Giulia Antonazzo; Beverley B. Matthews; Alix J. Rey; Christopher J. Tabone; Madeline A. Crosby; David B. Emmert; Kathleen Falls; Joshua L. Goodman; Yanhui Hu; Laura Ponting; Andrew J. Schroeder; Victor B. Strelets; Jim Thurmond; Pinglei Zhou

Since 1992, FlyBase (flybase.org) has been an essential online resource for the Drosophila research community. Concentrating on the most extensively studied species, Drosophila melanogaster, FlyBase includes information on genes (molecular and genetic), transgenic constructs, phenotypes, genetic and physical interactions, and reagents such as stocks and cDNAs. Access to data is provided through a number of tools, reports, and bulk-data downloads. Looking to the future, FlyBase is expanding its focus to serve a broader scientific community. In this update, we describe new features, datasets, reagent collections, and data presentations that address this goal, including enhanced orthology data, Human Disease Model Reports, protein domain search and visualization, concise gene summaries, a portal for external resources, video tutorials and the FlyBase Community Advisory Group.

Genome Biology | 2002

Annotation of the Drosophila melanogastereuchromatic genome: a systematic review

Sima Misra; Madeline A. Crosby; Chris Mungall; Beverley B. Matthews; Kathryn S. Campbell; Pavel Hradecky; Yanmei Huang; Joshua S Kaminker; Gillian Millburn; Simon E Prochnik; Christopher D. Smith; Jonathan L Tupy; Eleanor J Whitfield; Leyla Bayraktaroglu; Benjamin P. Berman; Brian Bettencourt; Susan E. Celniker; Aubrey D.N.J. de Grey; Rachel Drysdale; Nomi L Harris; John Richter; Susan Russo; Andrew J. Schroeder; ShengQiang Shu; Mark Stapleton; Chihiro Yamada; Michael Ashburner; William M. Gelbart; Gerald M. Rubin; Suzanna E. Lewis

BMC Bioinformatics | 2012

Automatic categorization of diverse experimental information in the bioscience literature

Ruihua Fang; Gary Schindelman; Kimberly Van Auken; Jolene S. Fernandes; Wen Chen; Xiaodong Wang; Paul Davis; Mary Ann Tuli; Steven J. Marygold; Gillian Millburn; Beverley B. Matthews; Haiyan Zhang; Nicholas H. Brown; William M. Gelbart; Paul W. Sternberg

BackgroundCuration of information from bioscience literature into biological knowledge databases is a crucial way of capturing experimental information in a computable form. During the biocuration process, a critical first step is to identify from all published literature the papers that contain results for a specific data type the curator is interested in annotating. This step normally requires curators to manually examine many papers to ascertain which few contain information of interest and thus, is usually time consuming. We developed an automatic method for identifying papers containing these curation data types among a large pool of published scientific papers based on the machine learning method Support Vector Machine (SVM). This classification system is completely automatic and can be readily applied to diverse experimental data types. It has been in use in production for automatic categorization of 10 different experimental datatypes in the biocuration process at WormBase for the past two years and it is in the process of being adopted in the biocuration process at FlyBase and the Saccharomyces Genome Database (SGD). We anticipate that this method can be readily adopted by various databases in the biocuration community and thereby greatly reducing time spent on an otherwise laborious and demanding task. We also developed a simple, readily automated procedure to utilize training papers of similar data types from different bodies of literature such as C. elegans and D. melanogaster to identify papers with any of these data types for a single database. This approach has great significance because for some data types, especially those of low occurrence, a single corpus often does not have enough training papers to achieve satisfactory performance.ResultsWe successfully tested the method on ten data types from WormBase, fifteen data types from FlyBase and three data types from Mouse Genomics Informatics (MGI). It is being used in the curation work flow at WormBase for automatic association of newly published papers with ten data types including RNAi, antibody, phenotype, gene regulation, mutant allele sequence, gene expression, gene product interaction, overexpression phenotype, gene interaction, and gene structure correction.ConclusionsOur methods are applicable to a variety of data types with training set containing several hundreds to a few thousand documents. It is completely automatic and, thus can be readily incorporated to different workflow at different literature-based databases. We believe that the work presented here can contribute greatly to the tremendous task of automating the important yet labor-intensive biocuration effort.

G3: Genes, Genomes, Genetics | 2015

Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data

Beverley B. Matthews; Gilberto dos Santos; Madeline A. Crosby; David B. Emmert; Susan E. St. Pierre; L. Sian Gramates; Pinglei Zhou; Andrew J. Schroeder; Kathleen Falls; Victor B. Strelets; Susan Russo; William M. Gelbart

We report the current status of the FlyBase annotated gene set for Drosophila melanogaster and highlight improvements based on high-throughput data. The FlyBase annotated gene set consists entirely of manually annotated gene models, with the exception of some classes of small non-coding RNAs. All gene models have been reviewed using evidence from high-throughput datasets, primarily from the modENCODE project. These datasets include RNA-Seq coverage data, RNA-Seq junction data, transcription start site profiles, and translation stop-codon read-through predictions. New annotation guidelines were developed to take into account the use of the high-throughput data. We describe how this flood of new data was incorporated into thousands of new and revised annotations. FlyBase has adopted a philosophy of excluding low-confidence and low-frequency data from gene model annotations; we also do not attempt to represent all possible permutations for complex and modularly organized genes. This has allowed us to produce a high-confidence, manageable gene annotation dataset that is available at FlyBase (http://flybase.org). Interesting aspects of new annotations include new genes (coding, non-coding, and antisense), many genes with alternative transcripts with very long 3′ UTRs (up to 15–18 kb), and a stunning mismatch in the number of male-specific genes (approximately 13% of all annotated gene models) vs. female-specific genes (less than 1%). The number of identified pseudogenes and mutations in the sequenced strain also increased significantly. We discuss remaining challenges, for instance, identification of functional small polypeptides and detection of alternative translation starts.

Genome Biology | 2002

Annotation of the Drosophila melanogaster

G3: Genes, Genomes, Genetics | 2015

Gene Model Annotations for Drosophila melanogaster: The Rule-Benders

Madeline A. Crosby; L. Sian Gramates; Gilberto dos Santos; Beverley B. Matthews; Susan E. St. Pierre; Pinglei Zhou; Andrew J. Schroeder; Kathleen Falls; David B. Emmert; Susan Russo; William M. Gelbart

In the context of the FlyBase annotated gene models in Drosophila melanogaster, we describe the many exceptional cases we have curated from the literature or identified in the course of FlyBase analysis. These range from atypical but common examples such as dicistronic and polycistronic transcripts, noncanonical splices, trans-spliced transcripts, noncanonical translation starts, and stop-codon readthroughs, to single exceptional cases such as ribosomal frameshifting and HAC1-type intron processing. In FlyBase, exceptional genes and transcripts are flagged with Sequence Ontology terms and/or standardized comments. Because some of the rule-benders create problems for handlers of high-throughput data, we discuss plans for flagging these cases in bulk data downloads.

Current protocols in human genetics | 2016

Exploring FlyBase Data Using QuickSearch

Steven J. Marygold; Giulia Antonazzo; Helen Attrill; Marta Costa; Madeline A. Crosby; Gilberto dos Santos; Joshua L. Goodman; L. Sian Gramates; Beverley B. Matthews; Alix J. Rey; Jim Thurmond

FlyBase (flybase.org) is the primary online database of genetic, genomic, and functional information about Drosophila species, with a major focus on the model organism Drosophila melanogaster. The long and rich history of Drosophila research, combined with recent surges in genomic‐scale and high‐throughput technologies, mean that FlyBase now houses a huge quantity of data. Researchers need to be able to rapidly and intuitively query these data, and the QuickSearch tool has been designed to meet these needs. This tool is conveniently located on the FlyBase homepage and is organized into a series of simple tabbed interfaces that cover the major data and annotation classes within the database. This unit describes the functionality of all aspects of the QuickSearch tool. With this knowledge, FlyBase users will be equipped to take full advantage of all QuickSearch features and thereby gain improved access to data relevant to their research.

Nucleic Acids Research | 2018

FlyBase 2.0: the next generation

Jim Thurmond; Joshua L. Goodman; Victor B. Strelets; Helen Attrill; L. Sian Gramates; Steven J. Marygold; Beverley B. Matthews; Gillian Millburn; Giulia Antonazzo; Vítor Trovisco; Thomas C. Kaufman; Brian R. Calvi; Norbert Perrimon; Susan Russo Gelbart; Julie Agapite; Kris Broll; Lynn Crosby; Gilberto dos Santos; David B. Emmert; Kathleen Falls; Victoria Jenkins; Beverley Matthews; Carol Sutherland; Christopher J. Tabone; Pinglei Zhou; Mark Zytkovicz; Nicholas H. Brown; Phani Garapati; Alex Holmes; Aoife Larkin

Abstract FlyBase (flybase.org) is a knowledge base that supports the community of researchers that use the fruit fly, Drosophila melanogaster, as a model organism. The FlyBase team curates and organizes a diverse array of genetic, molecular, genomic, and developmental information about Drosophila. At the beginning of 2018, ‘FlyBase 2.0’ was released with a significantly improved user interface and new tools. Among these important changes are a new organization of search results into interactive lists or tables (hitlists), enhanced reference lists, and new protein domain graphics. An important new data class called ‘experimental tools’ consolidates information on useful fly strains and other resources related to a specific gene, which significantly enhances the ability of the Drosophila researcher to design and carry out experiments. With the release of FlyBase 2.0, there has also been a restructuring of backend architecture and a continued development of application programming interfaces (APIs) for programmatic access to FlyBase data. In this review, we describe these major new features and functionalities of the FlyBase 2.0 site and how they support the use of Drosophila as a model organism for biological discovery and translational research.

Explore More