Susan Tweedie
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Susan Tweedie.
Nucleic Acids Research | 2009
Susan Tweedie; Michael Ashburner; Kathleen Falls; Paul Leyland; Peter McQuilton; Steven J. Marygold; Gillian Millburn; David Osumi-Sutherland; Andrew Schroeder; Ruth Seal; Haiyan Zhang
FlyBase (http://flybase.org) is a database of Drosophila genetic and genomic information. Gene Ontology (GO) terms are used to describe three attributes of wild-type gene products: their molecular function, the biological processes in which they play a role, and their subcellular location. This article describes recent changes to the FlyBase GO annotation strategy that are improving the quality of the GO annotation data. Many of these changes stem from our participation in the GO Reference Genome Annotation Project--a multi-database collaboration producing comprehensive GO annotation sets for 12 diverse species.
Nucleic Acids Research | 2008
Midori A. Harris; Jennifer I. Deegan; Amelia Ireland; Jane Lomax; Michael Ashburner; Susan Tweedie; Seth Carbon; Suzanna E. Lewis; Christopher J. Mungall; John Richter; Karen Eilbeck; Judith A. Blake; Alexander D. Diehl; Mary E. Dolan; Harold Drabkin; Janan T. Eppig; David P. Hill; Ni Li; Martin Ringwald; Rama Balakrishnan; Gail Binkley; J. Michael Cherry; Karen R. Christie; Maria C. Costanzo; Qing Dong; Stacia R. Engel; Dianna G. Fisk; Jodi E. Hirschman; Benjamin C. Hitz; Eurie L. Hong
The Gene Ontology (GO) project (http://www.geneontology.org/) provides a set of structured, controlled vocabularies for community use in annotating genes, gene products and sequences (also see http://www.sequenceontology.org/). The ontologies have been extended and refined for several biological areas, and improvements to the structure of the ontologies have been implemented. To improve the quantity and quality of gene product annotations available from its public repository, the GO Consortium has launched a focused effort to provide comprehensive and detailed annotation of orthologous genes across a number of ‘reference’ genomes, including human and several key model organisms. Software developments include two releases of the ontology-editing tool OBO-Edit, and improvements to the AmiGO browser interface.
PLOS Computational Biology | 2009
Pascale Gaudet; Rex L. Chisholm; Tanya Z. Berardini; Emily Dimmer; Stacia R. Engel; Petra Fey; David P. Hill; Doug Howe; James C. Hu; Rachael P. Huntley; Varsha K. Khodiyar; Ranjana Kishore; Donghui Li; Ruth C. Lovering; Fiona M. McCarthy; Li Ni; Victoria Petri; Deborah A. Siegele; Susan Tweedie; Kimberly Van Auken; Valerie Wood; Siddhartha Basu; Seth Carbon; Mary E. Dolan; Christopher J. Mungall; Kara Dolinski; Paul D. Thomas; Michael Ashburner; Judith A. Blake; J. Michael Cherry
The Gene Ontology (GO) is a collaborative effort that provides structured vocabularies for annotating the molecular function, biological role, and cellular location of gene products in a highly systematic way and in a species-neutral manner with the aim of unifying the representation of gene function across different organisms. Each contributing member of the GO Consortium independently associates GO terms to gene products from the organism(s) they are annotating. Here we introduce the Reference Genome project, which brings together those independent efforts into a unified framework based on the evolutionary relationships between genes in these different organisms. The Reference Genome project has two primary goals: to increase the depth and breadth of annotations for genes in each of the organisms in the project, and to create data sets and tools that enable other genome annotation efforts to infer GO annotations for homologous genes in their organisms. In addition, the project has several important incidental benefits, such as increasing annotation consistency across genome databases, and providing important improvements to the GOs logical structure and biological content.
Database | 2014
Yuqing Mao; Kimberly Van Auken; Donghui Li; Cecilia N. Arighi; Peter McQuilton; G. Thomas Hayman; Susan Tweedie; Mary L. Schaeffer; Stanley J. F. Laulederkind; Shur Jen Wang; Julien Gobeill; Patrick Ruch; Anh T.uan Luu; Jung Jae Kim; Jung-Hsien Chiang; Yu De Chen; Chia Jung Yang; Hongfang Liu; Dongqing Zhu; Yanpeng Li; Hong Yu; Ehsan Emadzadeh; Graciela Gonzalez; Jian Ming Chen; Hong Jie Dai; Zhiyong Lu
Gene Ontology (GO) annotation is a common task among model organism databases (MODs) for capturing gene function data from journal articles. It is a time-consuming and labor-intensive task, and is thus often considered as one of the bottlenecks in literature curation. There is a growing need for semiautomated or fully automated GO curation techniques that will help database curators to rapidly and accurately identify gene function information in full-length articles. Despite multiple attempts in the past, few studies have proven to be useful with regard to assisting real-world GO curation. The shortage of sentence-level training data and opportunities for interaction between text-mining developers and GO curators has limited the advances in algorithm development and corresponding use in practical circumstances. To this end, we organized a text-mining challenge task for literature-based GO annotation in BioCreative IV. More specifically, we developed two subtasks: (i) to automatically locate text passages that contain GO-relevant information (a text retrieval task) and (ii) to automatically identify relevant GO terms for the genes in a given article (a concept-recognition task). With the support from five MODs, we provided teams with >4000 unique text passages that served as the basis for each GO annotation in our task data. Such evidence text information has long been recognized as critical for text-mining algorithm development but was never made available because of the high cost of curation. In total, seven teams participated in the challenge task. From the team results, we conclude that the state of the art in automatically mining GO terms from literature has improved over the past decade while much progress is still needed for computer-assisted GO curation. Future work should focus on addressing remaining technical challenges for improved performance of automatic GO concept recognition and incorporating practical benefits of text-mining tools into real-world GO annotation. Database URL: http://www.biocreative.org/tasks/biocreative-iv/track-4-GO/.
Developmental Biology | 2011
Varsha K. Khodiyar; David P. Hill; Doug Howe; Tanya Z. Berardini; Susan Tweedie; Philippa J. Talmud; Ross A. Breckenridge; Shoumo Bhattarcharya; Paul R. Riley; Peter J. Scambler; Ruth C. Lovering
An understanding of heart development is critical in any systems biology approach to cardiovascular disease. The interpretation of data generated from high-throughput technologies (such as microarray and proteomics) is also essential to this approach. However, characterizing the role of genes in the processes underlying heart development and cardiovascular disease involves the non-trivial task of data analysis and integration of previous knowledge. The Gene Ontology (GO) Consortium provides structured controlled biological vocabularies that are used to summarize previous functional knowledge for gene products across all species. One aspect of GO describes biological processes, such as development and signaling. In order to support high-throughput cardiovascular research, we have initiated an effort to fully describe heart development in GO; expanding the number of GO terms describing heart development from 12 to over 280. This new ontology describes heart morphogenesis, the differentiation of specific cardiac cell types, and the involvement of signaling pathways in heart development. This work also aligns GO with the current views of the heart development research community and its representation in the literature. This extension of GO allows gene product annotators to comprehensively capture the genetic program leading to the developmental progression of the heart. This will enable users to integrate heart development data across species, resulting in the comprehensive retrieval of information about this subject. The revised GO structure, combined with gene product annotations, should improve the interpretation of data from high-throughput methods in a variety of cardiovascular research areas, including heart development, congenital cardiac disease, and cardiac stem cell research. Additionally, we invite the heart development community to contribute to the expansion of this important dataset for the benefit of future research in this area.
Disease Models & Mechanisms | 2016
Gillian Millburn; Madeline A. Crosby; L. Sian Gramates; Susan Tweedie
ABSTRACT The use of Drosophila melanogaster as a model for studying human disease is well established, reflected by the steady increase in both the number and proportion of fly papers describing human disease models in recent years. In this article, we highlight recent efforts to improve the availability and accessibility of the disease model information in FlyBase (http://flybase.org), the model organism database for Drosophila. FlyBase has recently introduced Human Disease Model Reports, each of which presents background information on a specific disease, a tabulation of related disease subtypes, and summaries of experimental data and results using fruit flies. Integrated presentations of relevant data and reagents described in other sections of FlyBase are incorporated into these reports, which are specifically designed to be accessible to non-fly researchers in order to promote collaboration across model organism communities working in translational science. Another key component of disease model information in FlyBase is that data are collected in a consistent format – using the evolving Disease Ontology (an open-source standardized ontology for human-disease-associated biomedical data) – to allow robust and intuitive searches. To facilitate this, FlyBase has developed a dedicated tool for querying and navigating relevant data, which include mutations that model a disease and any associated interacting modifiers. In this article, we describe how data related to fly models of human disease are presented in individual Gene Reports and in the Human Disease Model Reports. Finally, we discuss search strategies and new query tools that are available to access the disease model data in FlyBase. Drosophila Collection: Drosophila melanogaster is well established as a model for studying human disease. Here, we highlight recent efforts to enhance the availability and accessibility of disease model data in FlyBase, the model organism database for Drosophila.
PLOS ONE | 2013
Robert Hoehndorf; Nigel Hardy; David Osumi-Sutherland; Susan Tweedie; Paul N. Schofield; Georgios V. Gkoutos
High-throughput phenotyping projects in model organisms have the potential to improve our understanding of gene functions and their role in living organisms. We have developed a computational, knowledge-based approach to automatically infer gene functions from phenotypic manifestations and applied this approach to yeast (Saccharomyces cerevisiae), nematode worm (Caenorhabditis elegans), zebrafish (Danio rerio), fruitfly (Drosophila melanogaster) and mouse (Mus musculus) phenotypes. Our approach is based on the assumption that, if a mutation in a gene leads to a phenotypic abnormality in a process , then must have been involved in , either directly or indirectly. We systematically analyze recorded phenotypes in animal models using the formal definitions created for phenotype ontologies. We evaluate the validity of the inferred functions manually and by demonstrating a significant improvement in predicting genetic interactions and protein-protein interactions based on functional similarity. Our knowledge-based approach is generally applicable to phenotypes recorded in model organism databases, including phenotypes from large-scale, high throughput community projects whose primary mode of dissemination is direct publication on-line rather than in the literature.
Database | 2014
Kimberly Van Auken; Mary L. Schaeffer; Peter McQuilton; Stanley J. F. Laulederkind; Donghui Li; Shur-Jen Wang; G. Thomas Hayman; Susan Tweedie; Cecilia N. Arighi; James Done; Hans-Michael Müller; Paul W. Sternberg; Yuqing Mao; Chih-Hsuan Wei; Zhiyong Lu
Gene function curation via Gene Ontology (GO) annotation is a common task among Model Organism Database groups. Owing to its manual nature, this task is considered one of the bottlenecks in literature curation. There have been many previous attempts at automatic identification of GO terms and supporting information from full text. However, few systems have delivered an accuracy that is comparable with humans. One recognized challenge in developing such systems is the lack of marked sentence-level evidence text that provides the basis for making GO annotations. We aim to create a corpus that includes the GO evidence text along with the three core elements of GO annotations: (i) a gene or gene product, (ii) a GO term and (iii) a GO evidence code. To ensure our results are consistent with real-life GO data, we recruited eight professional GO curators and asked them to follow their routine GO annotation protocols. Our annotators marked up more than 5000 text passages in 200 articles for 1356 distinct GO terms. For evidence sentence selection, the inter-annotator agreement (IAA) results are 9.3% (strict) and 42.7% (relaxed) in F1-measures. For GO term selection, the IAAs are 47% (strict) and 62.9% (hierarchical). Our corpus analysis further shows that abstracts contain ∼10% of relevant evidence sentences and 30% distinct GO terms, while the Results/Experiment section has nearly 60% relevant sentences and >70% GO terms. Further, of those evidence sentences found in abstracts, less than one-third contain enough experimental detail to fulfill the three core criteria of a GO annotation. This result demonstrates the need of using full-text articles for text mining GO annotations. Through its use at the BioCreative IV GO (BC4GO) task, we expect our corpus to become a valuable resource for the BioNLP research community. Database URL: http://www.biocreative.org/resources/corpora/bc-iv-go-task-corpus/.
PLOS ONE | 2014
Yasmin Alam-Faruque; David P. Hill; Emily Dimmer; Midori A. Harris; Rebecca E. Foulger; Susan Tweedie; Helen Attrill; Douglas G. Howe; Stephen Randall Thomas; Duncan Davidson; Adrian S. Woolf; Judith A. Blake; Christopher J. Mungall; Claire O’Donovan; Rolf Apweiler; Rachael P. Huntley
Gene Ontology (GO) provides dynamic controlled vocabularies to aid in the description of the functional biological attributes and subcellular locations of gene products from all taxonomic groups (www.geneontology.org). Here we describe collaboration between the renal biomedical research community and the GO Consortium to improve the quality and quantity of GO terms describing renal development. In the associated annotation activity, the new and revised terms were associated with gene products involved in renal development and function. This project resulted in a total of 522 GO terms being added to the ontology and the creation of approximately 9,600 kidney-related GO term associations to 940 UniProt Knowledgebase (UniProtKB) entries, covering 66 taxonomic groups. We demonstrate the impact of these improvements on the interpretation of GO term analyses performed on genes differentially expressed in kidney glomeruli affected by diabetic nephropathy. In summary, we have produced a resource that can be utilized in the interpretation of data from small- and large-scale experiments investigating molecular mechanisms of kidney function and development and thereby help towards alleviating renal disease.
Circulation: Genomic and Precision Medicine , 11 (2) , Article e001813. (2018) | 2018
Ruth C. Lovering; Paola Roncaglia; Douglas G. Howe; Stanley J. F. Laulederkind; Varsha K. Khodiyar; Tanya Z. Berardini; Susan Tweedie; Rebecca E. Foulger; David Osumi-Sutherland; Nancy H. Campbell; Rachael P. Huntley; Philippa J. Talmud; Judith A. Blake; Ross A. Breckenridge; Paul R. Riley; Pier D. Lambiase; Perry M. Elliott; Lucie H. Clapp; Andrew Tinker; David P. Hill
Background: A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. Methods and Results: In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. Conclusions: We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects.