Jane Loveland
Wellcome Trust Sanger Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jane Loveland.
Genome Research | 2009
Kim D. Pruitt; Jennifer Harrow; Rachel A. Harte; Craig Wallin; Mark Diekhans; Donna Maglott; Steve Searle; Catherine M. Farrell; Jane Loveland; Barbara J. Ruef; Elizabeth Hart; Marie-Marthe Suner; Melissa J. Landrum; Bronwen Aken; Sarah Ayling; Robert Baertsch; Julio Fernandez-Banet; Joshua L. Cherry; Val Curwen; Michael DiCuccio; Manolis Kellis; Jennifer M. Lee; Michael F. Lin; Michael Schuster; Andrew Shkeda; Clara Amid; Garth Brown; Oksana Dukhanina; Adam Frankish; Jennifer Hart
Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.
Nature | 2006
Michael C. Zody; Manuel Garber; David J. Adams; Ted Sharpe; Jennifer Harrow; James R. Lupski; Christine Nicholson; Steven M. Searle; Laurens Wilming; Sarah K. Young; Amr Abouelleil; Nicole R. Allen; Weimin Bi; Toby Bloom; Mark L. Borowsky; Boris Bugalter; Jonathan Butler; Jean L. Chang; Chao-Kung Chen; April Cook; Benjamin Corum; Christina A. Cuomo; Pieter J. de Jong; David DeCaprio; Ken Dewar; Michael Fitzgerald; James Gilbert; Richard Gibson; Sante Gnerre; Steven Goldstein
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
Nucleic Acids Research | 2014
Catherine M. Farrell; Nuala A. O’Leary; Rachel A. Harte; Jane Loveland; Laurens Wilming; Craig Wallin; Mark Diekhans; Daniel Barrell; Stephen M. J. Searle; Bronwen Aken; Susan M. Hiatt; Adam Frankish; Marie-Marthe Suner; Bhanu Rajput; Charles A. Steward; Garth Brown; Ruth Bennett; Michael R. Murphy; Wendy Wu; Mike Kay; Jennifer Hart; Jeena Rajan; Janet Weber; Catherine Snow; Lillian D. Riddick; Toby Hunt; David Webb; Mark G. Thomas; Pamela Tamez; Sanjida H. Rangwala
The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.
BMC Genomics | 2013
Harry Dawson; Jane Loveland; Géraldine Pascal; James Gilbert; Hirohide Uenishi; Katherine Mann; Yongming Sang; Jie Zhang; Denise R. Carvalho-Silva; Toby Hunt; Matthew Hardy; Zhi-Liang Hu; Shuhong Zhao; Anna Anselmo; Hiroki Shinkai; Celine Chen; Bouabid Badaoui; Daniel Berman; Clara Amid; Mike Kay; David Lloyd; Catherine Snow; Takeya Morozumi; Ryan Pei-Yen Cheng; Megan Bystrom; Ronan Kapetanovic; John C. Schwartz; Ranjit Singh Kataria; Matthew Astley; Eric Fritz
BackgroundThe domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.ResultsThe Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome.ConclusionsThis extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.
Database | 2012
Rachel A. Harte; Catherine M. Farrell; Jane Loveland; Marie-Marthe Suner; Laurens Wilming; Bronwen Aken; Daniel Barrell; Adam Frankish; Craig Wallin; Steve Searle; Mark Diekhans; Jennifer Harrow; Kim D. Pruitt
The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a ‘gold standard’ definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines. Database URL: http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi
American Journal of Human Genetics | 2005
Michael S. Jackson; K. Oliver; Jane Loveland; Sean Humphray; Ian Dunham; Mariano Rocchi; Luigi Viggiano; Jonathan P. Park; Mauro Santibanez-Koref
Approximately 5% of the human genome consists of segmental duplications that can cause genomic mutations and may play a role in gene innovation. Reticulate evolutionary processes, such as unequal crossing-over and gene conversion, are known to occur within specific duplicon families, but the broader contribution of these processes to the evolution of human duplications remains poorly characterized. Here, we use phylogenetic profiling to analyze multiple alignments of 24 human duplicon families that span >8 Mb of DNA. Our results indicate that none of them are evolving independently, with all alignments showing sharp discontinuities in phylogenetic signal consistent with reticulation. To analyze these results in more detail, we have developed a quartet method that estimates the relative contribution of nucleotide substitution and reticulate processes to sequence evolution. Our data indicate that most of the duplications show a highly significant excess of sites consistent with reticulate evolution, compared with the number expected by nucleotide substitution alone, with 15 of 30 alignments showing a >20-fold excess over that expected. Using permutation tests, we also show that at least 5% of the total sequence shares 100% sequence identity because of reticulation, a figure that includes 74 independent tracts of perfect identity >2 kb in length. Furthermore, analysis of a subset of alignments indicates that the density of reticulation events is as high as 1 every 4 kb. These results indicate that phylogenetic relationships within recently duplicated human DNA can be rapidly disrupted by reticulate evolution. This finding has important implications for efforts to finish the human genome sequence, complicates comparative sequence analysis of duplicon families, and could profoundly influence the tempo of gene-family evolution.
Briefings in Bioinformatics | 2013
Allegra Via; Thomas Blicher; Erik Bongcam-Rudloff; Michelle D. Brazas; Catherine Brooksbank; Aidan Budd; Javier De Las Rivas; Jacqueline Dreyer; Pedro L. Fernandes; Celia W. G. van Gelder; Joachim Jacob; Rafael C. Jimenez; Jane Loveland; Federico Morán; Nicola Mulder; Tommi Nyrönen; Kristian Rother; Maria Victoria Schneider; Teresa K. Attwood
The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.
Nucleic Acids Research | 2014
Jennifer Harrow; Charles A. Steward; Adam Frankish; James Gilbert; José Manuel Rodríguez González; Jane Loveland; Jonathan M. Mudge; Daniel Sheppard; Mark G. Thomas; Stephen J. Trevanion; Laurens Wilming
The Vertebrate Genome Annotation (VEGA) database (http://vega.sanger.ac.uk), initially designed as a community resource for browsing manual annotation of the human genome project, now contains five reference genomes (human, mouse, zebrafish, pig and rat). Its introduction pages have been redesigned to enable the user to easily navigate between whole genomes and smaller multi-species haplotypic regions of interest such as the major histocompatibility complex. The VEGA browser is unique in that annotation is updated via the Human And Vertebrate Analysis aNd Annotation (HAVANA) update track every 2 weeks, allowing single gene updates to be made publicly available to the research community quickly. The user can now access different haplotypic subregions more easily, such as those from the non-obese diabetic mouse, and display them in a more intuitive way using the comparative tools. We also highlight how the user can browse manually annotated updated patches from the Genome Reference Consortium (GRC).
Briefings in Bioinformatics | 2012
Maria Victoria Schneider; Peter Walter; Marie-Claude Blatter; James D. Watson; Michelle D. Brazas; Kristian Rother; Aidan Budd; Allegra Via; Celia W. G. van Gelder; Joachim Jacob; Pedro L. Fernandes; Tommi Nyrönen; Javier De Las Rivas; Thomas Blicher; Rafael C. Jimenez; Jane Loveland; Jennifer McDowall; P. D. Jones; Brendan W. Vaughan; Rodrigo Lopez; Teresa K. Attwood; Catherine Brooksbank
Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials. Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review it, respectively, to similar initiatives and collections.
Genome Research | 2016
Benjamin M. Skinner; Carole A. Sargent; Carol Churcher; Toby Hunt; Javier Herrero; Jane Loveland; Matthew Dunn; Sandra Louzada; Beiyuan Fu; William Chow; James Gilbert; Siobhan Austin-Guest; Kathryn Beal; Denise R. Carvalho-Silva; William Cheng; Daria Gordon; Darren Grafham; Matt Hardy; Jo Harley; Heidi Hauser; Philip Howden; Kerstin Howe; Kim Lachani; Peter Ji Ellis; Daniel Kelly; Giselle Kerry; James Kerwin; Bee Ling Ng; Glen Threadgold; Thomas Wileman
We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes--both single copy and amplified--on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution.