Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Terence Murphy is active.

Publication


Featured researches published by Terence Murphy.


Nucleic Acids Research | 2014

RefSeq: an update on mammalian reference sequences

Kim D. Pruitt; Garth Brown; Susan M. Hiatt; Françoise Thibaud-Nissen; Alexander Astashyn; Olga Ermolaeva; Catherine M. Farrell; Jennifer Hart; Melissa J. Landrum; Kelly M. McGarvey; Michael R. Murphy; Nuala A. O’Leary; Shashikant Pujar; Bhanu Rajput; Sanjida H. Rangwala; Lillian D. Riddick; Andrei Shkeda; Hanzhen Sun; Pamela Tamez; Raymond E. Tully; Craig Wallin; David Webb; Janet Weber; Wendy Wu; Michael DiCuccio; Paul Kitts; Donna Maglott; Terence Murphy; James Ostell

The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of annotated genomic, transcript and protein sequence records derived from data in public sequence archives and from computation, curation and collaboration (http://www.ncbi.nlm.nih.gov/refseq/). We report here on growth of the mammalian and human subsets, changes to NCBI’s eukaryotic annotation pipeline and modifications affecting transcript and protein records. Recent changes to NCBI’s eukaryotic genome annotation pipeline provide higher throughput, and the addition of RNAseq data to the pipeline results in a significant expansion of the number of transcripts and novel exons annotated on mammalian RefSeq genomes. Recent annotation changes include reporting supporting evidence for transcript records, modification of exon feature annotation and the addition of a structured report of gene and sequence attributes of biological interest. We also describe a revised protein annotation policy for alternatively spliced transcripts with more divergent predicted proteins and we summarize the current status of the RefSeqGene project.


Nucleic Acids Research | 2016

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

Nuala A. O'Leary; Mathew W. Wright; J. Rodney Brister; Stacy Ciufo; Diana Haddad; Richard McVeigh; Bhanu Rajput; Barbara Robbertse; Brian Smith-White; Danso Ako-adjei; Alexander Astashyn; Azat Badretdin; Yiming Bao; Olga Blinkova; Vyacheslav Brover; Vyacheslav Chetvernin; Jinna Choi; Eric Cox; Olga Ermolaeva; Catherine M. Farrell; Tamara Goldfarb; Tripti Gupta; Daniel H. Haft; Eneida Hatcher; Wratko Hlavina; Vinita Joardar; Vamsi K. Kodali; Wenjun Li; Donna Maglott; Patrick Masterson

The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55 000 organisms (>4800 viruses, >40 000 prokaryotes and >10 000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.


Genome Research | 2009

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes

Kim D. Pruitt; Jennifer Harrow; Rachel A. Harte; Craig Wallin; Mark Diekhans; Donna Maglott; Steve Searle; Catherine M. Farrell; Jane Loveland; Barbara J. Ruef; Elizabeth Hart; Marie-Marthe Suner; Melissa J. Landrum; Bronwen Aken; Sarah Ayling; Robert Baertsch; Julio Fernandez-Banet; Joshua L. Cherry; Val Curwen; Michael DiCuccio; Manolis Kellis; Jennifer M. Lee; Michael F. Lin; Michael Schuster; Andrew Shkeda; Clara Amid; Garth Brown; Oksana Dukhanina; Adam Frankish; Jennifer Hart

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.


BMC Genomics | 2014

Finding the missing honey bee genes: Lessons learned from a genome upgrade

Christine G. Elsik; Kim C. Worley; Anna K. Bennett; Martin Beye; Francisco Camara; Christopher P. Childers; Dirk C. de Graaf; Griet Debyser; Jixin Deng; Bart Devreese; Eran Elhaik; Jay D. Evans; Leonard J. Foster; Dan Graur; Roderic Guigó; Katharina Hoff; Michael Holder; Matthew E. Hudson; Greg J. Hunt; Huaiyang Jiang; Vandita Joshi; Radhika S. Khetani; Peter Kosarev; Christie Kovar; Jian Ma; Ryszard Maleszka; Robin F. A. Moritz; Monica Munoz-Torres; Terence Murphy; Donna M. Muzny

BackgroundThe first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes.ResultsHere, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data.ConclusionsLessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.


Nucleic Acids Research | 2015

Gene: a gene-centered information resource at NCBI

Garth Brown; Vichet Hem; Kenneth S. Katz; Michael Ovetsky; Craig Wallin; Olga Ermolaeva; Igor Tolstoy; Tatiana Tatusova; Kim D. Pruitt; Donna Maglott; Terence Murphy

The National Center for Biotechnology Informations (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBIs Entrez system, via NCBIs Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP.


Genome Biology | 2014

Genome of the house fly, Musca domestica L., a global vector of diseases with adaptations to a septic environment

Jeffrey G. Scott; Wesley C. Warren; Leo W. Beukeboom; Daniel Bopp; Andrew G. Clark; Sarah D. Giers; Monika Hediger; Andrew K. Jones; Shinji Kasai; Cheryl A. Leichter; Ming Li; Richard P. Meisel; Patrick Minx; Terence Murphy; David R. Nelson; William R. Reid; Frank D. Rinkevich; Hugh M. Robertson; Timothy B. Sackton; David B. Sattelle; Françoise Thibaud-Nissen; Chad Tomlinson; Louis Jacobus Mgn Van De Zande; Kimberly K. O. Walden; Richard Wilson; Nannan Liu

BackgroundAdult house flies, Musca domestica L., are mechanical vectors of more than 100 devastating diseases that have severe consequences for human and animal health. House fly larvae play a vital role as decomposers of animal wastes, and thus live in intimate association with many animal pathogens.ResultsWe have sequenced and analyzed the genome of the house fly using DNA from female flies. The sequenced genome is 691 Mb. Compared with Drosophila melanogaster, the genome contains a rich resource of shared and novel protein coding genes, a significantly higher amount of repetitive elements, and substantial increases in copy number and diversity of both the recognition and effector components of the immune system, consistent with life in a pathogen-rich environment. There are 146 P450 genes, plus 11 pseudogenes, in M. domestica, representing a significant increase relative to D. melanogaster and suggesting the presence of enhanced detoxification in house flies. Relative to D. melanogaster, M. domestica has also evolved an expanded repertoire of chemoreceptors and odorant binding proteins, many associated with gustation.ConclusionsThis represents the first genome sequence of an insect that lives in intimate association with abundant animal pathogens. The house fly genome provides a rich resource for enabling work on innovative methods of insect control, for understanding the mechanisms of insecticide resistance, genetic adaptation to high pathogen loads, and for exploring the basic biology of this important pest. The genome of this species will also serve as a close out-group to Drosophila in comparative genomic studies.


Nucleic Acids Research | 2010

BeetleBase in 2010: revisions to provide comprehensive genomic information for Tribolium castaneum

Hee Shin Kim; Terence Murphy; Jing Xia; Doina Caragea; Yoonseong Park; Richard W. Beeman; Marcé D. Lorenzen; Stephen Butcher; J. Robert Manak; Susan J. Brown

BeetleBase (http://www.beetlebase.org) has been updated to provide more comprehensive genomic information for the red flour beetle Tribolium castaneum. The database contains genomic sequence scaffolds mapped to 10 linkage groups (genome assembly release Tcas_3.0), genetic linkage maps, the official gene set, Reference Sequences from NCBI (RefSeq), predicted gene models, ESTs and whole-genome tiling array data representing several developmental stages. The database was reconstructed using the upgraded Generic Model Organism Database (GMOD) modules. The genomic data is stored in a PostgreSQL relatational database using the Chado schema and visualized as tracks in GBrowse. The updated genetic map is visualized using the comparative genetic map viewer CMAP. To enhance the database search capabilities, the BLAST and BLAT search tools have been integrated with the GMOD tools. BeetleBase serves as a long-term repository for Tribolium genomic data, and is compatible with other model organism databases.


Insect Molecular Biology | 2010

AphidBase: a centralized bioinformatic resource for annotation of the pea aphid genome

Fabrice Legeai; Shuji Shigenobu; Jean-Pierre Gauthier; John K. Colbourne; Claude Rispe; Olivier Collin; Stephen Richards; Alex C. C. Wilson; Terence Murphy; Denis Tagu

AphidBase is a centralized bioinformatic resource that was developed to facilitate community annotation of the pea aphid genome by the International Aphid Genomics Consortium (IAGC). The AphidBase Information System designed to organize and distribute genomic data and annotations for a large international community was constructed using open source software tools from the Generic Model Organism Database (GMOD). The system includes Apollo and GBrowse utilities as well as a wiki, blast search capabilities and a full text search engine. AphidBase strongly supported community cooperation and coordination in the curation of gene models during community annotation of the pea aphid genome. AphidBase can be accessed at http://www.aphidbase.com.


Genome Research | 2017

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.

Valerie Schneider; Tina A. Graves-Lindsay; Kerstin Howe; Nathan Bouk; Hsiu-Chuan Chen; Paul Kitts; Terence Murphy; Kim D. Pruitt; Françoise Thibaud-Nissen; Derek Albracht; Robert S. Fulton; Milinn Kremitzki; Vincent Magrini; Chris Markovic; Sean McGrath; Karyn Meltz Steinberg; Kate Auger; William Chow; Joanna Collins; Glenn Harden; Tim Hubbard; Sarah Pelan; Jared T. Simpson; Glen Threadgold; James Torrance; Jonathan Wood; Laura Clarke; Sergey Koren; Matthew Boitano; Paul Peluso

The human reference genome assembly plays a central role in nearly all aspects of todays basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.


Genome Biology | 2016

The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species

Alexie Papanicolaou; Marc F. Schetelig; Peter Arensburger; Peter W. Atkinson; Joshua B. Benoit; Kostas Bourtzis; Pedro Castañera; John P. Cavanaugh; Hsu Chao; Christopher Childers; Ingrid Curril; Huyen Dinh; HarshaVardhan Doddapaneni; Amanda Dolan; Shannon Dugan; Markus Friedrich; Giuliano Gasperi; Scott M. Geib; Georgios Georgakilas; Richard A. Gibbs; Sarah D. Giers; Ludvik M. Gomulski; Miguel González-Guzmán; Ana Guillem-Amat; Yi Han; Artemis G. Hatzigeorgiou; Pedro Hernández-Crespo; Daniel S.T. Hughes; Jeffery W. Jones; Dimitra Karagkouni

The Mediterranean fruit fly (medfly), Ceratitis capitata, is a major destructive insect pest due to its broad host range, which includes hundreds of fruits and vegetables. It exhibits a unique ability to invade and adapt to ecological niches throughout tropical and subtropical regions of the world, though medfly infestations have been prevented and controlled by the sterile insect technique (SIT) as part of integrated pest management programs (IPMs). The genetic analysis and manipulation of medfly has been subject to intensive study in an effort to improve SIT efficacy and other aspects of IPM control. The 479 Mb medfly genome is sequenced from adult flies from lines inbred for 20 generations. A high-quality assembly is achieved having a contig N50 of 45.7 kb and scaffold N50 of 4.06 Mb. In-depth curation of more than 1800 messenger RNAs shows specific gene expansions that can be related to invasiveness and host adaptation, including gene families for chemoreception, toxin and insecticide metabolism, cuticle proteins, opsins, and aquaporins. We identify genes relevant to IPM control, including those required to improve SIT. The medfly genome sequence provides critical insights into the biology of one of the most serious and widespread agricultural pests. This knowledge should significantly advance the means of controlling the size and invasive potential of medfly populations. Its close relationship to Drosophila, and other insect species important to agriculture and human health, will further comparative functional and structural studies of insect genomes that should broaden our understanding of gene family evolution.

Collaboration


Dive into the Terence Murphy's collaboration.

Top Co-Authors

Avatar

Kim D. Pruitt

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Donna Maglott

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Tatiana Tatusova

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Craig Wallin

University of California

View shared research outputs
Top Co-Authors

Avatar

Garth Brown

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Mike Murphy

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paul Kitts

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Michael DiCuccio

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Bhanu Rajput

National Institutes of Health

View shared research outputs
Researchain Logo
Decentralizing Knowledge