Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Craig Wallin is active.

Publication


Featured researches published by Craig Wallin.


Genome Research | 2009

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes

Kim D. Pruitt; Jennifer Harrow; Rachel A. Harte; Craig Wallin; Mark Diekhans; Donna Maglott; Steve Searle; Catherine M. Farrell; Jane Loveland; Barbara J. Ruef; Elizabeth Hart; Marie-Marthe Suner; Melissa J. Landrum; Bronwen Aken; Sarah Ayling; Robert Baertsch; Julio Fernandez-Banet; Joshua L. Cherry; Val Curwen; Michael DiCuccio; Manolis Kellis; Jennifer M. Lee; Michael F. Lin; Michael Schuster; Andrew Shkeda; Clara Amid; Garth Brown; Oksana Dukhanina; Adam Frankish; Jennifer Hart

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.


Nucleic Acids Research | 2014

Current status and new features of the Consensus Coding Sequence database

Catherine M. Farrell; Nuala A. O’Leary; Rachel A. Harte; Jane Loveland; Laurens Wilming; Craig Wallin; Mark Diekhans; Daniel Barrell; Stephen M. J. Searle; Bronwen Aken; Susan M. Hiatt; Adam Frankish; Marie-Marthe Suner; Bhanu Rajput; Charles A. Steward; Garth Brown; Ruth Bennett; Michael R. Murphy; Wendy Wu; Mike Kay; Jennifer Hart; Jeena Rajan; Janet Weber; Catherine Snow; Lillian D. Riddick; Toby Hunt; David Webb; Mark G. Thomas; Pamela Tamez; Sanjida H. Rangwala

The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.


Database | 2012

Tracking and coordinating an international curation effort for the CCDS Project

Rachel A. Harte; Catherine M. Farrell; Jane Loveland; Marie-Marthe Suner; Laurens Wilming; Bronwen Aken; Daniel Barrell; Adam Frankish; Craig Wallin; Steve Searle; Mark Diekhans; Jennifer Harrow; Kim D. Pruitt

The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a ‘gold standard’ definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines. Database URL: http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi


Archive | 2016

Table 4. [Filter sets (partial).].

Mike Murphy; Garth Brown; Craig Wallin; Tatiana Tatusova; Kim D. Pruitt; Terence Murphy; Donna Maglott


Archive | 2016

Figure 6. [Genomic context and Genomic regions,...].

Mike Murphy; Garth Brown; Craig Wallin; Tatiana Tatusova; Kim D. Pruitt; Terence Murphy; Donna Maglott


Archive | 2016

Table 3. [Other properties in Gene (excluding those related to genetype, rnatype, source, and srcdb refseq).].

Mike Murphy; Garth Brown; Craig Wallin; Tatiana Tatusova; Kim D. Pruitt; Terence Murphy; Donna Maglott


Archive | 2016

Figure 4. [Representative Related information section. The...].

Mike Murphy; Garth Brown; Craig Wallin; Tatiana Tatusova; Kim D. Pruitt; Terence Murphy; Donna Maglott


Archive | 2016

Table 2. [Access to Gene-specific sequence information from Gene.].

Mike Murphy; Garth Brown; Craig Wallin; Tatiana Tatusova; Kim D. Pruitt; Terence Murphy; Donna Maglott


Archive | 2016

Figure 11. [Advanced Search. Shown is an...].

Mike Murphy; Garth Brown; Craig Wallin; Tatiana Tatusova; Kim D. Pruitt; Terence Murphy; Donna Maglott


Archive | 2016

Figure 5. [Representative Title and Summary sections of a Full Report.].

Mike Murphy; Garth Brown; Craig Wallin; Tatiana Tatusova; Kim D. Pruitt; Terence Murphy; Donna Maglott

Collaboration


Dive into the Craig Wallin's collaboration.

Top Co-Authors

Avatar

Garth Brown

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Kim D. Pruitt

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Donna Maglott

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Mike Murphy

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Tatiana Tatusova

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Terence Murphy

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Catherine M. Farrell

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark Diekhans

University of California

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge