Christopher W. V. Hogue
Mount Sinai Hospital
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christopher W. V. Hogue.
Nature | 2002
Yuen Ho; Albrecht Gruhler; Adrian Heilbut; Gary D. Bader; Lynda Moore; Sally-Lin Adams; Anna Millar; Paul D. Taylor; Keiryn L. Bennett; Kelly Boutilier; Lingyun Yang; Cheryl Wolting; Ian M. Donaldson; Søren Schandorff; Juanita Shewnarane; Mai Vo; Joanne Taggart; Marilyn Goudreault; Brenda Muskat; Cris Alfarano; Danielle Dewar; Zhen Lin; Katerina Michalickova; Andrew Willems; Holly Sassi; Peter Aagaard Nielsen; Karina Juhl Rasmussen; Jens R. Andersen; Lene E. Johansen; Lykke H. Hansen
The recent abundance of genome sequence data has brought an urgent need for systematic proteomics to decipher the encoded protein networks that dictate cellular function. To date, generation of large-scale protein–protein interaction maps has relied on the yeast two-hybrid system, which detects binary interactions through activation of reporter gene expression. With the advent of ultrasensitive mass spectrometric protein identification methods, it is feasible to identify directly protein complexes on a proteome-wide scale. Here we report, using the budding yeast Saccharomyces cerevisiae as a test case, an example of this approach, which we term high-throughput mass spectrometric protein complex identification (HMS-PCI). Beginning with 10% of predicted yeast proteins as baits, we detected 3,617 associated proteins covering 25% of the yeast proteome. Numerous protein complexes were identified, including many new interactions in various signalling pathways and in the DNA damage response. Comparison of the HMS-PCI data set with interactions reported in the literature revealed an average threefold higher success rate in detection of known complexes compared with large-scale two-hybrid studies. Given the high degree of connectivity observed in this study, even partial HMS-PCI coverage of complex proteomes, including that of humans, should allow comprehensive identification of cellular networks.
Nucleic Acids Research | 2004
C. Alfarano; C. E. Andrade; K. Anthony; N. Bahroos; M. Bajec; K. Bantoft; Doron Betel; B. Bobechko; K. Boutilier; E. Burgess; K. Buzadzija; R. Cavero; C. D'Abreo; I. Donaldson; D. Dorairajoo; Michel Dumontier; M. R. Dumontier; V. Earles; R. Farrall; Howard J. Feldman; E. Garderman; Y. Gong; R. Gonzaga; V. Grytsan; E. Gryz; V. Gu; E. Haldorsen; A. Halupa; Robin Haw; A. Hrvojic
The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues.
BMC Bioinformatics | 2003
Ian M. Donaldson; Joel D. Martin; Berry de Bruijn; Cheryl Wolting; Vicki Lay; Brigitte Tuekam; Shudong Zhang; Berivan Baskin; Gary D. Bader; Katerina Michalickova; Tony Pawson; Christopher W. V. Hogue
BackgroundThe majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND.ResultsCross-validation estimated the support vector machines test-set precision, accuracy and recall for classifying abstracts describing interaction information was 92%, 90% and 92% respectively. We estimated that the system would be able to recall up to 60% of all non-high throughput interactions present in another yeast-protein interaction database. Finally, this system was applied to a real-world curation problem and its use was found to reduce the task duration by 70% thus saving 176 days.ConclusionsMachine learning methods are useful as tools to direct interaction and pathway database back-filling; however, this potential can only be realized if these techniques are coupled with human review and entry into a factual database such as BIND. The PreBIND system described here is available to the public at http://bind.ca. Current capabilities allow searching for human, mouse and yeast protein-interaction information.
BMC Biology | 2007
Samuel Kerrien; Sandra Orchard; Luisa Montecchi-Palazzi; Bruno Aranda; Antony F. Quinn; Nisha Vinod; Gary D. Bader; Ioannis Xenarios; Jérôme Wojcik; David James Sherman; Mike Tyers; John J. Salama; Susan Moore; Arnaud Ceol; Andrew Chatr-aryamontri; Matthias Oesterheld; Volker Stümpflen; Lukasz Salwinski; Jason Nerothin; Ethan Cerami; Michael E. Cusick; Marc Vidal; Michael K. Gilson; John Armstrong; Peter Woollard; Christopher W. V. Hogue; David Eisenberg; Gianni Cesareni; Rolf Apweiler; Henning Hermjakob
BackgroundMolecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions.ResultsThe HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration.ConclusionThe PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.
Proteins | 2000
Howard J. Feldman; Christopher W. V. Hogue
A fast computer program, FOLDTRAJ, to generate plausible random protein structures is reported. All‐atom proteins are made directly in continuous three‐dimensional space starting from primary sequence with an N to C directed build‐up method. The method uses a novel pipelined residue addition approach in which the leading edge of the protein is constructed three residues at a time for optimal protein geometry, including the placement of cis proline. Build‐up methods represent a classic N‐body problem, expected to scale as N2. When proteins become more collapsed, build‐up methods are susceptible to backtracking problems which can scale exponentially with the number of residues required to back out of a trapped walk. We have provided solutions to both these problems, using a multiway binary tree that makes the N‐body problem of bump‐checking scale as NlogN, and speeding up backtracking by varying the number of tries before backtracking based on available conformational space. FOLDTRAJ is independent of energy potentials, other than that implicit in the geometrical properties derived by statistical studies of known structures, and in atomic Van der Waals radii. WHAT―CHECK shows that the program generates chirally and physically valid proteins with all bond lengths, angles and dihedrals within allowable tolerances. Random structures built using sequences from PDB files 1SEM, 2HPR, and 1RTP typically have 5–15% α‐helical content (according to DSSP) and on the order of 20% β‐strand/extended content. Ensembles of random structures are compared with polymer theory and with experimentally determined fluorescence resonance energy transfer distances. Reasonably sized structure ensembles do sample most of the conformational space available to proteins. The method is also capable of protein reconstruction using CαCα direction vectors, and it compares favorably with methods that reconstruct protein backbones based on alpha‐carbon coordinates, having an average backbone and Cβ root mean square deviation of 0.63 Å for nine different protein folds. Proteins 2000;39:112–131.
FEBS Letters | 2005
Howard J. Feldman; Michel Dumontier; Susan Ling; Norbert Haider; Christopher W. V. Hogue
A novel chemical ontology based on chemical functional groups automatically, objectively assigned by a computer program, was developed to categorize small molecules. It has been applied to PubChem and the small molecule interaction database to demonstrate its utility as a basic pharmacophore search system. Molecules can be compared using a semantic similarity score based on functional group assignments rather than 3D shape, which succeeds in identifying small molecules known to bind a common binding site. This ontology will serve as a powerful tool for searching chemical databases and identifying key functional groups responsible for biological activities.
BMC Bioinformatics | 2002
Katerina Michalickova; Gary D. Bader; Michel Dumontier; Hao Lieu; Doron Betel; Ruth Isserlin; Christopher W. V. Hogue
BackgroundSeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment.ResultsSeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries.ConclusionsThe system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.
European Biophysics Journal | 2005
Stéphanie Portet; Jack A. Tuszynski; Christopher W. V. Hogue; J.M. Dixon
Parameters characterizing elastic properties of microtubules, measured in several recent experiments, reflect an anisotropic character. We describe the microscopic dynamical properties of microtubules using a discrete model based on an appropriate lattice of dimers. Adopting a harmonic approximation for the dimer–dimer interactions and estimating the lattice elastic constants, we make predictions regarding vibration dispersion relations and vibration propagation velocities. Vibration frequencies and velocities are expressed as functions of the elastic constants and of the geometrical characteristics of the microtubules. We show that vibrations which propagate along the protofilament do so significantly faster than those along the helix.
Proteins | 2002
Howard J. Feldman; Christopher W. V. Hogue
Protein structure prediction from sequence alone by “brute force” random methods is a computationally expensive problem. Estimates have suggested that it could take all the computers in the world longer than the age of the universe to compute the structure of a single 200‐residue protein. Here we investigate the use of a faster version of our FOLDTRAJ probabilistic all‐atom protein‐structure‐sampling algorithm. We have improved the method so that it is now over twenty times faster than originally reported, and capable of rapidly sampling conformational space without lattices. It uses geometrical constraints and a Leonard‐Jones type potential for self‐avoidance. We have also implemented a novel method to add secondary structure‐prediction information to make protein‐like amounts of secondary structure in sampled structures. In a set of 100,000 probabilistic conformers of 1VII, 1ENH, and 1PMC generated, the structures with smallest Cα RMSD from native are 3.95, 5.12, and 5.95Å, respectively. Expanding this test to a set of 17 distinct protein folds, we find that all‐helical structures are “hit” by brute force more frequently than β or mixed structures. For small helical proteins or very small non‐helical ones, this approach should have a “hit” close enough to detect with a good scoring function in a pool of several million conformers. By fitting the distribution of RMSDs from the native state of each of the 17 sets of conformers to the extreme value distribution, we are able to estimate the size of conformational space for each. With a 0.5Å RMSD cutoff, the number of conformers is roughly 2N where N is the number of residues in the protein. This is smaller than previous estimates, indicating an average of only two possible conformations per residue when sterics are accounted for. Our method reduces the effective number of conformations available at each residue by probabilistic bias, without requiring any particular discretization of residue conformational space, and is the fastest method of its kind. With computer speeds doubling every 18 months and parallel and distributed computing becoming more practical, the brute force approach to protein structure prediction may yet have some hope in the near future. Proteins 2002;46:8–23.
FEBS Letters | 2006
Howard J. Feldman; Kevin A. Snyder; Amy Ticoll; Greg D. Pintilie; Christopher W. V. Hogue
A complete set of 6300 small molecule ligands was extracted from the protein data bank, and deposited online in PubChem as data source ‘SMID’. This sets major improvement over prior methods is the inclusion of cyclic polypeptides and branched polysaccharides, including an unambiguous nomenclature, in addition to normal monomeric ligands. Only the best available example of each ligand structure is retained, and an additional dataset is maintained containing co‐ordinates for all examples of each structure. Attempts are made to correct ambiguous atomic elements and other common errors, and a perception algorithm was used to determine bond order and aromaticity when no other information was available.