Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Joel R. Bock is active.

Publication


Featured researches published by Joel R. Bock.


Bioinformatics | 2001

Predicting protein–protein interactions from primary structure

Joel R. Bock; David A. Gough

MOTIVATION An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. The expectation is that this will provide a fuller appreciation of cellular processes and networks at the protein level, ultimately leading to a better understanding of disease mechanisms and suggesting new means for intervention. This paper addresses the question: can protein-protein interactions be predicted directly from primary structure and associated data? Using a diverse database of known protein interactions, a Support Vector Machine (SVM) learning system was trained to recognize and predict interactions based solely on primary structure and associated physicochemical properties. RESULTS Inductive accuracy of the trained system, defined here as the percentage of correct protein interaction predictions for previously unseen test sets, averaged 80% for the ensemble of statistical experiments. Future proteomics studies may benefit from this research by proceeding directly from the automated identification of a cells gene products to prediction of protein interaction pairs.


Bioinformatics | 2003

Whole-proteome interaction mining

Joel R. Bock; David A. Gough

MOTIVATION A major post-genomic scientific and technological pursuit is to describe the functions performed by the proteins encoded by the genome. One strategy is to first identify the protein-protein interactions in a proteome, then determine pathways and overall structure relating these interactions, and finally to statistically infer functional roles of individual proteins. Although huge amounts of genomic data are at hand, current experimental protein interaction assays must overcome technical problems to scale-up for high-throughput analysis. In the meantime, bioinformatics approaches may help bridge the information gap required for inference of protein function. In this paper, a previously described data mining approach to prediction of protein-protein interactions (Bock and Gough, 2001, Bioinformatics, 17, 455-460) is extended to interaction mining on a proteome-wide scale. An algorithm (the phylogenetic bootstrap) is introduced, which suggests traversal of a phenogram, interleaving rounds of computation and experiment, to develop a knowledge base of protein interactions in genetically-similar organisms. RESULTS The interaction mining approach was demonstrated by building a learning system based on 1,039 experimentally validated protein-protein interactions in the human gastric bacterium Helicobacter pylori. An estimate of the generalization performance of the classifier was derived from 10-fold cross-validation, which indicated expected upper bounds on precision of 80% and sensitivity of 69% when applied to related organisms. One such organism is the enteric pathogen Campylobacter jejuni, in which comprehensive machine learning prediction of all possible pairwise protein-protein interactions was performed. The resulting network of interactions shares an average protein connectivity characteristic in common with previous investigations reported in the literature, offering strong evidence supporting the biological feasibility of the hypothesized map. For inferences about complete proteomes in which the number of pairwise non-interactions is expected to be much larger than the number of actual interactions, we anticipate that the sensitivity will remain the same but precision may decrease. We present specific biological examples of two subnetworks of protein-protein interactions in C. jejuni resulting from the application of this approach, including elements of a two-component signal transduction systems for thermoregulation, and a ferritin uptake network.


IEEE Transactions on Biomedical Engineering | 1998

Toward prediction of physiological state signals in sleep apnea

Joel R. Bock; David A. Gough

A recurrent connectionist model is described to predict dynamic respiratory state in the apneic sleeping patient. The time-domain model of nonlinear time-lagged interactions between heart rate, respiration, and oxygen saturation was developed to implicitly embed the dynamics of the respiration and cardiovascular control systems. Multiple future time scales were enforced on the network during training to explore the limits of the prediction horizon and produce a global representation of dynamic state trajectory. Predicted apneic respiration state results are presented in terms of invariant geometric statistics (largest Lyapunov exponent /spl lambda//sub L/ and correlation dimension D/sub c/). The /spl lambda//sub L/ prediction error was 13%, while D/sub c/ error was within 9% of the true time series value. The magnitude of these errors may fall within experimental noise levels. This methodology may eventually be useful in dynamic control of continuous positive airway pressure (CPAP) therapy devices, and may lead to increased patient compliance with this therapy.


Molecular & Cellular Proteomics | 2002

A New Method to Estimate Ligand-Receptor Energetics

Joel R. Bock; David A. Gough

In the discovery of new drugs, lead identification and optimization have assumed critical importance given the number of drug targets generated from genetic, genomics, and proteomic technologies. High-throughput experimental screening assays have been complemented recently by “virtual screening” approaches to identify and filter potential ligands when the characteristics of a target receptor structure of interest are known. Virtual screening mandates a reliable procedure for automatic ranking of structurally distinct ligands in compound library databases. Computing a rank score requires the accurate prediction of binding affinities between these ligands and the target. Many current scoring strategies require information about the target three-dimensional structure. In this study, a new method to estimate the free binding energy between a ligand and receptor is proposed. We extend a central idea previously reported (Bock, J. R., and Gough, D. A. (2001) Predicting protein-protein interactions from primary structure. Bioinformatics 17, 455–460; Bock, J. R., and Gough, D. A. (2002) Whole-proteome interaction mining. Bioinformatics, in press) that uses simple descriptors to represent biomolecules as input examples to train a support vector machine (Smola, A. J., and Schölkopf, B. (1998) A Tutorial on Support Vector Regression, NeuroCOLT Technical Report NC-TR-98-030, Royal Holloway College, University of London, UK) and the application of the trained system to previously unseen pairs, estimating their propensity for interaction. Here we seek to learn the function that maps features of a receptor-ligand pair onto their equilibrium free binding energy. These features do not comprise any direct information about the three-dimensional structures of ligand or target. In cross-validation experiments, it is demonstrated that objective measurements of prediction error rate and rank-ordering statistics are competitive with those of several other investigations, most of which depend on three-dimensional structural data. The size of the sample (n = 2,671) indicates that this approach is robust and may have widespread applicability beyond restricted families of receptor types. It is concluded that newly sequenced proteins, or those for which three-dimensional crystal structures are not easily obtained, can be rapidly analyzed for their binding potential against a library of ligands using this methodology.


Drug Discovery Today: Biosilico | 2004

In silico biological function attribution: a different perspective

Joel R. Bock; David A. Gough

Abstract In this review, we present a survey of the scientific literature on biological-function attribution by computer. The focus is on methods of predicting protein-based biological function in terms of intracellular protein–protein interactions. For each methodology, a critical evaluation of strengths and weaknesses is presented from the literature in the field. A conceptual classification scheme is proposed that separates computational methodologies into those based on biological hypotheses and those based on machine learning hypotheses. This represents a different perspective on in silico function attribution. The scope of the discussion here centers on various machine-learning approaches reported in the literature. Machine-generated hypotheses implicitly model biological function by learning from patterns inherent in data.


PLOS ONE | 2012

Hitting Is Contagious in Baseball: Evidence from Long Hitting Streaks

Joel R. Bock; Akhilesh Maewal; David A. Gough

Data analysis is used to test the hypothesis that “hitting is contagious”. A statistical model is described to study the effect of a hot hitter upon his teammates’ batting during a consecutive game hitting streak. Box score data for entire seasons comprising streaks of length games, including a total observations were compiled. Treatment and control sample groups () were constructed from core lineups of players on the streaking batter’s team. The percentile method bootstrap was used to calculate confidence intervals for statistics representing differences in the mean distributions of two batting statistics between groups. Batters in the treatment group (hot streak active) showed statistically significant improvements in hitting performance, as compared against the control. Mean for the treatment group was found to be to percentage points higher during hot streaks (mean difference increased points), while the batting heat index introduced here was observed to increase by points. For each performance statistic, the null hypothesis was rejected at the significance level. We conclude that the evidence suggests the potential existence of a “statistical contagion effect”. Psychological mechanisms essential to the empirical results are suggested, as several studies from the scientific literature lend credence to contagious phenomena in sports. Causal inference from these results is difficult, but we suggest and discuss several latent variables that may contribute to the observed results, and offer possible directions for future research.


Archive | 2003

In Silico Proteomics

Joel R. Bock; David A. Gough

This Handbook of Proteomic Methods largely comprises current experimental technologies to identify, quantify, and characterize expressed proteins and their interactions within cells, tissues, and body fluids. These techniques have evolved rapidly with an impetus from the industrial biotechnology sector. Nevertheless, experimental elucidation of all proteomic constituents within an organism and the documentation of their interactions remain formidable tasks. This is further complicated by the broad diversity in protein expression guaranteed by alternative splicing of pre-mRNA or post-translational modifications. In one dramatic example, more than 38,000 different isoforms of Down syndrome cell adhesion molecule (DSCAM) were observed in Drosophila melanogaster (1). Obviously, the combinatorics required for comprehensively explicating all protein—protein interactions, especially for higher eukaryotes, are prohibitive, even with the use of advanced high-throughput approaches.


Journal of Chemical Information and Modeling | 2005

Virtual screen for ligands of orphan G protein-coupled receptors

Joel R. Bock; David A. Gough


Archive | 2001

Method for predicting protein binding from primary structure data

David A. Gough; Joel R. Bock


Archive | 2005

Method for predicting G-protein coupled receptor-ligand interactions

David A. Gough; Joel R. Bock

Collaboration


Dive into the Joel R. Bock's collaboration.

Top Co-Authors

Avatar

David A. Gough

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge