Chen Yanover | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chen Yanover is active.

Explore More

Publication

Featured researches published by Chen Yanover.

international conference on computer vision | 2005

Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation

Talya Meltzer; Chen Yanover; Yair Weiss

A wide range of low level vision problems have been formulated in terms of finding the most probable assignment of a Markov random field (or equivalently the lowest energy configuration). Perhaps the most successful example is stereo vision. For the stereo problem, it has been shown that finding the global optimum is NP hard but good results have been obtained using a number of approximate optimization algorithms. In this paper, we show that for standard benchmark stereo pairs, the global optimum can be found in about 30 minutes using a variant of the belief propagation (BP) algorithm. We extend previous theoretical results on reweighted belief propagation to account for possible ties in the beliefs and using these results we obtain easily checkable conditions that guarantee that the BP disparities are the global optima. We verify experimentally that these conditions are typically met for the standard benchmark stereo pairs and discuss the implications of our results for further progress in stereo.

research in computational molecular biology | 2007

Minimizing and learning energy functions for side-chain prediction

Chen Yanover; Ora Schueler-Furman; Yair Weiss

Side-chain prediction is an important subproblem of the general protein folding problem. Despite much progress in side-chain prediction, performance is far from satisfactory. As an example, the ROSETTA program that uses simulated annealing to select the minimum energy conformations, correctly predicts the first two side-chain angles for approximately 72% of the buried residues in a standard data set. Is further improvement more likely to come from better search methods, or from better energy functions? Given that exact minimization of the energy is NP hard, it is difficult to get a systematic answer to this question. n nIn this paper, we present a novel search method and a novel method for learning energy functions from training data that are both based on Tree Reweighted Belief Propagation (TRBP). We find that TRBP can find the global optimum of the ROSETTA energy function in a few minutes of computation for approximately 85% of the proteins in a standard benchmark set. TRBP can also effectively bound the partition function which enables using the Conditional Random Fields (CRF) framework for learning. n nInterestingly, finding the global minimum does not significantly improve sidechain prediction for an energy function based on ROSETTAs default energy terms (less than 0.1%), while learning new weights gives a significant boost from 72% to 78%. Using a recently modified ROSETTA energy function with a softer Lennard-Jones repulsive term, the global optimum does improve prediction accuracy from 77% to 78%. Here again, learning new weights improves side-chain modeling even further to 80%. Finally, the highest accuracy (82.6%) is obtained using an extended rotamer library and CRF learned weights. Our results suggest that combining machine learning with approximate inference can improve the state-of-the-art in side-chain prediction.

BMC Bioinformatics | 2006

PepDist: a new framework for protein-peptide binding prediction based on learning peptide distance functions.

Tomer Hertz; Chen Yanover

BackgroundMany different aspects of cellular signalling, trafficking and targeting mechanisms are mediated by interactions between proteins and peptides. Representative examples are MHC-peptide complexes in the immune system. Developing computational methods for protein-peptide binding prediction is therefore an important task with applications to vaccine and drug design.MethodsPrevious learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose PepDist: a novel approach for predicting binding affinity. Our approach is based on learning peptide-peptide distance functions. Moreover, we suggest to learn a single peptide-peptide distance function over an entire family of proteins (e.g. MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically, we propose to use DistBoost [1, 2], which is a semi-supervised distance learning algorithm.ResultsWe compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, our methods performance gain, when compared to other computational methods, is even more pronounced. We have recently uploaded the PepDist webserver which provides binding prediction of peptides to 35 different MHC class I alleles. The webserver which can be found at http://www.pepdist.cs.huji.ac.il is powered by a prediction engine which was trained using the framework presented in this paper.ConclusionThe results obtained suggest that learning a single distance function over an entire family of proteins achieves higher prediction accuracy than learning a set of binary classifiers for each of the proteins separately. We also show the importance of obtaining information on experimentally determined non-binders. Learning with real non-binders generalizes better than learning with randomly generated peptides that are assumed to be non-binders. This suggests that information about non-binding peptides should also be published and made publicly available.

european conference on computational biology | 2007

Identifying HLA supertypes by learning distance functions

Tomer Hertz; Chen Yanover

MOTIVATIONnThe development of epitope-based vaccines crucially relies on the ability to classify Human Leukocyte Antigen (HLA) molecules into sets that have similar peptide binding specificities, termed supertypes. In their seminal work, Sette and Sidney defined nine HLA class I supertypes and claimed that these provide an almost perfect coverage of the entire repertoire of HLA class I molecules. HLA alleles are highly polymorphic and polygenic and therefore experimentally classifying each of these molecules to supertypes is at present an impossible task. Recently, a number of computational methods have been proposed for this task. These methods are based on defining protein similarity measures, derived from analysis of binding peptides or from analysis of the proteins themselves.nnnRESULTSnIn this paper we define both peptide derived and protein derived similarity measures, which are based on learning distance functions. The peptide derived measure is defined using a peptide-peptide distance function, which is learned using information about known binding and non-binding peptides. The protein derived similarity measure is defined using a protein-protein distance function, which is learned using information about alleles previously classified to supertypes by Sette and Sidney (1999). We compare the classification obtained by these two complimentary methods to previously suggested classification methods. In general, our results are in excellent agreement with the classifications suggested by Sette and Sidney (1999) and with those reported by Buus et al. (2004). The main important advantage of our proposed distance-based approach is that it makes use of two different and important immunological sources of information-HLA alleles and peptides that are known to bind or not bind to these alleles. Since each of our distance measures is trained using a different source of information, their combination can provide a more confident classification of alleles to supertypes.

Journal of Computational Chemistry | 2007

Dead‐end elimination for multistate protein design

Chen Yanover; Menachem Fromer; Julia M. Shifman

Multistate protein design is the task of predicting the amino acid sequence that is best suited to selectively and stably fold to one state out of a set of competing structures. Computationally, it entails solving a challenging optimization problem. Therefore, notwithstanding the increased interest in multistate design, the only implementations reported are based on either genetic algorithms or Monte Carlo methods. The dead‐end elimination (DEE) theorem cannot be readily transfered to multistate design problems despite its successful application to single‐state protein design. In this article we propose a variant of the standard DEE, called type‐dependent DEE. Our method reduces the size of the conformational space of the multistate design problem, while provably preserving the minimal energy conformational assignment for any choice of amino acid sequence. Type‐dependent DEE can therefore be used as a preprocessing step in any computational multistate design scheme. We demonstrate the applicability of type‐dependent DEE on a set of multistate design problems and discuss its strength and limitations.

research in computational molecular biology | 2005

Predicting protein-peptide binding affinity by learning peptide-peptide distance functions

Chen Yanover; Tomer Hertz

Many important cellular response mechanisms are activated when a peptide binds to an appropriate receptor. In the immune system, the recognition of pathogen peptides begins when they bind to cell membrane Major Histocompatibility Complexes (MHCs). MHC proteins then carry these peptides to the cell surface in order to allow the activation of cytotoxic T-cells. The MHC binding cleft is highly polymorphic and therefore protein-peptide binding is highly specific. Developing computational methods for predicting protein-peptide binding is important for vaccine design and treatment of diseases like cancer. n nPrevious learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose a novel approach for predicting binding affinity. Our approach is based on learning a peptide-peptide distance function. Moreover, we learn a single peptide-peptide distance function over an entire family of proteins (e.g MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically we propose to use DistBoost [1, 2], which is a semi-supervised distance learning algorithm. n nWe compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, DistBoosts performance gain, when compared to other computational methods, is even more pronounced.

intelligent systems in molecular biology | 2008

A computational framework to empower probabilistic protein design

Menachem Fromer; Chen Yanover

Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult. Results: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future. Contact: [email protected]

Current Genetics | 1998

COMPUTER ANALYSIS OF THE ENTIRE BUDDING YEAST GENOME FOR PUTATIVE TARGETS OF THE GCN4 TRANSCRIPTION FACTOR

Oren Schuldiner; Chen Yanover; Nissim Benvenisty

Abstract The completion of the yeast genome project enables an analysis of various phenomena for a whole eukaryotic genome. We aimed at characterizing a full spectrum of target genes for a transcription activator, and specifically characterized putative targets for GCN4 in the budding yeast. The results suggest that about 1% of the genes are regulated by GCN4 and that these genes code for proteins involved in amino-acid and nucleotide metabolism. Our analysis proposes that, when enough data about the binding nature of a transcription factor exists, it is possible to identify its putative targets and also to try and assign a physiological role for this transcription factor.

Journal of Machine Learning Research | 2006