Yuval Nov
University of Haifa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuval Nov.
Proceedings of the National Academy of Sciences of the United States of America | 2010
Inbal Budowski-Tal; Yuval Nov; Rachel Kolodny
Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a “bags-of-fragments”—a vector that counts the number of occurrences of each fragment—and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.
Applied and Environmental Microbiology | 2012
Yuval Nov
ABSTRACT We developed new criteria for determining the library size in a saturation mutagenesis experiment. When the number of all possible distinct variants is large, any of the top-performing variants (e.g., any of the top three) is likely to meet the design requirements, so the probability that the library contains at least one of them is a sensible criterion for determining the library size. By using a criterion of this type, one may significantly reduce the library size and thus save costs and labor while minimally compromising the quality of the best variant discovered. We present the probabilistic tools underlying these criteria and use them to compare the efficiencies of four randomization schemes: NNN, which uses all 64 codons; NNB, which uses 48 codons; NNK, which uses 32 codons; and MAX, which assigns equal probabilities to each of the 20 amino acids. MAX was found to be the most efficient randomization scheme and NNN the least efficient. TopLib, a computer program for carrying out the related calculations, is available through a user-friendly Web server.
Arthritis & Rheumatism | 2014
Doron Rimar; Itzhak Rosner; Yuval Nov; Gleb Slobodin; Michael Rozenbaum; Katy Halasz; Tharwat Haj; Nizar Jiries; Lisa Kaly; Nina Boulman; Rula Daood; Zahava Vadasz
Fibrosis is a major cause of morbidity and mortality in systemic sclerosis (SSc). Levels of lysyl oxidase (LOX), an extracellular enzyme that stabilizes collagen fibrils, have been found to be elevated in the skin of SSc patients, but have not been evaluated in the serum or correlated with the clinical parameters. We undertook this study to evaluate serum LOX levels in SSc patients and to correlate these levels with clinical parameters of SSc.
The ISME Journal | 2008
Yoram Barak; Yuval Nov; David F. Ackerley; A. Matin
Most existing methods for improving protein activity are laborious and costly, as they either require knowledge of protein structure or involve expression and screening of a vast number of protein mutants. We describe here a successful first application of a novel approach, which requires no structural knowledge and is shown to significantly reduce the number of mutants that need to be screened. In the first phase of this study, around 7000 mutants were screened through standard directed evolution, yielding a 230-fold improvement in activity relative to the wild type. Using sequence analysis and site-directed mutagenesis, an additional single mutant was then produced, with 500-fold improved activity. In the second phase, a novel statistical method for protein improvement was used; building on data from the first phase, only 11 targeted additional mutants were produced through site-directed mutagenesis, and the best among them achieved a >1500-fold improvement in activity over the wild type. Thus, the statistical model underlying the experiment was validated, and its predictions were shown to reduce laboratory labor and resources.
Applied and Environmental Microbiology | 2010
Moran Brouk; Yuval Nov; Ayelet Fishman
ABSTRACT Directed evolution and rational design were used to generate active variants of toluene-4-monooxygenase (T4MO) on 2-phenylethanol (PEA), with the aim of producing hydroxytyrosol, a potent antioxidant. Due to the complexity of the enzymatic system—four proteins encoded by six genes—mutagenesis is labor-intensive and time-consuming. Therefore, the statistical model of Nov and Wein (J. Comput. Biol. 12:247-282) was used to reduce the number of variants produced and evaluated in a lab. From an initial data set of 24 variants, with mutations at nine positions, seven double or triple mutants were identified through statistical analysis. The average activity of these mutants was 4.6-fold higher than the average activity of the initial data set. In an attempt to further improve the enzyme activity to obtain PEA hydroxylation, a second round of statistical analysis was performed. Nine variants were considered, with 3, 4, and 5 point mutations. The average activity of the variants obtained in the second statistical round was 1.6-fold higher than in the first round and 7.3-fold higher than that of the initial data set. The best variant discovered, TmoA I100A E214G D285Q, exhibited an initial oxidation rate of 4.4 ± 0.3 nmol/min/mg protein, which is 190-fold higher than the rate obtained by the wild type. This rate was also 2.6-fold higher than the activity of the wild type on the natural substrate toluene. By considering only 16 preselected mutants (out of ∼13,000 possible combinations), a highly active variant was discovered with minimum time and effort.
Scientific Reports | 2015
Carlos G. Acevedo-Rocha; Manfred T. Reetz; Yuval Nov
Saturation mutagenesis is a powerful technique for engineering proteins, metabolic pathways and genomes. In spite of its numerous applications, creating high-quality saturation mutagenesis libraries remains a challenge, as various experimental parameters influence in a complex manner the resulting diversity. We explore from the economical perspective various aspects of saturation mutagenesis library preparation: We introduce a cheaper and faster control for assessing library quality based on liquid media; analyze the role of primer purity and supplier in libraries with and without redundancy; compare library quality, yield, randomization efficiency, and annealing bias using traditional and emergent randomization schemes based on mixtures of mutagenic primers; and establish a methodology for choosing the most cost-effective randomization scheme given the screening costs and other experimental parameters. We show that by carefully considering these parameters, laboratory expenses can be significantly reduced.
PLOS ONE | 2013
Yuval Nov
Saturation mutagenesis is a widely used directed evolution technique, in which a large number of protein variants, each having random amino acids in certain predetermined positions, are screened in order to discover high-fitness variants among them. Several metrics for determining the library size (the number of variants screened) have been suggested in the literature, but none of them incorporates the actual fitness of the variants discovered in the experiment. We present the results of an extensive simulation study, which is based on probabilistic models for protein fitness landscape, and which investigates how the result of a saturation mutagenesis experiment – the fitness of the best variant discovered – varies as a function of the library size. In particular, we study the loss of fitness in the experiment: the difference between the fitness of the best variant discovered, and the fitness of the best variant in variant space. Our results are that the existing criteria for determining the library size are conservative, so smaller libraries are often satisfactory. Reducing the library size can save labor, time, and expenses in the laboratory.
Journal of Computational Biology | 2013
Yuval Nov; Alexander Fulton; Karl-Erich Jaeger
In protein engineering, useful information may be gained from systematically generating and screening all single-point mutants of a given protein. We model and analyze an iterative two-stage procedure to generate all these mutants. At each position, L variants are generated in the first stage via saturation mutagenesis and are sequenced. In the second stage, the missing variants (out of the 19 possible single-point substitutions) are produced via site-directed mutagenesis. We study the economic tradeoff associated with varying L, and derive its optimal value given the experimental parameters.
Methods of Molecular Biology | 2014
Yuval Nov
Directed evolution has emerged as an important tool for engineering proteins with improved or novel properties. Because of their inherent reliance on randomness, directed evolution protocols are amenable to probabilistic modeling and analysis. This chapter summarizes and reviews in a nonmathematical way some of the probabilistic works related to directed evolution, with particular focus on three of the most widely used methods: saturation mutagenesis, error-prone PCR, and in vitro recombination. The ultimate aim is to provide the reader with practical information to guide the planning and design of directed evolution studies. Importantly, the applications and locations of freely available computational resources to assist with this process are described in detail.
Journal of Theoretical Biology | 2013
Yuval Nov; Danny Segev
Codon randomization via degenerate oligonucleotides is a widely used approach for generating protein libraries. We use integer programming methodology to model and solve the problem of computing the minimal mixture of oligonucleotides required to induce an arbitrary target probability over the 20 standard amino acids. We consider both randomization via conventional degenerate oligonucleotides, which incorporate at each position of the randomized codon certain nucleotides in equal probabilities, and randomization via spiked oligonucleotides, which admit arbitrary nucleotide distribution at each of the codons positions. Existing methods for computing such mixtures rely on various heuristics.