Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Rachel Kolodny is active.

Publication


Featured researches published by Rachel Kolodny.


Journal of Molecular Biology | 2002

Small Libraries of Protein Fragments Model Native Protein Structures Accurately

Rachel Kolodny; Patrice Koehl; Leonidas J. Guibas; Michael Levitt

Prediction of protein structure depends on the accuracy and complexity of the models used. Here, we represent the polypeptide chain by a sequence of rigid fragments that are concatenated without any degrees of freedom. Fragments chosen from a library of representative fragments are fit to the native structure using a greedy build-up method. This gives a one-dimensional representation of native protein three-dimensional structure whose quality depends on the nature of the library. We use a novel clustering method to construct libraries that differ in the fragment length (four to seven residues) and number of representative fragments they contain (25-300). Each library is characterized by the quality of fit (accuracy) and the number of allowed states per residue (complexity). We find that the accuracy depends on the complexity and varies from 2.9A for a 2.7-state model on the basis of fragments of length 7-0.76A for a 15-state model on the basis of fragments of length 5. Our goal is to find representations that are both accurate and economical (low complexity). The models defined here are substantially better in this regard: with ten states per residue we approximate native protein structure to 1A compared to over 20 states per residue needed previously. For the same complexity, we find that longer fragments provide better fits. Unfortunately, libraries of longer fragments must be much larger (for ten states per residue, a seven-residue library is 100 times larger than a five-residue library). As the number of known protein native structures increases, it will be possible to construct larger libraries to better exploit this correlation between neighboring residues. Our fragment libraries, which offer a wide range of optimal fragments suited to different accuracies of fit, may prove to be useful for generating better decoy sets for ab initio protein folding and for generating accurate loop conformations in homology modeling.


Proceedings of the National Academy of Sciences of the United States of America | 2010

FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately

Inbal Budowski-Tal; Yuval Nov; Rachel Kolodny

Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a “bags-of-fragments”—a vector that counts the number of occurrences of each fragment—and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.


The International Journal of Robotics Research | 2005

Inverse Kinematics in Biology: The Protein Loop Closure Problem

Rachel Kolodny; Leonidas J. Guibas; Michael Levitt; Patrice Koehl

Assembling fragments from known protein structures is a widely used approach to construct structural models for new proteins. We describe an application of this idea to an important inverse kinematics problem in structural biology: the loop closure problem. We have developed an algorithm for generating the conformations of candidate loops that fit in a gap of given length in a protein structure framework. Our method proceeds by concatenating small fragments of protein chosen from small libraries of representative fragments. Our approach has the advantages of ab initio methods since we are able to enumerate all candidate loops in the discrete approximation of the conformational space accessible to the loop, as well as the advantages of database search approach since the use of fragments of known protein structures guarantees that the backbone conformations are physically reasonable. We test our approach on a set of 427 loops, varying in length from four residues to 14 residues. The quality of the candidate loops is evaluated in terms of global coordinate root mean square (cRMS). The top predictions vary between 0.3 and 4.2 Å for four-residue loops and between 1.5 and 3.1 Å for 14-residue loops, respectively.


Proteins | 2008

Sequence-similar, structure-dissimilar protein pairs in the PDB.

Mickey Kosloff; Rachel Kolodny

It is often assumed that in the Protein Data Bank (PDB), two proteins with similar sequences will also have similar structures. Accordingly, it has proved useful to develop subsets of the PDB from which “redundant” structures have been removed, based on a sequence‐based criterion for similarity. Similarly, when predicting protein structure using homology modeling, if a template structure for modeling a target sequence is selected by sequence alone, this implicitly assumes that all sequence‐similar templates are equivalent. Here, we show that this assumption is often not correct and that standard approaches to create subsets of the PDB can lead to the loss of structurally and functionally important information. We have carried out sequence‐based structural superpositions and geometry‐based structural alignments of a large number of protein pairs to determine the extent to which sequence similarity ensures structural similarity. We find many examples where two proteins that are similar in sequence have structures that differ significantly from one another. The source of the structural differences usually has a functional basis. The number of such proteins pairs that are identified and the magnitude of the dissimilarity depend on the approach that is used to calculate the differences; in particular sequence‐based structure superpositioning will identify a larger number of structurally dissimilar pairs than geometry‐based structural alignments. When two sequences can be aligned in a statistically meaningful way, sequence‐based structural superpositioning provides a meaningful measure of structural differences. This approach and geometry‐based structure alignments reveal somewhat different information and one or the other might be preferable in a given application. Our results suggest that in some cases, notably homology modeling, the common use of nonredundant datasets, culled from the PDB based on sequence, may mask important structural and functional information. We have established a data base of sequence‐similar, structurally dissimilar protein pairs that will help address this problem (http://luna.bioc.columbia.edu/rachel/seqsimstrdiff.htm). Proteins 2008.


Proceedings of the National Academy of Sciences of the United States of America | 2011

Maps of protein structure space reveal a fundamental relationship between protein structure and function

Margarita Osadchy; Rachel Kolodny

To study the protein structure–function relationship, we propose a method to efficiently create three-dimensional maps of structure space using a very large dataset of > 30,000 Structural Classification of Proteins (SCOP) domains. In our maps, each domain is represented by a point, and the distance between any two points approximates the structural distance between their corresponding domains. We use these maps to study the spatial distributions of properties of proteins, and in particular those of local vicinities in structure space such as structural density and functional diversity. These maps provide a unique broad view of protein space and thus reveal previously undescribed fundamental properties thereof. At the same time, the maps are consistent with previous knowledge (e.g., domains cluster by their SCOP class) and organize in a unified, coherent representation previous observation concerning specific protein folds. To investigate the function–structure relationship, we measure the functional diversity (using the Gene Ontology controlled vocabulary) in local structural vicinities. Our most striking finding is that functional diversity varies considerably across structure space: The space has a highly diverse region, and diversity abates when moving away from it. Interestingly, the domains in this region are mostly alpha/beta structures, which are known to be the most ancient proteins. We believe that our unique perspective of structure space will open previously undescribed ways of studying proteins, their evolution, and the relationship between their structure and function.


Bioinformatics | 2007

Using an alignment of fragment strings for comparing protein structures

Iddo Friedberg; Tim Harder; Rachel Kolodny; Einat Sitbon; Zhanwen Li; Adam Godzik

MOTIVATION Most methods that are used to compare protein structures use three-dimensional (3D) structural information. At the same time, it has been shown that a 1D string representation of local protein structure retains a degree of structural information. This type of representation can be a powerful tool for protein structure comparison and classification, given the arsenal of sequence comparison tools developed by computational biology. However, in order to do so, there is a need to first understand how much information is contained in various possible 1D representations of protein structure. RESULTS Here we describe the use of a particular structure fragment library, denoted here as KL-strings, for the 1D representation of protein structure. Using KL-strings, we develop an infrastructure for comparing protein structures with a 1D representation. This study focuses on the added value gained from such a description. We show the new local structure language adds resolution to the traditional three-state (helix, strand and coil) secondary structure description, and provides a high degree of accuracy in recognizing structural similarities when used with a pairwise alignment benchmark. The results of this study have immediate applications towards fast structure recognition, and for fold prediction and classification.


Annual review of biophysics | 2013

On the Universe of Protein Folds

Rachel Kolodny; Leonid Pereyaslavets; Abraham O. Samson; Michael Levitt

In the fifty years since the first atomic structure of a protein was revealed, tens of thousands of additional structures have been solved. Like all objects in biology, proteins structures show common patterns that seem to define family relationships. Classification of proteins structures, which started in the 1970s with about a dozen structures, has continued with increasing enthusiasm, leading to two main fold classifications, SCOP and CATH, as well as many additional databases. Classification is complicated by deciding what constitutes a domain, the fundamental unit of structure. Also difficult is deciding when two given structures are similar. Like all of biology, fold classification is beset by exceptions to all rules. Thus, the perspectives of protein fold space that the fold classifications offer differ from each other. In spite of these ambiguities, fold classifications are useful for prediction of structure and function. Studying the characteristics of fold space can shed light on protein evolution and the physical laws that govern protein behavior.


Biopolymers | 2003

Protein decoy assembly using short fragments under geometric constraints

Rachel Kolodny; Michael Levitt

A small set of protein fragments can represent adequately all known local protein structure. This set of fragments, along with a construction scheme that assembles these fragments into structures, defines a discrete (relatively small) conformation space, which approximates protein structures accurately. We generate protein decoys by sampling geometrically valid structures from this conformation space, biased by the secondary structure prediction for the protein. Unlike other methods, secondary structure prediction is the only protein‐specific information used for generating the decoys. Nevertheless, these decoys are qualitatively similar to those found by others. The method works well for all‐α proteins, and shows promising results for α and β proteins.


Proceedings of the National Academy of Sciences of the United States of America | 2014

Global view of the protein universe

Sergey Nepomnyachiy; Nir Ben-Tal; Rachel Kolodny

Significance To globally explore protein space, we use networks to present similarities among a representative set of all known domains. In the “domain network” edges connect domains that share “motifs,” i.e., significantly sized segments of similar sequence and structure, and in the “motif network” edges connect recurring motifs that appear in the same domain. The networks offer a way to organize protein space, and examine how the organization changes upon changing the definition of “evolutionary relatedness” among domains. For example, we use them to highlight and characterize the uniqueness of a class of domains called alpha/beta, in which the alpha and beta elements alternate. The networks can also suggest evolutionary paths between domains, and be used for protein search and design. To explore protein space from a global perspective, we consider 9,710 SCOP (Structural Classification of Proteins) domains with up to 70% sequence identity and present all similarities among them as networks: In the “domain network,” nodes represent domains, and edges connect domains that share “motifs,” i.e., significantly sized segments of similar sequence and structure. We explore the dependence of the network on the thresholds that define the evolutionary relatedness of the domains. At excessively strict thresholds the network falls apart completely; for very lax thresholds, there are network paths between virtually all domains. Interestingly, at intermediate thresholds the network constitutes two regions that can be described as “continuous” versus “discrete.” The continuous region comprises a large connected component, dominated by domains with alternating alpha and beta elements, and the discrete region includes the rest of the domains in isolated islands, each generally corresponding to a fold. We also construct the “motif network,” in which nodes represent recurring motifs, and edges connect motifs that appear in the same domain. This network also features a large and highly connected component of motifs that originate from domains with alternating alpha/beta elements (and some all-alpha domains), and smaller isolated islands. Indeed, the motif network suggests that nature reuses such motifs extensively. The networks suggest evolutionary paths between domains and give hints about protein evolution and the underlying biophysics. They provide natural means of organizing protein space, and could be useful for the development of strategies for protein search and design.


Bioinformatics | 2014

Redundancy-weighting for better inference of protein structural features

Chen Yanover; Natalia Vanetik; Michael Levitt; Rachel Kolodny; Chen Keasar

MOTIVATION Structural knowledge, extracted from the Protein Data Bank (PDB), underlies numerous potential functions and prediction methods. The PDB, however, is highly biased: many proteins have more than one entry, while entire protein families are represented by a single structure, or even not at all. The standard solution to this problem is to limit the studies to non-redundant subsets of the PDB. While alleviating biases, this solution hides the many-to-many relations between sequences and structures. That is, non-redundant datasets conceal the diversity of sequences that share the same fold and the existence of multiple conformations for the same protein. A particularly disturbing aspect of non-redundant subsets is that they hardly benefit from the rapid pace of protein structure determination, as most newly solved structures fall within existing families. RESULTS In this study we explore the concept of redundancy-weighted datasets, originally suggested by Miyazawa and Jernigan. Redundancy-weighted datasets include all available structures and associate them (or features thereof) with weights that are inversely proportional to the number of their homologs. Here, we provide the first systematic comparison of redundancy-weighted datasets with non-redundant ones. We test three weighting schemes and show that the distributions of structural features that they produce are smoother (having higher entropy) compared with the distributions inferred from non-redundant datasets. We further show that these smoothed distributions are both more robust and more correct than their non-redundant counterparts. We suggest that the better distributions, inferred using redundancy-weighting, may improve the accuracy of knowledge-based potentials and increase the power of protein structure prediction methods. Consequently, they may enhance model-driven molecular biology.

Collaboration


Dive into the Rachel Kolodny's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Patrice Koehl

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chen Keasar

Ben-Gurion University of the Negev

View shared research outputs
Top Co-Authors

Avatar

Barry Honig

Howard Hughes Medical Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge