Robert D. Carr | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Robert D. Carr is active.

Explore More

Publication

Featured researches published by Robert D. Carr.

Journal of Computational Biology | 2004

1001 optimal PDB structure alignments: integer programming methods for finding the maximum contact map overlap.

Alberto Caprara; Robert D. Carr; Sorin Istrail; Giuseppe Lancia; Brian Walenz

Protein structure comparison is a fundamental problem for structural genomics, with applications to drug design, fold prediction, protein clustering, and evolutionary studies. Despite its importance, there are very few rigorous methods and widely accepted similarity measures known for this problem. In this paper we describe the last few years of developments on the study of an emerging measure, the contact map overlap (CMO), for protein structure comparison. A contact map is a list of pairs of residues which lie in three-dimensional proximity in the proteins native fold. Although this measure is in principle computationally hard to optimize, we show how it can in fact be computed with great accuracy for related proteins by integer linear programming techniques. These methods have the advantage of providing certificates of near-optimality by means of upper bounds to the optimal alignment value. We also illustrate effective heuristics, such as local search and genetic algorithms. We were able to obtain for the first time optimal alignments for large similar proteins (about 1,000 residues and 2,000 contacts) and used the CMO measure to cluster proteins in families. The clusters obtained were compared to SCOP classification in order to validate the measure. Extensive computational experiments showed that alignments which are off by at most 10% from the optimal value can be computed in a short time. Further experiments showed how this measure reacts to the choice of the threshold defining a contact and how to choose this threshold in a sensible way.

research in computational molecular biology | 2001

101 optimal PDB structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem

Giuseppe Lancia; Robert D. Carr; Brian Walenz; Sorin Istrail

Structure comparison is a fundamental problem for structural genomics. A variety of structure comparison methods were proposed and several protein structure classification servers e.g., SCOP, DALI, CATH, were designed based on them, and are extensively used in practice. This area of research continues to be very active, being energized bi-annually by the CASP folding competitions, but despite the extraordinary international research effort devoted to it, progress is slow. A fundamental dimension of this bottleneck is the absence of rigorous algorithmic methods. A recent excellent survey on structure comparison by Taylor et.al. [23] records the state of the art of the area: In structure comparison, we do not even have an algorithm that guarantees an optimal answer for pairs of structures … In this paper we provide the first rigorous algorithm for structure comparison. Our method is based on developing an effective integer linear programming (IP) formulation of protein structure contact maps overlap (CMO), and a branch-and-cut strategy that employs lower-bounding heuristics at the branch nodes. Our algorithms identified a gallery of optimal and near-optimal structure alignments for pairs of proteins from the Protein Data Bank with up to 80 amino acids and about 150 contacts each — problems of instance size of about 300. Although these sizes also reflect our current limitations, these are the first provable optimal and near-optimal algorithms in the literature for a measure of structure similarity which sees extensive practical use. At the heart of our success in finding optimal alignments is a reduction of the CMO optimization to the maximum independent set (MIS) problem on special graphs. For CMO instances of size 300, the corresponding MIS graph instance contains about 10,000 nodes. While our algorithms are able to solve to optimality MIS problem of these sizes, the known optimal algorithms for the MIS on general graphs can at present only solve instances with up to a few hundred nodes. This is the first effective use of IP methods in protein structure comparison; the biomolecular structure literature contains only one other effective IP method devoted to RNA comparison, due to Lenhof et.al. [18]. The hybrid heuristic approach that worked well for providing lower bounds in the branch and cut algorithm was tried on large proteins in a test set suggested by Jeffrey Skolnick. It involved 33 proteins classified into four families: Flavodoxin-like fold CheY-related, Plastocyanin, TIM Barrel, and Ferratin. Out of the set of all 528 pairwise structure alignments, we have validated the clustering with a 98.7% accuracy (1.3% false negatives and 0% false positives).

Random Structures & Algorithms - Probabilistic methods in combinatorial optimization archive | 2002

Randomized metarounding

Robert D. Carr; Santosh Vempala

Let P be a linear relaxation of an integer polytope Z such that the integrality gap of P with respect to Z is at most r, as verified by a polytime heuristic A, which on any positive cost function c returns an integer solution (an extreme point of Z) whose cost is at most r times the optimal cost over P. Then for any point x* in P (a fractional solution), rx* dominates some convex combination of extreme points of Z. A constructive version of this theorem is presented here, with applications to approximation algorithms, and can be viewed as a generalization of randomized rounding.

Journal of Chemical Information and Computer Sciences | 2004

The signature molecular descriptor: 4. Canonizing molecules using extended valence sequences

Jean-Loup Faulon; Michael J. Collins; Robert D. Carr

We present a new algorithm to canonize molecular graphs using the signature molecular descriptor introduced in the previous papers of this series. While developed specifically for molecular structures, the algorithm can be used for any graph and is not limited to acyclic graphs, planar graphs, bounded valence, or bounded genus graphs, for which polynomial time algorithms exist. The algorithm is tested with benzenoid hydrocarbons and a database of 126,705 organic compounds. The algorithms performances are compared against Brendan Mc Kays Nauty algorithm, which is believed to be the fastest graph canonization algorithm for general graphs, with five series of graphs each comprising up to 30,000 vertices: 2D meshes (pericondensed benzenoids), 3D cages (fullerenes and nanotubes), 3D meshes (crystal lattices), 4D cages, and power law graphs (protein and gene networks). The algorithm can be downloaded as an open source code at http://www.cs.sandia.gov/ approximately jfaulon/QSAR.

Mathematical Programming | 2006

Robust optimization of contaminant sensor placement for community water systems

Robert D. Carr; Harvey J. Greenberg; William E. Hart; Goran Konjevod; Erik Lauer; Henry Lin; Tod Morrison; Cynthia A. Phillips

We present a series of related robust optimization models for placing sensors in municipal water networks to detect contaminants that are maliciously or accidentally injected. We formulate sensor placement problems as mixed-integer programs, for which the objective coefficients are not known with certainty. We consider a restricted absolute robustness criteria that is motivated by natural restrictions on the uncertain data, and we define three robust optimization models that differ in how the coefficients in the objective vary. Under one set of assumptions there exists a sensor placement that is optimal for all admissible realizations of the coefficients. Under other assumptions, we can apply sorting to solve each worst-case realization efficiently, or we can apply duality to integrate the worst-case outcome and have one integer program. The most difficult case is where the objective parameters are bilinear, and we prove its complexity is NP-hard even under simplifying assumptions. We consider a relaxation that provides an approximation, giving an overall guarantee of near-optimality when used with branch-and-bound search. We present preliminary computational experiments that illustrate the computational complexity of solving these robust formulations on sensor placement applications.

Archive | 2003

A Decomposition-Based Pseudoapproximation Algorithm for Network Flow Inhibition

Carl Burch; Robert D. Carr; Sven Oliver Krumke; Madhav V. Marathe; Cynthia A. Phillips; Eric Sundberg

In the network inhibition problem, we wish to expend a limited budget attacking a given edge-capacitated graph by “paying” to remove edge capacity from some subset of the edges. We wish to minimize the resulting maximum flow between two designated vertices s and t. The problem is strongly NP-hard. Previous approximation algorithms applied only to planar graphs. In this chapter, we give a polynomial-time algorithm, based on a linear-programming relaxation of an integer program, that finds an attack with cost B a and residual network capacity (max flow) C a such that

Other Information: PBD: 1 Sep 2000 | 2000

Branch-and-Cut Algorithms for Independent Set Problems: Integrality Gap and An Application to Protein Structure Alignment

Robert D. Carr; Giuseppe Lancia; Sorin Istrail

Interfaces | 2009

US Environmental Protection Agency Uses Operations Research to Reduce Contamination Risks in Drinking Water

Regan Murray; William E. Hart; Cynthia A. Phillips; Jonathan W. Berry; Erik G. Boman; Robert D. Carr; Lee Ann Riesen; Jean-Paul Watson; Terra Haxton; Jonathan G. Herrmann; Robert Janke; George M. Gray; Thomas N. Taxon; James G. Uber; Kevin M. Morley

\frac{{B_a }} {B} + \in \frac{{C_a }} {{C^* }} \leqslant 1 + \in ,

integer programming and combinatorial optimization | 1998

A New Bound for the 2-Edge Connected Subgraph Problem

Robert D. Carr; R. Ravi

Operations Research Letters | 2002

Compact vs. exponential-size LP relaxations

Robert D. Carr; Giuseppe Lancia

where e>0 is a given error parameter, B is the given budget (the amount of resources to expend damaging the network), and C* is the minimum (optimal) residual capacity for any attack with budget B. For example, our algorithm returns a (1,1+1/e)-approximation or a (1+e, 1)-pseudoapproximation, but we do not know which a priori. The parameter e biases the nature of the solution, but does not affect the running time.

Explore More