Ryan H. Lilien
Dartmouth College
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ryan H. Lilien.
Journal of Computational Biology | 2003
Ryan H. Lilien; Hany Farid; Bruce Randall Donald
We have developed an algorithm called Q5 for probabilistic classification of healthy versus disease whole serum samples using mass spectrometry. The algorithm employs principal components analysis (PCA) followed by linear discriminant analysis (LDA) on whole spectrum surface-enhanced laser desorption/ionization time of flight (SELDI-TOF) mass spectrometry (MS) data and is demonstrated on four real datasets from complete, complex SELDI spectra of human blood serum. Q5 is a closed-form, exact solution to the problem of classification of complete mass spectra of a complex protein mixture. Q5 employs a probabilistic classification algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally efficient; it is noniterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of different training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classification method achieves excellent performance. We achieve sensitivity, specificity, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques and can provide clues as to the molecular identities of differentially expressed proteins and peptides.
Journal of Computational Chemistry | 2008
Ivelin S. Georgiev; Ryan H. Lilien; Bruce Randall Donald
One of the main challenges for protein redesign is the efficient evaluation of a combinatorial number of candidate structures. The modeling of protein flexibility, typically by using a rotamer library of commonly‐observed low‐energy side‐chain conformations, further increases the complexity of the redesign problem. A dominant algorithm for protein redesign is dead‐end elimination (DEE), which prunes the majority of candidate conformations by eliminating rigid rotamers that provably are not part of the global minimum energy conformation (GMEC). The identified GMEC consists of rigid rotamers (i.e., rotamers that have not been energy‐minimized) and is thus referred to as the rigid‐GMEC. As a postprocessing step, the conformations that survive DEE may be energy‐minimized. When energy minimization is performed after pruning with DEE, the combined protein design process becomes heuristic, and is no longer provably accurate: a conformation that is pruned using rigid‐rotamer energies may subsequently minimize to a lower energy than the rigid‐GMEC. That is, the rigid‐GMEC and the conformation with the lowest energy among all energy‐minimized conformations (the minimized‐GMEC) are likely to be different. While the traditional DEE algorithm succeeds in not pruning rotamers that are part of the rigid‐GMEC, it makes no guarantees regarding the identification of the minimized‐GMEC. In this paper we derive a novel, provable, and efficient DEE‐like algorithm, called minimized‐DEE (MinDEE), that guarantees that rotamers belonging to the minimized‐GMEC will not be pruned, while still pruning a combinatorial number of conformations. We show that MinDEE is useful not only in identifying the minimized‐GMEC, but also as a filter in an ensemble‐based scoring and search algorithm for protein redesign that exploits energy‐minimized conformations. We compare our results both to our previous computational predictions of protein designs and to biological activity assays of predicted protein mutants. Our provable and efficient minimized‐DEE algorithm is applicable in protein redesign, protein‐ligand binding prediction, and computer‐aided drug design.
Journal of Computational Biology | 2005
Ryan H. Lilien; Brian W. Stevens; Amy C. Anderson; Bruce Randall Donald
Realization of novel molecular function requires the ability to alter molecular complex formation. Enzymatic function can be altered by changing enzyme-substrate interactions via modification of an enzymes active site. A redesigned enzyme may either perform a novel reaction on its native substrates or its native reaction on novel substrates. A number of computational approaches have been developed to address the combinatorial nature of the protein redesign problem. These approaches typically search for the global minimum energy conformation among an exponential number of protein conformations. We present a novel algorithm for protein redesign, which combines a statistical mechanics-derived ensemble-based approach to computing the binding constant with the speed and completeness of a branch-and-bound pruning algorithm. In addition, we developed an efficient deterministic approximation algorithm, capable of approximating our scoring function to arbitrary precision. In practice, the approximation algorithm decreases the execution time of the mutation search by a factor of ten. To test our method, we examined the Phe-specific adenylation domain of the nonribosomal peptide synthetase gramicidin synthetase A (GrsA-PheA). Ensemble scoring, using a rotameric approximation to the partition functions of the bound and unbound states for GrsA-PheA, is first used to predict binding of the wildtype protein and a previously described mutant (selective for leucine), and second, to switch the enzyme specificity toward leucine, using two novel active site sequences computationally predicted by searching through the space of possible active site mutations. The top scoring in silico mutants were created in the wetlab and dissociation/binding constants were determined by fluorescence quenching. These tested mutations exhibit the desired change in specificity from Phe to Leu. Our ensemble-based algorithm, which flexibly models both protein and ligand using rotamer-based partition functions, has application in enzyme redesign, the prediction of protein-ligand binding, and computer-aided drug design.
Methods in Enzymology | 2013
Pablo Gainza; Kyle E. Roberts; Ivelin S. Georgiev; Ryan H. Lilien; Daniel A. Keedy; Cheng-Yu Chen; Faisal Reza; Amy C. Anderson; David C. Richardson; Jane S. Richardson; Bruce Randall Donald
UNLABELLED We have developed a suite of protein redesign algorithms that improves realistic in silico modeling of proteins. These algorithms are based on three characteristics that make them unique: (1) improved flexibility of the protein backbone, protein side-chains, and ligand to accurately capture the conformational changes that are induced by mutations to the protein sequence; (2) modeling of proteins and ligands as ensembles of low-energy structures to better approximate binding affinity; and (3) a globally optimal protein design search, guaranteeing that the computational predictions are optimal with respect to the input model. Here, we illustrate the importance of these three characteristics. We then describe OSPREY, a protein redesign suite that implements our protein design algorithms. OSPREY has been used prospectively, with experimental validation, in several biomedically relevant settings. We show in detail how OSPREY has been used to predict resistance mutations and explain why improved flexibility, ensembles, and provability are essential for this application. AVAILABILITY OSPREY is free and open source under a Lesser GPL license. The latest version is OSPREY 2.0. The program, user manual, and source code are available at www.cs.duke.edu/donaldlab/software.php. CONTACT [email protected].
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1999
Daniel P. Huttenlocher; Ryan H. Lilien; Clark F. Olson
View-based recognition methods, such as those using eigenspace techniques, have been successful for a number of recognition tasks. Such approaches, however, are somewhat limited in their ability to recognize objects that are partly hidden from view or occur against cluttered backgrounds. In order to address these limitations, we have developed a view matching technique based on an eigenspace approximation to the generalized Hausdorff measure. This method achieves compact storage and fast indexing that are the main advantages of eigenspace view matching techniques, while also being tolerant of partial occlusion and background clutter. The method applies to binary feature maps, such as intensity edges, rather than directly to intensity images.
research in computational molecular biology | 2004
Ryan H. Lilien; Brian W. Stevens; Amy C. Anderson; Bruce Randall Donald
Realization of novel molecular function requires the ability to alter molecular complex formation. Enzymatic function can be altered by changing enzyme-substrate interactions via modification of an enzymes active site. A redesigned enzyme may either perform a novel reaction on its native substrates or its native reaction on novel substrates. A number of computational approaches have been developed to address the combinatorial nature of the protein redesign problem. These approaches typically search for the global minimum energy conformation among an exponential number of protein conformations. We present a novel algorithm for protein redesign, which combines a statistical mechanics-derived ensemble-based approach to computing the binding constant with the speed and completeness of a branch-and-bound pruning algorithm. In addition, we developed an efficient deterministic approximation algorithm, capable of approximating our scoring function to arbitrary precision. In practice, the approximation algorithm decreases the execution time of the mutation search by a factor of ten. To test our method, we examined the Phe-specific adenylation domain of the non-ribosomal peptide synthetase gramicidin synthetase A (GrsA-PheA). Ensemble scoring, using a rotameric approximation to the partition functions of the bound and unbound states for GrsA-PheA, is first used to predict binding of the wildtype protein and a previously described mutant (selective for leucine), and second, to switch the enzyme specificity toward leucine, using two novel active site sequences computationally predicted by searching through the space of possible active site mutations. The top scoring in silico mutants were created in the wetlab and dissociation / binding constants were determined by fluorescence quenching. These tested mutations exhibit the desired change in specificity from Phe to Leu. Our ensemble-based algorithm which flexibly models both protein and ligand using rotamer-based partition functions, has application in enzyme redesign, the prediction of protein-ligand binding, and computer-aided drug design.
Journal of Computational Biology | 2004
Christopher James Langmead; Anthony K. Yan; Ryan H. Lilien; Lincong Wang; Bruce Randall Donald
High-throughput NMR structural biology can play an important role in structural genomics. We report an automated procedure for high-throughput NMR resonance assignment for a protein of known structure, or of a homologous structure. These assignments are a prerequisite for probing protein-protein interactions, protein-ligand binding, and dynamics by NMR. Assignments are also the starting point for structure determination and refinement. A new algorithm, called Nuclear Vector Replacement (NVR) is introduced to compute assignments that optimally correlate experimentally measured NH residual dipolar couplings (RDCs) to a given a priori whole-protein 3D structural model. The algorithm requires only uniform( 15)N-labeling of the protein and processes unassigned H(N)-(15)N HSQC spectra, H(N)-(15)N RDCs, and sparse H(N)-H(N) NOEs (d(NN)s), all of which can be acquired in a fraction of the time needed to record the traditional suite of experiments used to perform resonance assignments. NVR runs in minutes and efficiently assigns the (H(N),(15)N) backbone resonances as well as the d(NN)s of the 3D (15)N-NOESY spectrum, in O(n(3)) time. The algorithm is demonstrated on NMR data from a 76-residue protein, human ubiquitin, matched to four structures, including one mutant (homolog), determined either by x-ray crystallography or by different NMR experiments (without RDCs). NVR achieves an assignment accuracy of 92-100%. We further demonstrate the feasibility of our algorithm for different and larger proteins, using NMR data for hen lysozyme (129 residues, 97-100% accuracy) and streptococcal protein G (56 residues, 100% accuracy), matched to a variety of 3D structural models. Finally, we extend NVR to a second application, 3D structural homology detection, and demonstrate that NVR is able to identify structural homologies between proteins with remote amino acid sequences using a database of structural models.
Journal of Biological Chemistry | 2003
Robert H. O'Neil; Ryan H. Lilien; Bruce Randall Donald; Robert M. Stroud; Amy C. Anderson
We have determined the crystal structure of dihydrofolate reductase-thymidylate synthase (DHFR-TS) from Cryptosporidium hominis, revealing a unique linker domain containing an 11-residue α-helix that has extensive interactions with the opposite DHFR-TS monomer of the homodimeric enzyme. Analysis of the structure of DHFR-TS from C. hominis and of previously solved structures of DHFR-TS from Plasmodium falciparum and Leishmania major reveals that the linker domain primarily controls the relative orientation of the DHFR and TS domains. Using the tertiary structure of the linker domains, we have been able to place a number of protozoa in two distinct and dissimilar structural families corresponding to two evolutionary families and provide the first structural evidence validating the use of DHFR-TS as a tool of phylogenetic classification. Furthermore, the structure of C. hominis DHFR-TS calls into question surface electrostatic channeling as the universal means of dihydrofolate transport between TS and DHFR in the bifunctional enzyme.
Journal of Chemical Information and Modeling | 2011
Izhar Wallach; Ryan H. Lilien
Virtual docking algorithms are often evaluated on their ability to separate active ligands from decoy molecules. The current state-of-the-art benchmark, the Directory of Useful Decoys (DUD), minimizes bias by including decoys from a library of synthetically feasible molecules that are physically similar yet chemically dissimilar to the active ligands. We show that by ignoring synthetic feasibility, we can compile a benchmark that is comparable to the DUD and less biased with respect to physical similarity.
intelligent systems in molecular biology | 2006
Ivelin S. Georgiev; Ryan H. Lilien; Bruce Randall Donald
MOTIVATION Structure-based protein redesign can help engineer proteins with desired novel function. Improving computational efficiency while still maintaining the accuracy of the design predictions has been a major goal for protein design algorithms. The combinatorial nature of protein design results both from allowing residue mutations and from the incorporation of protein side-chain flexibility. Under the assumption that a single conformation can model protein folding and binding, the goal of many algorithms is the identification of the Global Minimum Energy Conformation (GMEC). A dominant theorem for the identification of the GMEC is Dead-End Elimination (DEE). DEE-based algorithms have proven capable of eliminating the majority of candidate conformations, while guaranteeing that only rotamers not belonging to the GMEC are pruned. However, when the protein design process incorporates rotameric energy minimization, DEE is no longer provably-accurate. Hence, with energy minimization, the minimized-DEE (MinDEE) criterion must be used instead. RESULTS In this paper, we present provably-accurate improvements to both the DEE and MinDEE criteria. We show that our novel enhancements result in a speedup of up to a factor of more than 1000 when applied in redesign for three different proteins: Gramicidin Synthetase A, plastocyanin, and protein G. AVAILABILITY Contact authors for source code.