Is this you? Create Your Porfile

Chris Thachuk

California Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chris Thachuk is active.

Explore More

Publication

Featured researches published by Chris Thachuk.

BMC Bioinformatics | 2007

A replica exchange Monte Carlo algorithm for protein folding in the HP model

Chris Thachuk; Alena Shmygelska; Holger H. Hoos

BackgroundThe ab initio protein folding problem consists of predicting protein tertiary structure from a given amino acid sequence by minimizing an energy function; it is one of the most important and challenging problems in biochemistry, molecular biology and biophysics. The ab initio protein folding problem is computationally challenging and has been shown to be NPMathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFneVtcqqGqbauaaa@3961@-hard even when conformations are restricted to a lattice. In this work, we implement and evaluate the replica exchange Monte Carlo (REMC) method, which has already been applied very successfully to more complex protein models and other optimization problems with complex energy landscapes, in combination with the highly effective pull move neighbourhood in two widely studied Hydrophobic Polar (HP) lattice models.ResultsWe demonstrate that REMC is highly effective for solving instances of the square (2D) and cubic (3D) HP protein folding problem. When using the pull move neighbourhood, REMC outperforms current state-of-the-art algorithms for most benchmark instances. Additionally, we show that this new algorithm provides a larger ensemble of ground-state structures than the existing state-of-the-art methods. Furthermore, it scales well with sequence length, and it finds significantly better conformations on long biological sequences and sequences with a provably unique ground-state structure, which is believed to be a characteristic of real proteins. We also present evidence that our REMC algorithm can fold sequences which exhibit significant interaction between termini in the hydrophobic core relatively easily.ConclusionWe demonstrate that REMC utilizing the pull move neighbourhood significantly outperforms current state-of-the-art methods for protein structure prediction in the HP model on 2D and 3D lattices. This is particularly noteworthy, since so far, the state-of-the-art methods for 2D and 3D HP protein folding – in particular, the pruned-enriched Rosenbluth method (PERM) and, to some extent, Ant Colony Optimisation (ACO) – were based on chain growth mechanisms. To the best of our knowledge, this is the first application of REMC to HP protein folding on the cubic lattice, and the first extension of the pull move neighbourhood to a 3D lattice.

BMC Bioinformatics | 2009

Core Hunter: an algorithm for sampling genetic resources based on multiple genetic measures

Chris Thachuk; José Crossa; Jorge Franco; Susanne Dreisigacker; Marilyn L. Warburton; Guy Davenport

BackgroundExisting algorithms and methods for forming diverse core subsets currently address either allele representativeness (breeders preference) or allele richness (taxonomists preference). The main objective of this paper is to propose a powerful yet flexible algorithm capable of selecting core subsets that have high average genetic distance between accessions, or rich genetic diversity overall, or a combination of both.ResultsWe present Core Hunter, an advanced stochastic local search algorithm for selecting core subsets. Core Hunter is able to find core subsets having more genetic diversity and better average genetic distance than the current state-of-the-art algorithms for all genetic distance and diversity measures we evaluated. Furthermore, Core Hunter can attempt to optimize any number of genetic measures simultaneously, based on the preference of the user. Notably, Core Hunter is able to select significantly smaller core subsets, which retain all unique alleles from a reference collection, than state-of-the-art algorithms.ConclusionCore Hunter is a highly effective and flexible tool for sampling genetic resources and establishing core subsets. Our implementation, documentation, and source code for Core Hunter is available at http://corehunter.org

Natural Computing | 2015

DNA walker circuits: computational potential, design, and verification

Frits Dannenberg; Marta Z. Kwiatkowska; Chris Thachuk; Andrew J. Turberfield

Unlike their traditional, silicon counterparts, DNA computers have natural interfaces with both chemical and biological systems. These can be used for a number of applications, including the precise arrangement of matter at the nanoscale and the creation of smart biosensors. Like silicon circuits, DNA strand displacement systems (DSD) can evaluate non-trivial functions. However, these systems can be slow and are susceptible to errors. It has been suggested that localised hybridization reactions could overcome some of these challenges. Localised reactions occur in DNA ‘walker’ systems which were recently shown to be capable of navigating a programmable track tethered to an origami tile. We investigate the computational potential of these systems for evaluating Boolean functions and forming composable circuits. We find that systems of multiple walkers have severely limited potential for parallel circuit evaluation. DNA walkers, like DSDs, are also susceptible to errors. We develop a discrete stochastic model of DNA walker ‘circuits’ based on experimental data, and demonstrate the merit of using probabilistic model checking techniques to analyse their reliability, performance and correctness. This analysis aids in the design of reliable and efficient DNA walker circuits.

international workshop on dna-based computers | 2012

Space and Energy Efficient Computation with DNA Strand Displacement Systems

Chris Thachuk; Anne Condon

Chemical reaction networks (CRN’s) are important models of molecular programming that can be realized by logically reversible, and thus energy efficient, DNA strand displacement systems (DSD’s). Qian et al. [12] showed that energy efficient DSD’s are Turing-universal; however their simulation of a computation requires space (or volume) proportional to the number of steps of the computation. Here we show that polynomially space bounded computations can be simulated in both a space and energy efficient manner using logically reversible CRN’s and DSD’s. A consequence of our proofs is that determining whether a particular molecular species can be produced from an initial pool of molecules of a CRN or DSD is PSPACE-hard, and thus also verifying the correctness of CRN’s and DSD’s is PSPACE-hard.

international conference on dna computing | 2015

Leakless DNA Strand Displacement Systems

Chris Thachuk; Erik Winfree; David Soloveichik

While current experimental demonstrations have been limited to small computational tasks, DNA strand displacement systems DSD systemsi¾?[25] hold promise for sophisticated information processing within chemical or biological environments. A DSD system encodes designed reactions that are facilitated by three-way or four-way toehold-mediated strand displacement. However, such systems are capable of spurious displacement events that lead to leak: incorrect signal production. We have identified sources of leak pathways in typical existing DSD schemes that rely on toehold sequestration and are susceptible to toeless strand displacement i.e. displacement reactions that occur despite the absence of a toehold. Based on this understanding, we propose a simple, domain-level motif for the design of leak-resistant DSD systems. This motif forms the basis of a number of DSD schemes that do not rely on toehold sequestration alone to prevent spurious displacements. Spurious displacements are still possible in our systems, but require multiple, low probability events to occur. Our schemes can implement combinatorial Boolean logic formulas and can be extended to implement arbitrary chemical reaction networks.

international conference on dna computing | 2013

Stochastic Simulation of the Kinetics of Multiple Interacting Nucleic Acid Strands

Joseph M. Schaeffer; Chris Thachuk; Erik Winfree

DNA nanotechnology is an emerging field which utilizes the unique structural properties of nucleic acids in order to build nanoscale devices, such as logic gates, motors, walkers, and algorithmic structures. Predicting the structure and interactions of a DNA device requires good modeling of both the thermodynamics and the kinetics of the DNA strands within the system. The kinetics of a set of DNA strands can be modeled as a continuous time Markov process through the state space of all secondary structures. The primary means of exploring the kinetics of a DNA system is by simulating trajectories through the state space and aggregating data over many such trajectories. We expand on previous work by extending the thermodynamics and kinetics models to handle multiple strands in a fixed volume, and show that the new models are consistent with previous models. We developed data structures and algorithms that allow us to take advantage of local properties of secondary structure, improving the efficiency of the simulator so that we can handle larger systems. The new kinetic parameters in our model were calibrated by analyzing simulator results on experimental systems that measure basic kinetic rates of various processes. Finally, we apply the new simulator to explore a case study on toehold-mediated four-way branch migration.

Nature Communications | 2017

Compiler-aided systematic construction of large-scale DNA strand displacement circuits using unpurified components

Anupama J. Thubagere; Chris Thachuk; Joseph Berleant; Robert F. Johnson; Diana A. Ardelean; Kevin M. Cherry; Lulu Qian

Biochemical circuits made of rationally designed DNA molecules are proofs of concept for embedding control within complex molecular environments. They hold promise for transforming the current technologies in chemistry, biology, medicine and material science by introducing programmable and responsive behaviour to diverse molecular systems. As the transformative power of a technology depends on its accessibility, two main challenges are an automated design process and simple experimental procedures. Here we demonstrate the use of circuit design software, combined with the use of unpurified strands and simplified experimental procedures, for creating a complex DNA strand displacement circuit that consists of 78 distinct species. We develop a systematic procedure for overcoming the challenges involved in using unpurified DNA strands. We also develop a model that takes synthesis errors into consideration and semi-quantitatively reproduces the experimental data. Our methods now enable even novice researchers to successfully design and construct complex DNA strand displacement circuits.

international workshop on dna-based computers | 2014

Fast Algorithmic Self-assembly of Simple Shapes Using Random Agitation

Ho-Lin Chen; David Doty; Dhiraj Holden; Chris Thachuk; Damien Woods; Chun-Tao Yang

We study the power of uncontrolled random molecular movement in a model of self-assembly called the nubots model. The nubots model is an asynchronous nondeterministic cellular automaton augmented with rigid-body movement rules (push/pull, deterministically and programmatically applied to specific monomers) and random agitations (nondeterministically applied to every monomer and direction with equal probability all of the time). Previous work on nubots showed how to build simple shapes such as lines and squares quickly—in expected time that is merely logarithmic of their size. These results crucially make use of the programmable rigid-body movement rule: the ability for a single monomer to push or pull large objects quickly, and only at a time and place of the programmers’ choosing. However, in engineered molecular systems, molecular motion is largely uncontrolled and fundamentally random. This raises the question of whether similar results can be achieved in a more restrictive, and perhaps easier to justify, model where uncontrolled random movements, or agitations, are happening throughout the self-assembly process and are the only form of rigid-body movement. We show that this is indeed the case: we give a polylogarithmic expected time construction for squares using agitation, and a sublinear expected time construction to build a line. Such results are impossible in an agitation-free (and movement-free) setting and thus show the benefits of exploiting uncontrolled random movement.

international workshop on dna-based computers | 2009

NP-Completeness of the Direct Energy Barrier Problem without Pseudoknots

Ján Maňuch; Chris Thachuk; Ladislav Stacho; Anne Condon

Knowledge of energy barriers between pairs of secondary structures for a given DNA or RNA molecule is useful, both in understanding RNA function in biological settings and in design of programmed molecular systems. Current heuristics are not guaranteed to find the exact energy barrier, raising the question whether the energy barrier can be calculated efficiently. In this paper, we study the computational complexity of a simple formulation of the energy barrier problem, in which each base pair contributes an energy of ? 1 and only base pairs in the initial and final structures may be used on a folding pathway from initial to final structure. We show that this problem is NP-complete.

combinatorial pattern matching | 2011

Succincter text indexing with wildcards

Chris Thachuk

We study the problem of indexing text with wildcard positions, motivated by the challenge of aligning sequencing data to large genomes that contain millions of single nucleotide polymorphisms (SNPs)--positions known to differ between individuals. SNPs modeled as wildcards can lead to more informed and biologically relevant alignments. We improve the space complexity of previous approaches by giving a succinct index requiring (2 + o(1))n log σ + O(n) + O(d log n) + O(k log k) bits for a text of length n over an alphabet of size σ containing d groups of k wildcards. The new index is particularly favourable for larger alphabets and comparable for smaller alphabets, such as DNA. A key to the space reduction is a result we give showing how any compressed suffix array can be supplemented with auxiliary data structures occupying O(n) + O(d log n/d) bits to also support efficient dictionary matching queries. We present a new query algorithm for our wildcard index that greatly reduces the query working space to O(dm + m log n) bits, where m is the length of the query. We note that compared to previous results this reduces the working space by two orders of magnitude when aligning short read data to the Human genome.

Explore More