Is this you? Create Your Porfile

David Allouche

Institut national de la recherche agronomique

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David Allouche is active.

Explore More

Publication

Featured researches published by David Allouche.

PLOS ONE | 2011

Gene Regulatory Network Reconstruction Using Bayesian Networks, the Dantzig Selector, the Lasso and Their Meta-Analysis

Matthieu Vignes; Jimmy Vandel; David Allouche; Nidal Ramadan-Alban; Christine Cierco-Ayrolles; Thomas Schiex; Brigitte Mangin; Simon de Givry

Modern technologies and especially next generation sequencing facilities are giving a cheaper access to genotype and genomic data measured on the same sample at once. This creates an ideal situation for multifactorial experiments designed to infer gene regulatory networks. The fifth “Dialogue for Reverse Engineering Assessments and Methods” (DREAM5) challenges are aimed at assessing methods and associated algorithms devoted to the inference of biological networks. Challenge 3 on “Systems Genetics” proposed to infer causal gene regulatory networks from different genetical genomics data sets. We investigated a wide panel of methods ranging from Bayesian networks to penalised linear regressions to analyse such data, and proposed a simple yet very powerful meta-analysis, which combines these inference methods. We present results of the Challenge as well as more in-depth analysis of predicted networks in terms of structure and reliability. The developed meta-analysis was ranked first among the teams participating in Challenge 3A. It paves the way for future extensions of our inference method and more accurate gene network estimates in the context of genetical genomics.

Bioinformatics | 2013

A new framework for computational protein design through cost function network optimization

Seydou Traoré; David Allouche; Isabelle André; Simon de Givry; George Katsirelos; Thomas Schiex; Sophie Barbe

MOTIVATION The main challenge for structure-based computational protein design (CPD) remains the combinatorial nature of the search space. Even in its simplest fixed-backbone formulation, CPD encompasses a computationally difficult NP-hard problem that prevents the exact exploration of complex systems defining large sequence-conformation spaces. RESULTS We present here a CPD framework, based on cost function network (CFN) solving, a recent exact combinatorial optimization technique, to efficiently handle highly complex combinatorial spaces encountered in various protein design problems. We show that the CFN-based approach is able to solve optimality a variety of complex designs that could often not be solved using a usual CPD-dedicated tool or state-of-the-art exact operations research tools. Beyond the identification of the optimal solution, the global minimum-energy conformation, the CFN-based method is also able to quickly enumerate large ensembles of suboptimal solutions of interest to rationally build experimental enzyme mutant libraries. AVAILABILITY The combined pipeline used to generate energetic models (based on a patched version of the open source solver Osprey 2.0), the conversion to CFN models (based on Perl scripts) and CFN solving (based on the open source solver toulbar2) are all available at http://genoweb.toulouse.inra.fr/~tschiex/CPD

Artificial Intelligence | 2014

Computational protein design as an optimization problem

David Allouche; Isabelle André; Sophie Barbe; Jessica Davies; Simon de Givry; George Katsirelos; Barry O'Sullivan; Steven David Prestwich; Thomas Schiex; Seydou Traoré

Proteins are chains of simple molecules called amino acids. The three-dimensional shape of a protein and its amino acid composition define its biological function. Over millions of years, living organisms have evolved a large catalog of proteins. By exploring the space of possible amino acid sequences, protein engineering aims at similarly designing tailored proteins with specific desirable properties. In Computational Protein Design (CPD), the challenge of identifying a protein that performs a given task is defined as the combinatorial optimization of a complex energy function over amino acid sequences. In this paper, we introduce the CPD problem and some of the main approaches that have been used by structural biologists to solve it, with an emphasis on the exact method embodied in the dead-end elimination/A? algorithm (DEE/A?). The CPD problem is a specific form of binary Cost Function Network (CFN, aka Weighted CSP). We show how DEE algorithms can be incorporated and suitably modified to be maintained during search, at reasonable computational cost. We then evaluate the efficiency of CFN algorithms as implemented in our solver toulbar2, on a set of real CPD instances built in collaboration with structural biologists. The CPD problem can be easily reduced to 0/1 Linear Programming, 0/1 Quadratic Programming, 0/1 Quadratic Optimization, Weighted Partial MaxSAT and Graphical Model optimization problems. We compare toulbar2 with these different approaches using a variety of solvers. We observe tremendous differences in the difficulty that each approach has on these instances. Overall, the CFN approach shows the best efficiency on these problems, improving by several orders of magnitude against the exact DEE/A? approach. The introduction of dead-end elimination before or during search allows to further improve these results.

Journal of Chemical Theory and Computation | 2015

Guaranteed Discrete Energy Optimization on Large Protein Design Problems

David Simoncini; David Allouche; Simon de Givry; Céline Delmas; Sophie Barbe; Thomas Schiex

In Computational Protein Design (CPD), assuming a rigid backbone and amino-acid rotamer library, the problem of finding a sequence with an optimal conformation is NP-hard. In this paper, using Dunbracks rotamer library and Talaris2014 decomposable energy function, we use an exact deterministic method combining branch and bound, arc consistency, and tree-decomposition to provenly identify the global minimum energy sequence-conformation on full-redesign problems, defining search spaces of size up to 10(234). This is achieved on a single core of a standard computing server, requiring a maximum of 66GB RAM. A variant of the algorithm is able to exhaustively enumerate all sequence-conformations within an energy threshold of the optimum. These proven optimal solutions are then used to evaluate the frequencies and amplitudes, in energy and sequence, at which an existing CPD-dedicated simulated annealing implementation may miss the optimum on these full redesign problems. The probability of finding an optimum drops close to 0 very quickly. In the worst case, despite 1,000 repeats, the annealing algorithm remained more than 1 Rosetta unit away from the optimum, leading to design sequences that could differ from the optimal sequence by more than 30% of their amino acids.

principles and practice of constraint programming | 2015

Anytime Hybrid Best-First Search with tree decomposition for weighted CSP

David Allouche; Simon de Givry; Georgios Katsirelos; Thomas Schiex; Matthias Zytnicki

We propose Hybrid Best-First Search (HBFS), a search strategy for optimization problems that combines Best-First Search (BFS) and Depth-First Search (DFS). Like BFS, HBFS provides an anytime global lower bound on the optimum, while also providing anytime upper bounds, like DFS. Hence, it provides feedback on the progress of search and solution quality in the form of an optimality gap. In addition, it exhibits highly dynamic behavior that allows it to perform on par with methods like limited discrepancy search and frequent restarting in terms of quickly finding good solutions. We also use the lower bounds reported by HBFS in problems with small treewidth, by integrating it into Backtracking with Tree Decomposition (BTD). BTD-HBFS exploits the lower bounds reported by HBFS in individual clusters to improve the anytime behavior and global pruning lower bound of BTD. In an extensive empirical evaluation on optimization problems from a variety of application domains, we show that both HBFS and BTD-HBFS improve both anytime and overall performance compared to their counterparts.

Journal of Computational Chemistry | 2016

Fast search algorithms for computational protein design.

Seydou Traoré; Kyle E. Roberts; David Allouche; Bruce Randall Donald; Isabelle André; Thomas Schiex; Sophie Barbe

One of the main challenges in computational protein design (CPD) is the huge size of the protein sequence and conformational space that has to be computationally explored. Recently, we showed that state‐of‐the‐art combinatorial optimization technologies based on Cost Function Network (CFN) processing allow speeding up provable rigid backbone protein design methods by several orders of magnitudes. Building up on this, we improved and injected CFN technology into the well‐established CPD package Osprey to allow all Osprey CPD algorithms to benefit from associated speedups. Because Osprey fundamentally relies on the ability of A* to produce conformations in increasing order of energy, we defined new A* strategies combining CFN lower bounds, with new side‐chain positioning‐based branching scheme. Beyond the speedups obtained in the new A* ‐CFN combination, this novel branching scheme enables a much faster enumeration of suboptimal sequences, far beyond what is reachable without it. Together with the immediate and important speedups provided by CFN technology, these developments directly benefit to all the algorithms that previously relied on the DEE/ A* combination inside Osprey* and make it possible to solve larger CPD problems with provable algorithms.

principles and practice of constraint programming | 2012

Computational protein design as a cost function network optimization problem

David Allouche; Seydou Traore; Isabelle Andre; Simon de Givry; Georgios Katsirelos; Sophie Barbe; Thomas Schiex

Constraints - An International Journal | 2016

Multi-language evaluation of exact solvers in graphical model discrete optimization

Barry Hurley; Barry O'Sullivan; David Allouche; George Katsirelos; Thomas Schiex; Matthias Zytnicki; Simon de Givry

By representing the constraints and objective function in factorized form, graphical models can concisely define various NP-hard optimization problems. They are therefore extensively used in several areas of computer science and artificial intelligence. Graphical models can be deterministic or stochastic, optimize a sum or product of local functions, defining a joint cost or probability distribution. Simple transformations exist between these two types of models, but also with MaxSAT or linear programming. In this paper, we report on a large comparison of exact solvers which are all state-of-the-art for their own target language. These solvers are all evaluated on deterministic and probabilistic graphical models coming from the Probabilistic Inference Challenge 2011, the Computer Vision and Pattern Recognition OpenGM2 benchmark, the Weighted Partial MaxSAT Evaluation 2013, the MaxCSP 2008 Competition, the MiniZinc Challenge 2012 & 2013, and the CFLib (a library of Cost Function Networks). All 3026 instances are made publicly available in five different formats and seven formulations. To our knowledge, this is the first evaluation that encompasses such a large set of related NP-complete optimization frameworks, despite their tight connections. The results show that a small number of evaluated solvers are able to perform well on multiple areas. By exploiting the variability and complementarity of solver performances, we show that a simple portfolio approach can be very effective. This portfolio won the last UAI Evaluation 2014 (MAP task).

principles and practice of constraint programming | 2010

Towards parallel non serial dynamic programming for solving hard weighted CSP

David Allouche; Simon de Givry; Thomas Schiex

We introduce a parallelized version of tree-decomposition based dynamic programming for solving difficult weighted CSP instances on many cores. A tree decomposition organizes cost functions in a tree of collection of functions called clusters. By processing the tree from the leaves up to the root, we solve each cluster concurrently, for each assignment of its separator, using a state-of-the-art exact sequential algorithm. The grain of parallelism obtained in this way is directly related to the tree decomposition used. We use a dedicated strategy for building suitable decompositions. We present preliminary results of our prototype running on a cluster with hundreds of cores on different decomposable real problems. This implementation allowed us to solve the last open CELAR radio link frequency assignment instance to optimality.

modelling computation and optimization in information systems and management sciences | 2015

Approximate Counting with Deterministic Guarantees for Affinity Computation

Clément Viricel; David Simoncini; David Allouche; Simon de Givry; Sophie Barbe; Thomas Schiex

Computational Protein Design aims at rationally designing amino-acid sequences that fold into a given three-dimensional structure and that will bestow the designed protein with desirable properties/functions. Usual criteria for design include stability of the designed protein and affinity between it and a ligand of interest. However, estimating the affinity between two molecules requires to compute the partition function, a #P-complete problem.

Explore More