Chris Whidden
Dalhousie University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chris Whidden.
workshop on algorithms in bioinformatics | 2009
Chris Whidden; Norbert Zeh
We provide a unifying view on the structure of maximum (acyclic) agreement forests of rooted and unrooted phylogenies. This enables us to obtain linear- or O(n log n)-time 3-approximation and improved fixed-parameter algorithms for the subtree prune and regraft distance between two rooted phylogenies, the tree bisection and reconnection distance between two unrooted phylogenies, and the hybridization number of two rooted phylogenies.
SIAM Journal on Computing | 2013
Chris Whidden; Robert G. Beiko; Norbert Zeh
We present new and improved fixed-parameter algorithms for computing maximum agreement forests of pairs of rooted binary phylogenetic trees. The size of such a forest for two trees corresponds to their subtree prune-and-regraft distance and, if the agreement forest is acyclic, to their hybridization number. These distance measures are essential tools for understanding reticulate evolution. Our algorithm for computing maximum acyclic agreement forests is the first depth-bounded search algorithm for this problem. Our algorithms substantially outperform the best previous algorithms for these problems.
Systematic Biology | 2014
Chris Whidden; Norbert Zeh; Robert G. Beiko
Supertree methods reconcile a set of phylogenetic trees into a single structure that is often interpreted as a branching history of species. A key challenge is combining conflicting evolutionary histories that are due to artifacts of phylogenetic reconstruction and phenomena such as lateral gene transfer (LGT). Many supertree approaches use optimality criteria that do not reflect underlying processes, have known biases, and may be unduly influenced by LGT. We present the first method to construct supertrees by using the subtree prune-and-regraft (SPR) distance as an optimality criterion. Although calculating the rooted SPR distance between a pair of trees is NP-hard, our new maximum agreement forest-based methods can reconcile trees with hundreds of taxa and > 50 transfers in fractions of a second, which enables repeated calculations during the course of an iterative search. Our approach can accommodate trees in which uncertain relationships have been collapsed to multifurcating nodes. Using a series of benchmark datasets simulated under plausible rates of LGT, we show that SPR supertrees are more similar to correct species histories than supertrees based on parsimony or Robinson–Foulds distance criteria. We successfully constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees that covered 244 genomes representing several major bacterial phyla. Our SPR-based approach also allowed direct inference of highways of gene transfer between bacterial classes and genera. A Small number of these highways connect genera in different phyla and can highlight specific genes implicated in long-distance LGT. [Lateral gene transfer; matrix representation with parsimony; phylogenomics; prokaryotic phylogeny; Robinson–Foulds; subtree prune-and-regraft; supertrees.]
symposium on experimental and efficient algorithms | 2010
Chris Whidden; Robert G. Beiko; Norbert Zeh
We improve on earlier FPT algorithms for computing a rooted maximum agreement forest (MAF) or a maximum acyclic agreement forest (MAAF) of a pair of phylogenetic trees. Their sizes give the subtree-prune-and-regraft (SPR) distance and the hybridization number of the trees, respectively. We introduce new branching rules that reduce the running time of the algorithms from O(3n) and O(3n log n) to O(2.42n) and O(2.42n log n), respectively. In practice, the speed up may be much more than predicted by the worst-case analysis. We confirm this intuition experimentally by computing MAFs for simulated trees and trees inferred from protein sequence data. We show that our algorithm is orders of magnitude faster and can handle much larger trees and SPR distances than the best previous methods, treeSAT and sprdist.
Systematic Biology | 2015
Chris Whidden; Frederick A. Matsen
In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks.
Algorithmica | 2016
Chris Whidden; Robert G. Beiko; Norbert Zeh
We present efficient fixed-parameter and approximation algorithms for the NP-hard problem of computing a maximum agreement forest (MAF) of a pair of multifurcating (nonbinary) rooted trees. Multifurcating trees arise naturally as a result of statistical uncertainty in current tree construction methods. The size of an MAF corresponds to the subtree prune-and-regraft distance of the two trees and is intimately connected to their hybridization number. These distance measures are essential tools for understanding reticulate evolution, such as lateral gene transfer, recombination, and hybridization. Our algorithms nearly match the running times of the currently best algorithms for the binary case. This is achieved using a combination of efficient branching rules (similar to but more complex than in the binary case) and a novel edge protection scheme that further reduces the size of the search space the algorithms need to explore.
Theoretical Computer Science | 2017
Chris Whidden; Frederick A. Matsen
Abstract Statistical phylogenetic inference methods use tree rearrangement operations such as subtree–prune–regraft (SPR) to perform Markov chain Monte Carlo (MCMC) across tree topologies. The structure of the graph induced by tree rearrangement operations is an important determinant of the mixing properties of MCMC, motivating the study of the underlying SPR graph in greater detail. In this paper, we investigate the SPR graph of rooted trees (rSPR graph) in a new way: by calculating the Ricci–Ollivier curvature with respect to uniform and Metropolis–Hastings random walks. This value quantifies the degree to which a pair of random walkers from specified points move towards each other; negative curvature means that they move away from one another on average, while positive curvature means that they move towards each other. In order to calculate this curvature, we develop fast new algorithms for rSPR graph computation. We then develop formulas characterizing how the number of rSPR neighbors of a tree changes after an rSPR operation is applied to that tree. These give bounds on the curvature, as well as a flatness-in-the-limit theorem indicating that paths of small topology changes are easy to traverse. However, we find that large topology changes (i.e. moving a large subtree) give pairs of trees with negative curvature. We show using simulation that mean access time distributions depend on distance, degree, and curvature, demonstrating the relevance of these results to stochastic tree search.
SIAM Journal on Discrete Mathematics | 2016
Leo van Iersel; Steven Kelk; Nela Lekić; Chris Whidden; Norbert Zeh
Phylogenetic networks are leaf-labelled directed acyclic graphs that are used to describe non-treelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all indegrees minus the number of nodes plus one. The Hybridization Number problem takes as input a collection of phylogenetic trees and asks to construct a phylogenetic network that contains an embedding of each of the input trees and has a smallest possible hybridization number. We present an algorithm for the Hybridization Number problem on three binary trees on
knowledge discovery and data mining | 2009
Vlado Keselj; Haibin Liu; Norbert Zeh; Christian Blouin; Chris Whidden
n
Journal of Mathematical Biology | 2018
Alex Gavryushkin; Chris Whidden; Frederick A. Matsen
leaves, which runs in time