Jakub Truszkowski
University of Waterloo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jakub Truszkowski.
Algorithms for Molecular Biology | 2012
Jakub Truszkowski; Yanqi Hao; Daniel G. Brown
BackgroundRecently, we have identified a randomized quartet phylogeny algorithm that has O(n logn) runtime with high probability, which is asymptotically optimal. Our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O(loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of Neighbour Joining , which requires Θ(n3) runtime in existing implementations. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost. An early prototype implementation of our software is available athttp://www.cs.uwaterloo.ca/jmtruszk/qtree.tar.gz.
combinatorial pattern matching | 2011
Daniel G. Brown; Jakub Truszkowski
We present a quartet-based phylogeny algorithm that returns the correct topology for n taxa in O(n logn) time with high probability, assuming each quartet is inconsistent with the true tree topology with constant probability, independent of other quartets. Our incremental algorithm relies upon a search tree structure for the phylogeny that is balanced, with high probability, no matter the true topology. In experiments, our prototype was as fast as the fastest heuristics, but because real data do not typically satisfy our probabilistic assumptions, its overall performance is not as good as our theoretical results predict.
pacific symposium on biocomputing | 2012
Daniel G. Brown; Jakub Truszkowski
We consider the problem of phylogenetic placement, in which large numbers of sequences (often next-generation sequencing reads) are placed onto an existing phylogenetic tree. We adapt our recent work on phylogenetic tree inference, which uses ancestral sequence reconstruction and locality-sensitive hashing, to this domain. With these ideas, new sequences can be placed onto trees with high fidelity in strikingly fast runtimes. Our results are two orders of magnitude faster than existing programs for this domain, and show a modest accuracy tradeoff. Our results offer the possibility of analyzing many more reads in a next-generation sequencing project than is currently possible.
BMC Bioinformatics | 2011
Jakub Truszkowski; Daniel G. Brown
BackgroundIdentifying recombinations in HIV is important for studying the epidemiology of the virus and aids in the design of potential vaccines and treatments. The previous widely-used tool for this task uses the Viterbi algorithm in a hidden Markov model to model recombinant sequences.ResultsWe apply a new decoding algorithm for this HMM that improves prediction accuracy. Exactly locating breakpoints is usually impossible, since different subtypes are highly conserved in some sequence regions. Our algorithm identifies these sites up to a certain error tolerance. Our new algorithm is more accurate in predicting the location of recombination breakpoints. Our implementation of the algorithm is available at http://www.cs.uwaterloo.ca/~jmtruszk/jphmm_balls.tar.gz.ConclusionsBy explicitly accounting for uncertainty in breakpoint positions, our algorithm offers more reliable predictions of recombination breakpoints in HIV-1. We also document a new domain of use for our new decoding approach in HMMs.
workshop on algorithms in bioinformatics | 2011
Daniel G. Brown; Jakub Truszkowski
Recently, we have identified a quartet phylogeny algorithm with O(n log n) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O(log log n) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.
BMC Bioinformatics | 2012
Andre P. Masella; Andrea K. Bartram; Jakub Truszkowski; Daniel G. Brown; Josh D. Neufeld
uncertainty in artificial intelligence | 2010
Farheen Omar; Mathieu Sinn; Jakub Truszkowski; Pascal Poupart; James Tung; Allen Caine
BMC Bioinformatics | 2010
Daniel G. Brown; Jakub Truszkowski
workshop on algorithms in bioinformatics | 2012
Daniel G. Brown; Jakub Truszkowski
arXiv: Populations and Evolution | 2011
Daniel G. Brown; Jakub Truszkowski