Jotun Hein
University of Oxford
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jotun Hein.
Nucleic Acids Research | 2003
Bjarne Knudsen; Jotun Hein
RNA secondary structures are important in many biological processes and efficient structure prediction can give vital directions for experimental investigations. Many available programs for RNA secondary structure prediction only use a single sequence at a time. This may be sufficient in some applications, but often it is possible to obtain related RNA sequences with conserved secondary structure. These should be included in structural analyses to give improved results. This work presents a practical way of predicting RNA secondary structure that is especially useful when related sequences can be obtained. The method improves a previous algorithm based on an explicit evolutionary model and a probabilistic model of structures. Predictions can be done on a web server at http://www.daimi.au.dk/~compbio/pfold.
Bellman Prize in Mathematical Biosciences | 1990
Jotun Hein
The parsimony principle states that a history of a set of sequences that minimizes the amount of evolution is a good approximation to the real evolutionary history of the sequences. This principle is applied to the reconstruction of the evolution of homologous sequences where recombinations or horizontal transfer can occur. First it is demonstrated that the appropriate structure to represent the evolution of sequences with recombinations is a family of trees each describing the evolution of a segment of the sequence. Two trees for neighboring segments will differ by exactly the transfer of a subtree within the whole tree. This leads to a metric between trees based on the smallest number of such operations needed to convert one tree into the other. An algorithm is presented that calculates this metric. This metric is used to formulate a dynamic programming algorithm that finds the most parsimonious history that fits a given set of sequences. The algorithm is potentially very practical, since many groups of sequences defy analysis by methods that ignore recombinations. These methods give ambiguous or contradictory results because the sequence history cannot be described by one phylogeny, but only a family of phylogenies that each describe the history of a segment of the sequences. The generalization of the algorithm to reconstruct gene conversions and the possibility for heuristic versions of the algorithm for larger data sets are discussed.
Journal of Molecular Evolution | 1993
Jotun Hein
SummarySequences subject to recombination and gene conversion defy phylogenetic analysis by traditional methods since their evolutionary history cannot be adequately summarized by a tree. This study investigates ways to describe their evolutionary history and proposes a method giving a partial reconstruction of this history. Multigene families, viruses, and alleles from within populations experience recombinations/gene conversions, so the questions studied here are relevant for a large body of data and the suggested solutions should be very practical. The method employed was implemented in a program, RecPars, written in C and was used to analyze nine retroviruses.
Discrete Applied Mathematics | 1996
Jotun Hein; Tao Jiang; Lusheng Wang; Kaizhong Zhang
We study the computational complexity and approximation of several problems arising in the comparison of evolutionary trees. It is shown that the maximum agreement subtree (MAST) problem for three trees with unbounded degree cannot be approximated within ratio (2^{log ^delta n})in polynomial time for any δ < 1, unless NP (subseteq)DTIME[2polylog n], and MAST with edge contractions for two binary trees is NP-hard. This answers two open questions posed in [1]. For the maximum refinement subtree (MRST) problem involving two trees, we show that it is polynomialtime solvable when both trees have bounded degree and is NP-hard when one of the trees can have an arbitrary degree. Finally, we consider the problem of optimally transforming a tree into another by transferring subtrees around. It is shown that computing the subtree-transfer distance is NP-hard and an approximation algorithm with performance ratio 3 is given.
Bioinformatics | 2003
Jakob Skou Pedersen; Jotun Hein
MOTIVATIONnA growing number of genomes are sequenced. The differences in evolutionary pattern between functional regions can thus be observed genome-wide in a whole set of organisms. The diverse evolutionary pattern of different functional regions can be exploited in the process of genomic annotation. The modelling of evolution by the existing comparative gene finders leaves room for improvement.nnnRESULTSnA probabilistic model of both genome structure and evolution is designed. This type of model is called an Evolutionary Hidden Markov Model (EHMM), being composed of an HMM and a set of region-specific evolutionary models based on a phylogenetic tree. All parameters can be estimated by maximum likelihood, including the phylogenetic tree. It can handle any number of aligned genomes, using their phylogenetic tree to model the evolutionary correlations. The time complexity of all algorithms used for handling the model are linear in alignment length and genome number. The model is applied to the problem of gene finding. The benefit of modelling sequence evolution is demonstrated both in a range of simulations and on a set of orthologous human/mouse gene pairs.nnnAVAILABILITYnFree availability over the Internet on www server: http://www.birc.dk/Software/evogene.
Journal of Computational Biology | 2005
Yun S. Song; Jotun Hein
By viewing the ancestral recombination graph as defining a sequence of trees, we show how possible evolutionary histories consistent with given data can be constructed using the minimum number of recombination events. In contrast to previously known methods, which yield only estimated lower bounds, our method of detecting recombination always gives the minimum number of recombination events if the right kind of rooted trees are used in our algorithm. A new lower bound can be defined if rooted trees with fewer constraints are used. As well as studying how often it actually is equal to the minimum, we test how this new lower bound performs in comparison to some other lower bounds. Our study indicates that the new lower bound is an improvement on earlier bounds. Also, using simulated data, we investigate how well our method can recover the actual site-specific evolutionary relationships. In the presence of recombination, using a single tree to describe the evolution of the entire locus clearly leads to lower average recovery percentages than does our method. Our study shows that recovering the actual local tree topologies can be done more accurately than estimating the actual number of recombination events.
intelligent systems in molecular biology | 2004
Gerton Lunter; Jotun Hein
MOTIVATIONnIt is well known that neighbouring nucleotides in DNA sequences do not mutate independently of each other. In this paper, we introduce a context-dependent substitution model and derive an algorithm to calculate the likelihood of sequences evolving under this model. We use this algorithm to estimate neighbour-dependent substitution rates, as well as rates for dinucleotide substitutions, using a Bayesian sampling procedure. The model is irreversible, giving an arrow to time, and allowing the position of the root between a pair of sequences to be inferred without using out-groups.nnnRESULTSnWe applied the model upon aligned human-mouse non-coding data. Clear neighbour dependencies were observed, including 17-18-fold increased CpG to TpG/CpA rates compared with other substitutions. Root inference positioned the root halfway the mouse and human tips, suggesting an approximately clock-like behaviour of the irreversible part of the substitution process.
workshop on algorithms in bioinformatics | 2005
Rune B. Lyngsø; Yun S. Song; Jotun Hein
Recombination plays an important role in creating genetic diversity within species, and inferring past recombination events is central to many problems in genetics. Given a set M of sampled sequences, finding an evolutionary history for M with the minimum number of recombination events is a computationally very challenging problem. In this paper, we present a novel branch and bound algorithm for tackling that problem. Our method is shown to be far more efficient than the only preexisting exact method, described in [1]. Our software implementing the algorithm discussed in this paper is publicly available.
Journal of Computational Biology | 2003
Gerton Lunter; István Miklós; Yun S. Song; Jotun Hein
We present an efficient algorithm for statistical multiple alignment based on the TKF91 model of Thorne, Kishino, and Felsenstein (1991) on an arbitrary k-leaved phylogenetic tree. The existing algorithms use a hidden Markov model approach, which requires at least O( radical 5(k)) states and leads to a time complexity of O(5(k)L(k)), where L is the geometric mean sequence length. Using a combinatorial technique reminiscent of inclusion/exclusion, we are able to sum away the states, thus improving the time complexity to O(2(k)L(k)) and considerably reducing memory requirements. This makes statistical multiple alignment under the TKF91 model a definite practical possibility in the case of a phylogenetic tree with a modest number of leaves.
Bulletin of Mathematical Biology | 1989
Jotun Hein
In this article the question of reconstructing a phylogeny from additive distance data is addressed. Previous algorithms used the complete distance matrix of the n OTUs (Operational Taxonomic Unit), that corresponds to the tips of the tree. This used O(n2) computing time. It is shown that this is wasteful for biologically reasonable trees. If the tree has internal nodes with degrees that are bounded an O(n*log(n] algorithm is possible. It is also shown if the nodes can have unbounded degrees the problem has n2 as lower bound.