Jotun Hein | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jotun Hein is active.

Explore More

Publication

Featured researches published by Jotun Hein.

Nucleic Acids Research | 2003

Pfold: RNA secondary structure prediction using stochastic context-free grammars

Bjarne Knudsen; Jotun Hein

RNA secondary structures are important in many biological processes and efficient structure prediction can give vital directions for experimental investigations. Many available programs for RNA secondary structure prediction only use a single sequence at a time. This may be sufficient in some applications, but often it is possible to obtain related RNA sequences with conserved secondary structure. These should be included in structural analyses to give improved results. This work presents a practical way of predicting RNA secondary structure that is especially useful when related sequences can be obtained. The method improves a previous algorithm based on an explicit evolutionary model and a probabilistic model of structures. Predictions can be done on a web server at http://www.daimi.au.dk/~compbio/pfold.

Bellman Prize in Mathematical Biosciences | 1990

Reconstructing evolution of sequences subject to recombination using parsimony

Jotun Hein

The parsimony principle states that a history of a set of sequences that minimizes the amount of evolution is a good approximation to the real evolutionary history of the sequences. This principle is applied to the reconstruction of the evolution of homologous sequences where recombinations or horizontal transfer can occur. First it is demonstrated that the appropriate structure to represent the evolution of sequences with recombinations is a family of trees each describing the evolution of a segment of the sequence. Two trees for neighboring segments will differ by exactly the transfer of a subtree within the whole tree. This leads to a metric between trees based on the smallest number of such operations needed to convert one tree into the other. An algorithm is presented that calculates this metric. This metric is used to formulate a dynamic programming algorithm that finds the most parsimonious history that fits a given set of sequences. The algorithm is potentially very practical, since many groups of sequences defy analysis by methods that ignore recombinations. These methods give ambiguous or contradictory results because the sequence history cannot be described by one phylogeny, but only a family of phylogenies that each describe the history of a segment of the sequences. The generalization of the algorithm to reconstruct gene conversions and the possibility for heuristic versions of the algorithm for larger data sets are discussed.

Journal of Molecular Evolution | 1993

A heuristic method to reconstruct the history of sequences subject to recombination

Jotun Hein

SummarySequences subject to recombination and gene conversion defy phylogenetic analysis by traditional methods since their evolutionary history cannot be adequately summarized by a tree. This study investigates ways to describe their evolutionary history and proposes a method giving a partial reconstruction of this history. Multigene families, viruses, and alleles from within populations experience recombinations/gene conversions, so the questions studied here are relevant for a large body of data and the suggested solutions should be very practical. The method employed was implemented in a program, RecPars, written in C and was used to analyze nine retroviruses.

Discrete Applied Mathematics | 1996

On the complexity of comparing evolutionary trees

Jotun Hein; Tao Jiang; Lusheng Wang; Kaizhong Zhang

We study the computational complexity and approximation of several problems arising in the comparison of evolutionary trees. It is shown that the maximum agreement subtree (MAST) problem for three trees with unbounded degree cannot be approximated within ratio (2^{log ^delta n})in polynomial time for any δ < 1, unless NP (subseteq)DTIME[2polylog n], and MAST with edge contractions for two binary trees is NP-hard. This answers two open questions posed in [1]. For the maximum refinement subtree (MRST) problem involving two trees, we show that it is polynomialtime solvable when both trees have bounded degree and is NP-hard when one of the trees can have an arbitrary degree. Finally, we consider the problem of optimally transforming a tree into another by transferring subtrees around. It is shown that computing the subtree-transfer distance is NP-hard and an approximation algorithm with performance ratio 3 is given.

Bioinformatics | 2003

Gene finding with a hidden Markov model of genome structure and evolution

Jakob Skou Pedersen; Jotun Hein

MOTIVATIONnA growing number of genomes are sequenced. The differences in evolutionary pattern between functional regions can thus be observed genome-wide in a whole set of organisms. The diverse evolutionary pattern of different functional regions can be exploited in the process of genomic annotation. The modelling of evolution by the existing comparative gene finders leaves room for improvement.nnnRESULTSnA probabilistic model of both genome structure and evolution is designed. This type of model is called an Evolutionary Hidden Markov Model (EHMM), being composed of an HMM and a set of region-specific evolutionary models based on a phylogenetic tree. All parameters can be estimated by maximum likelihood, including the phylogenetic tree. It can handle any number of aligned genomes, using their phylogenetic tree to model the evolutionary correlations. The time complexity of all algorithms used for handling the model are linear in alignment length and genome number. The model is applied to the problem of gene finding. The benefit of modelling sequence evolution is demonstrated both in a range of simulations and on a set of orthologous human/mouse gene pairs.nnnAVAILABILITYnFree availability over the Internet on www server: http://www.birc.dk/Software/evogene.

Journal of Computational Biology | 2005

Constructing Minimal Ancestral Recombination Graphs

Yun S. Song; Jotun Hein

By viewing the ancestral recombination graph as defining a sequence of trees, we show how possible evolutionary histories consistent with given data can be constructed using the minimum number of recombination events. In contrast to previously known methods, which yield only estimated lower bounds, our method of detecting recombination always gives the minimum number of recombination events if the right kind of rooted trees are used in our algorithm. A new lower bound can be defined if rooted trees with fewer constraints are used. As well as studying how often it actually is equal to the minimum, we test how this new lower bound performs in comparison to some other lower bounds. Our study indicates that the new lower bound is an improvement on earlier bounds. Also, using simulated data, we investigate how well our method can recover the actual site-specific evolutionary relationships. In the presence of recombination, using a single tree to describe the evolution of the entire locus clearly leads to lower average recovery percentages than does our method. Our study shows that recovering the actual local tree topologies can be done more accurately than estimating the actual number of recombination events.

intelligent systems in molecular biology | 2004

A nucleotide substitution model with nearest-neighbour interactions

Gerton Lunter; Jotun Hein

MOTIVATIONnIt is well known that neighbouring nucleotides in DNA sequences do not mutate independently of each other. In this paper, we introduce a context-dependent substitution model and derive an algorithm to calculate the likelihood of sequences evolving under this model. We use this algorithm to estimate neighbour-dependent substitution rates, as well as rates for dinucleotide substitutions, using a Bayesian sampling procedure. The model is irreversible, giving an arrow to time, and allowing the position of the root between a pair of sequences to be inferred without using out-groups.nnnRESULTSnWe applied the model upon aligned human-mouse non-coding data. Clear neighbour dependencies were observed, including 17-18-fold increased CpG to TpG/CpA rates compared with other substitutions. Root inference positioned the root halfway the mouse and human tips, suggesting an approximately clock-like behaviour of the irreversible part of the substitution process.

workshop on algorithms in bioinformatics | 2005

Minimum recombination histories by branch and bound

Rune B. Lyngsø; Yun S. Song; Jotun Hein

Recombination plays an important role in creating genetic diversity within species, and inferring past recombination events is central to many problems in genetics. Given a set M of sampled sequences, finding an evolutionary history for M with the minimum number of recombination events is a computationally very challenging problem. In this paper, we present a novel branch and bound algorithm for tackling that problem. Our method is shown to be far more efficient than the only preexisting exact method, described in [1]. Our software implementing the algorithm discussed in this paper is publicly available.

Journal of Computational Biology | 2003

An Efficient Algorithm for Statistical Multiple Alignment on Arbitrary Phylogenetic Trees

Gerton Lunter; István Miklós; Yun S. Song; Jotun Hein

We present an efficient algorithm for statistical multiple alignment based on the TKF91 model of Thorne, Kishino, and Felsenstein (1991) on an arbitrary k-leaved phylogenetic tree. The existing algorithms use a hidden Markov model approach, which requires at least O( radical 5(k)) states and leads to a time complexity of O(5(k)L(k)), where L is the geometric mean sequence length. Using a combinatorial technique reminiscent of inclusion/exclusion, we are able to sum away the states, thus improving the time complexity to O(2(k)L(k)) and considerably reducing memory requirements. This makes statistical multiple alignment under the TKF91 model a definite practical possibility in the case of a phylogenetic tree with a modest number of leaves.

Bulletin of Mathematical Biology | 1989

An optimal algorithm to reconstruct trees from additive distance data

Jotun Hein

In this article the question of reconstructing a phylogeny from additive distance data is addressed. Previous algorithms used the complete distance matrix of the n OTUs (Operational Taxonomic Unit), that corresponds to the tips of the tree. This used O(n2) computing time. It is shown that this is wasteful for biologically reasonable trees. If the tree has internal nodes with degrees that are bounded an O(n*log(n] algorithm is possible. It is also shown if the nodes can have unbounded degrees the problem has n2 as lower bound.

Explore More