Laura Kubatko
Ohio State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Laura Kubatko.
Systematic Biology | 2007
Laura Kubatko; James H. Degnan
Although multiple gene sequences are becoming increasingly available for molecular phylogenetic inference, the analysis of such data has largely relied on inference methods designed for single genes. One of the common approaches to analyzing data from multiple genes is concatenation of the individual gene data to form a single supergene to which traditional phylogenetic inference procedures - e.g., maximum parsimony (MP) or maximum likelihood (ML) - are applied. Recent empirical studies have demonstrated that concatenation of sequences from multiple genes prior to phylogenetic analysis often results in inference of a single, well-supported phylogeny. Theoretical work, however, has shown that the coalescent can produce substantial variation in single-gene histories. Using simulation, we combine these ideas to examine the performance of the concatenation approach under conditions in which the coalescent produces a high level of discord among individual gene trees and show that it leads to statistically inconsistent estimation in this setting. Furthermore, use of the bootstrap to measure support for the inferred phylogeny can result in moderate to strong support for an incorrect tree under these conditions. These results highlight the importance of incorporating variation in gene histories into multilocus phylogenetics.
Bioinformatics | 2009
Laura Kubatko; Bryan C. Carstens; L. Lacey Knowles
UNLABELLED STEM is a software package written in the C language to obtain maximum likelihood (ML) estimates for phylogenetic species trees given a sample of gene trees under the coalescent model. It includes options to compute the ML species tree, search the space of all species trees for the k trees of highest likelihood and compute ML branch lengths for a user-input species tree. AVAILABILITY The STEM package, including source code, is freely available at http://www.stat.osu.edu/~lkubatko/software/STEM/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Molecular Phylogenetics and Evolution | 2009
Liang Liu; Lili Yu; Laura Kubatko; Dennis K. Pearl; Scott V. Edwards
We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.
PLOS Neglected Tropical Diseases | 2009
Sandra D. Melman; Michelle L. Steinauer; Charles Cunningham; Laura Kubatko; Ibrahim N. Mwangi; Nirvana Barker Wynn; Martin W. Mutuku; Diana M. S. Karanja; Daniel G. Colley; Carla L. Black; William Evan Secor; Gerald M. Mkoji; Eric S. Loker
Background The near exclusive use of praziquantel (PZQ) for treatment of human schistosomiasis has raised concerns about the possible emergence of drug-resistant schistosomes. Methodology/Principal Findings We measured susceptibility to PZQ of isolates of Schistosoma mansoni obtained from patients from Kisumu, Kenya continuously exposed to infection as a consequence of their occupations as car washers or sand harvesters. We used a) an in vitro assay with miracidia, b) an in vivo assay targeting adult worms in mice and c) an in vitro assay targeting adult schistosomes perfused from mice. In the miracidia assay, in which miracidia from human patients were exposed to PZQ in vitro, reduced susceptibility was associated with previous treatment of the patient with PZQ. One isolate (“KCW”) that was less susceptible to PZQ and had been derived from a patient who had never fully cured despite multiple treatments was studied further. In an in vivo assay of adult worms, the KCW isolate was significantly less susceptible to PZQ than two other isolates from natural infections in Kenya and two lab-reared strains of S. mansoni. The in vitro adult assay, based on measuring length changes of adults following exposure to and recovery from PZQ, confirmed that the KCW isolate was less susceptible to PZQ than the other isolates tested. A sub-isolate of KCW maintained separately and tested after three years was susceptible to PZQ, indicative that the trait of reduced sensitivity could be lost if selection was not maintained. Conclusions/Significance Isolates of S. mansoni from some patients in Kisumu have lower susceptibility to PZQ, including one from a patient who was never fully cured after repeated rounds of treatment administered over several years. As use of PZQ continues, continued selection for worms with diminished susceptibility is possible, and the probability of emergence of resistance will increase as large reservoirs of untreated worms diminish. The potential for rapid emergence of resistance should be an important consideration of treatment programs.
Bioinformatics | 2014
Julia Chifman; Laura Kubatko
MOTIVATION Increasing attention has been devoted to estimation of species-level phylogenetic relationships under the coalescent model. However, existing methods either use summary statistics (gene trees) to carry out estimation, ignoring an important source of variability in the estimates, or involve computationally intensive Bayesian Markov chain Monte Carlo algorithms that do not scale well to whole-genome datasets. RESULTS We develop a method to infer relationships among quartets of taxa under the coalescent model using techniques from algebraic statistics. Uncertainty in the estimated relationships is quantified using the nonparametric bootstrap. The performance of our method is assessed with simulated data. We then describe how our method could be used for species tree inference in larger taxon samples, and demonstrate its utility using datasets for Sistrurus rattlesnakes and for soybeans. AVAILABILITY AND IMPLEMENTATION The method to infer the phylogenetic relationship among quartets is implemented in the software SVDquartets, available at www.stat.osu.edu/∼lkubatko/software/SVDquartets.
Systematic Biology | 2009
Laura Kubatko
As DNA sequences have become more readily available, it has become increasingly desirable to infer species phylogenies from multigene data sets. Much recent work has centered around the recognition that substantial incongruence in single-gene phylogenies necessitates the development of statistical procedures to estimate species phylogenies that appropriately model the process of evolution at the level of the individual genes. One process that gives rise to variation in the histories of individual genes is incomplete lineage sorting, which is commonly modeled by the coalescent, and thus much current work is focused on proper estimation of species phylogenies under the coalescent model. A second common source of discord in single-gene phylogenies is hybridization, a process that is ubiquitous in many groups of plants and animals. Although methods to incorporate hybridization into phylogenetic estimation have also been developed, only a handful of methods that address both coalescence and hybridization have been proposed. Here, I propose an extension of an existing model that incorporates both of these processes simultaneously by utilizing gene trees for inference in a likelihood framework. The model allows examination of the evidence for hybridization in the presence of incomplete lineage sorting due to deep coalescence via model selection using standard information criteria (e.g., Akaike information criterion and Bayesian information criterion). The potential of the method is evaluated using simulated data.
Theoretical Population Biology | 2009
Chen Meng; Laura Kubatko
The application of phylogenetic inference methods, to data for a set of independent genes sampled randomly throughout the genome, often results in substantial incongruence in the single-gene phylogenetic estimates. Among the processes known to produce discord between single-gene phylogenies, two of the best studied in a phylogenetic context are hybridization and incomplete lineage sorting. Much recent attention has focused on the development of methods for estimating species phylogenies in the presence of incomplete lineage sorting, but phylogenetic models that allow for hybridization have been more limited. Here we propose a model that allows incongruence in single-gene phylogenies to be due to both hybridization and incomplete lineage sorting, with the goal of determining the contribution of hybridization to observed gene tree incongruence in the presence of incomplete lineage sorting. Using our model, we propose methods for estimating the extent of the role of hybridization in both a likelihood and a Bayesian framework. The performance of our methods is examined using both simulated and empirical data.
Evolutionary Bioinformatics | 2012
Laura M. Boykin; Karen F. Armstrong; Laura Kubatko; Paul J. De Barro
Species delimitation directly impacts on global biosecurity. It is a critical element in the decisions made by national governments in regard to the flow of trade and to the biosecurity measures imposed to protect countries from the threat of invasive species. Here we outline a novel approach to species delimitation, “tip to root”, for two highly invasive insect pests, Bemisia tabaci (sweetpotato whitefly) and Lymantria dispar (Asian gypsy moth). Both species are of concern to biosecurity, but illustrate the extremes of phylogenetic resolution that present the most complex delimitation issues for biosecurity; B. tabaci having extremely high intra-specific genetic variability and L. dispar composed of relatively indistinct subspecies. This study tests a series of analytical options to determine their applicability as tools to provide more rigorous species delimitation measures and consequently more defensible species assignments and identification of unknowns for biosecurity. Data from established DNA barcode datasets (COI), which are becoming increasingly considered for adoption in biosecurity, were used here as an example. The analytical approaches included the commonly used Kimura two-parameter (K2P) inter-species distance plus four more stringent measures of taxon distinctiveness, (1) Rosenbergs reciprocal monophyly, (P(AB)), 1 (2) Rodrigos (P(randomly distinct)), 2 (3) genealogical sorting index, (gsi), 3 and (4) General mixed Yule- coalescent (GMYC).4,5 For both insect datasets, a comparative analysis of the methods revealed that the K2P distance method does not capture the same level of species distinctiveness revealed by the other three measures; in B. tabaci there are more distinct groups than previously identified using the K2P distances and for L. dipsar far less variation is apparent within the predefined subspecies. A consensus for the results from P(AB), P(randomly distinct) and gsi offers greater statistical confidence as to where genetic limits might be drawn. In the species cases here, the results clearly indicate that there is a need for more gene sampling to substantiate either the new cohort of species indicated for B. tabaci or to detect the established subspecies taxonomy of L. dispar. Given the ease of use through the Geneious species delimitation plugins, similar analysis of such multi-gene datasets would be easily accommodated. Overall, the tip to root approach described here is recommended where careful consideration of species delimitation is required to support crucial biosecurity decisions based on accurate species identification.
Systematic Biology | 2010
Huateng Huang; Qixin He; Laura Kubatko; L. Lacey Knowles
Discord in the estimated gene trees among loci can be attributed to both the process of mutation and incomplete lineage sorting. Effectively modeling these two sources of variation--mutational and coalescent variance--provides two distinct challenges for phylogenetic studies. Despite extensive investigation on mutational models for gene-tree estimation over the past two decades and recent attention to modeling of the coalescent process for phylogenetic estimation, the effects of these two variances have yet to be evaluated simultaneously. Here, we partition the effects of mutational and coalescent processes on phylogenetic accuracy by comparing the accuracy of species trees estimated from gene trees (i.e., the actual coalescent genealogies) with that of species trees estimated from estimated gene trees (i.e., trees estimated from nucleotide sequences, which contain both coalescent and mutational variance). Not only is there a significant contribution of both mutational and coalescent variance to errors in species-tree estimates, but the relative magnitude of the effects on the accuracy of species-tree estimation also differs systematically depending on 1) the timing of divergence, 2) the sampling design, and 3) the method used for species-tree estimation. These findings explain why using more information contained in gene trees (e.g., topology and branch lengths as opposed to just topology) does not necessarily translate into pronounced gains in accuracy, highlighting the strengths and limits of different methods for species-tree estimation. Differences in accuracy scores between methods for different sampling regimes also emphasize that it would be a mistake to assume more computationally intensive species-tree estimation procedures that will always provide better estimates of species trees. To the contrary, the performance of a method depends not only on the method per se but also on the compatibilities between the input genetic data and the method as determined by the relative impact of mutational and coalescent variance.
Systematic Biology | 2011
Laura Kubatko; H. Lisle Gibbs; Erik W. Bloomquist
Phylogenetic relationships and taxonomic distinctiveness of closely related species and subspecies are most accurately inferred from data derived from multiple independent loci. Here, we apply several approaches for understanding species-level relationships using data from 18 nuclear DNA loci and 1 mitochondrial DNA locus within currently described species and subspecies of Sistrurus rattlesnakes. Collectively, these methods provide evidence that a currently described species, the massasauga rattlesnake (Sistrurus catenatus), consists of two well-supported clades, one composed of the two western subspecies (S. c. tergeminus and S. c. edwardsii) and the other the eastern subspecies (S. c. catenatus). Within pigmy rattlesnakes (S. miliarius), however, there is not strong support across methods for any particular grouping at the subspecific level. Monophyly based tests for taxonomic distinctiveness show evidence for distinctiveness of all subspecies but this support is strongest by far for the S. c. catenatus clade. Because support for the distinctiveness of S. c. catenatus is both strong and consistent across methods, and due to its morphological distinctiveness and allopatric distribution, we suggest that this subspecies be elevated to full species status, which has significant conservation implications. Finally, most divergence time estimates based upon a fossil-calibrated species tree are > 50% younger than those from a concatenated gene tree analysis and suggest that an active period of speciation within Sistrurus occurred within the late Pliocene/Pleistocene eras.
Collaboration
Dive into the Laura Kubatko's collaboration.
Commonwealth Scientific and Industrial Research Organisation
View shared research outputs