Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alexey Kozlov is active.

Publication


Featured researches published by Alexey Kozlov.


Bioinformatics | 2015

ExaML version 3: a tool for phylogenomic analyses on supercomputers

Alexey Kozlov; Andre J. Aberer; Alexandros Stamatakis

Motivation: Phylogenies are increasingly used in all fields of medical and biological research. Because of the next generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. We present ExaML version 3, a dedicated production-level code for inferring phylogenies on whole-transcriptome and whole-genome alignments using supercomputers. Results: We introduce several improvements and extensions to ExaML: Extensions of substitution models and supported data types, the integration of a novel load balance algorithm as well as a parallel I/O optimization that significantly improve parallel efficiency, and a production-level implementation for Intel MIC-based hardware platforms. Availability and implementation: The code is available under GNU GPL at https://github.com/stamatak/ExaML. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Nature Ecology and Evolution | 2017

Parasites dominate hyperdiverse soil protist communities in Neotropical rainforests

Frédéric Mahé; Colomban de Vargas; David Bass; Lucas Czech; Alexandros Stamatakis; Enrique Lara; David Singer; Jordan Mayor; John Bunge; Sarah Sernaker; Tobias Siemensmeyer; Isabelle Trautmann; Sarah Romac; Cédric Berney; Alexey Kozlov; Edward A. D. Mitchell; Christophe V. W. Seppey; Elianne Sirnæs Egge; Guillaume Lentendu; Rainer Wirth; Gabriel Trueba; Micah Dunthorn

High animal and plant richness in tropical rainforest communities has long intrigued naturalists. It is unknown if similar hyperdiversity patterns are reflected at the microbial scale with unicellular eukaryotes (protists). Here we show, using environmental metabarcoding of soil samples and a phylogeny-aware cleaning step, that protist communities in Neotropical rainforests are hyperdiverse and dominated by the parasitic Apicomplexa, which infect arthropods and other animals. These host-specific parasites potentially contribute to the high animal diversity in the forests by reducing population growth in a density-dependent manner. By contrast, too few operational taxonomic units (OTUs) of Oomycota were found to broadly drive high tropical tree diversity in a host-specific manner under the Janzen-Connell model. Extremely high OTU diversity and high heterogeneity between samples within the same forests suggest that protists, not arthropods, are the most diverse eukaryotes in tropical rainforests. Our data show that protists play a large role in tropical terrestrial ecosystems long viewed as being dominated by macroorganisms.


Nucleic Acids Research | 2016

Phylogeny-aware identification and correction of taxonomically mislabeled sequences

Alexey Kozlov; Jiajie Zhang; Pelin Yilmaz; Frank Oliver Glöckner; Alexandros Stamatakis

Abstract Molecular sequences in public databases are mostly annotated by the submitting authors without further validation. This procedure can generate erroneous taxonomic sequence labels. Mislabeled sequences are hard to identify, and they can induce downstream errors because new sequences are typically annotated using existing ones. Furthermore, taxonomic mislabelings in reference sequence databases can bias metagenetic studies which rely on the taxonomy. Despite significant efforts to improve the quality of taxonomic annotations, the curation rate is low because of the labor-intensive manual curation process. Here, we present SATIVA, a phylogeny-aware method to automatically identify taxonomically mislabeled sequences (‘mislabels’) using statistical models of evolution. We use the Evolutionary Placement Algorithm (EPA) to detect and score sequences whose taxonomic annotation is not supported by the underlying phylogenetic signal, and automatically propose a corrected taxonomic classification for those. Using simulated data, we show that our method attains high accuracy for identification (96.9% sensitivity/91.7% precision) as well as correction (94.9% sensitivity/89.9% precision) of mislabels. Furthermore, an analysis of four widely used microbial 16S reference databases (Greengenes, LTP, RDP and SILVA) indicates that they currently contain between 0.2% and 2.5% mislabels. Finally, we use SATIVA to perform an in-depth evaluation of alternative taxonomies for Cyanobacteria. SATIVA is freely available at https://github.com/amkozlov/sativa.


Molecular Phylogenetics and Evolution | 2018

Transcriptome sequence-based phylogeny of chalcidoid wasps (Hymenoptera: Chalcidoidea) reveals a history of rapid radiations, convergence, and evolutionary success

Ralph S. Peters; Oliver Niehuis; Simon Gunkel; Marcel Bläser; Christoph Mayer; Lars Podsiadlowski; Alexey Kozlov; Alexander Donath; Simon van Noort; Shanlin Liu; Xin Zhou; Bernhard Misof; Lars Krogmann

Chalcidoidea are a megadiverse group of mostly parasitoid wasps of major ecological and economical importance that are omnipresent in almost all extant terrestrial habitats. The timing and pattern of chalcidoid diversification is so far poorly understood and has left many important questions on the evolutionary history of Chalcidoidea unanswered. In this study, we infer the early divergence events within Chalcidoidea and address the question of whether or not ancestral chalcidoids were small egg parasitoids. We also trace the evolution of some key traits: jumping ability, development of enlarged hind femora, and associations with figs. Our phylogenetic inference is based on the analysis of 3,239 single-copy genes across 48 chalcidoid wasps and outgroups representatives. We applied an innovative a posteriori evaluation approach to molecular clock-dating based on nine carefully validated fossils, resulting in the first molecular clock-based estimation of deep Chalcidoidea divergence times. Our results suggest a late Jurassic origin of Chalcidoidea, with a first divergence of morphologically and biologically distinct groups in the early to mid Cretaceous, between 129 and 81 million years ago (mya). Diversification of most extant lineages happened rapidly after the Cretaceous in the early Paleogene, between 75 and 53 mya. The inferred Chalcidoidea tree suggests a transition from ancestral minute egg parasitoids to larger-bodied parasitoids of other host stages during the early history of chalcidoid evolution. The ability to jump evolved independently at least three times, namely in Eupelmidae, Encyrtidae, and Tanaostigmatidae. Furthermore, the large-bodied strongly sclerotized species with enlarged hind femora in Chalcididae and Leucospidae are not closely related. Finally, the close association of some chalcidoid wasps with figs, either as pollinators, or as inquilines/gallers or as parasitoids, likely evolved at least twice independently: in the Eocene, giving rise to fig pollinators, and in the Oligocene or Miocene, resulting in non-pollinating fig-wasps, including gallers and parasitoids. The origins of very speciose lineages (e.g., Mymaridae, Eulophidae, Pteromalinae) are evenly spread across the period of chalcidoid evolution from early Cretaceous to the late Eocene. Several shifts in biology and morphology (e.g., in host exploitation, body shape and size, life history), each followed by rapid radiations, have likely enabled the evolutionary success of Chalcidoidea.


American Journal of Botany | 2018

A roadmap for global synthesis of the plant tree of life

Wolf L. Eiserhardt; Alexandre Antonelli; Dominic J. Bennett; Laura R. Botigué; J. Gordon Burleigh; Steven Dodsworth; Brian J. Enquist; Félix Forest; Jan T. Kim; Alexey Kozlov; Ilia J. Leitch; Brian S. Maitner; Siavash Mirarab; William H. Piel; Oscar Alejandro Pérez-Escobar; Lisa Pokorny; Carsten Rahbek; Brody Sandel; Stephen A. Smith; Alexandros Stamatakis; Rutger A. Vos; Tandy J. Warnow; William J. Baker

Providing science and society with an integrated, up-to-date, high quality, open, reproducible and sustainable plant tree of life would be a huge service that is now coming within reach. However, synthesizing the growing body of DNA sequence data in the public domain and disseminating the trees to a diverse audience are often not straightforward due to numerous informatics barriers. While big synthetic plant phylogenies are being built, they remain static and become quickly outdated as new data are published and tree-building methods improve. Moreover, the body of existing phylogenetic evidence is hard to navigate and access for non-experts. We propose that our community of botanists, tree builders, and informaticians should converge on a modular framework for data integration and phylogenetic analysis, allowing easy collaboration, updating, data sourcing and flexible analyses. With support from major institutions, this pipeline should be re-run at regular intervals, storing trees and their metadata long-term. Providing the trees to a diverse global audience through user-friendly front ends and application development interfaces should also be a priority. Interactive interfaces could be used to solicit user feedback and thus improve data quality and to coordinate the generation of new data. We conclude by outlining a number of steps that we suggest the scientific community should take to achieve global phylogenetic synthesis.


BMC Evolutionary Biology | 2018

Phylogenomic analysis of Apoidea sheds new light on the sister group of bees

Manuela Sann; Oliver Niehuis; Ralph S. Peters; Christoph Mayer; Alexey Kozlov; Lars Podsiadlowski; Sarah Bank; Karen Meusemann; Bernhard Misof; Christoph Bleidorn; Michael Ohl

BackgroundApoid wasps and bees (Apoidea) are an ecologically and morphologically diverse group of Hymenoptera, with some species of bees having evolved eusocial societies. Major problems for our understanding of the evolutionary history of Apoidea have been the difficulty to trace the phylogenetic origin and to reliably estimate the geological age of bees. To address these issues, we compiled a comprehensive phylogenomic dataset by simultaneously analyzing target DNA enrichment and transcriptomic sequence data, comprising 195 single-copy protein-coding genes and covering all major lineages of apoid wasps and bee families.ResultsOur compiled data matrix comprised 284,607 nucleotide sites that we phylogenetically analyzed by applying a combination of domain- and codon-based partitioning schemes. The inferred results confirm the polyphyletic status of the former family “Crabronidae”, which comprises nine major monophyletic lineages. We found the former subfamily Pemphredoninae to be polyphyletic, comprising three distantly related clades. One of them, Ammoplanina, constituted the sister group of bees in all our analyses. We estimate the origin of bees to be in the Early Cretaceous (ca. 128 million years ago), a time period during which angiosperms rapidly radiated. Finally, our phylogenetic analyses revealed that within the Apoidea, (eu)social societies evolved exclusively in a single clade that comprises pemphredonine and philanthine wasps as well as bees.ConclusionBy combining transcriptomic sequences with those obtained via target DNA enrichment, we were able to include an unprecedented large number of apoid wasps in a phylogenetic study for tracing the phylogenetic origin of bees. Our results confirm the polyphyletic nature of the former wasp family Crabonidae, which we here suggest splitting into eight families. Of these, the family Ammoplanidae possibly represents the extant sister lineage of bees. Species of Ammoplanidae are known to hunt thrips, of which some aggregate on flowers and feed on pollen. The specific biology of Ammoplanidae as predators indicates how the transition from a predatory to pollen-collecting life style could have taken place in the evolution of bees. This insight plus the finding that (eu)social societies evolved exclusively in a single subordinated lineage of apoid wasps provides new perspectives for future comparative studies.


international parallel and distributed processing symposium | 2014

Efficient Computation of the Phylogenetic Likelihood Function on the Intel MIC Architecture

Alexey Kozlov; Christian Goll; Alexandros Stamatakis

Phylogenetic inference is the process of reconstructing the evolutionary history of species based on their traits, nowadays mostly using molecular sequence data. Current state-of-the-art inference methods, like Bayesian and Maximum Likelihood (ML) inference, rely on the Phylogenetic Likelihood Function (PLF) as their computational core. Due to the large number of floating-point operations involved, the PLF evaluation is the major bottleneck for large-scale phylogenetic analyses comprising thousands of genes or even whole genomes. Here, we describe an optimized implementation of the PLF kernel for the novel Intel Many Integrated Core (MIC) architecture. Using a MIC-based accelerator (Xeon Phi 5110P), we were able to achieve speedups ranging from 1.9× to 2.8× for different PLF kernels, compared to a highly optimized AVX implementation running on dual-socket Xeon E5-2680 system. By integrating the optimized PLF into the phylogenetic inference program RAxML-Light, we reduced the overall execution times by up to factor of two. To assess the scalability on multiple Xeon Phi cards, we also developed a hybrid MPIOpenMP version of the ExaML code. When ExaML is executed on two coprocessors on the same node, we obtain speedups of up to a factor of 3.7 (vs. a CPU baseline) and 1.8 (vs. a single MIC). As expected, speedups increase with growing dataset size and become stable for alignments that require processing 1-2 million sites per MIC card.


bioRxiv | 2016

Soil Protists in Three Neotropical Rainforests are Hyperdiverse and Dominated by Parasites

Frédéric Mahé; Colomban de Vargas; David Bass; Lucas Czech; Alexandros Stamatakis; Enrique Lara; Jordan Mayor; John Bunge; Sarah Sernaker; Tobias Siemensmeyer; Isabelle Trautmann; Sarah Romac; Cédric Berney; Alexey Kozlov; Edward A. D. Mitchell; Christophe V. W. Seppey; David Singer; Elianne Sirnæs Egge; Rainer Wirth; Gabriel Trueba; Micah Dunthorn

Animal and plant richness in tropical rainforests has long intrigued naturalist. More recent work has revealed that parasites contribute to high tropical tree diversity (Bagchi et al., 2014; Terborgh, 2012) and that arthropods are the most diverse eukaryotes in these forests (Erwin, 1982; Basset et al., 2012). It is unknown if similar patterns are reflected at the microbial scale with unicellular eukaryotes or protists. Here we show, using environmental metabarcoding and a novel phylogeny-aware cleaning step, that protists inhabiting Neotropical rainforest soils are hyperdiverse and dominated by the parasitic Apicomplexa, which infect arthropods and other animals. These host-specific protist parasites potentially contribute to the high animal diversity in the forests by reducing population growth in a density-dependent manner. By contrast, we found too few Oomycota to broadly drive high tropical tree diversity in a host-specific manner under the Janzen-Connell model (Janzen, 1970; Connell, 1970). Extremely high OTU diversity and high heterogeneity between samples within the same forests suggest that protists, not arthropods, are the most diverse eukaryotes in tropical rainforests. Our data show that microbes play a large role in tropical terrestrial ecosystems long viewed as being dominated by macro-organisms. Contact: [email protected]


bioRxiv | 2018

RAxML-NG: A fast, scalable, and user-friendly tool for maximum likelihood phylogenetic inference

Alexey Kozlov; Diego Darriba; Tomáš Flouri; Benoit Morel; Alexandros Stamatakis

Motivation Phylogenies are important for fundamental biological research, but also have numerous applications in biotechnology, agriculture, and medicine. Finding the optimal tree under the popular maximum like-lihood (ML) criterion is known to be NP-hard. Thus, highly optimized and scalable codes are needed to analyze constantly growing empirical datasets. Results We present RAxML-NG, a from scratch re-implementation of the established greedy tree search algorithm of RAxML/ExaML. RAxML- NG offers improved accuracy, flexibility, speed, scalability, and usability. It compares favorably to IQ-Tree, an increasingly popular recent tool for ML-based phylogenetic inference. Finally, RAxML-NG introduces several new features, such as the detection of terraces in tree space and a the recently introduced transfer bootstrap support metric. Availability The code is available under GNU GPL at https://github.com/amkozlov/raxml-ng.RAxML-NG web service (maintained by Vital- IT) is available at https://raxml-ng.vital-it.ch/. Contact [email protected]


Systematic Entomology | 2018

New data, same story: phylogenomics does not support Syrphoidea (Diptera: Syrphidae, Pipunculidae): Syrphoidea are not supported by new data

Thomas Pauli; Trevor O. Burt; Karen Meusemann; Keith M. Bayless; Alexander Donath; Lars Podsiadlowski; Christoph Mayer; Alexey Kozlov; Alexandros Vasilikopoulos; Shanlin Liu; Xin Zhou; David K. Yeates; Bernhard Misof; Ralph S. Peters; Ximo Mengual

The Syrphoidea (families Pipunculidae and Syrphidae) has been suggested to be the sister group of the Schizophora, the largest species radiation of true flies. A major challenge in dipterology is inferring the phylogenetic relationship between Syrphoidea and Schizophora in order to understand the evolutionary history of flies. Using newly sequenced transcriptomic data of Syrphidae, Pipunculidae and closely related lineages, we were able to fully resolve phylogenetic relationships of Syrphoidea using a supermatrix approach with more than 1 million amino acid positions derived from 3145 genes, including 19 taxa across nine families. Platypezoidea were inferred as a sister group to Eumuscomorpha, which was recovered monophyletic. While Syrphidae were also found to be monophyletic, the superfamily Syrphoidea was not recovered as a monophyletic group, as Pipunculidae were inferred as sister group to Schizophora. Within Syrphidae, the subfamily Microdontinae was resolved as sister group to the remaining taxa, Syrphinae and Pipizinae were placed as sister groups, and the monophyly of Eristalinae was not recovered. Although our results are consistent with previously established hypotheses on Eumuscomorphan evolution, our approach is new to dipteran phylogeny, using larger‐scale transcriptomic data for the first time for this insect group.

Collaboration


Dive into the Alexey Kozlov's collaboration.

Top Co-Authors

Avatar

Alexandros Stamatakis

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xin Zhou

China Agricultural University

View shared research outputs
Top Co-Authors

Avatar

Lars Krogmann

American Museum of Natural History

View shared research outputs
Researchain Logo
Decentralizing Knowledge