Bioinformatics | 2021

MNHN-Tree-Tools: A toolbox for tree inference using multi-scale clustering of a set of sequences.

 
 
 
 

Abstract


SUMMARY\nGenomic sequences are widely used to infer the evolutionary history of a given group of individuals. Many methods have been developed for sequence clustering and tree building. In the early days of genome sequencing, these were often limited to hundreds of sequences, but due to the surge of high throughput sequencing, it is now common to have millions of sampled sequences at hand. We introduce MNHN-Tree-Tools, a high performance set of algorithms that builds multi-scale, nested clusters of sequences found in a FASTA file. MNHN-Tree-Tools does not rely on sequence alignment and can thus be used on large datasets to infer a sequence tree. Herein we outline two applications: A human alpha-satellite repeats classification and a tree of life derivation from 16S/18S rDNA sequences.\n\n\nCODE AVAILABILITY\nOpen source with a Zlib License via the Git protocol: https://gitlab.in2p3.fr/mnhn-tools/mnhn-tree-tools.\n\n\nSUPPLEMENTARY INFORMATION\nAn in depth discussion about the algorithm with numerical simulations: https://gitlab.in2p3.fr/mnhn-tools/tree-tools-algorithms-document/-/raw/master/article.pdf.\n\n\nMANUAL\nA detailed users guide and tutorial: https://gitlab.in2p3.fr/mnhn-tools/mnhn-tree-tools-manual/-/raw/master/manual.pdf.\n\n\nWEBSITE AND FAQ\nhttp://treetools.haschka.net.

Volume None
Pages None
DOI 10.1093/bioinformatics/btab430
Language English
Journal Bioinformatics

Full Text