Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shamsuzzoha Bayzid is active.

Publication


Featured researches published by Shamsuzzoha Bayzid.


BMC Genomics | 2014

Disk covering methods improve phylogenomic analyses

Shamsuzzoha Bayzid; Tyler Hunt; Tandy J. Warnow

MotivationWith the rapid growth rate of newly sequenced genomes, species tree inference from multiple genes has become a basic bioinformatics task in comparative and evolutionary biology. However, accurate species tree estimation is difficult in the presence of gene tree discordance, which is often due to incomplete lineage sorting (ILS), modelled by the multi-species coalescent. Several highly accurate coalescent-based species tree estimation methods have been developed over the last decade, including MP-EST. However, the running time for MP-EST increases rapidly as the number of species grows.ResultsWe present divide-and-conquer techniques that improve the scalability of MP-EST so that it can run efficiently on large datasets. Surprisingly, this technique also improves the accuracy of species trees estimated by MP-EST, as our study shows on a collection of simulated and biological datasets.


Algorithms for Molecular Biology | 2018

Gene tree parsimony for incomplete gene trees: addressing true biological loss

Shamsuzzoha Bayzid; Tandy J. Warnow

Motivation Species tree estimation from gene trees can be complicated by gene duplication and loss, and “gene tree parsimony” (GTP) is one approach for estimating species trees from multiple gene trees. In its standard formulation, the objective is to find a species tree that minimizes the total number of gene duplications and losses with respect to the input set of gene trees. Although much is known about GTP, little is known about how to treat inputs containing some incomplete gene trees (i.e., gene trees lacking one or more of the species).ResultsWe present new theory for GTP considering whether the incompleteness is due to gene birth and death (i.e., true biological loss) or taxon sampling, and present dynamic programming algorithms that can be used for an exact but exponential time solution for small numbers of taxa, or as a heuristic for larger numbers of taxa. We also prove that the “standard” calculations for duplications and losses exactly solve GTP when incompleteness results from taxon sampling, although they can be incorrect when incompleteness results from true biological loss. The software for the DP algorithm is freely available as open source code at https://github.com/smirarab/DynaDup.


Archive | 2014

Biological Datasets based on Salichos and Rokas for Mirarab et. al.

Siavash Mirarab; Shamsuzzoha Bayzid; Bastien Boussau; Tandy J. Warnow

Three biological datasets are all from Salichos and Rokas, 2013, Nature (doi:10.1038/nature12130). The authors kindly provided to us both the alignments and their gene trees. We make the supergene alignments and trees (which we estimated) available in this record. Please see the README file for more information.


Archive | 2014

Mammalian Model Species Trees for 1X Model Condition for Mirarab et. al.

Siavash Mirarab; Shamsuzzoha Bayzid; Bastien Boussau; Tandy J. Warnow

This record contains the model mammalian species trees for 1X model condition (for reduced or increased ILS model condition, we simply multiply or divide the branch lengths by 2 or 5). Please see the README file for more information.


Archive | 2014

Binning Code for Mirarab et. al.

Siavash Mirarab; Shamsuzzoha Bayzid; Bastien Boussau; Tandy J. Warnow

This is the pipeline used for performing the binning step, and the exact code that we used in the Science paper for binning is available here. For the latest version of this code, you can refer to the github repository at https://github.com/smirarab/binning. Once the file from the IDEALS page here is unzipped, look at the README file for usage and installation guidelines. Note that this pipeline works on *nix-like systems (including MAC) but not on Windows. However, the main code to perform vertex coloring and to perform compatibility checks are in java and can run on Windows if gluing scripts are developed. Please see the README file for more information and refer to the README on the github for a more detailed explanation.


Archive | 2014

Biological Mammalian Dataset for Mirarab et. al.

Siavash Mirarab; Shamsuzzoha Bayzid; Bastien Boussau; Tandy J. Warnow

Mammalian dataset was provided to us by Song et al from their 2013 PNAS paper (doi: 10.1073/pnas.1211733109). We provide the alignments (from Song et. al.), the gene trees, and the supergene trees that we estimated on those alignments in this record. Please see the README file for more information.


Archive | 2014

Bin Definition for Super Gene Trees for Mirarab et. al.

Siavash Mirarab; Shamsuzzoha Bayzid; Bastien Boussau; Tandy J. Warnow

Definition of bins for all our super gene trees for our Avian and Mammalian datasets. These files contain a pairwise/R*/[50/75]/bin.*.txt file for each of the model condition. These files are simple text files that give the gene ids put into each bin. Please see the README file for more information.


Bioinformatics | 2013

Naive Binning Improves Phylogenomic Analyses

Shamsuzzoha Bayzid; Tandy J. Warnow


PLOS ONE | 2015

Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses.

Shamsuzzoha Bayzid; Siavash Mirarab; Bastien Boussau; Tandy J. Warnow


PLOS ONE | 2014

The 25 species avian phylogeny, representing 4 genera of birds from Maluridae family, estimated by QFM using the 227,700 embedded quartets in 18 gene trees.

Reaz Rezwana; Shamsuzzoha Bayzid; Sohel Rahman M.

Collaboration


Dive into the Shamsuzzoha Bayzid's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tyler Hunt

University of Texas at Austin

View shared research outputs
Researchain Logo
Decentralizing Knowledge