Mahul Chakraborty
University of California, Irvine
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mahul Chakraborty.
Nucleic Acids Research | 2016
Mahul Chakraborty; James G. Baldwin-Brown; Anthony D. Long; J.J. Emerson
Genome assemblies that are accurate, complete and contiguous are essential for identifying important structural and functional elements of genomes and for identifying genetic variation. Nevertheless, most recent genome assemblies remain incomplete and fragmented. While long molecule sequencing promises to deliver more complete genome assemblies with fewer gaps, concerns about error rates, low yields, stringent DNA requirements and uncertainty about best practices may discourage many investigators from adopting this technology. Here, in conjunction with the platinum standard Drosophila melanogaster reference genome, we analyze recently published long molecule sequencing data to identify what governs completeness and contiguity of genome assemblies. We also present a hybrid meta-assembly approach that achieves remarkable assembly contiguity for both Drosophila and human assemblies with only modest long molecule sequencing coverage. Our results motivate a set of preliminary best practices for obtaining accurate and contiguous assemblies, a ‘missing manual’ that guides key decisions in building high quality de novo genome assemblies, from DNA isolation to polishing the assembly.
Genome Biology and Evolution | 2014
Ramray Bhat; Mahul Chakraborty; I.S. Mian; Stuart A. Newman
Prototype galectins, endogenously expressed animal lectins with a single carbohydrate recognition domain, are well-known regulators of tissue properties such as growth and adhesion. The earliest discovered and best studied of the prototype galectins is Galectin-1 (Gal-1). In the Gallus gallus (chicken) genome, Gal-1 is represented by two homologs: Gal-1A and Gal-1B, with distinct biochemical properties, tissue expression, and developmental functions. We investigated the origin of the Gal-1A/Gal-1B divergence to gain insight into when their developmental functions originated and how they could have contributed to vertebrate phenotypic evolution. Sequence alignment and phylogenetic tree construction showed that the Gal-1A/Gal-1B divergence can be traced back to the origin of the sauropsid lineage (consisting of extinct and extant reptiles and birds) although lineage-specific duplications also occurred in the amphibian and actinopterygian genomes. Gene synteny analysis showed that sauropsid gal-1b (the gene for Gal-1B) and its frog and actinopterygian gal-1 homologs share a similar chromosomal location, whereas sauropsid gal-1a has translocated to a new position. Surprisingly, we found that chicken Gal-1A, encoded by the translocated gal-1a, was more similar in its tertiary folding pattern than Gal-1B, encoded by the untranslocated gal-1b, to experimentally determined and predicted folds of nonsauropsid Gal-1s. This inference is consistent with our finding of a lower proportion of conserved residues in sauropsid Gal-1Bs, and evidence for positive selection of sauropsid gal-1b, but not gal-1a genes. We propose that the duplication and structural divergence of Gal-1B away from Gal-1A led to specialization in both expression and function in the sauropsid lineage.
Chemico-Biological Interactions | 2011
Mahul Chakraborty; James D. Fry
Little is known about the roles of aldehyde dehydrogenases in non-vertebrate animals. We recently showed that in Drosophila melanogaster, an enzyme with ∼70% amino acid identity to mammalian ALDH2 is necessary for detoxification of dietary ethanol. To investigate other functions of this enzyme, DmALDH, encoded by the gene Aldh, we compared two strains homozygous for Aldh-null mutations to two closely related wild type strains in measures of fitness and stress resistance in the absence of ethanol. Aldh-null strains have lower total reproductive rate, pre-adult viability, resistance to starvation, and possibly longevity than wild-type strains. When maintained under hyperoxia, Aldh nulls die more quickly and accumulate higher levels of protein carbonyls than wild-types, thereby providing evidence that DmALDH is important for detoxifying reactive aldehydes generated by lipid peroxidation. However no effect of Aldh was seen on protein carbonyl levels in flies maintained under normoxia. It is possible that Aldh nulls experience elevated rates of protein carbonylation under normoxia, but this is compensated (at a fitness cost) by increased rates of degradation of the defective proteins. Alternatively, the fitness defects of Aldh nulls under normoxia may result from the absence of one or more other functions of DmALDH, unrelated to protection against protein carbonylation.
BMC Evolutionary Biology | 2016
Ramray Bhat; Mahul Chakraborty; Tilmann Glimm; Thomas A. Stewart; Stuart A. Newman
BackgroundA multiscale network of two galectins Galectin-1 (Gal-1) and Galectin-8 (Gal-8) patterns the avian limb skeleton. Among vertebrates with paired appendages, chondrichthyan fins typically have one or more cartilage plates and many repeating parallel endoskeletal elements, actinopterygian fins have more varied patterns of nodules, bars and plates, while tetrapod limbs exhibit tandem arrays of few, proximodistally increasing numbers of elements. We applied a comparative genomic and protein evolution approach to understand the origin of the galectin patterning network. Having previously observed a phylogenetic constraint on Gal-1 structure across vertebrates, we asked whether evolutionary changes of Gal-8 could have critically contributed to the origin of the tetrapod pattern.ResultsTranslocations, duplications, and losses of Gal-8 genes in Actinopterygii established them in different genomic locations from those that the Sarcopterygii (including the tetrapods) share with chondrichthyans. The sarcopterygian Gal-8 genes acquired a potentially regulatory non-coding motif and underwent purifying selection. The actinopterygian Gal-8 genes, in contrast, did not acquire the non-coding motif and underwent positive selection.ConclusionThese observations interpreted through the lens of a reaction-diffusion-adhesion model based on avian experimental findings can account for the distinct endoskeletal patterns of cartilaginous, ray-finned, and lobe-finned fishes, and the stereotypical limb skeletons of tetrapods.
bioRxiv | 2015
Mahul Chakraborty; James G. Baldwin-Brown; Anthony D. Long; J.J. Emerson
Genome assemblies that are accurate, complete, and contiguous are essential for identifying important structural and functional elements of genomes and for identifying genetic variation. Nevertheless, most recent genome assemblies remain incomplete and fragmented. While long molecule sequencing promises to deliver more complete genome assemblies with fewer gaps, concerns about error rates, low yields, stringent DNA requirements, and uncertainty about best practices may discourage many investigators from adopting this technology. Here, in conjunction with the platinum standard Drosophila melanogaster reference genome, we analyze recently published long molecule sequencing data to identify what governs completeness and contiguity of genome assemblies. We also present a hybrid meta-assembly approach that achieves remarkable assembly contiguity for both Drosophila and human assemblies with only modest long molecule sequencing coverage. Our results motivate a set of preliminary best practices for obtaining accurate and contiguous assemblies, a “missing manual” that guides key decisions in building high quality de novo genome assemblies, from DNA isolation to polishing the assembly.
bioRxiv | 2017
Mahul Chakraborty; Roy Zhao; Xinwen Zhang; Shannon Kalsow; J.J. Emerson
Mutations that add, subtract, rearrange, or otherwise refashion genome structure often affect phenotypes, though the fragmented nature of most contemporary assemblies obscure them. To discover such mutations, we assembled the first reference quality genome of Drosophila melanogaster since its initial sequencing. By comparing this genome to the existing D. melanogaster assembly, we create a structural variant map of unprecedented resolution, revealing extensive genetic variation that has remained hidden until now. Many of these variants constitute strong candidates underlying phenotypic variation, including tandem duplications and a transposable element insertion that dramatically amplifies the expression of detoxification genes associated with nicotine resistance. The abundance of important genetic variation that still evades discovery highlights how crucial high quality references are to deciphering phenotypes.
Molecular Biology and Evolution | 2015
Mahul Chakraborty; James D. Fry
A large proportion of duplicates, originating from ubiquitously expressed genes, acquire testis-biased expression. Identifying the underlying cause of this observation requires determining whether the duplicates have altered functions relative to the parental genes. Typically, statistical methods are used to test for positive selection, signature of which in protein sequence of duplicates implies functional divergence. When assumptions are violated, however, such tests can lead to false inference of positive selection. More convincing evidence for naturally selected functional changes would be the occurrence of structural changes with similar functional consequences in independent duplicates of the same gene. We investigated two testis-specific duplicates of the broadly expressed enzyme gene Aldehyde dehydrogenase (Aldh) that arose in different Drosophila lineages. The duplicates show a typical pattern of accelerated amino acid substitutions relative to their broadly expressed paralogs, with statistical evidence for positive selection in both cases. Importantly, in both duplicates, width of the entrance to the substrate binding site, known a priori to influence substrate specificity, and otherwise conserved throughout the genus Drosophila, has been reduced, resulting in narrowing of the entrance. Protein structure modeling suggests that the reduction of the size of the enzymes substrate entry channel, which is likely to shift substrate specificity toward smaller aldehydes, is accounted for by the positively selected parallel substitutions in one duplicate but not the other. Evolution of the testis-specific duplicates was accompanied by reduction in expression of the ancestral Aldh in males, supporting the hypothesis that the duplicates may have helped resolve intralocus sexual conflict over Aldh function.
bioRxiv | 2018
Mahul Chakraborty; J.J. Emerson; Stuart J. Macdonald; Anthony D. Long
Despite extensive effort to reveal the genetic basis of complex phenotypic variation, studies typically explain only a fraction of trait heritability. It has been hypothesized that individually rare hidden structural variants (SVs) could account for a significant fraction of variation in complex traits. To investigate this hypothesis, we assembled 14 Drosophila melanogaster genomes and systematically identified more than 20,000 euchromatic SVs, of which ∼40% are invisible to high specificity short read genotyping approaches. SVs are common in Drosophila genes, with almost one third of diploid individuals harboring an SV in genes larger than 5kb, and nearly a quarter harboring multiple SVs in genes larger than 10kb. We show that SV alleles are rarer than amino acid polymorphisms, implying that they are more strongly deleterious. A number of functionally important genes harbor previously hidden structural variants that likely affect complex phenotypes (e.g., Cyp6g1, Drsl5, Cyp28d1&2, InR, and Gss1&2).Furthermore, SVs are overrepresented in quantitative trait locus candidate genes from eight Drosophila Synthetic Population Resource (DSPR) mapping experiments. We conclude that SVs are pervasive in genomes, are frequently present as heterogeneous allelic series, and can act as rare alleles of large effect.
G3: Genes, Genomes, Genetics | 2018
Edwin A. Solares; Mahul Chakraborty; Danny E. Miller; Shannon Kalsow; Kate Hall; Anoja Perera; J.J. Emerson; R. Scott Hawley
Accurate and comprehensive characterization of genetic variation is essential for deciphering the genetic basis of diseases and other phenotypes. A vast amount of genetic variation stems from large-scale sequence changes arising from the duplication, deletion, inversion, and translocation of sequences. In the past 10 years, high-throughput short reads have greatly expanded our ability to assay sequence variation due to single nucleotide polymorphisms. However, a recent de novo assembly of a second Drosophila melanogaster reference genome has revealed that short read genotyping methods miss hundreds of structural variants, including those affecting phenotypes. While genomes assembled using high-coverage long reads can achieve high levels of contiguity and completeness, concerns about cost, errors, and low yield have limited widespread adoption of such sequencing approaches. Here we resequenced the reference strain of D. melanogaster (ISO1) on a single Oxford Nanopore MinION flow cell run for 24 hr. Using only reads longer than 1 kb or with at least 30x coverage, we assembled a highly contiguous de novo genome. The addition of inexpensive paired reads and subsequent scaffolding using an optical map technology achieved an assembly with completeness and contiguity comparable to the D. melanogaster reference assembly. Comparison of our assembly to the reference assembly of ISO1 uncovered a number of structural variants (SVs), including novel LTR transposable element insertions and duplications affecting genes with developmental, behavioral, and metabolic functions. Collectively, these SVs provide a snapshot of the dynamics of genome evolution. Furthermore, our assembly and comparison to the D. melanogaster reference genome demonstrates that high-quality de novo assembly of reference genomes and comprehensive variant discovery using such assemblies are now possible by a single lab for under
Current Biology | 2016
Mahul Chakraborty; James D. Fry
1,000 (USD).