Malay Kumar Basu
National Institutes of Health
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Malay Kumar Basu.
Science | 2011
John K. Colbourne; Michael E. Pfrender; Donald L. Gilbert; W. Kelley Thomas; Abraham Tucker; Todd H. Oakley; Shin-ichi Tokishita; Andrea Aerts; Georg J. Arnold; Malay Kumar Basu; Darren J Bauer; Carla E. Cáceres; Liran Carmel; Claudio Casola; Jeong Hyeon Choi; John C. Detter; Qunfeng Dong; Serge Dusheyko; Brian D. Eads; Thomas Fröhlich; Kerry A. Geiler-Samerotte; Daniel Gerlach; Phil Hatcher; Sanjuro Jogdeo; Jeroen Krijgsveld; Evgenia V. Kriventseva; Dietmar Kültz; Christian Laforsch; Erika Lindquist; Jacqueline Lopez
The Daphnia genome reveals a multitude of genes and shows adaptation through gene family expansions. We describe the draft genome of the microcrustacean Daphnia pulex, which is only 200 megabases and contains at least 30,907 genes. The high gene count is a consequence of an elevated rate of gene duplication resulting in tandem gene clusters. More than a third of Daphnia’s genes have no detectable homologs in any other available proteome, and the most amplified gene families are specific to the Daphnia lineage. The coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random, and the analysis of gene expression under different environmental conditions reveals that numerous paralogs acquire divergent expression patterns soon after duplication. Daphnia-specific genes, including many additional loci within sequenced regions that are otherwise devoid of annotations, are the most responsive genes to ecological challenges.
Nucleic Acids Research | 2012
Daniel H. Haft; Jeremy D. Selengut; Roland A. Richter; Derek M. Harkins; Malay Kumar Basu; Erin Beck
TIGRFAMs, available online at http://www.jcvi.org/tigrfams is a database of protein family definitions. Each entry features a seed alignment of trusted representative sequences, a hidden Markov model (HMM) built from that alignment, cutoff scores that let automated annotation pipelines decide which proteins are members, and annotations for transfer onto member proteins. Most TIGRFAMs models are designated equivalog, meaning they assign a specific name to proteins conserved in function from a common ancestral sequence. Models describing more functionally heterogeneous families are designated subfamily or domain, and assign less specific but more widely applicable annotations. The Genome Properties database, available at http://www.jcvi.org/genome-properties, specifies how computed evidence, including TIGRFAMs HMM results, should be used to judge whether an enzymatic pathway, a protein complex or another type of molecular subsystem is encoded in a genome. TIGRFAMs and Genome Properties content are developed in concert because subsystems reconstruction for large numbers of genomes guides selection of seed alignment sequences and cutoff values during protein family construction. Both databases specialize heavily in bacterial and archaeal subsystems. At present, 4284 models appear in TIGRFAMs, while 628 systems are described by Genome Properties. Content derives both from subsystem discovery work and from biocuration of the scientific literature.
Cell Cycle | 2005
Igor B. Rogozin; Malay Kumar Basu; I. King Jordan; Youri I. Pavlov; Eugene V. Koonin
Using iterative database searches, we identified a new subfamily of the AID/APOBEC family of RNA/DNA editing cytidine deaminases. The new subfamily, which is represented by readily identifiable orthologs in mammals, chicken, and frog, but not fishes, was designated APOBEC4. The zinc-coordinating motifs involved in catalysis and the secondary structure of the APOBEC4 deaminase domain are evolutionarily conserved, suggesting that APOBEC4 proteins are active polynucleotide (deoxy)cytidine deaminases. In reconstructed maximum likelihood phylogenetic trees, APOBEC4 forms distinct clade with a high statistical support. APOBEC4 and APOBEC1 are joined in a moderately supported cluster clearly separated from AID, APOBEC2 and APOBEC3 subfamilies. In mammals, APOBEC4 is expressed primarily in testis which suggests the possibility that it is an editing enzyme for mRNAs involved in spermatogenesis.
BMC Biology | 2010
Daniel H. Haft; Malay Kumar Basu; Douglas A. Mitchell
BackgroundA new family of natural products has been described in which cysteine, serine and threonine from ribosomally-produced peptides are converted to thiazoles, oxazoles and methyloxazoles, respectively. These metabolites and their biosynthetic gene clusters are now referred to as thiazole/oxazole-modified microcins (TOMM). As exemplified by microcin B17 and streptolysin S, TOMM precursors contain an N-terminal leader sequence and C-terminal core peptide. The leader sequence contains binding sites for the posttranslational modifying enzymes which subsequently act upon the core peptide. TOMM peptides are small and highly variable, frequently missed by gene-finders and occasionally situated far from the thiazole/oxazole forming genes. Thus, locating a substrate for a particular TOMM pathway can be a challenging endeavor.ResultsExamination of candidate TOMM precursors has revealed a subclass with an uncharacteristically long leader sequence closely related to the enzyme nitrile hydratase. Members of this nitrile hydratase leader peptide (NHLP) family lack the metal-binding residues required for catalysis. Instead, NHLP sequences display the classic Gly-Gly cleavage motif and have C-terminal regions rich in heterocyclizable residues. The NHLP family exhibits a correlated species distribution and local clustering with an ABC transport system. This study also provides evidence that a separate family, annotated as Nif11 nitrogen-fixing proteins, can serve as natural product precursors (N11P), but not always of the TOMM variety. Indeed, a number of cyanobacterial genomes show extensive N11P paralogous expansion, such as Nostoc, Prochlorococcus and Cyanothece, which replace the TOMM cluster with lanthionine biosynthetic machinery.ConclusionsThis study has united numerous TOMM gene clusters with their cognate substrates. These results suggest that two large protein families, the nitrile hydratases and Nif11, have been retailored for secondary metabolism. Precursors for TOMMs and lanthionine-containing peptides derived from larger proteins to which other functions are attributed, may be widespread. The functions of these natural products have yet to be elucidated, but it is probable that some will display valuable industrial or medical activities.
Genome Biology and Evolution | 2009
Igor B. Rogozin; Malay Kumar Basu; Miklos Csuros; Eugene V. Koonin
The deep phylogeny of eukaryotes is an important but extremely difficult problem of evolutionary biology. Five eukaryotic supergroups are relatively well established but the relationship between these supergroups remains elusive, and their divergence seems to best fit a “Big Bang” model. Attempts were made to root the tree of eukaryotes by using potential derived shared characters such as unique fusions of conserved genes. One popular model of eukaryotic evolution that emerged from this type of analysis is the unikont–bikont phylogeny: The unikont branch consists of Metazoa, Choanozoa, Fungi, and Amoebozoa, whereas bikonts include the rest of eukaryotes, namely, Plantae (green plants, Chlorophyta, and Rhodophyta), Chromalveolata, excavates, and Rhizaria. We reexamine the relationships between the eukaryotic supergroups using a genome-wide analysis of rare genomic changes (RGCs) associated with multiple, conserved amino acids (RGC_CAMs and RGC_CAs), to resolve trifurcations of major eukaryotic lineages. The results do not support the basal position of Chromalveolata with respect to Plantae and unikonts or the monophyly of the bikont group and appear to be best compatible with the monophyly of unikonts and Chromalveolata. Chromalveolata show a distinct, additional signal of affinity with Plantae, conceivably, owing to genes transferred from the secondary, red algal symbiont. Excavates are derived forms, with extremely long branches that complicate phylogenetic inference; nevertheless, the RGC analysis suggests that they are significantly more likely to cluster with the unikont–Chromalveolata assemblage than with the Plantae. Thus, the first split in eukaryotic evolution might lie between photosynthetic and nonphotosynthetic forms and so could have been triggered by the endosymbiosis between an ancestral unicellular eukaryote and a cyanobacterium that gave rise to the chloroplast.
Journal of Bacteriology | 2011
Daniel H. Haft; Malay Kumar Basu
Data mining methods in bioinformatics and comparative genomics commonly rely on working definitions of protein families from prior computation. Partial phylogenetic profiling (PPP), by contrast, optimizes family sizes during its searches for the cooccurring protein families that serve different roles in the same biological system. In a large-scale investigation of the incredibly diverse radical S-adenosylmethionine (SAM) enzyme superfamily, PPP aided in building a collection of 68 TIGRFAMs hidden Markov models (HMMs) that define nonoverlapping and functionally distinct subfamilies. Many identify radical SAM enzymes as molecular markers for multicomponent biological systems; HMMs defining their partner proteins also were constructed. Newly found systems include five groupings of protein families in which at least one marker is a radical SAM enzyme while another, encoded by an adjacent gene, is a short peptide predicted to be its substrate for posttranslational modification. The most prevalent, in over 125 genomes, featuring a peptide that we designate SCIFF (six cysteines in forty-five residues), is conserved throughout the class Clostridia, a distribution inconsistent with putative bacteriocin activity. A second novel system features a tandem pair of putative peptide-modifying radical SAM enzymes associated with a highly divergent family of peptides in which the only clearly conserved feature is a run of His-Xaa-Ser repeats. A third system pairs a radical SAM domain peptide maturase with selenocysteine-containing targets, suggesting a new biological role for selenium. These and several additional novel maturases that cooccur with predicted target peptides share a C-terminal additional 4Fe4S-binding domain with PqqE, the subtilosin A maturase AlbA, and the predicted mycofactocin and Nif11-class peptide maturases as well as with activators of anaerobic sulfatases and quinohemoprotein amine dehydrogenases. Radical SAM enzymes with this additional domain, as detected by TIGR04085, significantly outnumber lantibiotic synthases and cyclodehydratases combined in reference genomes while being highly enriched for members whose apparent targets are small peptides. Interpretation of comparative genomics evidence suggests unexpected (nonbacteriocin) roles for natural products from several of these systems.
Journal of Biosciences | 1998
Malay K. Ray; G. Seshu Kumar; Kamala L Janiyani; K. Kannan; Pratik Jagtap; Malay Kumar Basu; S. Shivaji
Exposure to extremes of temperatures cause stresses which are sometimes lethal to living cells. Microorganisms in nature, however, are extremely diverse and some of them can live happily in the freezing cold of Antarctica. Among the cold adapted psychrotrophs and psychrophiles, the psychrotrophic bacteria are the predominant forms in the continental Antarctica. In spite of living in permanently cold area, the antarctic bacteria exhibit, similar to mesophiles, ‘cold-shock’ response albeit at a much lower temperatures, e.g., at 0–5°C. However, because of permanently cold condition and the long isolation of the continent, the microorganisms have acquired new adaptive features in the membranes, enzymes and macromolecular synthesis. Only recently these adaptive modifications are coming into light due to the efforts of various laboratories around the world. However, a lot more is known about adaptive response to low temperature in mesophilic bacteria than in antarctic bacteria. Combined knowledge from the two systems is providing useful clues to the understanding of basic biology of low temperature growing organisms. This article will provide an overview of this area of research with a special reference to sensing of temperature and regulation of gene expression at lower temperature.
Cell Cycle | 2005
Malay Kumar Basu; Eugene V. Koonin
Sufiredoxin (Srx) is a sulfinic acid reductase, a recently identified eukaryotic enzyme, which is involved in the reduction of the hyperoxidized sulfinic acid form of the catalytic cysteine of 2-Cys peroxiredoxins (Prx). This reaction contributes to the oxidative stress response and H202 mediated signaling. We show that Srx has significant sequence and structural similarity to a functionally unrelated protein, ParB, a DNA-binding protein with a helix-turn-helix (HTH) domain which is involved in chromosome partitioning in bacteria. Sequence comparison and phylogenetic analysis of the Srx and ParB protein families suggest that Srx evolved via truncation of ParB, which removed the entire C-terminal half of the protein, including the HTH domain, and a substitution of cysteine for a glutamic acid in a highly conserved structural motif of ParB. The latter substitution apparently created the sulfinic acid reductase catalytic site. Evolution of a redox enzyme from a DNA-binding protein, with retention of highly significant sequence similarity, is unusual, even when compared to functional switches accompanying recruitment of other prokaryotic proteins for new functions in eukaryotes.
Trends in Genetics | 2008
Malay Kumar Basu; Igor B. Rogozin; Eugene V. Koonin
The two types of eukaryotic spliceosomal introns, U2 and U12, possess different splice signals and are excised by distinct spliceosomes. The nature of the primordial introns remains uncertain. A comparison of the amino acid distributions at insertion sites of introns that retained their positions throughout eukaryotic evolution with the distributions for human and Arabidopsis thaliana U2 and U12 introns reveals close similarity with U2 but not U12. Thus, the primordial spliceosomal introns were, most likely, U2-type.
Bioinformatics | 2001
Malay Kumar Basu
SUMMARY Sequence analysis using Web Resources (SeWeR) is an integrated, Dynamic HTML (DHTML) interface to commonly used bioinformatics services available on the World Wide Web. It is highly customizable, extendable, platform neutral, completely server-independent and can be hosted as a web page as well as being used as stand-alone software running within a web browser.