Supratim Mukherjee
Joint Genome Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Supratim Mukherjee.
Nucleic Acids Research | 2017
Supratim Mukherjee; Dimitri Stamatis; Jon Bertsch; Galina Ovchinnikova; Olena Verezemska; Michelle Isbandi; Alex D. Thomas; Rida Ali; Kaushal Sharma; Nikos C. Kyrpides; T. B. K. Reddy
The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is a manually curated data management system that catalogs sequencing projects with associated metadata from around the world. In the current version of GOLD (v.6), all projects are organized based on a four level classification system in the form of a Study, Organism (for isolates) or Biosample (for environmental samples), Sequencing Project and Analysis Project. Currently, GOLD provides information for 26 117 Studies, 239 100 Organisms, 15 887 Biosamples, 97 212 Sequencing Projects and 78 579 Analysis Projects. These are integrated with over 312 metadata fields from which 58 are controlled vocabularies with 2067 terms. The web interface facilitates submission of a diverse range of Sequencing Projects (such as isolate genome, single-cell genome, metagenome, metatranscriptome) and complex Analysis Projects (such as genome from metagenome, or combined assembly from multiple Sequencing Projects). GOLD provides a seamless interface with the Integrated Microbial Genomes (IMG) system and supports and promotes the Genomic Standards Consortium (GSC) Minimum Information standards. This paper describes the data updates and additional features added during the last two years.
Nature Biotechnology | 2017
Supratim Mukherjee; Rekha Seshadri; Neha Varghese; Emiley A. Eloe-Fadrosh; Jan P. Meier-Kolthoff; Markus Göker; R. Cameron Coates; Michalis Hadjithomas; Georgios A. Pavlopoulos; David Paez-Espino; Yasuo Yoshikuni; Axel Visel; William B. Whitman; George M Garrity; Jonathan A. Eisen; Philip Hugenholtz; Amrita Pati; Natalia Ivanova; Tanja Woyke; Hans-Peter Klenk; Nikos C. Kyrpides
We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.
Standards in Genomic Sciences | 2015
Supratim Mukherjee; Marcel Huntemann; Natalia Ivanova; Nikos C. Kyrpides; Amrita Pati
With the rapid growth and development of sequencing technologies, genomes have become the new go-to for exploring solutions to some of the world’s biggest challenges such as searching for alternative energy sources and exploration of genomic dark matter. However, progress in sequencing has been accompanied by its share of errors that can occur during template or library preparation, sequencing, imaging or data analysis. In this study we screened over 18,000 publicly available microbial isolate genome sequences in the Integrated Microbial Genomes database and identified more than 1000 genomes that are contaminated with PhiX, a control frequently used during Illumina sequencing runs. Approximately 10% of these genomes have been published in literature and 129 contaminated genomes were sequenced under the Human Microbiome Project. Raw sequence reads are prone to contamination from various sources and are usually eliminated during downstream quality control steps. Detection of PhiX contaminated genomes indicates a lapse in either the application or effectiveness of proper quality control measures. The presence of PhiX contamination in several publicly available isolate genomes can result in additional errors when such data are used in comparative genomics analyses. Such contamination of public databases have far-reaching consequences in the form of erroneous data interpretation and analyses, and necessitates better measures to proofread raw sequences before releasing them to the broader scientific community.
Frontiers in Microbiology | 2016
Richard L. Hahnke; Jan P. Meier-Kolthoff; Marina García-López; Supratim Mukherjee; Marcel Huntemann; Natalia Ivanova; Tanja Woyke; Nikos C. Kyrpides; Hans-Peter Klenk; Markus Göker
The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.
Applied and Environmental Microbiology | 2015
Kyria Boundy-Mills; Matthias Hess; A. Rick Bennett; Matthew J. Ryan; Seogchan Kang; David R. Nobles; Jonathan A. Eisen; Patrik Inderbitzin; Irnayuli R. Sitepu; Tamas Torok; Daniel R. Brown; Juliana Cho; John E. Wertz; Supratim Mukherjee; Sherry L. Cady; Kevin McCluskey
ABSTRACT The mission of the United States Culture Collection Network (USCCN; http://usccn.org) is “to facilitate the safe and responsible utilization of microbial resources for research, education, industry, medicine, and agriculture for the betterment of human kind.” Microbial culture collections are a key component of life science research, biotechnology, and emerging global biobased economies. Representatives and users of several microbial culture collections from the United States and Europe gathered at the University of California, Davis, to discuss how collections of microorganisms can better serve users and stakeholders and to showcase existing resources available in public culture collections.
Genome Announcements | 2017
Kirill K. Miroshnikov; Alena Didriksen; Daniil G. Naumoff; Marcel Huntemann; Alicia Clum; Manoj Pillay; Krishnaveni Palaniappan; Neha Varghese; Natalia Mikhailova; Supratim Mukherjee; T. B. K. Reddy; Chris Daum; Nicole Shapiro; Natalia Ivanova; Nikos C. Kyrpides; Tanja Woyke; Svetlana N. Dedysh; Mette M. Svenning
ABSTRACT Methylocapsa palsarum NE2T is an aerobic, mildly acidophilic, obligate methanotroph. Similar to other Methylocapsa species, it possesses only a particulate methane monooxygenase and is capable of atmospheric nitrogen fixation. The genome sequence of this typical inhabitant of subarctic wetlands and soils also contains genes indicative of aerobic anoxygenic photosynthesis.
Genome Announcements | 2017
Dwi Susanti; Eric F. Johnson; Alla Lapidus; James Han; T. B. K. Reddy; Supratim Mukherjee; Manoj Pillay; Anna A. Perevalova; Natalia Ivanova; Tanja Woyke; Nikos C. Kyrpides; Biswarup Mukhopadhyay
ABSTRACT Desulfurococcus amylolyticus Z-533T, a hyperthermophilic crenarcheon, ferments peptide and starch, generating acetate, isobutyrate, isovalerate, CO2, and hydrogen. Unlike D. amylolyticus Z-1312, it cannot use cellulose and is inhibited by hydrogen. The reported draft genome sequence of D. amylolyticus Z-533T will help to understand the molecular basis for these differences.
Genome Announcements | 2017
Marlen C. Rice; Jeanette M. Norton; Lisa Y. Stein; Jessica A. Kozlowski; Annette Bollmann; Martin G. Klotz; Luis A. Sayavedra-Soto; Nicole Shapiro; Lynne Goodwin; Marcel Huntemann; Alicia Clum; Manoj Pillay; Neha Varghese; Natalia Mikhailova; Krishna Palaniappan; Natalia Ivanova; Supratim Mukherjee; T. B. K. Reddy; Chew Yee Ngan; Chris Daum; Nikos C. Kyrpides; Tanja Woyke
ABSTRACT Nitrosomonas cryotolerans ATCC 49181 is a cold-tolerant marine ammonia-oxidizing bacterium isolated from seawater collected in the Gulf of Alaska. The high-quality complete genome contains a 2.87-Mbp chromosome and a 56.6-kbp plasmid. Chemolithoautotrophic modules encoding ammonia oxidation and CO2 fixation were identified.
Nucleic Acids Research | 2018
Supratim Mukherjee; Dimitri Stamatis; Jon Bertsch; Galina Ovchinnikova; Hema Y Katta; Alejandro Mojica; I-Min A. Chen; Nikos C. Kyrpides; T. B. K. Reddy
Abstract The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is an open online resource, which maintains an up-to-date catalog of genome and metagenome projects in the context of a comprehensive list of associated metadata. Information in GOLD is organized into four levels: Study, Biosample/Organism, Sequencing Project and Analysis Project. Currently GOLD hosts information on 33 415 Studies, 49 826 Biosamples, 313 324 Organisms, 215 881 Sequencing Projects and 174 454 Analysis Projects with a total of 541 metadata fields, of which 80 are based on controlled vocabulary (CV) terms. GOLD provides a user-friendly web interface to browse sequencing projects and launch advanced search tools across four classification levels. Users submit metadata on a wide range of Sequencing and Analysis Projects in GOLD before depositing sequence data to the Integrated Microbial Genomes (IMG) system for analysis. GOLD conforms with and supports the rules set by the Genomic Standards Consortium (GSC) Minimum Information standards. The current version of GOLD (v.7) has seen the number of projects and associated metadata increase exponentially over the years. This paper provides an update on the current status of GOLD and highlights the new features added over the last two years.
Frontiers in Microbiology | 2018
Richard L. Hahnke; Jan P. Meier-Kolthoff; Marina García-López; Supratim Mukherjee; Marcel Huntemann; Natalia Ivanova; Tanja Woyke; Nikos C. Kyrpides; Hans-Peter Klenk; Markus Göker
[This corrects the article on p. 2003 in vol. 7, PMID: 28066339.].