Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shibu Yooseph is active.

Publication


Featured researches published by Shibu Yooseph.


PLOS Biology | 2007

The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific

Douglas B. Rusch; Aaron L. Halpern; Granger Sutton; Karla B. Heidelberg; Shannon J. Williamson; Shibu Yooseph; Dongying Wu; Jonathan A. Eisen; Jeff Hoffman; Karin A. Remington; Karen Beeson; Bao Duc Tran; Hamilton O. Smith; Holly Baden-Tillson; Clare Stewart; Joyce Thorpe; Jason Freeman; Cynthia Andrews-Pfannkoch; Joseph E. Venter; Kelvin Li; Saul Kravitz; John F. Heidelberg; Terry Utterback; Yu-Hui Rogers; Luisa I. Falcón; Valeria Souza; Germán Bonilla-Rosso; Luis E. Eguiarte; David M. Karl; Shubha Sathyendranath

The worlds oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed “fragment recruitment,” addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed “extreme assembly,” made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS.


PLOS Biology | 2007

The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families

Shibu Yooseph; Granger Sutton; Douglas B. Rusch; Aaron L. Halpern; Shannon J. Williamson; Karin A. Remington; Jonathan A. Eisen; Karla B. Heidelberg; Gerard Manning; Weizhong Li; Lukasz Jaroszewski; Piotr Cieplak; Christopher S. Miller; Huiying Li; Susan T. Mashiyama; Marcin P Joachimiak; Christopher van Belle; John-Marc Chandonia; David A W Soergel; Yufeng Zhai; Kannan Natarajan; Shaun W. Lee; Benjamin J. Raphael; Vineet Bafna; Robert Friedman; Steven E. Brenner; Adam Godzik; David Eisenberg; Jack E. Dixon; Susan S. Taylor

Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.


The ISME Journal | 2012

Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage

Chris L. Dupont; Douglas B. Rusch; Shibu Yooseph; Mary-Jane Lombardo; R. Alexander Richter; Ruben E. Valas; Mark Novotny; Joyclyn Yee-Greenbaum; Jeremy D. Selengut; Daniel H. Haft; Aaron L. Halpern; Roger S. Lasken; Kenneth H. Nealson; Robert M. Friedman; J. Craig Venter

Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25–1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition.


PLOS ONE | 2008

The Sorcerer II Global Ocean Sampling Expedition: Metagenomic Characterization of Viruses within Aquatic Microbial Samples

Shannon J. Williamson; Douglas B. Rusch; Shibu Yooseph; Aaron L. Halpern; Karla B. Heidelberg; John I. Glass; Cynthia Andrews-Pfannkoch; Douglas W. Fadrosh; Christopher S. Miller; Granger Sutton; Marvin Frazier; J. Craig Venter

Viruses are the most abundant biological entities on our planet. Interactions between viruses and their hosts impact several important biological processes in the worlds oceans such as horizontal gene transfer, microbial diversity and biogeochemical cycling. Interrogation of microbial metagenomic sequence data collected as part of the Sorcerer II Global Ocean Expedition (GOS) revealed a high abundance of viral sequences, representing approximately 3% of the total predicted proteins. Cluster analyses of the viral sequences revealed hundreds to thousands of viral genes encoding various metabolic and cellular functions. Quantitative analyses of viral genes of host origin performed on the viral fraction of aquatic samples confirmed the viral nature of these sequences and suggested that significant portions of aquatic viral communities behave as reservoirs of such genetic material. Distributional and phylogenetic analyses of these host-derived viral sequences also suggested that viral acquisition of environmentally relevant genes of host origin is a more abundant and widespread phenomenon than previously appreciated. The predominant viral sequences identified within microbial fractions originated from tailed bacteriophages and exhibited varying global distributions according to viral family. Recruitment of GOS viral sequence fragments against 27 complete aquatic viral genomes revealed that only one reference bacteriophage genome was highly abundant and was closely related, but not identical, to the cyanomyovirus P-SSM4. The co-distribution across all sampling sites of P-SSM4-like sequences with the dominant ecotype of its host, Prochlorococcus supports the classification of the viral sequences as P-SSM4-like and suggests that this virus may influence the abundance, distribution and diversity of one of the most dominant components of picophytoplankton in oligotrophic oceans. In summary, the abundance and broad geographical distribution of viral sequences within microbial fractions, the prevalence of genes among viral sequences that encode microbial physiological function and their distinct phylogenetic distribution lend strong support to the notion that viral-mediated gene acquisition is a common and ongoing mechanism for generating microbial diversity in the marine environment.


Nature | 2010

Genomic and functional adaptation in surface ocean planktonic prokaryotes

Shibu Yooseph; Kenneth H. Nealson; Douglas B. Rusch; John P. McCrow; Christopher L. Dupont; Maria Kim; Justin Johnson; Robert Montgomery; Steve Ferriera; Karen Beeson; Shannon J. Williamson; Andrey Tovchigrechko; Andrew E. Allen; Lisa Zeigler; Granger Sutton; Eric Eisenstadt; Yu-Hui Rogers; Robert Friedman; Marvin Frazier; J. Craig Venter

The understanding of marine microbial ecology and metabolism has been hampered by the paucity of sequenced reference genomes. To this end, we report the sequencing of 137 diverse marine isolates collected from around the world. We analysed these sequences, along with previously published marine prokaryotic genomes, in the context of marine metagenomic data, to gain insights into the ecology of the surface ocean prokaryotic picoplankton (0.1–3.0 μm size range). The results suggest that the sequenced genomes define two microbial groups: one composed of only a few taxa that are nearly always abundant in picoplanktonic communities, and the other consisting of many microbial taxa that are rarely abundant. The genomic content of the second group suggests that these microbes are capable of slow growth and survival in energy-limited environments, and rapid growth in energy-rich environments. By contrast, the abundant and cosmopolitan picoplanktonic prokaryotes for which there is genomic representation have smaller genomes, are probably capable of only slow growth and seem to be relatively unable to sense or rapidly acclimate to energy-rich conditions. Their genomic features also lead us to propose that one method used to avoid predation by viruses and/or bacterivores is by means of slow growth and the maintenance of low biomass.


PLOS ONE | 2012

Evaluation of 16s rDNA-based community profiling for human microbiome research

Doyle V. Ward; Dirk Gevers; Georgia Giannoukos; Ashlee M. Earl; Barbara A. Methé; Erica Sodergren; Michael Feldgarden; Dawn Ciulla; Diana Tabbaa; Cesar Arze; Elizabeth L. Appelbaum; Leigh Aird; Scott Anderson; Tulin Ayvaz; Edward A. Belter; Monika Bihan; Toby Bloom; Jonathan Crabtree; Laura Courtney; Lynn K. Carmichael; David J. Dooling; Rachel L. Erlich; Candace N. Farmer; Lucinda Fulton; Robert S. Fulton; Hongyu Gao; John Gill; Brian J. Haas; Lisa Hemphill; Otis Hall

The Human Microbiome Project will establish a reference data set for analysis of the microbiome of healthy adults by surveying multiple body sites from 300 people and generating data from over 12,000 samples. To characterize these samples, the participating sequencing centers evaluated and adopted 16S rDNA community profiling protocols for ABI 3730 and 454 FLX Titanium sequencing. In the course of establishing protocols, we examined the performance and error characteristics of each technology, and the relationship of sequence error to the utility of 16S rDNA regions for classification- and OTU-based analysis of community structure. The data production protocols used for this work are those used by the participating centers to produce 16S rDNA sequence for the Human Microbiome Project. Thus, these results can be informative for interpreting the large body of clinical 16S rDNA data produced for this project.


Proceedings of the National Academy of Sciences of the United States of America | 2004

Whole-genome shotgun assembly and comparison of human genome assemblies

Sorin Istrail; Granger Sutton; Liliana Florea; Aaron L. Halpern; Clark M. Mobarry; Ross A. Lippert; Brian Walenz; Hagit Shatkay; Ian M. Dew; Jason R. Miller; Michael Flanigan; Nathan Edwards; Randall Bolanos; Daniel Fasulo; Bjarni V. Halldórsson; Sridhar Hannenhalli; Russell Turner; Shibu Yooseph; Fu Lu; Deborah Nusskern; Bixiong Shue; Xiangqun Holly Zheng; Fei Zhong; Arthur L. Delcher; Daniel H. Huson; Saul Kravitz; Laurent Mouchard; Knut Reinert; Karin A. Remington; Andrew G. Clark

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304–1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860–921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.


Cell Metabolism | 2014

Diet and Feeding Pattern Affect the Diurnal Dynamics of the Gut Microbiome

Amir Zarrinpar; Amandine Chaix; Shibu Yooseph; Satchidananda Panda

The gut microbiome and daily feeding/fasting cycle influence host metabolism and contribute to obesity and metabolic diseases. However, fundamental characteristics of this relationship between the feeding/fasting cycle and the gut microbiome are unknown. Our studies show that the gut microbiome is highly dynamic, exhibiting daily cyclical fluctuations in composition. Diet-induced obesity dampens the daily feeding/fasting rhythm and diminishes many of these cyclical fluctuations. Time-restricted feeding (TRF), in which feeding is consolidated to the nocturnal phase, partially restores these cyclical fluctuations. Furthermore, TRF, which protects against obesity and metabolic diseases, affects bacteria shown to influence host metabolism. Cyclical changes in the gut microbiome from feeding/fasting rhythms contribute to the diversity of gut microflora and likely represent a mechanism by which the gut microbiome affects host metabolism. Thus, feeding pattern and time of harvest, in addition to diet, are important parameters when assessing the microbiomes contribution to host metabolism.


PLOS ONE | 2012

Analyses of the Microbial Diversity across the Human Microbiome

Kelvin Li; Monika Bihan; Shibu Yooseph; Barbara A. Methé

Analysis of human body microbial diversity is fundamental to understanding community structure, biology and ecology. The National Institutes of Health Human Microbiome Project (HMP) has provided an unprecedented opportunity to examine microbial diversity within and across body habitats and individuals through pyrosequencing-based profiling of 16 S rRNA gene sequences (16 S) from habits of the oral, skin, distal gut, and vaginal body regions from over 200 healthy individuals enabling the application of statistical techniques. In this study, two approaches were applied to elucidate the nature and extent of human microbiome diversity. First, bootstrap and parametric curve fitting techniques were evaluated to estimate the maximum number of unique taxa, Smax, and taxa discovery rate for habitats across individuals. Next, our results demonstrated that the variation of diversity within low abundant taxa across habitats and individuals was not sufficiently quantified with standard ecological diversity indices. This impact from low abundant taxa motivated us to introduce a novel rank-based diversity measure, the Tail statistic, (“τ”), based on the standard deviation of the rank abundance curve if made symmetric by reflection around the most abundant taxon. Due to τ’s greater sensitivity to low abundant taxa, its application to diversity estimation of taxonomic units using taxonomic dependent and independent methods revealed a greater range of values recovered between individuals versus body habitats, and different patterns of diversity within habitats. The greatest range of τ values within and across individuals was found in stool, which also exhibited the most undiscovered taxa. Oral and skin habitats revealed variable diversity patterns, while vaginal habitats were consistently the least diverse. Collectively, these results demonstrate the importance, and motivate the introduction, of several visualization and analysis methods tuned specifically for next-generation sequence data, further revealing that low abundant taxa serve as an important reservoir of genetic diversity in the human microbiome.


Proceedings of the National Academy of Sciences of the United States of America | 2013

Candidate phylum TM6 genome recovered from a hospital sink biofilm provides genomic insights into this uncultivated phylum

Jeffrey S. McLean; Mary-Jane Lombardo; Jonathan H. Badger; Anna Edlund; Mark Novotny; Joyclyn Yee-Greenbaum; Nikolay Vyahhi; Adam P Hall; Youngik Yang; Christopher L. Dupont; Michael G. Ziegler; Hamidreza Chitsaz; Andrew E. Allen; Shibu Yooseph; Glenn Tesler; Pavel A. Pevzner; Robert Friedman; Kenneth H. Nealson; J. C. Venter; Roger S. Lasken

Significance This research highlights the discovery and genome reconstruction of a member of the globally distributed yet uncultivated candidate phylum TM6 (designated TM6SC1). In addition to the 16S rRNA gene, no other genomic information is available for this cosmopolitan phylum. This report also introduces a mini-metagenomic approach based on the use of high-throughput single-cell genomics techniques and assembly tools that address a widely recognized issue: how to effectively capture and sequence the currently uncultivated bacterial species that make up the “dark matter of life.” Amplification and sequencing random pools of 100 events enabled an estimated 90% recovery of the TM6SC1 genome. The “dark matter of life” describes microbes and even entire divisions of bacterial phyla that have evaded cultivation and have yet to be sequenced. We present a genome from the globally distributed but elusive candidate phylum TM6 and uncover its metabolic potential. TM6 was detected in a biofilm from a sink drain within a hospital restroom by analyzing cells using a highly automated single-cell genomics platform. We developed an approach for increasing throughput and effectively improving the likelihood of sampling rare events based on forming small random pools of single-flow–sorted cells, amplifying their DNA by multiple displacement amplification and sequencing all cells in the pool, creating a “mini-metagenome.” A recently developed single-cell assembler, SPAdes, in combination with contig binning methods, allowed the reconstruction of genomes from these mini-metagenomes. A total of 1.07 Mb was recovered in seven contigs for this member of TM6 (JCVI TM6SC1), estimated to represent 90% of its genome. High nucleotide identity between a total of three TM6 genome drafts generated from pools that were independently captured, amplified, and assembled provided strong confirmation of a correct genomic sequence. TM6 is likely a Gram-negative organism and possibly a symbiont of an unknown host (nonfree living) in part based on its small genome, low-GC content, and lack of biosynthesis pathways for most amino acids and vitamins. Phylogenomic analysis of conserved single-copy genes confirms that TM6SC1 is a deeply branching phylum.

Collaboration


Dive into the Shibu Yooseph's collaboration.

Researchain Logo
Decentralizing Knowledge