Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nathan Mih is active.

Publication


Featured researches published by Nathan Mih.


BMC Systems Biology | 2016

Systems biology of the structural proteome

Elizabeth Brunk; Nathan Mih; Jonathan M. Monk; Zhen Zhang; Edward J. O’Brien; Spencer Bliven; Ke Chen; Roger L. Chang; Philip E. Bourne; Bernhard O. Palsson

BackgroundThe success of genome-scale models (GEMs) can be attributed to the high-quality, bottom-up reconstructions of metabolic, protein synthesis, and transcriptional regulatory networks on an organism-specific basis. Such reconstructions are biochemically, genetically, and genomically structured knowledge bases that can be converted into a mathematical format to enable a myriad of computational biological studies. In recent years, genome-scale reconstructions have been extended to include protein structural information, which has opened up new vistas in systems biology research and empowered applications in structural systems biology and systems pharmacology.ResultsHere, we present the generation, application, and dissemination of genome-scale models with protein structures (GEM-PRO) for Escherichia coli and Thermotoga maritima. We show the utility of integrating molecular scale analyses with systems biology approaches by discussing several comparative analyses on the temperature dependence of growth, the distribution of protein fold families, substrate specificity, and characteristic features of whole cell proteomes. Finally, to aid in the grand challenge of big data to knowledge, we provide several explicit tutorials of how protein-related information can be linked to genome-scale models in a public GitHub repository (https://github.com/SBRG/GEMPro/tree/master/GEMPro_recon/).ConclusionsTranslating genome-scale, protein-related information to structured data in the format of a GEM provides a direct mapping of gene to gene-product to protein structure to biochemical reaction to network states to phenotypic function. Integration of molecular-level details of individual proteins, such as their physical, chemical, and structural properties, further expands the description of biochemical network-level properties, and can ultimately influence how to model and predict whole cell phenotypes as well as perform comparative systems biology approaches to study differences between organisms. GEM-PRO offers insight into the physical embodiment of an organism’s genotype, and its use in this comparative framework enables exploration of adaptive strategies for these organisms, opening the door to many new lines of research. With these provided tools, tutorials, and background, the reader will be in a position to run GEM-PRO for their own purposes.


Nature Biotechnology | 2017

iML1515, a knowledgebase that computes Escherichia coli traits

Jonathan M. Monk; Colton J. Lloyd; Elizabeth Brunk; Nathan Mih; Anand Sastry; Zachary A. King; Rikiya Takeuchi; Wataru Nomura; Zhen Zhang; Hirotada Mori; Adam M. Feist; Bernhard O. Palsson

iML1515, a knowledgebase that computes Escherichia coli traits To the Editor: Extracting knowledge from the many types of big data produced by high-throughput methods remains a challenge, even when data are from Escherichia coli, the best characterized bacterial species. Here, we present iML1515, the most complete genome-scale reconstruction of the metabolic network in E. coli K-12 MG1655 to date, and we demonstrate how it can be used to address this challenge. Enabling analysis of several data types, including transcriptomes, proteomes, and metabolomes, iML1515 accounts for 1,515 open reading frames and 2,719 metabolic reactions involving 1,192 unique metabolites. The iML1515 knowledgebase is linked to 1,515 protein structures to provide an integrated modeling framework bridging systems and structural biology. We apply iML1515 to build metabolic models of E. coli human gut microbiome strains from metagenomic sequencing data. We then use iML1515 to build metabolic models for E. coli clinical isolates and predict their metabolic capabilities. Finally, we use iML1515 to carry out a comparative structural proteome analysis of 1,122 E. coli strains and identify multi-strain sequence variations.


Proceedings of the National Academy of Sciences of the United States of America | 2016

Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis

Jared T. Broddrick; Benjamin E. Rubin; David G. Welkie; Niu Du; Nathan Mih; Spencer Diamond; Jenny J. Lee; Susan S. Golden; Bernhard O. Palsson

Significance Genome-scale models of metabolism are important tools for metabolic engineering and production strain development. We present an experimentally validated and manually curated model of metabolism in Synechococcus elongatus PCC 7942 that (i) leads to discovery of unique metabolic characteristics, such as the importance of a truncated, linear TCA pathway, (ii) highlights poorly understood areas of metabolism as exemplified by knowledge gaps in nucleotide salvage, and (iii) accurately quantifies light input and self-shading. We now have a metabolic model that can be used as a basis for metabolic design in S. elongatus. The model cyanobacterium, Synechococcus elongatus PCC 7942, is a genetically tractable obligate phototroph that is being developed for the bioproduction of high-value chemicals. Genome-scale models (GEMs) have been successfully used to assess and engineer cellular metabolism; however, GEMs of phototrophic metabolism have been limited by the lack of experimental datasets for model validation and the challenges of incorporating photon uptake. Here, we develop a GEM of metabolism in S. elongatus using random barcode transposon site sequencing (RB-TnSeq) essential gene and physiological data specific to photoautotrophic metabolism. The model explicitly describes photon absorption and accounts for shading, resulting in the characteristic linear growth curve of photoautotrophs. GEM predictions of gene essentiality were compared with data obtained from recent dense-transposon mutagenesis experiments. This dataset allowed major improvements to the accuracy of the model. Furthermore, discrepancies between GEM predictions and the in vivo dataset revealed biological characteristics, such as the importance of a truncated, linear TCA pathway, low flux toward amino acid synthesis from photorespiration, and knowledge gaps within nucleotide metabolism. Coupling of strong experimental support and photoautotrophic modeling methods thus resulted in a highly accurate model of S. elongatus metabolism that highlights previously unknown areas of S. elongatus biology.


Nature Biotechnology | 2018

Recon3D enables a three-dimensional view of gene variation in human metabolism

Elizabeth Brunk; Swagatika Sahoo; Daniel C. Zielinski; Ali Altunkaya; Andreas Dräger; Nathan Mih; Francesco Gatto; Avlant Nilsson; German Preciat Gonzalez; Maike Kathrin Aurich; Andreas Prlić; Anand Sastry; Anna Dröfn Daníelsdóttir; Almut Katrin Heinken; Alberto Noronha; Peter W. Rose; Stephen K. Burley; Ronan M. T. Fleming; Jens Nielsen; Ines Thiele; Bernhard O. Palsson

Genome-scale network reconstructions have helped uncover the molecular basis of metabolism. Here we present Recon3D, a computational resource that includes three-dimensional (3D) metabolite and protein structure data and enables integrated analyses of metabolic functions in humans. We use Recon3D to functionally characterize mutations associated with disease, and identify metabolic response signatures that are caused by exposure to certain drugs. Recon3D represents the most comprehensive human metabolic network model to date, accounting for 3,288 open reading frames (representing 17% of functionally annotated human genes), 13,543 metabolic reactions involving 4,140 unique metabolites, and 12,890 protein structures. These data provide a unique resource for investigating molecular mechanisms of human metabolism. Recon3D is available at http://vmh.life.


Proceedings of the National Academy of Sciences of the United States of America | 2017

Thermosensitivity of growth is determined by chaperone-mediated proteome reallocation

Ke Chen; Ye Gao; Nathan Mih; Edward J. O’Brien; Laurence Yang; Bernhard O. Palsson

Significance How do bacteria adapt to the diverse thermal niches on earth? Evidence accumulates in the protein sequence and structural determinants of thermosensitivity and mechanisms by which molecular chaperones aid protein folding. However, a comprehensive understanding of how thermoadaptation is achieved at the systems level is still missing. Here we reconstruct an integrated genome-scale protein-folding network for Escherichia coli, termed FoldME, that couples both contributing factors to the metabolic state of a cell. FoldME simulations reproduce the asymmetrical bacterial temperature response and delineate the multiscale strategies cells use to resist unfolding stresses induced by high temperature and destabilizing mutations in a single gene. The results highlight how global proteome allocation regulates thermoadaptation through balance between chaperones for folding and translational machinery for biosynthesis. Maintenance of a properly folded proteome is critical for bacterial survival at notably different growth temperatures. Understanding the molecular basis of thermoadaptation has progressed in two main directions, the sequence and structural basis of protein thermostability and the mechanistic principles of protein quality control assisted by chaperones. Yet we do not fully understand how structural integrity of the entire proteome is maintained under stress and how it affects cellular fitness. To address this challenge, we reconstruct a genome-scale protein-folding network for Escherichia coli and formulate a computational model, FoldME, that provides statistical descriptions of multiscale cellular response consistent with many datasets. FoldME simulations show (i) that the chaperones act as a system when they respond to unfolding stress rather than achieving efficient folding of any single component of the proteome, (ii) how the proteome is globally balanced between chaperones for folding and the complex machinery synthesizing the proteins in response to perturbation, (iii) how this balancing determines growth rate dependence on temperature and is achieved through nonspecific regulation, and (iv) how thermal instability of the individual protein affects the overall functional state of the proteome. Overall, these results expand our view of cellular regulation, from targeted specific control mechanisms to global regulation through a web of nonspecific competing interactions that modulate the optimal reallocation of cellular resources. The methodology developed in this study enables genome-scale integration of environment-dependent protein properties and a proteome-wide study of cellular stress responses.


Proceedings of the National Academy of Sciences of the United States of America | 2017

Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities

Xin Fang; Anand Sastry; Nathan Mih; Donghyuk Kim; Justin Tan; James T. Yurkovich; Colton J. Lloyd; Ye Gao; Laurence Yang; Bernhard O. Palsson

Significance While the transcriptional regulatory network (TRN) of Escherichia coli has expanded considerably in recent years through new chromatin immunoprecipitation (ChIP) data, an open question remains: Does the global TRN, reconstructed by combining ChIP data for individual transcriptions factors, consistently explain observed differential gene expression? We have reconstructed a high-confidence TRN, determined its consistency with transcriptomics and predictive capabilities across multiple conditions, extracted 10 functional regulatory modules, and characterized this network at the sequence and structural levels. Our multiomics algorithmic pipeline is expected to facilitate rigorous validation and prioritization of experiments to elucidate TRNs in other bacteria. Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN—probably the best characterized TRN—several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism’s TRN from disparate data types.


PLOS Computational Biology | 2016

A Multi-scale Computational Platform to Mechanistically Assess the Effect of Genetic Variation on Drug Responses in Human Erythrocyte Metabolism.

Nathan Mih; Elizabeth Brunk; Aarash Bordbar; Bernhard O. Palsson

Progress in systems medicine brings promise to addressing patient heterogeneity and individualized therapies. Recently, genome-scale models of metabolism have been shown to provide insight into the mechanistic link between drug therapies and systems-level off-target effects while being expanded to explicitly include the three-dimensional structure of proteins. The integration of these molecular-level details, such as the physical, structural, and dynamical properties of proteins, notably expands the computational description of biochemical network-level properties and the possibility of understanding and predicting whole cell phenotypes. In this study, we present a multi-scale modeling framework that describes biological processes which range in scale from atomistic details to an entire metabolic network. Using this approach, we can understand how genetic variation, which impacts the structure and reactivity of a protein, influences both native and drug-induced metabolic states. As a proof-of-concept, we study three enzymes (catechol-O-methyltransferase, glucose-6-phosphate dehydrogenase, and glyceraldehyde-3-phosphate dehydrogenase) and their respective genetic variants which have clinically relevant associations. Using all-atom molecular dynamic simulations enables the sampling of long timescale conformational dynamics of the proteins (and their mutant variants) in complex with their respective native metabolites or drug molecules. We find that changes in a protein’s structure due to a mutation influences protein binding affinity to metabolites and/or drug molecules, and inflicts large-scale changes in metabolism.


Bioinformatics | 2018

ssbio: a Python framework for structural systems biology

Nathan Mih; Elizabeth Brunk; Ke Chen; Edward Catoiu; Anand Sastry; Erol S. Kavvas; Jonathan M. Monk; Zhen Zhang; Bernhard O. Palsson

Summary: Working with protein structures at the genome‐scale has been challenging in a variety of ways. Here, we present ssbio, a Python package that provides a framework to easily work with structural information in the context of genome‐scale network reconstructions, which can contain thousands of individual proteins. The ssbio package provides an automated pipeline to construct high quality genome‐scale models with protein structures (GEM‐PROs), wrappers to popular third‐party programs to compute associated protein properties, and methods to visualize and annotate structures directly in Jupyter notebooks, thus lowering the barrier of linking 3D structural data with established systems workflows. Availability and implementation: ssbio is implemented in Python and available to download under the MIT license at http://github.com/SBRG/ssbio. Documentation and Jupyter notebook tutorials are available at http://ssbio.readthedocs.io/en/latest/. Interactive notebooks can be launched using Binder at https://mybinder.org/v2/gh/SBRG/ssbio/master?filepath=Binder.ipynb. Supplementary information: Supplementary data are available at Bioinformatics online.


bioRxiv | 2017

Multi-scale model of the proteomic and metabolic consequences of reactive oxygen species

Laurence Yang; Nathan Mih; James T. Yurkovich; Joon Ho Park; Sangwoo Seo; Donghyuk Kim; Jonathan M. Monk; Colton J. Lloyd; Justin Tan; Ye Gao; Jared T. Broddrick; Ke Chen; David Heckmann; Adam M. Feist; Bernhard O. Palsson

All aerobically growing microbes must deal with oxidative stress from intrinsically-generated reactive oxygen species (ROS), or from external ROS in the context of infection. To study the systems biology of microbial ROS response, we developed a genome-scale model of proteome damage and maintenance in response to ROS, by extending a genome-scale metabolism and macro-molecular expression (ME) model of E. coli. This OxidizeME model recapitulated measured microbial oxidative stress response including metal-loenzyme inactivation by ROS and amino acid auxotrophies. OxidizeME also correctly predicted differential expression under ROS stress. We used OxidizeME to investigate how environmental context affects the flexibility of ROS stress response. The context-dependency of microbial stress response has important implications for infectious disease. OxidizeME provides a computational resource for model-driven experiment design in this direction.Catalysis using iron-sulfur clusters and transition metals can be traced back to the last universal common ancestor. The damage to metalloproteins caused by reactive oxygen species (ROS) can completely inhibit cell growth when unmanaged and thus elicits an essential stress response that is universal and fundamental in biology. We develop a computable multi-scale description of the ROS stress response in Escherichia coli. We show that this quantitative framework allows for the understanding and prediction of ROS stress responses at three levels: 1) pathways: amino acid auxotrophies, 2) networks: the systemic response to ROS stress, and 3) genetic basis: adaptation to ROS stress during laboratory evolution. These results show that we can now develop fundamental and quantitative genotype-phenotype relationships for stress responses on a genome-wide basis.


Nucleic Acids Research | 2018

Systematic discovery of uncharacterized transcription factors in Escherichia coli K-12 MG1655

Ye Gao; James T. Yurkovich; Sang Woo Seo; Ilyas Kabimoldayev; Andreas Dräger; Ke Chen; Anand Sastry; Xin Fang; Nathan Mih; Laurence Yang; Johannes Eichner; Byung-Kwan Cho; Donghyuk Kim; Bernhard O. Palsson

Abstract Transcriptional regulation enables cells to respond to environmental changes. Of the estimated 304 candidate transcription factors (TFs) in Escherichia coli K-12 MG1655, 185 have been experimentally identified, but ChIP methods have been used to fully characterize only a few dozen. Identifying these remaining TFs is key to improving our knowledge of the E. coli transcriptional regulatory network (TRN). Here, we developed an integrated workflow for the computational prediction and comprehensive experimental validation of TFs using a suite of genome-wide experiments. We applied this workflow to (i) identify 16 candidate TFs from over a hundred uncharacterized genes; (ii) capture a total of 255 DNA binding peaks for ten candidate TFs resulting in six high-confidence binding motifs; (iii) reconstruct the regulons of these ten TFs by determining gene expression changes upon deletion of each TF and (iv) identify the regulatory roles of three TFs (YiaJ, YdcI, and YeiE) as regulators of l-ascorbate utilization, proton transfer and acetate metabolism, and iron homeostasis under iron-limited conditions, respectively. Together, these results demonstrate how this workflow can be used to discover, characterize, and elucidate regulatory functions of uncharacterized TFs in parallel.

Collaboration


Dive into the Nathan Mih's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anand Sastry

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ke Chen

University of California

View shared research outputs
Top Co-Authors

Avatar

Laurence Yang

University of California

View shared research outputs
Top Co-Authors

Avatar

Erol S. Kavvas

University of California

View shared research outputs
Top Co-Authors

Avatar

Ye Gao

University of California

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge