Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Arshan Nasir is active.

Publication


Featured researches published by Arshan Nasir.


BMC Evolutionary Biology | 2012

Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea, Bacteria and Eukarya

Arshan Nasir; Kyung Mo Kim; Gustavo Caetano-Anollés

BackgroundThe discovery of giant viruses with genome and physical size comparable to cellular organisms, remnants of protein translation machinery and virus-specific parasites (virophages) have raised intriguing questions about their origin. Evidence advocates for their inclusion into global phylogenomic studies and their consideration as a distinct and ancient form of life.ResultsHere we reconstruct phylogenies describing the evolution of proteomes and protein domain structures of cellular organisms and double-stranded DNA viruses with medium-to-very-large proteomes (giant viruses). Trees of proteomes define viruses as a ‘fourth supergroup’ along with superkingdoms Archaea, Bacteria, and Eukarya. Trees of domains indicate they have evolved via massive and primordial reductive evolutionary processes. The distribution of domain structures suggests giant viruses harbor a significant number of protein domains including those with no cellular representation. The genomic and structural diversity embedded in the viral proteomes is comparable to the cellular proteomes of organisms with parasitic lifestyles. Since viral domains are widespread among cellular species, we propose that viruses mediate gene transfer between cells and crucially enhance biodiversity.ConclusionsResults call for a change in the way viruses are perceived. They likely represent a distinct form of life that either predated or coexisted with the last universal common ancestor (LUCA) and constitute a very crucial part of our planet’s biosphere.


Science Advances | 2015

A phylogenomic data-driven exploration of viral origins and evolution

Arshan Nasir; Gustavo Caetano-Anollés

A study of the evolution of the proteomic makeup of cells and viruses using protein structural and functional data. The origin of viruses remains mysterious because of their diverse and patchy molecular and functional makeup. Although numerous hypotheses have attempted to explain viral origins, none is backed by substantive data. We take full advantage of the wealth of available protein structural and functional data to explore the evolution of the proteomic makeup of thousands of cells and viruses. Despite the extremely reduced nature of viral proteomes, we established an ancient origin of the “viral supergroup” and the existence of widespread episodes of horizontal transfer of genetic information. Viruses harboring different replicon types and infecting distantly related hosts shared many metabolic and informational protein structural domains of ancient origin that were also widespread in cellular proteomes. Phylogenomic analysis uncovered a universal tree of life and revealed that modern viruses reduced from multiple ancient cells that harbored segmented RNA genomes and coexisted with the ancestors of modern cells. The model for the origin and evolution of viruses and cells is backed by strong genomic and structural evidence and can be reconciled with existing models of viral evolution if one considers viruses to have originated from ancient cells and not from modern counterparts.


PLOS Computational Biology | 2014

Global patterns of protein domain gain and loss in superkingdoms.

Arshan Nasir; Kyung Mo Kim; Gustavo Caetano-Anollés

Domains are modules within proteins that can fold and function independently and are evolutionarily conserved. Here we compared the usage and distribution of protein domain families in the free-living proteomes of Archaea, Bacteria and Eukarya and reconstructed species phylogenies while tracing the history of domain emergence and loss in proteomes. We show that both gains and losses of domains occurred frequently during proteome evolution. The rate of domain discovery increased approximately linearly in evolutionary time. Remarkably, gains generally outnumbered losses and the gain-to-loss ratios were much higher in akaryotes compared to eukaryotes. Functional annotations of domain families revealed that both Archaea and Bacteria gained and lost metabolic capabilities during the course of evolution while Eukarya acquired a number of diverse molecular functions including those involved in extracellular processes, immunological mechanisms, and cell regulation. Results also highlighted significant contemporary sharing of informational enzymes between Archaea and Eukarya and metabolic enzymes between Bacteria and Eukarya. Finally, the analysis provided useful insights into the evolution of species. The archaeal superkingdom appeared first in evolution by gradual loss of ancestral domains, bacterial lineages were the first to gain superkingdom-specific domains, and eukaryotes (likely) originated when an expanding proto-eukaryotic stem lineage gained organelles through endosymbiosis of already diversified bacterial lineages. The evolutionary dynamics of domain families in proteomes and the increasing number of domain gains is predicted to redefine the persistence strategies of organisms in superkingdoms, influence the make up of molecular functions, and enhance organismal complexity by the generation of new domain architectures. This dynamics highlights ongoing secondary evolutionary adaptations in akaryotic microbes, especially Archaea.


PLOS Genetics | 2017

Lokiarchaea are close relatives of Euryarchaeota, not bridging the gap between prokaryotes and eukaryotes

Violette Da Cunha; Morgan Gaia; Danièle Gadelle; Arshan Nasir; Patrick Forterre

The eocyte hypothesis, in which Eukarya emerged from within Archaea, has been boosted by the description of a new candidate archaeal phylum, “Lokiarchaeota”, from metagenomic data. Eukarya branch within Lokiarchaeota in a tree reconstructed from the concatenation of 36 universal proteins. However, individual phylogenies revealed that lokiarchaeal proteins sequences have different evolutionary histories. The individual markers phylogenies revealed at least two subsets of proteins, either supporting the Woese or the Eocyte tree of life. Strikingly, removal of a single protein, the elongation factor EF2, is sufficient to break the Eukaryotes-Lokiarchaea affiliation. Our analysis suggests that the three lokiarchaeal EF2 proteins have a chimeric organization that could be due to contamination and/or homologous recombination with patches of eukaryotic sequences. A robust phylogenetic analysis of RNA polymerases with a new dataset indicates that Lokiarchaeota and related phyla of the Asgard superphylum are sister group to Euryarchaeota, not to Eukarya, and supports the monophyly of Archaea with their rooting in the branch leading to Thaumarchaeota.


Frontiers in Genetics | 2012

Benefits of using molecular structure and abundance in phylogenomic analysis.

Gustavo Caetano-Anollés; Arshan Nasir

Molecular structure is eminently modular and expresses complexity at different levels of molecular organization (Caetano-Anolles et al., 2009). At high levels, evolutionary change occurs at extraordinary slow pace. A new protein fold can take millions of years to materialize in sequence space while new sequences develop in less than microseconds. Structural cores are generally orders of magnitude more conserved than sequences. Consequently, they carry durable phylogenetic information useful for deep exploration of biological history. Unfortunately, the complexities of structural alignments, in which similarities of two sets of atoms with unknown correspondences are sought with no restriction on the correspondences, make global phylogenetic analysis of structure an enormous bioinformational challenge (Taylor, 2007). In recent years, however, a shift of focus from molecules to molecular repertoires, advances in bioinformatics implementations, and an expanded census of structure and function provided new avenues of evolutionary exploration. Developments include: (i) the almost complete experimental acquisition of protein folds structures (∼1,200 out of 1,500 expected; Levitt, 2009) and wide coverage of the modern RNA world (Leontis et al., 2006); (ii) functional ontologies with the potential to unify biological knowledge [e.g., gene ontology, (GO); Ashburner et al., 2000]; (iii) widespread and robust assignment of known structures to genomic sequences (Chothia and Gough, 2009); and (iv) the development of phylogenomic methods that embed structure and function directly into phylogenetic analysis (Caetano-Anolles et al., 2009). Genomic abundances derived from structural and functional censuses have been used to build trees of proteomes (ToPs; Gerstein, 1998), trees of domains (ToDs; Caetano-Anolles and Caetano-Anolles, 2003), and trees of functions (ToFs; Kim and Caetano-Anolles, 2010). While the branches of ToPs encase proteomic history and resemble traditional “trees of species” built by systematic biologists, ToDs describe how components of the system (domains in proteomes) change as the entire system evolves. These rooted phylogenomic trees establish an “evolutionary arrow,” without resorting to outgroup hypotheses, defining a chronology of architectural innovation (Figure ​(Figure1A).1A). Trees are not phenetic statements. While they are built from multistate or quantitative valued characters, speciation in trees fulfills a molecular clock that is compatible with paleobiology and the geological record (Wang et al., 2011). In sharp contrast to standard phylogenetic methods that generate trees of genes and genomes (ToGs) from the occurrence of genomic features (e.g., nucleotides or amino acid residues in sequence sites, presence/absence of a gene), ToDs and ToPs reap the benefit of processes occurring at higher and more conserved levels of the structural hierarchy that are responsible for the accumulation of modules in biology (Caetano-Anolles et al., 2010; Mittenthal et al., 2012). The systematic study of “abundance” of molecular parts rather than their “occurrence” offers several advantages over ToGs and standard phylogenetic analysis of sequence that we here highlight:


Genes | 2011

Annotation of Protein Domains Reveals Remarkable Conservation in the Functional Make up of Proteomes Across Superkingdoms

Arshan Nasir; Aisha Naeem; M.J. Khan; Horacio D. Lopez Nicora; Gustavo Caetano-Anollés

The functional repertoire of a cell is largely embodied in its proteome, the collection of proteins encoded in the genome of an organism. The molecular functions of proteins are the direct consequence of their structure and structure can be inferred from sequence using hidden Markov models of structural recognition. Here we analyze the functional annotation of protein domain structures in almost a thousand sequenced genomes, exploring the functional and structural diversity of proteomes. We find there is a remarkable conservation in the distribution of domains with respect to the molecular functions they perform in the three superkingdoms of life. In general, most of the protein repertoire is spent in functions related to metabolic processes but there are significant differences in the usage of domains for regulatory and extra-cellular processes both within and between superkingdoms. Our results support the hypotheses that the proteomes of superkingdom Eukarya evolved via genome expansion mechanisms that were directed towards innovating new domain architectures for regulatory and extra/intracellular process functions needed for example to maintain the integrity of multicellular structure or to interact with environmental biotic and abiotic factors (e.g., cell signaling and adhesion, immune responses, and toxin production). Proteomes of microbial superkingdoms Archaea and Bacteria retained fewer numbers of domains and maintained simple and smaller protein repertoires. Viruses appear to play an important role in the evolution of superkingdoms. We finally identify few genomic outliers that deviate significantly from the conserved functional design. These include Nanoarchaeum equitans, proteobacterial symbionts of insects with extremely reduced genomes, Tenericutes and Guillardia theta. These organisms spend most of their domains on information functions, including translation and transcription, rather than on metabolism and harbor a domain repertoire characteristic of parasitic organisms. In contrast, the functional repertoire of the proteomes of the Planctomycetes-Verrucomicrobia-Chlamydiae superphylum was no different than the rest of bacteria, failing to support claims of them representing a separate superkingdom. In turn, Protista and Bacteria shared similar functional distribution patterns suggesting an ancestral evolutionary link between these groups.


Frontiers in Microbiology | 2014

The distribution and impact of viral lineages in domains of life.

Arshan Nasir; Patrick Forterre; Kyung Mo Kim; Gustavo Caetano-Anollés

Living organisms can be conveniently classified into three domains, Archaea, Bacteria, and Eukarya (Woese et al., 1990). The three domains are united by several features that support the common origin of life including the presence of ribosomes, double-stranded DNA genomes, a nearly universal genetic code, physical compartments (i.e., membranes), and the ability to carry out metabolism and oxidation-reduction reactions. In comparison, other types of genetic material and particles (e.g., viruses, plasmids, and other selfish genetic elements) are often excluded from the definition of “life” (for opposing views see Raoult and Forterre, 2008; Forterre, 2011, 2012a). However, they can still influence the evolution of cellular organisms, and in conjunction, establish complex life cycles.


Archaea | 2014

Archaea: The First Domain of Diversified Life

Gustavo Caetano-Anollés; Arshan Nasir; Kaiyue Zhou; Derek Caetano-Anollés; Jay E. Mittenthal; Feng Jie Sun; Kyung Mo Kim

The study of the origin of diversified life has been plagued by technical and conceptual difficulties, controversy, and apriorism. It is now popularly accepted that the universal tree of life is rooted in the akaryotes and that Archaea and Eukarya are sister groups to each other. However, evolutionary studies have overwhelmingly focused on nucleic acid and protein sequences, which partially fulfill only two of the three main steps of phylogenetic analysis, formulation of realistic evolutionary models, and optimization of tree reconstruction. In the absence of character polarization, that is, the ability to identify ancestral and derived character states, any statement about the rooting of the tree of life should be considered suspect. Here we show that macromolecular structure and a new phylogenetic framework of analysis that focuses on the parts of biological systems instead of the whole provide both deep and reliable phylogenetic signal and enable us to put forth hypotheses of origin. We review over a decade of phylogenomic studies, which mine information in a genomic census of millions of encoded proteins and RNAs. We show how the use of process models of molecular accumulation that comply with Westons generality criterion supports a consistent phylogenomic scenario in which the origin of diversified life can be traced back to the early history of Archaea.


Frontiers in Genetics | 2013

A General Framework of Persistence Strategies for Biological Systems Helps Explain Domains of Life

Liudmila S. Yafremava; Monica Wielgos; Suravi Thomas; Arshan Nasir; Minglei Wang; Jay E. Mittenthal; Gustavo Caetano-Anollés

The nature and cause of the division of organisms in superkingdoms is not fully understood. Assuming that environment shapes physiology, here we construct a novel theoretical framework that helps identify general patterns of organism persistence. This framework is based on Jacob von Uexküll’s organism-centric view of the environment and James G. Miller’s view of organisms as matter-energy-information processing molecular machines. Three concepts describe an organism’s environmental niche: scope, umwelt, and gap. Scope denotes the entirety of environmental events and conditions to which the organism is exposed during its lifetime. Umwelt encompasses an organism’s perception of these events. The gap is the organism’s blind spot, the scope that is not covered by umwelt. These concepts bring organisms of different complexity to a common ecological denominator. Ecological and physiological data suggest organisms persist using three strategies: flexibility, robustness, and economy. All organisms use umwelt information to flexibly adapt to environmental change. They implement robustness against environmental perturbations within the gap generally through redundancy and reliability of internal constituents. Both flexibility and robustness improve survival. However, they also incur metabolic matter-energy processing costs, which otherwise could have been used for growth and reproduction. Lineages evolve unique tradeoff solutions among strategies in the space of what we call “a persistence triangle.” Protein domain architecture and other evidence support the preferential use of flexibility and robustness properties. Archaea and Bacteria gravitate toward the triangle’s economy vertex, with Archaea biased toward robustness. Eukarya trade economy for survivability. Protista occupy a saddle manifold separating akaryotes from multicellular organisms. Plants and the more flexible Fungi share an economic stratum, and Metazoa are locked in a positive feedback loop toward flexibility.


Trends in Microbiology | 2015

Lokiarchaeota: eukaryote-like missing links from microbial dark matter?

Arshan Nasir; Kyung Mo Kim; Gustavo Caetano-Anollés

Identification and genome sequencing of novel organismal groups can reduce the gap between the sequenced minority and the unexplored majority. The recent discovery of phylum Lokiarchaeota promises understanding of biological history. Here we inquire if Lokiarchaeota truly represent ancient eukaryotic ancestors or just microbial dark matter of expanding archaeal diversity.

Collaboration


Dive into the Arshan Nasir's collaboration.

Top Co-Authors

Avatar

Kyung Mo Kim

Korea Research Institute of Bioscience and Biotechnology

View shared research outputs
Top Co-Authors

Avatar

Hanna Choe

Korea Research Institute of Bioscience and Biotechnology

View shared research outputs
Top Co-Authors

Avatar

Sang-Heon Lee

Korea Research Institute of Bioscience and Biotechnology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Doo-Sang Park

Korea Research Institute of Bioscience and Biotechnology

View shared research outputs
Top Co-Authors

Avatar

Song-Gun Kim

Korea Research Institute of Bioscience and Biotechnology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hyeonsoo Jeong

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Jeongsu Oh

Korea Research Institute of Bioscience and Biotechnology

View shared research outputs
Researchain Logo
Decentralizing Knowledge