Sara Light
Stockholm University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sara Light.
Genome Biology | 2006
Diana Ekman; Sara Light; Åsa K. Björklund; Arne Elofsson
BackgroundMost proteins interact with only a few other proteins while a small number of proteins (hubs) have many interaction partners. Hub proteins and non-hub proteins differ in several respects; however, understanding is not complete about what properties characterize the hubs and set them apart from proteins of low connectivity. Therefore, we have investigated what differentiates hubs from non-hubs and static hubs (party hubs) from dynamic hubs (date hubs) in the protein-protein interaction network of Saccharomyces cerevisiae.ResultsThe many interactions of hub proteins can only partly be explained by bindings to similar proteins or domains. It is evident that domain repeats, which are associated with binding, are enriched in hubs. Moreover, there is an over representation of multi-domain proteins and long proteins among the hubs. In addition, there are clear differences between party hubs and date hubs. Fewer of the party hubs contain long disordered regions compared to date hubs, indicating that these regions are important for flexible binding but less so for static interactions. Furthermore, party hubs interact to a large extent with each other, supporting the idea of party hubs as the cores of highly clustered functional modules. In addition, hub proteins, and in particular party hubs, are more often ancient. Finally, the more recent paralogs of party hubs are underrepresented.ConclusionOur results indicate that multiple and repeated domains are enriched in hub proteins and, further, that long disordered regions, which are common in date hubs, are particularly important for flexible binding.
BMC Genomics | 2005
Sara Light; Per Kraulis; Arne Elofsson
BackgroundMany biological networks show some characteristics of scale-free networks. Scale-free networks can evolve through preferential attachment where new nodes are preferentially attached to well connected nodes. In networks which have evolved through preferential attachment older nodes should have a higher average connectivity than younger nodes. Here we have investigated preferential attachment in the context of metabolic networks.ResultsThe connectivities of the enzymes in the metabolic network of Escherichia coli were determined and representatives for these enzymes were located in 11 eukaryotes, 17 archaea and 46 bacteria. E. coli enzymes which have representatives in eukaryotes have a higher average connectivity while enzymes which are represented only in the prokaryotes, and especially the enzymes only present in βγ-proteobacteria, have lower connectivities than expected by chance. Interestingly, the enzymes which have been proposed as candidates for horizontal gene transfer have a higher average connectivity than the other enzymes. Furthermore, It was found that new edges are added to the highly connected enzymes at a faster rate than to enzymes with low connectivities which is consistent with preferential attachment.ConclusionHere, we have found indications of preferential attachment in the metabolic network of E. coli. A possible biological explanation for preferential attachment growth of metabolic networks is that novel enzymes created through gene duplication maintain some of the compounds involved in the original reaction, throughout its future evolution. In addition, we found that enzymes which are candidates for horizontal gene transfer have a higher average connectivity than other enzymes. This indicates that while new enzymes are attached preferentially to highly connected enzymes, these highly connected enzymes have sometimes been introduced into the E. coli genome by horizontal gene transfer. We speculate that E. coli has adjusted its metabolic network to a changing environment by replacing the relatively central enzymes for better adapted orthologs from other prokaryotic species.
BMC Bioinformatics | 2004
Sara Light; Per Kraulis
BackgroundThe two most common models for the evolution of metabolism are the patchwork evolution model, where enzymes are thought to diverge from broad to narrow substrate specificity, and the retrograde evolution model, according to which enzymes evolve in response to substrate depletion. Analysis of the distribution of homologous enzyme pairs in the metabolic network can shed light on the respective importance of the two models. We here investigate the evolution of the metabolism in E. coli viewed as a single network using EcoCyc.ResultsSequence comparison between all enzyme pairs was performed and the minimal path length (MPL) between all enzyme pairs was determined. We find a strong over-representation of homologous enzymes at MPL 1. We show that the functionally similar and functionally undetermined enzyme pairs are responsible for most of the over-representation of homologous enzyme pairs at MPL 1.ConclusionsThe retrograde evolution model predicts that homologous enzymes pairs are at short metabolic distances from each other. In general agreement with previous studies we find that homologous enzymes occur close to each other in the network more often than expected by chance, which lends some support to the retrograde evolution model. However, we show that the homologous enzyme pairs which may have evolved through retrograde evolution, namely the pairs that are functionally dissimilar, show a weaker over-representation at MPL 1 than the functionally similar enzyme pairs. Our study indicates that, while the retrograde evolution model may have played a small part, the patchwork evolution model is the predominant process of metabolic enzyme evolution.
Molecular Biology and Evolution | 2013
Sara Light; Rauan Sagit; Oxana Sachenkova; Diana Ekman; Arne Elofsson
Proteins evolve not only through point mutations but also by insertion and deletion events, which affect the length of the protein. It is well known that such indel events most frequently occur in surface-exposed loops. However, detailed analysis of indel events in distantly related and fast-evolving proteins is hampered by the difficulty involved in correctly aligning such sequences. Here, we circumvent this problem by first only analyzing homologous proteins based on length variation rather than pairwise alignments. Using this approach, we find a surprisingly strong relationship between difference in length and difference in the number of intrinsically disordered residues, where up to three quarters of the length variation can be explained by changes in the number of intrinsically disordered residues. Further, we find that disorder is common in both insertions and deletions. A more detailed analysis reveals that indel events do not induce disorder but rather that already disordered regions accrue indels, suggesting that there is a lowered selective pressure for indels to occur within intrinsically disordered regions.
Journal of Molecular Biology | 2010
Åsa K. Björklund; Sara Light; Rauan Sagit; Arne Elofsson
Protein domain repeats are common in proteins that are central to the organization of a cell, in particular in eukaryotes. They are known to evolve through internal tandem duplications. However, the understanding of the underlying mechanisms is incomplete. To shed light on repeat expansion mechanisms, we have studied the evolution of the muscle protein Nebulin, a protein that contains a large number of actin-binding nebulin domains. Nebulin proteins have evolved from an invertebrate precursor containing two nebulin domains. Repeat regions have expanded through duplications of single domains, as well as duplications of a super repeat (SR) consisting of seven nebulins. We show that the SR has evolved independently into large regions in at least three instances: twice in the invertebrate Branchiostoma floridae and once in vertebrates. In-depth analysis reveals several recent tandem duplications in the Nebulin gene. The events involve both single-domain and multidomain SR units or several SR units. There are single events, but frequently the same unit is duplicated multiple times. For instance, an ancestor of human and chimpanzee underwent two tandem duplications. The duplication junction coincides with an Alu transposon, thus suggesting duplication through Alu-mediated homologous recombination. Duplications in the SR region consistently involve multiples of seven domains. However, the exact unit that is duplicated varies both between species and within species. Thus, multiple tandem duplications of the same motif did not create the large Nebulin protein. Finally, analysis of segmental duplications in the human genome reveals that duplications are more common in genes containing domain repeats than in those coding for nonrepeated proteins. In fact, segmental duplications are found three to six times more often in long repeated genes than expected by chance.
FEBS Letters | 2013
Morten H. H. Nørholm; Stephen Toddo; Minttu T.I. Virkki; Sara Light; Gunnar von Heijne; Daniel O. Daley
Membrane proteins are extremely challenging to produce in sufficient quantities for biochemical and structural analysis and there is a growing demand for solutions to this problem. In this study we attempted to improve expression of two difficult‐to‐express coding sequences (araH and narK) for membrane transporters. For both coding sequences, synonymous codon substitutions in the region adjacent to the AUG start led to significant improvements in expression, whereas multi‐parameter sequence optimization of codons throughout the coding sequence failed. We conclude that coding sequences can be re‐wired for high‐level protein expression by selective engineering of the 5′ coding sequence with synonymous codons, thus circumventing the need to consider whole sequence optimization.
Proteomics | 2008
Åsa K. Björklund; Sara Light; Linnea E. Hedin; Arne Elofsson
With recent publications of several large‐scale protein–protein interaction (PPI) studies, the realization of the full yeast interaction network is getting closer. Here, we have analysed several yeast protein interaction datasets to understand their strengths and weaknesses. In particular, we investigate the effect of experimental biases on some of the protein properties suggested to be enriched in highly connected proteins. Finally, we use support vector machines (SVM) to assess the contribution of these properties to protein interactivity. We find that protein abundance is the most important factor for detecting interactions in tandem affinity purifications (TAP), while it is of less importance for Yeast Two Hybrid (Y2H) screens. Consequently, sequence conservation and/or essentiality of hubs may be related to their high abundance. Further, proteins with disordered structure are over‐represented in Y2H screens and in one, but not the other, large‐scale TAP assay. Hence, disordered regions may be important both in transient interactions and interactions in complexes. Finally, a few domain families seem to be responsible for a large part of all interactions. Most importantly, we show that there are method‐specific biases in PPI experiments. Thus, care should be taken before drawing strong conclusions based on a single dataset.
Biochimica et Biophysica Acta | 2013
Sara Light; Rauan Sagit; Diana Ekman; Arne Elofsson
Proteins evolve through point mutations as well as by insertions and deletions (indels). During the last decade it has become apparent that protein regions that do not fold into three-dimensional structures, i.e. intrinsically disordered regions, are quite common. Here, we have studied the relationship between protein disorder and indels using HMM-HMM pairwise alignments in two sets of orthologous eukaryotic protein pairs. First, we show that disordered residues are much more frequent among indel residues than among aligned residues and, also are more prevalent among indels than in coils. Second, we observed that disordered residues are particularly common in longer indels. Disordered indels of short-to-medium size are prevalent in the non-terminal regions of proteins while the longest indels, ordered and disordered alike, occur toward the termini of the proteins where new structural units are comparatively well tolerated. Finally, while disordered regions often evolve faster than ordered regions and disorder is common in indels, there are some previously recognized protein families where the disordered region is more conserved than the ordered region. We find that these rare proteins are often involved in information processes, such as RNA processing and translation. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly.
Biochimica et Biophysica Acta | 2012
Morten H.H. Nørholm; Sara Light; Minttu T.I. Virkki; Arne Elofsson; Gunnar von Heijne; Daniel O. Daley
With synthetic gene services, molecular cloning is as easy as ordering a pizza. However choosing the right RNA code for efficient protein production is less straightforward, more akin to deciding on the pizza toppings. The possibility to choose synonymous codons in the gene sequence has ignited a discussion that dates back 50 years: Does synonymous codon use matter? Recent studies indicate that replacement of particular codons for synonymous codons can improve expression in homologous or heterologous hosts, however it is not always successful. Furthermore it is increasingly apparent that membrane protein biogenesis can be codon-sensitive. Single synonymous codon substitutions can influence mRNA stability, mRNA structure, translational initiation, translational elongation and even protein folding. Synonymous codon substitutions therefore need to be carefully evaluated when membrane proteins are engineered for higher production levels and further studies are needed to fully understand how to select the codons that are optimal for higher production. This article is part of a Special Issue entitled: Protein Folding in Membranes.
Current Opinion in Structural Biology | 2014
Sara Light; Walter Basile; Arne Elofsson
The frequency of de novo creation of proteins has been debated. Early it was assumed that de novo creation should be extremely rare and that the vast majority of all protein coding genes were created in early history of life. However, the early genomics era lead to the insight that protein coding genes do appear to be lineage-specific. Today, with thousands of completely sequenced genomes, this impression remains. It has even been proposed that the creation of novel genes, a continuous process where most de novo genes are short-lived, is as frequent as gene duplications. There exist reports with strongly indicative evidence for de novo gene emergence in many organisms ranging from Bacteria, sometimes generated through bacteriophages, to humans, where orphans appear to be overexpressed in brain and testis. In contrast, research on protein evolution indicates that many very distantly related proteins appear to share partial homology. Here, we discuss recent results on de novo gene emergence, as well as important technical challenges limiting our ability to get a definite answer to the extent of de novo protein creation.