Sungsam Gong
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sungsam Gong.
Science Translational Medicine | 2015
Angharad M. Roberts; James S. Ware; Daniel S. Herman; Sebastian Schafer; John Baksi; Alexander G. Bick; Rachel Buchan; Roddy Walsh; Shibu John; Samuel Wilkinson; Francesco Mazzarotto; Leanne E. Felkin; Sungsam Gong; Jacqueline A. L. MacArthur; Fiona Cunningham; Jason Flannick; Stacey B. Gabriel; David Altshuler; P. Macdonald; Matthias Heinig; Anne Keogh; Christopher S. Hayward; Nicholas R. Banner; Dudley J. Pennell; Declan P. O’Regan; Tan Ru San; Antonio de Marvao; Timothy Dawes; Ankur Gulati; Emma J. Birks
Truncating variants of the giant protein titin cause dilated cardiomyopathy when they occur toward the protein’s carboxyl terminus and in highly expressed exons. What Happens When Titins Are Trimmed? The most common form of inherited heart failure, dilated cardiomyopathy, can be caused by mutations in a mammoth heart protein, appropriately called titin. Now, Roberts et al. sort out which titin mutations cause disease and why some people can carry certain titin mutations but remain perfectly healthy. In an exhaustive survey of more than 5200 people, with and without cardiomyopathy, the authors sequenced the titin gene and measured its corresponding RNA and protein levels. The alterations in titin were truncating mutations, which cause short nonfunctional versions of the RNA or protein. These defects produced cardiomyopathy when they occurred closer to the protein’s carboxyl terminus and in exons that were abundantly transcribed. The titin-truncating mutations that occur in the general population tended not to have these characteristics and were usually benign. This new detailed understanding of the molecular basis of dilated cardiomyopathy penetrance will promote better disease management and accelerate rational patient stratification. The recent discovery of heterozygous human mutations that truncate full-length titin (TTN, an abundant structural, sensory, and signaling filament in muscle) as a common cause of end-stage dilated cardiomyopathy (DCM) promises new prospects for improving heart failure management. However, realization of this opportunity has been hindered by the burden of TTN-truncating variants (TTNtv) in the general population and uncertainty about their consequences in health or disease. To elucidate the effects of TTNtv, we coupled TTN gene sequencing with cardiac phenotyping in 5267 individuals across the spectrum of cardiac physiology and integrated these data with RNA and protein analyses of human heart tissues. We report diversity of TTN isoform expression in the heart, define the relative inclusion of TTN exons in different isoforms (using the TTN transcript annotations available at http://cardiodb.org/titin), and demonstrate that these data, coupled with the position of the TTNtv, provide a robust strategy to discriminate pathogenic from benign TTNtv. We show that TTNtv is the most common genetic cause of DCM in ambulant patients in the community, identify clinically important manifestations of TTNtv-positive DCM, and define the penetrance and outcomes of TTNtv in the general population. By integrating genetic, transcriptome, and protein analyses, we provide evidence for a length-dependent mechanism of disease. These data inform diagnostic criteria and management strategies for TTNtv-positive DCM patients and for TTNtv that are identified as incidental findings.
Nature Reviews Molecular Cell Biology | 2009
Catherine L. Worth; Sungsam Gong; Tom L. Blundell
High-throughput genomic sequencing has focused attention on understanding differences between species and between individuals. When this genetic variation affects protein sequences, the rate of amino acid substitution reflects both Darwinian selection for functionally advantageous mutations and selectively neutral evolution operating within the constraints of structure and function. During neutral evolution, whereby mutations accumulate by random drift, amino acid substitutions are constrained by factors such as the formation of intramolecular and intermolecular interactions and the accessibility to water or lipids surrounding the protein. These constraints arise from the need to conserve a specific architecture and to retain interactions that mediate functions in protein families and superfamilies.
Journal of Bioinformatics and Computational Biology | 2007
Catherine L. Worth; G. Richard J. Bickerton; Adrian Schreyer; Julia R. Forman; Tammy M. K. Cheng; Semin Lee; Sungsam Gong; David F. Burke; Tom L. Blundell
The prediction of the effects of nonsynonymous single nucleotide polymorphisms (nsSNPs) on function depends critically on exploiting all information available on the three-dimensional structures of proteins. We describe software and databases for the analysis of nsSNPs that allow a user to move from SNP to sequence to structure to function. In both structure prediction and the analysis of the effects of nsSNPs, we exploit information about protein evolution, in particular, that derived from investigations on the relation of sequence to structure gained from the study of amino acid substitutions in divergent evolution. The techniques developed in our laboratory have allowed fast and automated sequence-structure homology recognition to identify templates and to perform comparative modeling; as well as simple, robust, and generally applicable algorithms to assess the likely impact of amino acid substitutions on structure and interactions. We describe our strategy for approaching the relationship between SNPs and disease, and the results of benchmarking our approach -- human proteins of known structure and recognized mutation.
Nucleic Acids Research | 2012
Dan Bolser; Pierre-Yves Chibon; Nicolas Palopoli; Sungsam Gong; Daniel Jacob; Victoria Dominguez Del Angel; Dan Swan; Sebastian Bassi; Virginia González; Prashanth Suravajhala; Seungwoo Hwang; Paolo Romano; Robert Edwards; Bryan Bishop; John Eargle; Timur Shtatland; Nicholas J. Provart; Dave Clements; Daniel P. Renfro; Daeui Bhak; Jong Bhak
Biology is generating more data than ever. As a result, there is an ever increasing number of publicly available databases that analyse, integrate and summarize the available data, providing an invaluable resource for the biological community. As this trend continues, there is a pressing need to organize, catalogue and rate these resources, so that the information they contain can be most effectively exploited. MetaBase (MB) (http://MetaDatabase.Org) is a community-curated database containing more than 2000 commonly used biological databases. Each entry is structured using templates and can carry various user comments and annotations. Entries can be searched, listed, browsed or queried. The database was created using the same MediaWiki technology that powers Wikipedia, allowing users to contribute on many different levels. The initial release of MB was derived from the content of the 2007 Nucleic Acids Research (NAR) Database Issue. Since then, approximately 100 databases have been manually collected from the literature, and users have added information for over 240 databases. MB is synchronized annually with the static Molecular Biology Database Collection provided by NAR. To date, there have been 19 significant contributors to the project; each one is listed as an author here to highlight the community aspect of the project.
PLOS ONE | 2010
Sungsam Gong; Tom L. Blundell
Human genetic variation is the incarnation of diverse evolutionary history, which reflects both selectively advantageous and selectively neutral change. In this study, we catalogue structural and functional features of proteins that restrain genetic variation leading to single amino acid substitutions. Our variation dataset is divided into three categories: i) Mendelian disease-related variants, ii) neutral polymorphisms and iii) cancer somatic mutations. We characterize structural environments of the amino acid variants by the following properties: i) side-chain solvent accessibility, ii) main-chain secondary structure, and iii) hydrogen bonds from a side chain to a main chain or other side chains. To address functional restraints, amino acid substitutions in proteins are examined to see whether they are located at functionally important sites involved in protein-protein interactions, protein-ligand interactions or catalytic activity of enzymes. We also measure the likelihood of amino acid substitutions and the degree of residue conservation where variants occur. We show that various types of variants are under different degrees of structural and functional restraints, which affect their occurrence in human proteome.
Biochemical Society Transactions | 2009
Sungsam Gong; Catherine L. Worth; G. Richard J. Bickerton; Semin Lee; Duangrudee Tanramluk; Tom L. Blundell
Divergent evolution of proteins reflects both selectively advantageous and neutral amino acid substitutions. In the present article, we examine restraints on sequence, which arise from selectively advantageous roles for structure and function and which lead to the conservation of local sequences and structures in families and superfamilies. We analyse structurally aligned members of protein families and superfamilies in order to investigate the importance of the local structural environment of amino acid residues in the acceptance of amino acid substitutions during protein evolution. We show that solvent accessibility is the most important determinant, followed by the existence of hydrogen bonds from the side-chain to main-chain functions and the nature of the element of secondary structure to which the amino acid contributes. Polar side chains whose hydrogen-bonding potential is satisfied tend to be more conserved than their unsatisfied or non-hydrogen-bonded counterparts, and buried and satisfied polar residues tend to be significantly more conserved than buried hydrophobic residues. Finally, we discuss the importance of functional restraints in the form of interactions of proteins with other macromolecules in assemblies or with substrates, ligands or allosteric regulators. We show that residues involved in such functional interactions are significantly more conserved and have differing amino acid substitution patterns.
Nucleic Acids Research | 2006
Areum Han; Hyo Jin Kang; Yoo-Bok Cho; Sunghoon Lee; Youngjoo Kim; Sungsam Gong
The single nucleotide polymorphisms (SNPs) in conserved protein regions have been thought to be strong candidates that alter protein functions. Thus, we have developed SNP@Domain, a web resource, to identify SNPs within human protein domains. We annotated SNPs from dbSNP with protein structure-based as well as sequence-based domains: (i) structure-based using SCOP and (ii) sequence-based using Pfam to avoid conflicts from two domain assignment methodologies. Users can investigate SNPs within protein domains with 2D and 3D maps. We expect this visual annotation of SNPs within protein domains will help scientists select and interpret SNPs associated with diseases. A web interface for the SNP@Domain is freely available at and from .
Journal of Cardiovascular Translational Research | 2011
Sungsam Gong; Catherine L. Worth; Tammy M. K. Cheng; Tom L. Blundell
The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype–phenotype relationships can be better understood. In this review, we bring this ‘bottom-up’ approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function.
Proceedings of the National Academy of Sciences of the United States of America | 2017
Tereza Cindrova-Davies; Eric Jauniaux; Michael G. Elliot; Sungsam Gong; Graham J. Burton; D. Stephen Charnock-Jones
Significance The human yolk sac is often considered vestigial. Here, we report RNA-sequencing analysis of the human and murine yolk sacs and compare with that of the chicken. We relate the human RNA-sequencing data to coelomic fluid proteomic data. Conservation of transcripts across the species indicates the human secondary yolk sac likely performs key functions early in development, particularly uptake and processing of macro- and micronutrients, many of which are found in coelomic fluid. More generally, our findings shed light on evolutionary mechanisms giving rise to complex structures such as the placenta. We propose that although a choriovitelline placenta is never established physically in the human, the placental villi, exocoelomic cavity, and secondary yolk sac function together as a physiological equivalent. The yolk sac is phylogenetically the oldest of the extraembryonic membranes. The human embryo retains a yolk sac, which goes through primary and secondary phases of development, but its importance is controversial. Although it is known to synthesize proteins, its transport functions are widely considered vestigial. Here, we report RNA-sequencing (RNA-seq) data for the human and murine yolk sacs and compare those data with data for the chicken. We also relate the human RNA-seq data to proteomic data for the coelomic fluid bathing the yolk sac. Conservation of transcriptomes across the species indicates that the human secondary yolk sac likely performs key functions early in development, particularly uptake and processing of macro- and micronutrients, many of which are found in coelomic fluid. More generally, our findings shed light on evolutionary mechanisms that give rise to complex structures such as the placenta. We identify genetic modules that are conserved across mammals and birds, suggesting these modules are part of the core amniote genetic repertoire and are the building blocks for both oviparous and viviparous reproductive modes. We propose that although a choriovitelline placenta is never established physically in the human, the placental villi, the exocoelomic cavity, and the secondary yolk sac function together as a physiological equivalent.
Epigenetics | 2018
Sungsam Gong; Michelle D. Johnson; Justyna Dopierala; Francesca Gaccioli; Ulla Sovio; Miguel Constância; Gordon Cs Smith; D. Stephen Charnock-Jones
ABSTRACT DNA methylation is an important regulator of gene function. Fetal sex is associated with the risk of several specific pregnancy complications related to placental function. However, the association between fetal sex and placental DNA methylation remains poorly understood. We carried out whole-genome oxidative bisulfite sequencing in the placentas of two healthy female and two healthy male pregnancies generating an average genome depth of coverage of 25x. Most highly ranked differentially methylated regions (DMRs) were located on the X chromosome but we identified a 225 kb sex-specific DMR in the body of the CUB and Sushi Multiple Domains 1 (CSMD1) gene on chromosome 8. The sex-specific differential methylation pattern observed in this region was validated in additional placentas using in-solution target capture. In a new RNA-seq data set from 64 female and 67 male placentas, CSMD1 mRNA was 1.8-fold higher in male than in female placentas (P value = 8.5 × 10−7, Mann-Whitney test). Exon-level quantification of CSMD1 mRNA from these 131 placentas suggested a likely placenta-specific CSMD1 isoform not detected in the 21 somatic tissues analyzed. We show that the gene body of an autosomal gene, CSMD1, is differentially methylated in a sex- and placental-specific manner, displaying sex-specific differences in placental transcript abundance.