Guan Ning Lin
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guan Ning Lin.
Cell | 2012
Jacob J. Michaelson; Yujian Shi; Madhusudan Gujral; Hancheng Zheng; Dheeraj Malhotra; Xin Jin; Minghan Jian; Guangming Liu; Douglas S. Greer; Abhishek Bhandari; Wenting Wu; Roser Corominas; Aine Peoples; Amnon Koren; Athurva Gore; Shuli Kang; Guan Ning Lin; Jasper Estabillo; Therese Gadomski; Balvindar Singh; Kun Zhang; Natacha Akshoomoff; Christina Corsello; Steven A. McCarroll; Lilia M. Iakoucheva; Yingrui Li; Jun Wang; Jonathan Sebat
De novo mutation plays an important role in autism spectrum disorders (ASDs). Notably, pathogenic copy number variants (CNVs) are characterized by high mutation rates. We hypothesize that hypermutability is a property of ASD genes and may also include nucleotide-substitution hot spots. We investigated global patterns of germline mutation by whole-genome sequencing of monozygotic twins concordant for ASD and their parents. Mutation rates varied widely throughout the genome (by 100-fold) and could be explained by intrinsic characteristics of DNA sequence and chromatin structure. Dense clusters of mutations within individual genomes were attributable to compound mutation or gene conversion. Hypermutability was a characteristic of genes involved in ASD and other diseases. In addition, genes impacted by mutations in this study were associated with ASD in independent exome-sequencing data sets. Our findings suggest that regional hypermutation is a significant factor shaping patterns of genetic variation and disease risk in humans.
Genome Biology | 2008
Lourdes Peña-Castillo; Murat Tasan; Chad L. Myers; Hyunju Lee; Trupti Joshi; Chao Zhang; Yuanfang Guan; Michele Leone; Andrea Pagnani; Wan-Kyu Kim; Chase Krumpelman; Weidong Tian; Guillaume Obozinski; Yanjun Qi; Guan Ning Lin; Gabriel F. Berriz; Francis D. Gibbons; Gert R. G. Lanckriet; Jian-Ge Qiu; Charles E. Grant; Zafer Barutcuoglu; David P. Hill; David Warde-Farley; Chris Grouios; Debajyoti Ray; Judith A. Blake; Minghua Deng; Michael I. Jordan; William Stafford Noble; Quaid Morris
Background:Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated.Results:In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%.Conclusion:We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized.
Genetics | 2007
James O. Allen; Christiane M.-R. Fauron; Patrick Minx; Leah M. Roark; Swetha Oddiraju; Guan Ning Lin; Louis Meyer; Hui Sun; Kyung Won Kim; Chunyan Wang; Feiyu Du; Dong Xu; Michael Gibson; Jill Cifrese; Sandra W. Clifton; Kathleen J. Newton
We have sequenced five distinct mitochondrial genomes in maize: two fertile cytotypes (NA and the previously reported NB) and three cytoplasmic-male-sterile cytotypes (CMS-C, CMS-S, and CMS-T). Their genome sizes range from 535,825 bp in CMS-T to 739,719 bp in CMS-C. Large duplications (0.5–120 kb) account for most of the size increases. Plastid DNA accounts for 2.3–4.6% of each mitochondrial genome. The genomes share a minimum set of 51 genes for 33 conserved proteins, three ribosomal RNAs, and 15 transfer RNAs. Numbers of duplicate genes and plastid-derived tRNAs vary among cytotypes. A high level of sequence conservation exists both within and outside of genes (1.65–7.04 substitutions/10 kb in pairwise comparisons). However, sequence losses and gains are common: integrated plastid and plasmid sequences, as well as noncoding “native” mitochondrial sequences, can be lost with no phenotypic consequence. The organization of the different maize mitochondrial genomes varies dramatically; even between the two fertile cytotypes, there are 16 rearrangements. Comparing the finished shotgun sequences of multiple mitochondrial genomes from the same species suggests which genes and open reading frames are potentially functional, including which chimeric ORFs are candidate genes for cytoplasmic male sterility. This method identified the known CMS-associated ORFs in CMS-S and CMS-T, but not in CMS-C.
Nature Communications | 2014
Roser Corominas; Xinping Yang; Guan Ning Lin; Shuli Kang; Yun Shen; Lila Ghamsari; Martin P. Broly; Maria J. Rodriguez; Stanley Tam; Shelly A. Trigg; Changyu Fan; Song Yi; Murat Tasan; Irma Lemmens; Xingyan Kuang; Nan Zhao; Dheeraj Malhotra; Jacob J. Michaelson; Vladimir Vacic; Michael A. Calderwood; Frederick P. Roth; Jan Tavernier; Steve Horvath; Kourosh Salehi-Ashtiani; Dmitry Korkin; Jonathan Sebat; David E. Hill; Tong Hao; Marc Vidal; Lilia M. Iakoucheva
Increased risk for autism spectrum disorders (ASD) is attributed to hundreds of genetic loci. The convergence of ASD variants have been investigated using various approaches, including protein interactions extracted from the published literature. However, these datasets are frequently incomplete, carry biases and are limited to interactions of a single splicing isoform, which may not be expressed in the disease-relevant tissue. Here we introduce a new interactome mapping approach by experimentally identifying interactions between brain-expressed alternatively spliced variants of ASD risk factors. The Autism Spliceform Interaction Network reveals that almost half of the detected interactions and about 30% of the newly identified interacting partners represent contribution from splicing variants, emphasizing the importance of isoform networks. Isoform interactions greatly contribute to establishing direct physical connections between proteins from the de novo autism CNVs. Our findings demonstrate the critical role of spliceform networks for translating genetic knowledge into a better understanding of human diseases.
Neuron | 2015
Guan Ning Lin; Roser Corominas; Irma Lemmens; Xinping Yang; Jan Tavernier; David E. Hill; Marc Vidal; Jonathan Sebat; Lilia M. Iakoucheva
The psychiatric disorders autism and schizophrenia have a strong genetic component, and copy number variants (CNVs) are firmly implicated. Recurrent deletions and duplications of chromosome 16p11.2 confer a high risk for both diseases, but the pathways disrupted by this CNV are poorly defined. Here we investigate the dynamics of the 16p11.2 network by integrating physical interactions of 16p11.2 proteins with spatiotemporal gene expression from the developing human brain. We observe profound changes in protein interaction networks throughout different stages of brain development and/or in different brain regions. We identify the late mid-fetal period of cortical development as most critical for establishing the connectivity of 16p11.2 proteins with their co-expressed partners. Furthermore, our results suggest that the regulation of the KCTD13-Cul3-RhoA pathway in layer 4 of the inner cortical plate is crucial for controlling brain size and connectivity and that its dysregulation by de novo mutations may be a potential determinant of 16p11.2 CNV deletion and duplication phenotypes.
American Journal of Human Genetics | 2016
William M. Brandler; Danny Antaki; Madhusudan Gujral; Amina Noor; Gabriel Rosanio; Timothy R. Chapman; Daniel J. Barrera; Guan Ning Lin; Dheeraj Malhotra; Amanda C. Watts; Lawrence C. Wong; Jasper Estabillo; Therese Gadomski; Oanh Hong; Karin V. Fuentes Fajardo; Abhishek Bhandari; Renius Owen; Michael Baughn; Jeffrey Yuan; Terry Solomon; Alexandra G Moyzis; Michelle S. Maile; Stephan J. Sanders; Gail Reiner; Keith K. Vaux; Charles M. Strom; Kang Zhang; Alysson R. Muotri; Natacha Akshoomoff; Suzanne M. Leal
Genetic studies of autism spectrum disorder (ASD) have established that de novo duplications and deletions contribute to risk. However, ascertainment of structural variants (SVs) has been restricted by the coarse resolution of current approaches. By applying a custom pipeline for SV discovery, genotyping, and de novo assembly to genome sequencing of 235 subjects (71 affected individuals, 26 healthy siblings, and their parents), we compiled an atlas of 29,719 SV loci (5,213/genome), comprising 11 different classes. We found a high diversity of de novo mutations, the majority of which were undetectable by previous methods. In addition, we observed complex mutation clusters where combinations of de novo SVs, nucleotide substitutions, and indels occurred as a single event. We estimate a high rate of structural mutation in humans (20%) and propose that genetic risk for ASD is attributable to an elevated frequency of gene-disrupting de novo SVs, but not an elevated rate of genome rearrangement.
BMC Bioinformatics | 2010
Guan Ning Lin; Zheng Wang; Dong Xu; Jianlin Cheng
BackgroundProtein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines.ResultsWe systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs.ConclusionsBoth the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html.
bioRxiv | 2017
Vikas Pejaver; Jorge Urresti; Jose Lugo-Martinez; Kymberleigh A. Pagel; Guan Ning Lin; Hyun-Jun Nam; Matthew Mort; David Neil Cooper; Jonathan Sebat; Lilia M. Iakoucheva; Sean D. Mooney; Predrag Radivojac
We introduce MutPred2, a tool that improves the prioritization of pathogenic amino acid substitutions, generates molecular mechanisms potentially causative of disease, and returns interpretable pathogenicity score distributions on individual genomes. While its prioritization performance is state-of-the-art, a novel and distinguishing feature of MutPred2 is the probabilistic modeling of variant impact on specific aspects of protein structure and function that can serve to guide experimental studies of phenotype-altering variants. We demonstrate the utility of MutPred2 in the identification of the structural and functional mutational signatures relevant to Mendelian disorders and the prioritization of de novo mutations associated with complex neurodevelopmental disorders. We then experimentally validate the functional impact of several variants identified in patients with such disorders. We argue that mechanism-driven studies of human inherited diseases have the potential to significantly accelerate the discovery of clinically actionable variants. Availability: http://mutpred.mutdb.org/
Methods of Molecular Biology | 2008
Trupti Joshi; Chao Zhang; Guan Ning Lin; Zhao Song; Dong Xu
Characterizing gene function is one of the major challenging tasks in the postgenomic era. To address this challenge, we developed GeneFAS (gene function annotation system), a computer system with a graphical user interface for cellular function prediction by integrating information from protein-protein interactions, protein complexes, microarray gene expression profiles, and annotations of known proteins. GeneFAS can provide biologists a workspace for their organism of interest, to integrate different types of experimental data and annotation information, and facilitate biological discovery and hypothesis generation using all the information. It also provides testing and training capabilities for users to utilize and integrate their data more efficiently. GeneFAS is freely available for download at http://digbio.missouri.edu/genefas .
Bioinformatics | 2017
Kymberleigh A. Pagel; Vikas Pejaver; Guan Ning Lin; Hyun-Jun Nam; Matthew Mort; David Neil Cooper; Jonathan Sebat; Lilia M. Iakoucheva; Sean D. Mooney; Predrag Radivojac
Motivation: Loss‐of‐function genetic variants are frequently associated with severe clinical phenotypes, yet many are present in the genomes of healthy individuals. The available methods to assess the impact of these variants rely primarily upon evolutionary conservation with little to no consideration of the structural and functional implications for the protein. They further do not provide information to the user regarding specific molecular alterations potentially causative of disease. Results: To address this, we investigate protein features underlying loss‐of‐function genetic variation and develop a machine learning method, MutPred‐LOF, for the discrimination of pathogenic and tolerated variants that can also generate hypotheses on specific molecular events disrupted by the variant. We investigate a large set of human variants derived from the Human Gene Mutation Database, ClinVar and the Exome Aggregation Consortium. Our prediction method shows an area under the Receiver Operating Characteristic curve of 0.85 for all loss‐of‐function variants and 0.75 for proteins in which both pathogenic and neutral variants have been observed. We applied MutPred‐LOF to a set of 1142 de novo vari3ants from neurodevelopmental disorders and find enrichment of pathogenic variants in affected individuals. Overall, our results highlight the potential of computational tools to elucidate causal mechanisms underlying loss of protein function in loss‐of‐function variants. Availability and Implementation: http://mutpred.mutdb.org Contact: [email protected]