Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiangchao Gan is active.

Publication


Featured researches published by Xiangchao Gan.


Nature | 2011

Mouse genomic variation and its effect on phenotypes and gene regulation.

Thomas M. Keane; Leo Goodstadt; Petr Danecek; Michael A. White; Kim Wong; Binnaz Yalcin; Andreas Heger; Avigail Agam; Guy Slater; Martin Goodson; N A Furlotte; Eleazar Eskin; Christoffer Nellåker; H Whitley; James Cleak; Deborah Janowitz; Polinka Hernandez-Pliego; Andrew Edwards; T G Belgard; Peter L. Oliver; Rebecca E McIntyre; Amarjit Bhomra; Jérôme Nicod; Xiangchao Gan; Wei Yuan; L van der Weyden; Charles A. Steward; Sendu Bala; Jim Stalker; Richard Mott

We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.


Nature | 2011

Multiple reference genomes and transcriptomes for Arabidopsis thaliana

Xiangchao Gan; Oliver Stegle; Jonas Behr; Joshua G. Steffen; Philipp Drewe; Katie L. Hildebrand; Rune Lyngsoe; Sebastian J. Schultheiss; Edward J. Osborne; Vipin T. Sreedharan; André Kahles; Regina Bohnert; Géraldine Jean; Paul S. Derwent; Paul J. Kersey; Eric J. Belfield; Nicholas P. Harberd; Eric Kemen; Christopher Toomajian; Paula X. Kover; Richard M. Clark; Gunnar Rätsch; Richard Mott

Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.


Nature | 2011

Sequence-based characterization of structural variation in the mouse genome.

Binnaz Yalcin; Kim Wong; Avigail Agam; Martin Goodson; Thomas M. Keane; Xiangchao Gan; Christoffer Nellåker; Leo Goodstadt; Jérôme Nicod; Amarjit Bhomra; Polinka Hernandez-Pliego; Helen Whitley; James Cleak; Rebekah Dutton; Deborah Janowitz; Richard Mott; David J. Adams; Jonathan Flint

Structural variation is widespread in mammalian genomes and is an important cause of disease, but just how abundant and important structural variants (SVs) are in shaping phenotypic variation remains unclear. Without knowing how many SVs there are, and how they arise, it is difficult to discover what they do. Combining experimental with automated analyses, we identified 711,920 SVs at 281,243 sites in the genomes of thirteen classical and four wild-derived inbred mouse strains. The majority of SVs are less than 1 kilobase in size and 98% are deletions or insertions. The breakpoints of 160,000 SVs were mapped to base pair resolution, allowing us to infer that insertion of retrotransposons causes more than half of SVs. Yet, despite their prevalence, SVs are less likely than other sequence variants to cause gene expression or quantitative phenotypic variation. We identified 24 SVs that disrupt coding exons, acting as rare variants of large effect on gene function. One-third of the genes so affected have immunological functions.


Cell | 2016

1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana

Carlos Alonso-Blanco; Jorge Andrade; Claude Becker; Felix Bemm; Joy Bergelson; Karsten M. Borgwardt; Jun Cao; Eunyoung Chae; Todd M. Dezwaan; Wei Ding; Joseph R. Ecker; Moises Exposito-Alonso; Ashley Farlow; Joffrey Fitz; Xiangchao Gan; Dominik Grimm; Angela M. Hancock; Stefan R. Henz; Svante Holm; Matthew Horton; Mike Jarsulic; Randall A. Kerstetter; Arthur Korte; Pamela Korte; Christa Lanz; Cheng-Ruei Lee; Dazhe Meng; Todd P. Michael; Richard Mott; Ni Wayan Muliyati

Summary Arabidopsis thaliana serves as a model organism for the study of fundamental physiological, cellular, and molecular processes. It has also greatly advanced our understanding of intraspecific genome variation. We present a detailed map of variation in 1,135 high-quality re-sequenced natural inbred lines representing the native Eurasian and North African range and recently colonized North America. We identify relict populations that continue to inhabit ancestral habitats, primarily in the Iberian Peninsula. They have mixed with a lineage that has spread to northern latitudes from an unknown glacial refugium and is now found in a much broader spectrum of habitats. Insights into the history of the species and the fine-scale distribution of genetic diversity provide the basis for full exploitation of A. thaliana natural variation through integration of genomes and epigenomes with molecular and non-molecular phenotypes.


Science | 2014

Leaf shape evolution through duplication, regulatory diversification, and loss of a homeobox gene.

Daniela Vlad; Daniel Kierzkowski; M. I. Rast; Francesco Vuolo; R. Dello Ioio; Carla Galinha; Xiangchao Gan; Mohsen Hajheidari; Angela Hay; Richard S. Smith; Peter Huijser; C. D. Bailey; Miltos Tsiantis

The evolutionary trajectory leading to crucifer leaf shape in Cardamine hirsuta plants is elucidated. In this work, we investigate morphological differences between Arabidopsis thaliana, which has simple leaves, and its relative Cardamine hirsuta, which has dissected leaves comprising distinct leaflets. With the use of genetics, interspecific gene transfers, and time-lapse imaging, we show that leaflet development requires the REDUCED COMPLEXITY (RCO) homeodomain protein. RCO functions specifically in leaves, where it sculpts developing leaflets by repressing growth at their flanks. RCO evolved in the Brassicaceae family through gene duplication and was lost in A. thaliana, contributing to leaf simplification in this species. Species-specific RCO action with respect to its paralog results from its distinct gene expression pattern in the leaf base. Thus, regulatory evolution coupled with gene duplication and loss generated leaf shape diversity by modifying local growth patterns during organogenesis. Developmental Complexity Although related, the plants Arabidopsis thaliana and Cardamine hirsuta have different sorts of leaves—one, a rather plain oval and the other, a complicated multipart construction. Comparing the development of the two leaf types, Vlad et al. (p. 780) uncovered a gene that regulates developmental growth. The C. hirsuta gene encoding the REDUCED COMPLEXITY (RCO) homeodomain protein arose through gene duplication and neofunctionalization, but was lost in the A. thaliana lineage. In C. hirsuta, RCO suppresses growth in domains around the perimeter of the developing leaf, yielding complex-shaped leaves. A. thaliana, lacking RCO, produces simple leaves. When RCO was expressed in A. thaliana, the leaves became more complex. Thus, the capacity to produce complex leaves remains, despite loss of the initiator.


Current Biology | 2015

Molecular Signatures of Major Depression

Na Cai; Simon Chang; Yihan I Li; Qibin Li; Jingchu Hu; Jieqin Liang; Li Song; Warren W. Kretzschmar; Xiangchao Gan; Jérôme Nicod; Margarita Rivera; Hongxin Deng; B Du; K Li; Wenhu Sang; J Gao; S Gao; B Ha; Hung-Yao Ho; C Hu; Jian Hu; Zhenfei Hu; Guoping Huang; G Jiang; Tao Jiang; Wei Jin; G Li; Kan Li; Yi Hao Li; Yingrui Li

Summary Adversity, particularly in early life, can cause illness. Clues to the responsible mechanisms may lie with the discovery of molecular signatures of stress, some of which include alterations to an individual’s somatic genome. Here, using genome sequences from 11,670 women, we observed a highly significant association between a stress-related disease, major depression, and the amount of mtDNA (p = 9.00 × 10−42, odds ratio 1.33 [95% confidence interval [CI] = 1.29–1.37]) and telomere length (p = 2.84 × 10−14, odds ratio 0.85 [95% CI = 0.81–0.89]). While both telomere length and mtDNA amount were associated with adverse life events, conditional regression analyses showed the molecular changes were contingent on the depressed state. We tested this hypothesis with experiments in mice, demonstrating that stress causes both molecular changes, which are partly reversible and can be elicited by the administration of corticosterone. Together, these results demonstrate that changes in the amount of mtDNA and telomere length are consequences of stress and entering a depressed state. These findings identify increased amounts of mtDNA as a molecular marker of MD and have important implications for understanding how stress causes the disease.


BMC Bioinformatics | 2008

Discovering biclusters in gene expression data based on high-dimensional linear geometries

Xiangchao Gan; Alan Wee-Chung Liew; Hong Yan

BackgroundIn DNA microarray experiments, discovering groups of genes that share similar transcriptional characteristics is instrumental in functional annotation, tissue classification and motif identification. However, in many situations a subset of genes only exhibits consistent pattern over a subset of conditions. Conventional clustering algorithms that deal with the entire row or column in an expression matrix would therefore fail to detect these useful patterns in the data. Recently, biclustering has been proposed to detect a subset of genes exhibiting consistent pattern over a subset of conditions. However, most existing biclustering algorithms are based on searching for sub-matrices within a data matrix by optimizing certain heuristically defined merit functions. Moreover, most of these algorithms can only detect a restricted set of bicluster patterns.ResultsIn this paper, we present a novel geometric perspective for the biclustering problem. The biclustering process is interpreted as the detection of linear geometries in a high dimensional data space. Such a new perspective views biclusters with different patterns as hyperplanes in a high dimensional space, and allows us to handle different types of linear patterns simultaneously by matching a specific set of linear geometries. This geometric viewpoint also inspires us to propose a generic bicluster pattern, i.e. the linear coherent model that unifies the seemingly incompatible additive and multiplicative bicluster models. As a particular realization of our framework, we have implemented a Hough transform-based hyperplane detection algorithm. The experimental results on human lymphoma gene expression dataset show that our algorithm can find biologically significant subsets of genes.ConclusionWe have proposed a novel geometric interpretation of the biclustering problem. We have shown that many common types of bicluster are just different spatial arrangements of hyperplanes in a high dimensional data space. An implementation of the geometric framework using the Fast Hough transform for hyperplane detection can be used to discover biologically significant subsets of genes under subsets of conditions for microarray data analysis.


Nucleic Acids Research | 2006

Microarray missing data imputation based on a set theoretic framework and biological knowledge

Xiangchao Gan; Alan Wee-Chung Liew; Hong Yan

Gene expressions measured using microarrays usually suffer from the missing value problem. However, in many data analysis methods, a complete data matrix is required. Although existing missing value imputation algorithms have shown good performance to deal with missing values, they also have their limitations. For example, some algorithms have good performance only when strong local correlation exists in data while some provide the best estimate when data is dominated by global structure. In addition, these algorithms do not take into account any biological constraint in their imputation. In this paper, we propose a set theoretic framework based on projection onto convex sets (POCS) for missing data imputation. POCS allows us to incorporate different types of a priori knowledge about missing values into the estimation process. The main idea of POCS is to formulate every piece of prior knowledge into a corresponding convex set and then use a convergence-guaranteed iterative procedure to obtain a solution in the intersection of all these sets. In this work, we design several convex sets, taking into consideration the biological characteristic of the data: the first set mainly exploit the local correlation structure among genes in microarray data, while the second set captures the global correlation structure among arrays. The third set (actually a series of sets) exploits the biological phenomenon of synchronization loss in microarray experiments. In cyclic systems, synchronization loss is a common phenomenon and we construct a series of sets based on this phenomenon for our POCS imputation algorithm. Experiments show that our algorithm can achieve a significant reduction of error compared to the KNNimpute, SVDimpute and LSimpute methods.


Cell | 2014

The Architecture of Parent-of-Origin Effects in Mice

Richard Mott; Wei Yuan; Pamela J. Kaisaki; Xiangchao Gan; James Cleak; Andrew Edwards; Amelie Baud; Jonathan Flint

Summary The number of imprinted genes in the mammalian genome is predicted to be small, yet we show here, in a survey of 97 traits measured in outbred mice, that most phenotypes display parent-of-origin effects that are partially confounded with family structure. To address this contradiction, using reciprocal F1 crosses, we investigated the effects of knocking out two nonimprinted candidate genes, Man1a2 and H2-ab1, that reside at nonimprinted loci but that show parent-of-origin effects. We show that expression of multiple genes becomes dysregulated in a sex-, tissue-, and parent-of-origin-dependent manner. We provide evidence that nonimprinted genes can generate parent-of-origin effects by interaction with imprinted loci and deduce that the importance of the number of imprinted genes is secondary to their interactions. We propose that this gene network effect may account for some of the missing heritability seen when comparing sibling-based to population-based studies of the phenotypic effects of genetic variants.


Current Biology | 2011

Regenerant Arabidopsis Lineages Display a Distinct Genome- Wide Spectrum of Mutations Conferring Variant Phenotypes

Caifu Jiang; Aziz Mithani; Xiangchao Gan; Eric J. Belfield; John P. Klingler; Jian-Kang Zhu; Jiannis Ragoussis; Richard Mott; Nicholas P. Harberd

Summary Multicellular organisms can be regenerated from totipotent differentiated somatic cell or nuclear founders [1–3]. Organisms regenerated from clonally related isogenic founders might a priori have been expected to be phenotypically invariant. However, clonal regenerant animals display variant phenotypes caused by defective epigenetic reprogramming of gene expression [2], and clonal regenerant plants exhibit poorly understood heritable phenotypic (“somaclonal”) variation [4–7]. Here we show that somaclonal variation in regenerant Arabidopsis lineages is associated with genome-wide elevation in DNA sequence mutation rate. We also show that regenerant mutations comprise a distinctive molecular spectrum of base substitutions, insertions, and deletions that probably results from decreased DNA repair fidelity. Finally, we show that while regenerant base substitutions are a likely major genetic cause of the somaclonal variation of regenerant Arabidopsis lineages, transposon movement is unlikely to contribute substantially to that variation. We conclude that the phenotypic variation of regenerant plants, unlike that of regenerant animals, is substantially due to DNA sequence mutation.

Collaboration


Dive into the Xiangchao Gan's collaboration.

Top Co-Authors

Avatar

Richard Mott

University College London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hong Yan

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Amarjit Bhomra

Wellcome Trust Centre for Human Genetics

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Cleak

Wellcome Trust Centre for Human Genetics

View shared research outputs
Researchain Logo
Decentralizing Knowledge