Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiang-Sun Zhang is active.

Publication


Featured researches published by Xiang-Sun Zhang.


Bioinformatics | 2006

Inferring gene regulatory networks from multiple microarray datasets

Yong Wang; Trupti Joshi; Xiang-Sun Zhang; Dong Xu; Luonan Chen

MOTIVATION Microarray gene expression data has increasingly become the common data source that can provide insights into biological processes at a system-wide level. One of the major problems with microarrays is that a dataset consists of relatively few time points with respect to a large number of genes, which makes the problem of inferring gene regulatory network an ill-posed one. On the other hand, gene expression data generated by different groups worldwide are increasingly accumulated on many species and can be accessed from public databases or individual websites, although each experiment has only a limited number of time-points. RESULTS This paper proposes a novel method to combine multiple time-course microarray datasets from different conditions for inferring gene regulatory networks. The proposed method is called GNR (Gene Network Reconstruction tool) which is based on linear programming and a decomposition procedure. The method theoretically ensures the derivation of the most consistent network structure with respect to all of the datasets, thereby not only significantly alleviating the problem of data scarcity but also remarkably improving the prediction reliability. We tested GNR using both simulated data and experimental data in yeast and Arabidopsis. The result demonstrates the effectiveness of GNR in terms of predicting new gene regulatory relationship in yeast and Arabidopsis. AVAILABILITY The software is available from http://zhangorup.aporc.org/bioinfo/grninfer/, http://digbio.missouri.edu/grninfer/ and http://intelligent.eic.osaka-sandai.ac.jp or upon request from the authors.


Bioinformatics | 2005

Haplotype reconstruction from SNP fragments by minimum error correction

Rui-Sheng Wang; Ling-Yun Wu; Zhenping Li; Xiang-Sun Zhang

MOTIVATION Haplotype reconstruction based on aligned single nucleotide polymorphism (SNP) fragments is to infer a pair of haplotypes from localized polymorphism data gathered through short genome fragment assembly. An important computational model of this problem is the minimum error correction (MEC) model, which has been mentioned in several literatures. The model retrieves a pair of haplotypes by correcting minimum number of SNPs in given genome fragments coming from an individuals DNA. RESULTS In the first part of this paper, an exact algorithm for the MEC model is presented. Owing to the NP-hardness of the MEC model, we also design a genetic algorithm (GA). The designed GA is intended to solve large size problems and has very good performance. The strength and weakness of the MEC model are shown using experimental results on real data and simulation data. In the second part of this paper, to improve the MEC model for haplotype reconstruction, a new computational model is proposed, which simultaneously employs genotype information of an individual in the process of SNP correction, and is called MEC with genotype information (shortly, MEC/GI). Computational results on extensive datasets show that the new model has much higher accuracy in haplotype reconstruction than the pure MEC model.


Bioinformatics | 2007

Alignment of molecular networks by integer quadratic programming

Li Zhenping; Shihua Zhang; Yong Wang; Xiang-Sun Zhang; Luonan Chen

MOTIVATION With more and more data on molecular networks (e.g. protein interaction networks, gene regulatory networks and metabolic networks) available, the discovery of conserved patterns or signaling pathways by comparing various kinds of networks among different species or within a species becomes an increasingly important problem. However, most of the conventional approaches either restrict comparative analysis to special structures, such as pathways, or adopt heuristic algorithms due to computational burden. RESULTS In this article, to find the conserved substructures, we develop an efficient algorithm for aligning molecular networks based on both molecule similarity and architecture similarity, by using integer quadratic programming (IQP). Such an IQP can be relaxed into the corresponding quadratic programming (QP) which almost always ensures an integer solution, thereby making molecular network alignment tractable without any approximation. The proposed framework is very flexible and can be applied to many kinds of molecular networks including weighted and unweighted, directed and undirected networks with or without loops. AVAILABILITY Matlab code and data are available from http://zhangroup.aporc.org/bioinfo/MNAligner or http://intelligent.eic.osaka-sandai.ac.jp/chenen/software/MNAligner, or upon request from authors. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Bioinformatics | 2012

Efficient methods for identifying mutated driver pathways in cancer

Junfei Zhao; Shihua Zhang; Ling-Yun Wu; Xiang-Sun Zhang

MOTIVATION The first step for clinical diagnostics, prognostics and targeted therapeutics of cancer is to comprehensively understand its molecular mechanisms. Large-scale cancer genomics projects are providing a large volume of data about genomic, epigenomic and gene expression aberrations in multiple cancer types. One of the remaining challenges is to identify driver mutations, driver genes and driver pathways promoting cancer proliferation and filter out the unfunctional and passenger ones. RESULTS In this study, we propose two methods to solve the so-called maximum weight submatrix problem, which is designed to de novo identify mutated driver pathways from mutation data in cancer. The first one is an exact method that can be helpful for assessing other approximate or/and heuristic algorithms. The second one is a stochastic and flexible method that can be employed to incorporate other types of information to improve the first method. Particularly, we propose an integrative model to combine mutation and expression data. We first apply our methods onto simulated data to show their efficiency. We further apply the proposed methods onto several real biological datasets, such as the mutation profiles of 74 head and neck squamous cell carcinomas samples, 90 glioblastoma tumor samples and 313 ovarian carcinoma samples. The gene expression profiles were also considered for the later two data. The results show that our integrative model can identify more biologically relevant gene sets. We have implemented all these methods and made a package called mutated driver pathway finder, which can be easily used for other researchers. AVAILABILITY A MATLAB package of MDPFinder is available at http://zhangroup.aporc.org/ShiHuaZhang. CONTACT [email protected]. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Computers & Operations Research | 2006

Capacitated facility location problem with general setup cost

Lingyun Wu; Xiang-Sun Zhang; Ju-Liang Zhang

This paper presents an extension of the capacitated facility location problem (CFLP), in which the general setup cost functions and multiple facilities in one site are considered. The setup costs consist of a fixed term (site setup cost) plus a second term (facility setup costs). The facility setup cost functions are generally non-linear functions of the size of the facility in the same site. Two equivalent mixed integer linear programming (MIP) models are formulated for the problem and solved by general MIP solver. A Lagrangian heuristic algorithm (LHA) is also developed to find approximate solutions for this NP-hard problem. Extensive computational experiments are taken on randomly generated data and also well-known existing data (with some necessary modifications). The detailed results are provided and the heuristic algorithm is shown to be efficient.


PLOS Computational Biology | 2009

Disease-Aging Network Reveals Significant Roles of Aging Genes in Connecting Genetic Diseases

Jiguang Wang; Shihua Zhang; Yong Wang; Luonan Chen; Xiang-Sun Zhang

One of the challenging problems in biology and medicine is exploring the underlying mechanisms of genetic diseases. Recent studies suggest that the relationship between genetic diseases and the aging process is important in understanding the molecular mechanisms of complex diseases. Although some intricate associations have been investigated for a long time, the studies are still in their early stages. In this paper, we construct a human disease-aging network to study the relationship among aging genes and genetic disease genes. Specifically, we integrate human protein-protein interactions (PPIs), disease-gene associations, aging-gene associations, and physiological system–based genetic disease classification information in a single graph-theoretic framework and find that (1) human disease genes are much closer to aging genes than expected by chance; and (2) diseases can be categorized into two types according to their relationships with aging. Type I diseases have their genes significantly close to aging genes, while type II diseases do not. Furthermore, we examine the topological characters of the disease-aging network from a systems perspective. Theoretical results reveal that the genes of type I diseases are in a central position of a PPI network while type II are not; (3) more importantly, we define an asymmetric closeness based on the PPI network to describe relationships between diseases, and find that aging genes make a significant contribution to associations among diseases, especially among type I diseases. In conclusion, the network-based study provides not only evidence for the intricate relationship between the aging process and genetic diseases, but also biological implications for prying into the nature of human diseases.


EPL | 2009

Modularity optimization in community detection of complex networks

Xiang-Sun Zhang; Rui-Sheng Wang; Yong Wang; Jiguang Wang; Yuqing Qiu; Li Wang; Luonan Chen

Detecting community structure in complex networks is a fundamental but challenging topic in network science. Modularity measures, such as widely used modularity function Q and recently suggested modularity density D, play critical roles as quality indices in partitioning a network into communities. In this letter, we reveal the complex behaviors of modularity optimization under different community definitions by an analytic study. Surprisingly, we find that in addition to the resolution limit of Q revealed in a recent study, both Q and D suffer from a more serious limitation, i.e. some derived communities do not satisfy the weak community definition or even the most weak community definition. Especially, the latter case, called as misidentification, implies that these communities may have sparser connection within them than between them, which violates the basic intuitive sense for a subgraph to be a community. Using a discrete convex optimization framework, we investigate the underlying causes for these limitations and provide insights on choices of the modularity measures in applications. Numerical experiments on artificial and real-life networks confirm the theoretical analysis.


Computational Biology and Chemistry | 2006

Brief communication: Identification of functional modules in a PPI network by clique percolation clustering

Shihua Zhang; Xuemei Ning; Xiang-Sun Zhang

Large-scale experiments and data integration have provided the opportunity to systematically analyze and comprehensively understand the topology of biological networks and biochemical processes in cells. Modular architecture which encompasses groups of genes/proteins involved in elementary biological functional units is a basic form of the organization of interacting proteins. Here we apply a graph clustering algorithm based on clique percolation clustering to detect overlapping network modules of a protein-protein interaction (PPI) network. Our analysis of the yeast Sacchromyces cerevisiae suggests that most of the detected modules correspond to one or more experimentally functional modules and half of these annotated modules match well with experimentally determined protein complexes. Our method of analysis can of course be applied to protein-protein interaction data for any species and even other biological networks.


Current Bioinformatics | 2006

Models and Algorithms for Haplotyping Problem

Xiang-Sun Zhang; Rui-Sheng Wang; Ling-Yun Wu; Luonan Chen

One of the main topics in genomics is to determine the relevance of DNA variations with some genetic disease. Single nucleotide polymorphism (SNP) is the most frequent and important form of genetic variation which involves a single DNA base. The values of a set of SNPs on a particular chromosome copy define a haplotype. Because of its importance in the studies of complex disease association, haplotyping is one of the central problems in bioinformatics. There are two classes of in silico haplotyping problems, i.e., single individual haplotyping and population haplotyping. In this review paper, we give an overview on the existing models and algorithms on this topic, report the recent progresses from the computational viewpoint and further discuss the future research trends.


Journal of Proteome Research | 2008

The Knowledge-Integrated Network Biomarkers Discovery for Major Adverse Cardiac Events

Guangxu Jin; Xiaobo Zhou; Honghui Wang; Hong Zhao; Kemi Cui; Xiang-Sun Zhang; Luonan Chen; Stanley L. Hazen; King C. Li; Stephen T. C. Wong

The mass spectrometry (MS) technology in clinical proteomics is very promising for discovery of new biomarkers for diseases management. To overcome the obstacles of data noises in MS analysis, we proposed a new approach of knowledge-integrated biomarker discovery using data from Major Adverse Cardiac Events (MACE) patients. We first built up a cardiovascular-related network based on protein information coming from protein annotations in Uniprot, protein-protein interaction (PPI), and signal transduction database. Distinct from the previous machine learning methods in MS data processing, we then used statistical methods to discover biomarkers in cardiovascular-related network. Through the tradeoff between known protein information and data noises in mass spectrometry data, we finally could firmly identify those high-confident biomarkers. Most importantly, aided by protein-protein interaction network, that is, cardiovascular-related network, we proposed a new type of biomarkers, that is, network biomarkers, composed of a set of proteins and the interactions among them. The candidate network biomarkers can classify the two groups of patients more accurately than current single ones without consideration of biological molecular interaction.

Collaboration


Dive into the Xiang-Sun Zhang's collaboration.

Top Co-Authors

Avatar

Luonan Chen

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yong Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Shihua Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Ling-Yun Wu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Rui-Sheng Wang

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Zhi-Ping Liu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zhenping Li

Beijing Wuzi University

View shared research outputs
Top Co-Authors

Avatar

Qiang Huang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yuqing Qiu

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge