Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiaoshuai Zhang is active.

Publication


Featured researches published by Xiaoshuai Zhang.


PLOS ONE | 2013

An integrative framework for Bayesian variable selection with informative priors for identifying genes and pathways.

Bin Peng; Dianwen Zhu; Bradley P. Ander; Xiaoshuai Zhang; Fuzhong Xue; Frank R. Sharp; Xiaowei Yang

The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with ‘large p, small n’ problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed.


BMC Genetics | 2012

Detection for gene-gene co-association via kernel canonical correlation analysis

Zhongshang Yuan; Qingsong Gao; Yungang He; Xiaoshuai Zhang; Fangyu Li; Jinghua Zhao; Fuzhong Xue

BackgroundCurrently, most methods for detecting gene-gene interaction (GGI) in genomewide association studies (GWASs) are limited in their use of single nucleotide polymorphism (SNP) as the unit of association. One way to address this drawback is to consider higher level units such as genes or regions in the analysis. Earlier we proposed a statistic based on canonical correlations (CCU) as a gene-based method for detecting gene-gene co-association. However, it can only capture linear relationship and not nonlinear correlation between genes. We therefore proposed a counterpart (KCCU) based on kernel canonical correlation analysis (KCCA).ResultsThrough simulation the KCCU statistic was shown to be a valid test and more powerful than CCU statistic with respect to sample size and interaction odds ratio. Analysis of data from regions involving three genes on rheumatoid arthritis (RA) from Genetic Analysis Workshop 16 (GAW16) indicated that only KCCU statistic was able to identify interactions reported earlier.ConclusionsKCCU statistic is a valid and powerful gene-based method for detecting gene-gene co-association.


PLOS ONE | 2013

From interaction to co-association --a Fisher r-to-z transformation-based simple statistic for real world genome-wide association study.

Zhongshang Yuan; Hong Liu; Xiaoshuai Zhang; Fangyu Li; Jinghua Zhao; Furen Zhang; Fuzhong Xue

Currently, the genetic variants identified by genome wide association study (GWAS) generally only account for a small proportion of the total heritability for complex disease. One crucial reason is the underutilization of gene-gene joint effects commonly encountered in GWAS, which includes their main effects and co-association. However, gene-gene co-association is often customarily put into the framework of gene-gene interaction vaguely. From the causal graph perspective, we elucidate in detail the concept and rationality of gene-gene co-association as well as its relationship with traditional gene-gene interaction, and propose two Fisher r-to-z transformation-based simple statistics to detect it. Three series of simulations further highlight that gene-gene co-association refers to the extent to which the joint effects of two genes differs from the main effects, not only due to the traditional interaction under the nearly independent condition but the correlation between two genes. The proposed statistics are more powerful than logistic regression under various situations, cannot be affected by linkage disequilibrium and can have acceptable false positive rate as long as strictly following the reasonable GWAS data analysis roadmap. Furthermore, an application to gene pathway analysis associated with leprosy confirms in practice that our proposed gene-gene co-association concepts as well as the correspondingly proposed statistics are strongly in line with reality.


BMJ Open | 2015

Detection for pathway effect contributing to disease in systems epidemiology with a case-control design

Jiadong Ji; Zhongshang Yuan; Xiaoshuai Zhang; Fangyu Li; Jing Xu; Ying Liu; Hongkai Li; Jia Wang; Fuzhong Xue

Objectives Identification of pathway effects responsible for specific diseases has been one of the essential tasks in systems epidemiology. Despite some advance in procedures for distinguishing specific pathway (or network) topology between different disease status, statistical inference at a population level remains unsolved and further development is still needed. To identify the specific pathways contributing to diseases, we attempt to develop powerful statistics which can capture the complex relationship among risk factors. Setting and participants Acute myeloid leukaemia (AML) data obtained from 133 adults (98 patients and 35 controls; 47% female). Results Simulation studies indicated that the proposed Pathway Effect Measures (PEM) were stable; bootstrap-based methods outperformed the others, with bias-corrected bootstrap CI method having the highest power. Application to real data of AML successfully identified the specific pathway (Treg→TGFβ→Th17) effect contributing to AML with p values less than 0.05 under various methods and the bias-corrected bootstrap CI (−0.214 to −0.020). It demonstrated that Th17–Treg correlation balance was impaired in patients with AML, suggesting that Th17–Treg imbalance potentially plays a role in the pathogenesis of AML. Conclusions The proposed bootstrap-based PEM are valid and powerful for detecting the specific pathway effect contributing to disease, thus potentially providing new insight into the underlying mechanisms and ways to study the disease effects of specific pathways more comprehensively.


BMC Genetics | 2014

Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies.

Xiaoshuai Zhang; Fuzhong Xue; Hong Liu; Dianwen Zhu; Bin Peng; Joseph L. Wiemels; Xiaowei Yang

BackgroundGenome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this “missing heritability” problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets.ResultsSimulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case–control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies.ConclusionsThe proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods.


BMC Genetics | 2013

A powerful latent variable method for detecting and characterizing gene-based gene-gene interaction on multiple quantitative traits.

Fangyu Li; Jinghua Zhao; Zhongshang Yuan; Xiaoshuai Zhang; Jiadong Ji; Fuzhong Xue

BackgroundOn thinking quantitatively of complex diseases, there are at least three statistical strategies for analyzing the gene-gene interaction: SNP by SNP interaction on single trait, gene-gene (each can involve multiple SNPs) interaction on single trait and gene-gene interaction on multiple traits. The third one is the most general in dissecting the genetic mechanism underlying complex diseases underpinning multiple quantitative traits. In this paper, we developed a novel statistic for this strategy through modifying the Partial Least Squares Path Modeling (PLSPM), called mPLSPM statistic.ResultsSimulation studies indicated that mPLSPM statistic was powerful and outperformed the principal component analysis (PCA) based linear regression method. Application to real data in the EPIC-Norfolk GWAS sub-cohort showed suggestive interaction (γ) between TMEM18 gene and BDNF gene on two composite body shape scores (γ = 0.047 and γ = 0.058, with P = 0.021, P = 0.005), and BMI (γ = 0.043, P = 0.034). This suggested these scores (synthetically latent traits) were more suitable to capture the obesity related genetic interaction effect between genes compared to single trait.ConclusionsThe proposed novel mPLSPM statistic is a valid and powerful gene-based method for detecting gene-gene interaction on multiple quantitative phenotypes.


BMC Medical Research Methodology | 2016

Network or regression-based methods for disease discrimination: a comparison study.

Xiaoshuai Zhang; Zhongshang Yuan; Jiadong Ji; Hongkai Li; Fuzhong Xue

BackgroundIn stark contrast to network-centric view for complex disease, regression-based methods are preferred in disease prediction, especially for epidemiologists and clinical professionals. It remains a controversy whether the network-based methods have advantageous performance than regression-based methods, and to what extent do they outperform.MethodsSimulations under different scenarios (the input variables are independent or in network relationship) as well as an application were conducted to assess the prediction performance of four typical methods including Bayesian network, neural network, logistic regression and regression splines.ResultsThe simulation results reveal that Bayesian network showed a better performance when the variables were in a network relationship or in a chain structure. For the special wheel network structure, logistic regression had a considerable performance compared to others. Further application on GWAS of leprosy show Bayesian network still outperforms other methods.ConclusionAlthough regression-based methods are still popular and widely used, network-based approaches should be paid more attention, since they capture the complex relationship between variables.


BMC Bioinformatics | 2016

A powerful score-based statistical test for group difference in weighted biological networks

Jiadong Ji; Zhongshang Yuan; Xiaoshuai Zhang; Fuzhong Xue

BackgroundComplex disease is largely determined by a number of biomolecules interwoven into networks, rather than a single biomolecule. A key but inadequately addressed issue is how to test possible differences of the networks between two groups. Group-level comparison of network properties may shed light on underlying disease mechanisms and benefit the design of drug targets for complex diseases. We therefore proposed a powerful score-based statistic to detect group difference in weighted networks, which simultaneously capture the vertex changes and edge changes.ResultsSimulation studies indicated that the proposed network difference measure (NetDifM) was stable and outperformed other methods existed, under various sample sizes and network topology structure. One application to real data about GWAS of leprosy successfully identified the specific gene interaction network contributing to leprosy. For additional gene expression data of ovarian cancer, two candidate subnetworks, PI3K-AKT and Notch signaling pathways, were considered and identified respectively.ConclusionsThe proposed method, accounting for the vertex changes and edge changes simultaneously, is valid and powerful to capture the group difference of biological networks.


BMC Genetics | 2016

A powerful score-based test statistic for detecting gene-gene co-association

Jing Xu; Zhongshang Yuan; Jiadong Ji; Xiaoshuai Zhang; Hongkai Li; Xuesen Wu; Fuzhong Xue; Yanxun Liu

BackgroundThe genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the “missing heritability” problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association.ResultsVarious simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ2) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice.ConclusionsSBS is a powerful and efficient gene-based method for detecting gene-gene co-association.


Human Biology | 2014

Comparing partial least square approaches in a gene- or region-based association study for multiple quantitative phenotypes.

Zhongshang Yuan; Xiaoshuai Zhang; Fangyu Li; Jinghua Zhao; Fuzhong Xue

ABSTRACT On thinking quantitatively of complex diseases, there are at least three statistical strategies for association studies: one single-nucleotide polymorphism (SNP) on a single trait, gene or region (with multiple SNPs) on a single trait, and gene or region on multiple traits. The third approach is the most general in dissecting genetic mechanisms underlying complex diseases underpinning multiple quantitative traits. Gene or region association methods based on partial least square (PLS) approaches have been shown to have apparent power advantage. However, few approaches have been developed for multiple quantitative phenotypes or traits underlying a condition or disease, and the performance of various PLS approaches used in association studies for multiple quantitative traits have not been assessed. Here we exploit association between multiple SNPs and multiple phenotypes or traits, from a regression perspective, through exhaustive scan statistics (sliding window) using PLS and sparse PLS regressions. Simulations were conducted to assess the performance of the proposed scan statistics and compare them with existing methods. The proposed methods were applied to 12 regions of genome-wide association study data from the European Prospective Investigation of Cancer-Norfolk study.

Collaboration


Dive into the Xiaoshuai Zhang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bin Peng

Chongqing Medical University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dianwen Zhu

City University of New York

View shared research outputs
Top Co-Authors

Avatar

Xiaowei Yang

City University of New York

View shared research outputs
Researchain Logo
Decentralizing Knowledge