Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Qiuying Sha is active.

Publication


Featured researches published by Qiuying Sha.


BMC Bioinformatics | 2004

Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes

Hongying Jiang; Youping Deng; Huann Sheng Chen; Lin Tao; Qiuying Sha; Jun Chen; Chung-Jui Tsai; Shuanglin Zhang

BackgroundDue to the high cost and low reproducibility of many microarray experiments, it is not surprising to find a limited number of patient samples in each study, and very few common identified marker genes among different studies involving patients with the same disease. Therefore, it is of great interest and challenge to merge data sets from multiple studies to increase the sample size, which may in turn increase the power of statistical inferences. In this study, we combined two lung cancer studies using micorarray GeneChip®, employed two gene shaving methods and a two-step survival test to identify genes with expression patterns that can distinguish diseased from normal samples, and to indicate patient survival, respectively.ResultsIn addition to common data transformation and normalization procedures, we applied a distribution transformation method to integrate the two data sets. Gene shaving (GS) methods based on Random Forests (RF) and Fishers Linear Discrimination (FLD) were then applied separately to the joint data set for cancer gene selection. The two methods discovered 13 and 10 marker genes (5 in common), respectively, with expression patterns differentiating diseased from normal samples. Among these marker genes, 8 and 7 were found to be cancer-related in other published reports. Furthermore, based on these marker genes, the classifiers we built from one data set predicted the other data set with more than 98% accuracy. Using the univariate Cox proportional hazard regression model, the expression patterns of 36 genes were found to be significantly correlated with patient survival (p < 0.05). Twenty-six of these 36 genes were reported as survival-related genes from the literature, including 7 known tumor-suppressor genes and 9 oncogenes. Additional principal component regression analysis further reduced the gene list from 36 to 16.ConclusionThis study provided a valuable method of integrating microarray data sets with different origins, and new methods of selecting a minimum number of marker genes to aid in cancer diagnosis. After careful data integration, the classification method developed from one data set can be applied to the other with high prediction accuracy.


American Journal of Human Genetics | 2003

Transmission/Disequilibrium Test Based on Haplotype Sharing for Tightly Linked Markers

Shuanglin Zhang; Qiuying Sha; Huann Sheng Chen; Jianping Dong; Renfang Jiang

Studies using haplotypes of multiple tightly linked markers are more informative than those using a single marker. However, studies based on multimarker haplotypes have some difficulties. First, if we consider each haplotype as an allele and use the conventional single-marker transmission/disequilibrium test (TDT), then the rapid increase in the degrees of freedom with an increasing number of markers means that the statistical power of the conventional tests will be low. Second, the parental haplotypes cannot always be unambiguously reconstructed. In the present article, we propose a haplotype-sharing TDT (HS-TDT) for linkage or association between a disease-susceptibility locus and a chromosome region in which several tightly linked markers have been typed. This method is applicable to both quantitative traits and qualitative traits. It is applicable to any size of nuclear family, with or without ambiguous phase information, and it is applicable to any number of alleles at each of the markers. The degrees of freedom (in a broad sense) of the test increase linearly as the number of markers considered increases but do not increase as the number of alleles at the markers increases. Our simulation results show that the HS-TDT has the correct type I error rate in structured populations and that, in most cases, the power of HS-TDT is higher than the power of the existing single-marker TDTs and haplotype-based TDTs.


Genetic Epidemiology | 2012

Two Adaptive Weighting Methods to Test for Rare Variant Associations in Family‐Based Designs

Shurong Fang; Qiuying Sha; Shuanglin Zhang

Although next‐generation DNA sequencing technologies have made rare variant association studies feasible and affordable, the development of powerful statistical methods for rare variant association studies is still under way. Most of the existing methods for rare variant association studies compare the number of rare mutations in a group of rare variants (in a gene or a pathway) between cases and controls. However, these methods assume that all causal variants are risk to diseases. Recently, several methods that are robust to the direction and magnitude of effects of causal variants have been proposed. However, they are applicable to unrelated individuals only, whereas family data have been shown to improve power to detect rare variants. In this article, we propose two adaptive weighting methods for rare variant association studies based on family data for quantitative traits. Using extensive simulation studies, we evaluate and compare our proposed methods with two methods based on the weights proposed by Madsen and Browning. Our results show that both proposed methods are robust to population stratification, robust to the direction and magnitude of the effects of causal variants, and more powerful than the methods using weights suggested by Madsen and Browning, especially when both risk and protective variants are present. Genet. Epidemiol. 36:499‐507, 2012.


Annals of Human Genetics | 2006

A Combinatorial Searching Method for Detecting a Set of Interacting Loci Associated with Complex Traits

Qiuying Sha; Xiaofeng Zhu; Yijun Zuo; Richard S. Cooper; Shuanglin Zhang

Complex diseases are presumed to be the results of the interaction of several genes and environmental factors, with each gene only having a small effect on the disease. Mapping complex disease genes therefore becomes one of the greatest challenges facing geneticists. Most current approaches of association studies essentially evaluate one marker or one gene (haplotype approach) at a time. These approaches ignore the possibility that effects of multilocus functional genetic units may play a larger role than a single‐locus effect in determining trait variability. In this article, we propose a Combinatorial Searching Method (CSM) to detect a set of interacting loci (may be unlinked) that predicts the complex trait. In the application of the CSM, a simple filter is used to filter all the possible locus‐sets and retain the candidate locus‐sets, then a new objective function based on the cross‐validation and partitions of the multi‐locus genotypes is proposed to evaluate the retained locus‐sets. The locus‐set with the largest value of the objective function is the final locus‐set and a permutation procedure is performed to evaluate the overall p‐value of the test for association between the final locus‐set and the trait. The performance of the method is evaluated by simulation studies as well as by being applied to a real data set. The simulation studies show that the CSM has reasonable power to detect high‐order interactions. When the CSM is applied to a real data set to detect the locus‐set (among the 13 loci in the ACE gene) that predicts systolic blood pressure (SBP) or diastolic blood pressure (DBP), we found that a four‐locus gene‐gene interaction model best predicts SBP with an overall p‐value = 0.033, and similarly a two‐locus gene‐gene interaction model best predicts DBP with an overall p‐value = 0.045.


Annals of Human Genetics | 2005

Tests of association between quantitative traits and haplotypes in a reduced-dimensional space.

Qiuying Sha; Jianping Dong; Renfang Jiang; Shuanglin Zhang

Candidate gene association tests are currently performed using several intragenic SNPs simultaneously, by testing SNP haplotype or genotype effects in multifactorial diseases or traits. The number of haplotypes drastically increases with an increase in the number of typed SNPs. As a result, large numbers of haplotypes will introduce large degrees of freedom in haplotype‐based tests, and thus limit the power of the tests.


Genetic Epidemiology | 2008

An ensemble learning approach jointly modeling main and interaction effects in genetic association studies.

Zhaogong Zhang; Shuanglin Zhang; Man Yu Wong; Nicholas J. Wareham; Qiuying Sha

Complex diseases are presumed to be the results of interactions of several genes and environmental factors, with each gene only having a small effect on the disease. Thus, the methods that can account for gene‐gene interactions to search for a set of marker loci in different genes or across genome and to analyze these loci jointly are critical. In this article, we propose an ensemble learning approach (ELA) to detect a set of loci whose main and interaction effects jointly have a significant association with the trait. In the ELA, we first search for “base learners” and then combine the effects of the base learners by a linear model. Each base learner represents a main effect or an interaction effect. The result of the ELA is easy to interpret. When the ELA is applied to analyze a data set, we can get a final model, an overall P‐value of the association test between the set of loci involved in the final model and the trait, and an importance measure for each base learner and each marker involved in the final model. The final model is a linear combination of some base learners. We know which base learner represents a main effect and which one represents an interaction effect. The importance measure of each base learner or marker can tell us the relative importance of the base learner or marker in the final model. We used intensive simulation studies as well as a real data set to evaluate the performance of the ELA. Our simulation studies demonstrated that the ELA is more powerful than the single‐marker test in all the simulation scenarios. The ELA also outperformed the other three existing multi‐locus methods in almost all cases. In an application to a large‐scale case‐control study for Type 2 diabetes, the ELA identified 11 single nucleotide polymorphisms that have a significant multi‐locus effect (P‐value=0.01), while none of the single nucleotide polymorphisms showed significant marginal effects and none of the two‐locus combinations showed significant two‐locus interaction effects. Genet. Epidemiol.


Annals of Human Genetics | 2009

A Variable-Sized Sliding-Window Approach for Genetic Association Studies via Principal Component Analysis

Rui Tang; Tao Feng; Qiuying Sha; Shuanglin Zhang

Recently with the rapid improvements in high‐throughout genotyping techniques, researchers are facing the very challenging task of analysing large‐scale genetic associations, especially at the whole‐genome level, without an optimal solution. In this study, we propose a new approach for genetic association analysis that is based on a variable‐sized sliding‐window framework and employs principal component analysis to find the optimum window size. With the help of the bisection algorithm in window‐size searching, our method is more computationally efficient than available approaches. We evaluate the performance of the proposed method by comparing it with two other methods—a single‐marker method and a variable‐length Markov chain method. We demonstrate that, in most cases, the proposed method out‐performs the other two methods. Furthermore, since the proposed method is based on genotype data, it does not require any computationally intensive phasing program to account for uncertain haplotype phase.


Human Heredity | 2006

Analytical Correction for Multiple Testing in Admixture Mapping

Qiuying Sha; Xihuan Zhang; Xiaofeng Zhu; Shuanglin Zhang

Admixture mapping, using unrelated individuals from the admixture populations that result from recent mating between members of each parental population, is an efficient approach to localize disease-causing variants that differ in frequency between two or more historically separated populations. Recently, several methods have been proposed to test linkage between a susceptibility gene and a disease locus by using admixture-generated linkage disequilibrium (LD) for each of the genotyped markers. In a genome scan, admixture mapping usually tests 2,000 to 3,000 markers across the genome. Currently, either a very conservative Sidak (or Bonferroni) correction or a very time consuming simulation-based method is used to correct for the multiple tests and evaluate the overall p value. In this report, we propose a computationally efficient analytical approach for correction of the multiple tests and for calculating the overall p value for an admixture genome scan. Except for the Sidak (or Bonferroni) correction, our proposed method is the first analytical approach for correction of the multiple tests and for calculating the overall p value for a genome scan. Our simulation studies show that the proposed method gives correct overall type I error rates for genome scans in all cases, and is much more computationally efficient than simulation-based methods.


PLOS ONE | 2016

Joint Analysis of Multiple Traits Using "Optimal" Maximum Heritability Test

Zhenchuan Wang; Qiuying Sha; Shuanglin Zhang

The joint analysis of multiple traits has recently become popular since it can increase statistical power to detect genetic variants and there is increasing evidence showing that pleiotropy is a widespread phenomenon in complex diseases. Currently, most of existing methods use all of the traits for testing the association between multiple traits and a single variant. However, those methods for association studies may lose power in the presence of a large number of noise traits. In this paper, we propose an “optimal” maximum heritability test (MHT-O) to test the association between multiple traits and a single variant. MHT-O includes a procedure of deleting traits that have weak or no association with the variant. Using extensive simulation studies, we compare the performance of MHT-O with MHT, Trait-based Association Test uses Extended Simes procedure (TATES), SUM_SCORE and MANOVA. Our results show that, in all of the simulation scenarios, MHT-O is either the most powerful test or comparable to the most powerful test among the five tests we compared.


Genetic Epidemiology | 2011

An improved score test for genetic association studies.

Qiuying Sha; Zhaogong Zhang; Shuanglin Zhang

Large‐scale genome‐wide association studies (GWAS) have become feasible recently because of the development of bead and chip technology. However, the success of GWAS partially depends on the statistical methods that are able to manage and analyze this sort of large‐scale data. Currently, the commonly used tests for GWAS include the Cochran–Armitage trend test, the allelic χ2 test, the genotypic χ2 test, the haplotypic χ2 test, and the multi‐marker genotypic χ2 test among others. From a methodological point of view, it is a great challenge to improve the power of commonly used tests, since these tests are commonly used precisely because they are already among the most powerful tests. In this article, we propose an improved score test that is uniformly more powerful than the score test based on the generalized linear model. Since the score test based on the generalized linear model includes the aforementioned commonly used tests as its special cases, our proposed improved score test is thus uniformly more powerful than these commonly used tests. We evaluate the performance of the improved score test by simulation studies and application to a real data set. Our results show that the power increases of the improved score test over the score test cannot be neglected in most cases. Genet. Epidemiol. 2011.

Collaboration


Dive into the Qiuying Sha's collaboration.

Top Co-Authors

Avatar

Shuanglin Zhang

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar

Zhaogong Zhang

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar

Huann Sheng Chen

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Xuexia Wang

University of Wisconsin–Milwaukee

View shared research outputs
Top Co-Authors

Avatar

Zhenchuan Wang

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar

Adan Niu

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar

Jianping Dong

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar

Renfang Jiang

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar

Kui Zhang

University of Alabama at Birmingham

View shared research outputs
Top Co-Authors

Avatar

Rui Tang

Michigan Technological University

View shared research outputs
Researchain Logo
Decentralizing Knowledge