Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shuangge Ma is active.

Publication


Featured researches published by Shuangge Ma.


Annals of Statistics | 2008

Asymptotic properties of bridge estimators in sparse high-dimensional regression models

Jian Huang; Joel L. Horowitz; Shuangge Ma

We study the asymptotic properties of bridge estimators in sparse, high-dimensional, linear regression models when the number of covariates may increase to infinity with the sample size. We are particularly interested in the use of bridge estimators to distinguish between covariates whose coefficients are zero and covariates whose coefficients are nonzero. We show that under appropriate conditions, bridge estimators correctly select covariates with nonzero coefficients with probability converging to one and that the estimators of nonzero coefficients have the same asymptotic distribution that they would have if the zero coefficients were known in advance. Thus, bridge estimators have an oracle property in the sense of Fan and Li [J. Amer. Statist. Assoc. 96 (2001) 1348-1360] and Fan and Peng [Ann. Statist. 32 (2004) 928-961]. In general, the oracle property holds only if the number of covariates is smaller than the sample size. However, under a partial orthogonality condition in which the covariates of the zero coefficients are uncorrelated or weakly correlated with the covariates of nonzero coefficients, we show that marginal bridge estimators can correctly distinguish between covariates with nonzero and zero coefficients with probability converging to one even when the number of covariates is greater than the sample size.


BMC Bioinformatics | 2007

Supervised group Lasso with applications to microarray data analysis

Shuangge Ma; Xiao Song; Jian Huang

BackgroundA tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure.ResultsWe propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data.ConclusionWe analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods.


Statistical Science | 2012

A Selective Review of Group Selection in High-Dimensional Models.

Jian Huang; Patrick Breheny; Shuangge Ma

Grouping structures arise naturally in many statistical modeling problems. Several methods have been proposed for variable selection that respect grouping structure in variables. Examples include the group LASSO and several concave group selection methods. In this article, we give a selective review of group selection concerning methodological developments, theoretical properties and computational algorithms. We pay particular attention to group selection methods involving concave penalties. We address both group selection and bi-level selection methods. We describe several applications of these methods in nonparametric additive models, semiparametric regression, seemingly unrelated regressions, genomic data analysis and genome wide association studies. We also highlight some issues that require further study.


BMC Bioinformatics | 2009

Regularized gene selection in cancer microarray meta-analysis

Shuangge Ma; Jian Huang

BackgroundIn cancer studies, it is common that multiple microarray experiments are conducted to measure the same clinical outcome and expressions of the same set of genes. An important goal of such experiments is to identify a subset of genes that can potentially serve as predictive markers for cancer development and progression. Analyses of individual experiments may lead to unreliable gene selection results because of the small sample sizes. Meta analysis can be used to pool multiple experiments, increase statistical power, and achieve more reliable gene selection. The meta analysis of cancer microarray data is challenging because of the high dimensionality of gene expressions and the differences in experimental settings amongst different experiments.ResultsWe propose a Meta Threshold Gradient Descent Regularization (MTGDR) approach for gene selection in the meta analysis of cancer microarray data. The MTGDR has many advantages over existing approaches. It allows different experiments to have different experimental settings. It can account for the joint effects of multiple genes on cancer, and it can select the same set of cancer-associated genes across multiple experiments. Simulation studies and analyses of multiple pancreatic and liver cancer experiments demonstrate the superior performance of the MTGDR.ConclusionThe MTGDR provides an effective way of analyzing multiple cancer microarray studies and selecting reliable cancer-associated genes.


Briefings in Bioinformatics | 2008

Penalized feature selection and classification in bioinformatics

Shuangge Ma; Jian Huang

In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classification techniques--which belong to the family of embedded feature selection methods--for bioinformatics studies with high-dimensional input. Classification objective functions, penalty functions and computational algorithms are discussed. Our goal is to make interested researchers aware of these feature selection and classification methods that are applicable to high-dimensional bioinformatics data.


Nature Genetics | 2015

Exome sequencing identifies recurrent mutations in NF1 and RASopathy genes in sun-exposed melanomas

Michael Krauthammer; Yong Kong; Antonella Bacchiocchi; Perry Evans; Natapol Pornputtapong; Cen Wu; James P. McCusker; Shuangge Ma; Elaine Cheng; Robert Straub; Merdan Serin; Marcus Bosenberg; Stephan Ariyan; Deepak Narayan; Mario Sznol; Harriet M. Kluger; Shrikant Mane; Joseph Schlessinger; Richard P. Lifton; Ruth Halaban

We report on whole-exome sequencing (WES) of 213 melanomas. Our analysis established NF1, encoding a negative regulator of RAS, as the third most frequently mutated gene in melanoma, after BRAF and NRAS. Inactivating NF1 mutations were present in 46% of melanomas expressing wild-type BRAF and RAS, occurred in older patients and showed a distinct pattern of co-mutation with other RASopathy genes, particularly RASA2. Functional studies showed that NF1 suppression led to increased RAS activation in most, but not all, melanoma cases. In addition, loss of NF1 did not predict sensitivity to MEK or ERK inhibitors. The rebound pathway, as seen by the induction of phosphorylated MEK, occurred in cells both sensitive and resistant to the studied drugs. We conclude that NF1 is a key tumor suppressor lost in melanomas, and that concurrent RASopathy gene mutations may enhance its role in melanomagenesis.


BMC Bioinformatics | 2010

Detection of gene pathways with predictive power for breast cancer prognosis

Shuangge Ma; Michael R. Kosorok

BackgroundPrognosis is of critical interest in breast cancer research. Biomedical studies suggest that genomic measurements may have independent predictive power for prognosis. Gene profiling studies have been conducted to search for predictive genomic measurements. Genes have the inherent pathway structure, where pathways are composed of multiple genes with coordinated functions. The goal of this study is to identify gene pathways with predictive power for breast cancer prognosis. Since our goal is fundamentally different from that of existing studies, a new pathway analysis method is proposed.ResultsThe new method advances beyond existing alternatives along the following aspects. First, it can assess the predictive power of gene pathways, whereas existing methods tend to focus on model fitting accuracy only. Second, it can account for the joint effects of multiple genes in a pathway, whereas existing methods tend to focus on the marginal effects of genes. Third, it can accommodate multiple heterogeneous datasets, whereas existing methods analyze a single dataset only. We analyze four breast cancer prognosis studies and identify 97 pathways with significant predictive power for prognosis. Important pathways missed by alternative methods are identified.ConclusionsThe proposed method provides a useful alternative to existing pathway analysis methods. Identified pathways can provide further insights into breast cancer prognosis.


Thyroid | 2009

A Birth Cohort Analysis of the Incidence of Papillary Thyroid Cancer in the United States, 1973–2004

Cairong Zhu; Tongzhang Zheng; Briseis A. Kilfoy; Xuesong Han; Shuangge Ma; Yue Ba; Yana Bai; Rong Wang; Yong Zhu; Yawei Zhang

BACKGROUND The incidence of papillary thyroid cancer has been reported to be increasing during the past three decades, with a 65-126% increase between 1975 and 2004. The reason for the increase is currently unknown. This study examined the incidence pattern of papillary thyroid cancer in the United States, and evaluated the components of birth cohort (defined based on year of birth), time period, and age as determinants of the observed time trend of the disease. METHODS Using the data from the National Cancer Institutes Surveillance, Epidemiology, and End Results program for 1973-2004, we conducted both univariate analysis and age-period-cohort modeling to evaluate birth cohort patterns and evaluate age, period, and cohort effects on incidence trends over time. RESULTS The increasing incidence showed a clear birth cohort pattern for both men and women. The results from age-period-cohort modeling showed that, while period effect appeared to have had an impact on the observed incidence trends, birth cohort effect may also explain part of the increasing trend in papillary thyroid carcinoma during the study period, especially among women. CONCLUSION While a period effect that is likely due to advancements in diagnostic techniques and increased medical detection of small thyroid nodules may explain some of the observed increase in the incidence, we speculate that birth cohort-related changes in environmental exposures (such as increased exposure to diagnostic X-rays and polybrominated diphenyl ethers) have also contributed to the observed increase in papillary thyroid cancer during the past decades.


European Journal of Clinical Nutrition | 2011

Relationship of folate, vitamin B12 and methylation of insulin-like growth factor-II in maternal and cord blood

Yue Ba; Hebert Yu; Fudong Liu; Xue Geng; Cairong Zhu; Quan Zhu; Tongzhang Zheng; Shuangge Ma; Gang Wang; Zhiyuan Li; Yawei Zhang

Background/Objective:One of the speculated mechanisms underlying fetal origin hypothesis of breast cancer is the possible influence of maternal environment on epigenetic regulation, such as changes in DNA methylation of the insulin-like growth factor-2 (IGF2) gene. The aim of the study is to investigate the relationship between folate, vitamin B12 and methylation of the IGF2 gene in maternal and cord blood.Subjects/Methods:We conducted a cross-sectional study to measure methylation patterns of IGF2 in promoters 2 (P2) and promoters 3 (P3).Results:The percentage of methylation in IGF2 P3 was higher in maternal blood than in cord blood (P<0.0001), whereas the methylation in P2 was higher in cord blood than in maternal blood (P=0.016). P3 methylation was correlated between maternal and cord blood (P<0.0001), but not P2 (P=0.06). The multivariate linear regression model showed that methylation patterns of both promoters in cord blood were not associated with serum folate levels in either cord or maternal blood, whereas the P3 methylation patterns were associated with serum levels of vitamin B12 in mothers blood (mean change (MC)=−0.22, P=0.0014). Methylation patterns in P2 of maternal blood were associated with serum levels of vitamin B12 in mothers blood (MC=−0.23, P=0.012), exposure to passive smoking (MC=0.46, P=0.034) and mothers weight gain during pregnancy (MC=0.23, P=0.019).Conclusions:The study suggests that environment influences methylation patterns in maternal blood, and then the maternal patterns influence the methylation status and levels of folate and vitamin B12 in cord blood.


Annals of Statistics | 2011

THE SPARSE LAPLACIAN SHRINKAGE ESTIMATOR FOR HIGH-DIMENSIONAL REGRESSION

Jian Huang; Shuangge Ma; Hongzhe Li; Cun-Hui Zhang

We propose a new penalized method for variable selection and estimation that explicitly incorporates the correlation patterns among predictors. This method is based on a combination of the minimax concave penalty and Laplacian quadratic associated with a graph as the penalty function. We call it the sparse Laplacian shrinkage (SLS) method. The SLS uses the minimax concave penalty for encouraging sparsity and Laplacian quadratic penalty for promoting smoothness among coefficients associated with the correlated predictors. The SLS has a generalized grouping property with respect to the graph represented by the Laplacian quadratic. We show that the SLS possesses an oracle property in the sense that it is selection consistent and equal to the oracle Laplacian shrinkage estimator with high probability. This result holds in sparse, high-dimensional settings with p ≫ n under reasonable conditions. We derive a coordinate descent algorithm for computing the SLS estimates. Simulation studies are conducted to evaluate the performance of the SLS method and a real data example is used to illustrate its application.

Collaboration


Dive into the Shuangge Ma's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jin Liu

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yang Li

Renmin University of China

View shared research outputs
Top Co-Authors

Avatar

Nathaniel Rothman

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Qing Lan

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xingjie Shi

Nanjing University of Finance and Economics

View shared research outputs
Researchain Logo
Decentralizing Knowledge