Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sijian Wang is active.

Publication


Featured researches published by Sijian Wang.


Statistics in Medicine | 2013

Variable selection for multiply-imputed data with application to dioxin exposure study

Qixuan Chen; Sijian Wang

Multiple imputation (MI) is a commonly used technique for handling missing data in large-scale medical and public health studies. However, variable selection on multiply-imputed data remains an important and longstanding statistical problem. If a variable selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation-least absolute shrinkage and selection operator (MI-LASSO) variable selection method as an extension of the least absolute shrinkage and selection operator (LASSO) method to multiply-imputed data. The MI-LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable selection across multiple-imputed datasets. We use a simulation study to demonstrate the advantage of the MI-LASSO method compared with the alternatives. We also apply the MI-LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan.


Biometrics | 2015

Regularized outcome weighted subgroup identification for differential treatment effects

Yaoyao Xu; Menggang Yu; Yingqi Zhao; Quefeng Li; Sijian Wang; Jun Shao

To facilitate comparative treatment selection when there is substantial heterogeneity of treatment effectiveness, it is important to identify subgroups that exhibit differential treatment effects. Existing approaches model outcomes directly and then define subgroups according to interactions between treatment and covariates. Because outcomes are affected by both the covariate-treatment interactions and covariate main effects, direct modeling outcomes can be hard due to model misspecification, especially in presence of many covariates. Alternatively one can directly work with differential treatment effect estimation. We propose such a method that approximates a target function whose value directly reflects correct treatment assignment for patients. The function uses patient outcomes as weights rather than modeling targets. Consequently, our method can deal with binary, continuous, time-to-event, and possibly contaminated outcomes in the same fashion. We first focus on identifying only directional estimates from linear rules that characterize important subgroups. We further consider estimation of comparative treatment effects for identified subgroups. We demonstrate the advantages of our method in simulation studies and in analyses of two real data sets.


Biometrics | 2015

Simultaneous variable selection for joint models of longitudinal and survival outcomes

Zangdong He; Wanzhu Tu; Sijian Wang; Haoda Fu; Zhangsheng Yu

Joint models of longitudinal and survival outcomes have been used with increasing frequency in clinical investigations. Correct specification of fixed and random effects is essential for practical data analysis. Simultaneous selection of variables in both longitudinal and survival components functions as a necessary safeguard against model misspecification. However, variable selection in such models has not been studied. No existing computational tools, to the best of our knowledge, have been made available to practitioners. In this article, we describe a penalized likelihood method with adaptive least absolute shrinkage and selection operator (ALASSO) penalty functions for simultaneous selection of fixed and random effects in joint models. To perform selection in variance components of random effects, we reparameterize the variance components using a Cholesky decomposition; in doing so, a penalty function of group shrinkage is introduced. To reduce the estimation bias resulted from penalization, we propose a two-stage selection procedure in which the magnitude of the bias is ameliorated in the second stage. The penalized likelihood is approximated by Gaussian quadrature and optimized by an EM algorithm. Simulation study showed excellent selection results in the first stage and small estimation biases in the second stage. To illustrate, we analyzed a longitudinally observed clinical marker and patient survival in a cohort of patients with heart failure.


Statistics in Medicine | 2013

Pathway index models for construction of patient-specific risk profiles.

Kevin H. Eng; Sijian Wang; William H. Bradley; Janet S. Rader; Christina Kendziorski

Statistical methods for variable selection, prediction, and classification have proven extremely useful in moving personalized genomics medicine forward, in particular, leading to a number of genomic-based assays now in clinical use for predicting cancer recurrence. Although invaluable in individual cases, the information provided by these assays is limited. Most often, a patient is classified into one of very few groups (e.g., recur or not), limiting the potential for truly personalized treatment. Furthermore, although these assays provide information on which individuals are at most risk (e.g., those for which recurrence is predicted), they provide no information on the aberrant biological pathways that give rise to the increased risk. We have developed an approach to address these limitations. The approach models a time-to-event outcome as a function of known biological pathways, identifies important genomic aberrations, and provides pathway-based patient-specific assessments of risk. As we demonstrate in a study of ovarian cancer from The Cancer Genome Atlas project, the patient-specific risk profiles are powerful and efficient characterizations useful in addressing a number of questions related to identifying informative patient subtypes and predicting survival.


Statistics in Medicine | 2015

Using distance covariance for improved variable selection with application to learning genetic risk models

Jing Kong; Sijian Wang; Grace Wahba

Variable selection is of increasing importance to address the difficulties of high dimensionality in many scientific areas. In this paper, we demonstrate a property for distance covariance, which is incorporated in a novel feature screening procedure together with the use of distance correlation. The approach makes no distributional assumptions for the variables and does not require the specification of a regression model and hence is especially attractive in variable selection given an enormous number of candidate attributes without much information about the true model with the response. The method is applied to two genetic risk problems, where issues including uncertainty of variable selection via cross validation, subgroup of hard-to-classify cases, and the application of a reject option are discussed.


Bone | 2016

Site-specific, adult bone benefits attributed to loading during youth: A preliminary longitudinal analysis.

Tamara A. Scerpella; Brittney Bernardoni; Sijian Wang; Paul J. Rathouz; Quefeng Li; Jodi N. Dowthwaite

We examined site-specific bone development in relation to childhood and adolescent artistic gymnastics exposure, comparing up to 10years of prospectively acquired longitudinal data in 44 subjects, including 31 non-gymnasts (NON) and 13 gymnasts (GYM) who participated in gymnastics from pre-menarche to ≥1.9years post-menarche. Subjects underwent annual regional and whole-body DXA scans; indices of bone geometry and strength were calculated. Anthropometrics, physical activity, and maturity were assessed annually, coincident with DXA scans. Non-linear mixed effect models centered growth in bone outcomes at menarche and adjusted for menarcheal age, height, and non-bone fat-free mass to evaluate GYM-NON differences. A POST-QUIT variable assessed the withdrawal effect of quitting gymnastics. Curves for bone area, mass (BMC), and strength indices were higher in GYM than NON at both distal radius metaphysis and diaphysis (p<0.0001). At the femoral neck, greater GYM BMC (p<0.01), narrower GYM endosteal diameter (p<0.02), and similar periosteal width (p=0.09) yielded GYM advantages in narrow neck cortical thickness and buckling ratio (both p<0.001; lower BR indicates lower fracture risk). Lumbar spine and sub-head BMC were greater in GYM than NON (p<0.036). Following gymnastics cessation, GYM slopes increased for distal radius diaphysis parameters (p≤0.01) and for narrow neck BR (p=0.02). At the distal radius metaphysis, GYM BMC and compressive strength slopes decreased, as did slopes for lumbar spine BMC, femoral neck BMC, and narrow neck cortical thickness (p<0.02). In conclusion, advantages in bone mass, geometry, and strength at multiple skeletal sites were noted across growth and into young adulthood in girls who participated in gymnastics loading to at least 1.9years post-menarche. Following gymnastics cessation, advantages at cortical bone sites improved or stabilized, while advantages at corticocancellous sites stabilized or diminished. Additional longitudinal observation is necessary to determine whether residual loading benefits enhance lifelong skeletal strength.


Biostatistics | 2013

Lasso tree for cancer staging with survival data

Yunzhi Lin; Sijian Wang; Rick Chappell

The tumor-node-metastasis staging system has been the lynchpin of cancer diagnosis, treatment, and prognosis for many years. For meaningful clinical use, an orderly grouping of the T and N categories into a staging system needs to be defined, usually with respect to a time-to-event outcome. This can be reframed as a model selection problem with respect to features arranged on a partially ordered two-way grid, and a penalized regression method is proposed for selecting the optimal grouping. Instead of penalizing the L1-norm of the coefficients like lasso, in order to enforce the stage grouping, we place L1 constraints on the differences between neighboring coefficients. The underlying mechanism is the sparsity-enforcing property of the L1 penalty, which forces some estimated coefficients to be the same and hence leads to stage grouping. Partial ordering constraints is also required as both the T and N categories are ordinal. A series of optimal groupings with different numbers of stages can be obtained by varying the tuning parameter, which gives a tree-like structure offering a visual aid on how the groupings are progressively made. We hence call the proposed method the lasso tree. We illustrate the utility of our method by applying it to the staging of colorectal cancer using survival outcomes. Simulation studies are carried out to examine the finite sample performance of the selection procedure. We demonstrate that the lasso tree is able to give the right grouping with moderate sample size, is stable with regard to changes in the data, and is not affected by random censoring.


Biometrics | 2015

Group variable selection via convex log‐exp‐sum penalty with application to a breast cancer survivor study

Zhigeng Geng; Sijian Wang; Menggang Yu; Patrick O. Monahan; Victoria L. Champion; Grace Wahba

In many scientific and engineering applications, covariates are naturally grouped. When the group structures are available among covariates, people are usually interested in identifying both important groups and important variables within the selected groups. Among existing successful group variable selection methods, some methods fail to conduct the within group selection. Some methods are able to conduct both group and within group selection, but the corresponding objective functions are non-convex. Such a non-convexity may require extra numerical effort. In this article, we propose a novel Log-Exp-Sum(LES) penalty for group variable selection. The LES penalty is strictly convex. It can identify important groups as well as select important variables within the group. We develop an efficient group-level coordinate descent algorithm to fit the model. We also derive non-asymptotic error bounds and asymptotic group selection consistency for our method in the high-dimensional setting where the number of covariates can be much larger than the sample size. Numerical results demonstrate the good performance of our method in both variable selection and prediction. We applied the proposed method to an American Cancer Society breast cancer survivor dataset. The findings are clinically meaningful and may help design intervention programs to improve the qualify of life for breast cancer survivors.


Statistical Methods in Medical Research | 2016

Advanced colorectal neoplasia risk stratification by penalized logistic regression.

Yunzhi Lin; Menggang Yu; Sijian Wang; Rick Chappell; Thomas F. Imperiale

Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered “average risk.” In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the L 1 -norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance.


Pediatric Exercise Science | 2015

The influence of organized physical activity (including gymnastics) on young adult skeletal traits: Is maturity phase important?

Brittney Bernardoni; Tamara A. Scerpella; Paula F. Rosenbaum; Jill A. Kanaley; Lindsay N. Raab; Quefeng Li; Sijian Wang; Jodi N. Dowthwaite

We prospectively evaluated adolescent organized physical activity (PA) as a factor in adult female bone traits. Annual DXA scans accompanied semiannual records of anthropometry, maturity, and PA for 42 participants in this preliminary analysis (criteria: appropriately timed DXA scans at ~1 year premenarche [predictor] and ~5 years postmenarche [dependent variable]). Regression analysis evaluated total adolescent interscan PA and PA over 3 maturity subphases as predictors of young adult bone outcomes: 1) bone mineral content (BMC), geometry, and strength indices at nondominant distal radius and femoral neck; 2) subhead BMC; 3) lumbar spine BMC. Analyses accounted for baseline gynecological age (years pre- or postmenarche), baseline bone status, adult body size and interscan body size change. Gymnastics training was evaluated as a potentially independent predictor, but did not improve models for any outcomes (p > .07). Premenarcheal bone traits were strong predictors of most adult outcomes (semipartial r2 = .21-0.59, p ≤ .001). Adult 1/3 radius and subhead BMC were predicted by both total PA and PA 1-3 years postmenarche (p < .03). PA 3-5 years postmenarche predicted femoral narrow neck width, endosteal diameter, and buckling ratio (p < .05). Thus, participation in organized physical activity programs throughout middle and high school may reduce lifetime fracture risk in females.

Collaboration


Dive into the Sijian Wang's collaboration.

Top Co-Authors

Avatar

Menggang Yu

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Quefeng Li

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Grace Wahba

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Brittney Bernardoni

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Jing Kong

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Tamara A. Scerpella

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Jodi N. Dowthwaite

State University of New York Upstate Medical University

View shared research outputs
Top Co-Authors

Avatar

Jun Shao

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rick Chappell

University of Wisconsin-Madison

View shared research outputs
Researchain Logo
Decentralizing Knowledge