Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yajuan Si is active.

Publication


Featured researches published by Yajuan Si.


Journal of Educational and Behavioral Statistics | 2013

Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys

Yajuan Si

In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian, joint modeling approach to multiple imputation for categorical data based on Dirichlet process mixtures of multinomial distributions. The approach automatically models complex dependencies while being computationally expedient. The Dirichlet process prior distributions enable analysts to avoid fixing the number of mixture components at an arbitrary number. We illustrate repeated sampling properties of the approach using simulated data. We apply the methodology to impute missing background data in the 2007 Trends in International Mathematics and Science Study.


Statistical Science | 2013

Handling Attrition in Longitudinal Studies: The Case for Refreshment Samples

Yiting Deng; D. Sunshine Hillygus; Yajuan Si; Siyu Zheng

Panel studies typically suffer from attrition, which reduces sample size and can result in biased inferences. It is impossible to know whether or not the attrition causes bias from the observed panel data alone. Refreshment samples—new, randomly sampled respondents given the questionnaire at the same time as a subsequent wave of the panel—offer information that can be used to diagnose and adjust for bias due to attrition. We review and bolster the case for the use of refreshment samples in panel studies. We include examples of both a fully Bayesian approach for analyzing the concatenated panel and refreshment data, and a multiple imputation approach for analyzing only the original panel. For the latter, we document a positive bias in the usual multiple imputation variance estimator. We present models appropriate for three waves and two refreshment samples, including nonterminal attrition. We illustrate the three-wave analysis using the 2007-2008 Associated Press-Yahoo! News Election Poll.


Bayesian Analysis | 2015

Bayesian Nonparametric Weighted Sampling Inference

Yajuan Si; Natesh S. Pillai; Andrew Gelman

Survey weighting adjusts for known or expected dierences between sample and population. Weights are constructed on design or benchmarking variables that are predictors of inclusion probability. In this paper, we assume that the only information we have about the weighting procedure is the values of the weights in the sample. We propose a hierarchical Bayesian approach in which we model the weights of the nonsampled units in the population and simultaneously include them as predictors in a nonparametric Gaussian process regression to yield valid inference for the underlying nite population and capture the uncertainty induced by sampling and the unobserved outcomes. We use simulation studies to evaluate the performance of our procedure and compare it to the classical design-based estimator. We apply our method to the Fragile Family Child Wellbeing Study. Our studies nd the Bayesian nonparametric nite population estimator to be more robust than the classical design-based estimator without loss in eciency.


Journal of statistical theory and practice | 2011

A Comparison of Posterior Simulation and Inference by Combining Rules for Multiple Imputation

Yajuan Si

Multiple imputation is a common approach for handling missing data. It is also used by government agencies to protect confidential information in public use data files. One reason for the popularity of multiple imputation approaches is ease of use: analysts make inferences by combining point and variance estimates with simple rules. These combining rules are based on method of moments approximations to full Bayesian inference. With modern computing, however, it is as easy to perform the full Bayesian inference as it is to combine point and variance estimates. This begs the question: is there any advantage of using full Bayesian inference over multiple imputation combining rules? We use simulation studies to investigate this question. We find that, in general, the full Bayesian inference is not preferable to using the combining rules in multiple imputation for missing data. The full Bayesian inference can have advantages over the combining rules when using multiple imputation to protect confidential information.


Journal of Research on Educational Effectiveness | 2016

The Impact of Every Classroom, Every Day on High School Student Achievement: Results From a School-Randomized Trial

Diane M. Early; Juliette Berg; Stacey Alicea; Yajuan Si; J. Lawrence Aber; Richard M. Ryan

Abstract Every Classroom, Every Day (ECED) is a set of instructional improvement interventions designed to increase student achievement in math and English/language arts (ELA). ECED includes three primary components: (a) systematic classroom observations by school leaders, (b) intensive professional development and support for math teachers and instructional leaders to reorganize math instruction, assessment, and grading around mastery of benchmarks, and (c) a structured literacy curriculum that supplements traditional English courses, with accompanying professional development and support for teachers surrounding its use. The present study is a two-year trial, conducted by independent researchers, which employed a school-randomized design and included 20 high schools (10 treatment; 10 control) in five districts in four states. The students were ethnically diverse and most were eligible for free or reduced-price lunch. Results provided evidence that ECED improved scores on standardized tests of math achievement, but not standardized tests of ELA achievement. Findings are discussed in terms of differences between math and ELA and of implications for future large-scale school-randomized trials.


The Annals of Applied Statistics | 2016

Bayesian latent pattern mixture models for handling attrition in panel studies with refreshment samples

Yajuan Si; D. Sunshine Hillygus

Many panel studies collect refreshment samples---new, randomly sampled respondents who complete the questionnaire at the same time as a subsequent wave of the panel. With appropriate modeling, these samples can be leveraged to correct inferences for biases caused by non-ignorable attrition. We present such a model when the panel includes many categorical survey variables. The model relies on a Bayesian latent pattern mixture model, in which an indicator for attrition and the survey variables are modeled jointly via a latent class model. We allow the multinomial probabilities within classes to depend on the attrition indicator, which offers additional flexibility over standard applications of latent class models. We present results of simulation studies that illustrate the benefits of this flexibility. We apply the model to correct attrition bias in an analysis of data from the 2007-2008 Associated Press/Yahoo News election panel study.


Political Analysis | 2015

Semi-parametric Selection Models for Potentially Non-ignorable Attrition in Panel Studies with Refreshment Samples

Yajuan Si; D. Sunshine Hillygus


Archive | 2012

Nonparametric Bayesian Methods for Multiple Imputation of Large Scale Incomplete Categorical Data in Panel Studies

Yajuan Si


arXiv: Methodology | 2017

Bayesian hierarchical weighting adjustment and survey inference

Yajuan Si; Rob Trangucci; Jonah Gabry; Andrew Gelman


Archive | 2015

Graphical Visualization of Polling Results

Susanna Makela; Yajuan Si; Andrew Gelman

Collaboration


Dive into the Yajuan Si's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Diane M. Early

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Richard M. Ryan

Australian Catholic University

View shared research outputs
Researchain Logo
Decentralizing Knowledge