Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gongjun Xu is active.

Publication


Featured researches published by Gongjun Xu.


Journal of the American Statistical Association | 2015

Statistical Analysis of Q-Matrix Based Diagnostic Classification Models

Yunxiao Chen; Jingchen Liu; Gongjun Xu; Zhiliang Ying

Diagnostic classification models (DMCs) have recently gained prominence in educational assessment, psychiatric evaluation, and many other disciplines. Central to the model specification is the so-called Q-matrix that provides a qualitative specification of the item-attribute relationship. In this article, we develop theories on the identifiability for the Q-matrix under the DINA and the DINO models. We further propose an estimation procedure for the Q-matrix through the regularized maximum likelihood. The applicability of this procedure is not limited to the DINA or the DINO model and it can be applied to essentially all Q-matrix based DMCs. Simulation studies show that the proposed method admits high probability recovering the true Q-matrix. Furthermore, two case studies are presented. The first case is a dataset on fraction subtraction (educational application) and the second case is a subsample of the National Epidemiological Survey on Alcohol and Related Conditions concerning the social anxiety disorder (psychiatric application).


Psychometrika | 2016

Identifiability of Diagnostic Classification Models

Gongjun Xu; Stephanie Zhang

Diagnostic classification models (DCMs) are important statistical tools in cognitive diagnosis. In this paper, we consider the issue of their identifiability. In particular, we focus on one basic and popular model, the DINA model. We propose sufficient and necessary conditions under which the model parameters are identifiable from the data. The consequences, in terms of the consistency of parameter estimates, of fulfilling or failing to fulfill these conditions are illustrated via simulation. The results can be easily extended to the DINO model through the duality of the DINA and DINO models. Moreover, the proposed theoretical framework could be applied to study the identifiability issue of other DCMs.


Annals of Statistics | 2017

Identifiability of restricted latent class models with binary responses

Gongjun Xu

Statistical latent class models are widely used in social and psychological researches, yet it is often difficult to establish the identifiability of the model parameters. In this paper we consider the identifiability issue of a family of restricted latent class models, where the restriction structures are needed to reflect pre-specified assumptions on the related assessment. We establish the identifiability results in the strict sense and specify which types of restriction structure would give the identifiability of the model parameters. The results not only guarantee the validity of many of the popularly used models, but also provide a guideline for the related experimental design, where in the current applications the design is usually experience based and identifiability is not guaranteed. Theoretically, we develop a new technique to establish the identifiability result, which may be extended to other restricted latent class models.


Psychometrika | 2018

A Two-Stage Approach to Differentiating Normal and Aberrant Behavior in Computer Based Testing

Chun Wang; Gongjun Xu; Zhuoran Shang

Statistical methods for identifying aberrances on psychological and educational tests are pivotal to detect flaws in the design of a test or irregular behavior of test takers. Two approaches have been taken in the past to address the challenge of aberrant behavior detection, which are (1) modeling aberrant behavior via mixture modeling methods, and (2) flagging aberrant behavior via residual based outlier detection methods. In this paper, we propose a two-stage method that is conceived of as a combination of both approaches. In the first stage, a mixture hierarchical model is fitted to the response and response time data to distinguish normal and aberrant behaviors using Markov chain Monte Carlo (MCMC) algorithm. In the second stage, a further distinction between rapid guessing and cheating behavior is made at a person level using a Bayesian residual index. Simulation results show that the two-stage method yields accurate item and person parameter estimates, as well as high true detection rate and low false detection rate, under different manipulated conditions mimicking NAEP parameters. A real data example is given in the end to illustrate the potential application of the proposed method.


Biometrika | 2016

An adaptive two-sample test for high-dimensional means

Gongjun Xu; Lifeng Lin; Peng Wei; Wei Pan

SUMMARY Several two-sample tests for high-dimensional data have been proposed recently, but they are powerful only against certain alternative hypotheses. In practice, since the true alternative hypothesis is unknown, it is unclear how to choose a powerful test. We propose an adaptive test that maintains high power across a wide range of situations and study its asymptotic properties. Its finite-sample performance is compared with that of existing tests. We apply it and other tests to detect possible associations between bipolar disease and a large number of single nucleotide polymorphisms on each chromosome based on data from a genome-wide association study. Numerical studies demonstrate the superior performance and high power of the proposed test across a wide spectrum of applications.


Journal of the American Statistical Association | 2017

Joint Scale-Change Models for Recurrent Events and Failure Time

Gongjun Xu; Sy Han Chiou; Chiung Yu Huang; Mei Cheng Wang; Jun Yan

ABSTRACT Recurrent event data arise frequently in various fields such as biomedical sciences, public health, engineering, and social sciences. In many instances, the observation of the recurrent event process can be stopped by the occurrence of a correlated failure event, such as treatment failure and death. In this article, we propose a joint scale-change model for the recurrent event process and the failure time, where a shared frailty variable is used to model the association between the two types of outcomes. In contrast to the popular Cox-type joint modeling approaches, the regression parameters in the proposed joint scale-change model have marginal interpretations. The proposed approach is robust in the sense that no parametric assumption is imposed on the distribution of the unobserved frailty and that we do not need the strong Poisson-type assumption for the recurrent event process. We establish consistency and asymptotic normality of the proposed semiparametric estimators under suitable regularity conditions. To estimate the corresponding variances of the estimators, we develop a computationally efficient resampling-based procedure. Simulation studies and an analysis of hospitalization data from the Danish Psychiatric Central Register illustrate the performance of the proposed method. Supplementary materials for this article are available online.


Journal of the American Statistical Association | 2018

Identifying Latent Structures in Restricted Latent Class Models

Gongjun Xu; Zhuoran Shang

ABSTRACT This article focuses on a family of restricted latent structure models with wide applications in psychological and educational assessment, where the model parameters are restricted via a latent structure matrix to reflect prespecified assumptions on the latent attributes. Such a latent matrix is often provided by experts and assumed to be correct upon construction, yet it may be subjective and misspecified. Recognizing this problem, researchers have been developing methods to estimate the matrix from data. However, the fundamental issue of the identifiability of the latent structure matrix has not been addressed until now. The first goal of this article is to establish identifiability conditions that ensure the estimability of the structure matrix. With the theoretical development, the second part of the article proposes a likelihood-based method to estimate the latent structure from the data. Simulation studies show that the proposed method outperforms the existing approaches. We further illustrate the method through a dataset in educational assessment. Supplementary materials for this article are available online.


British Journal of Mathematical and Statistical Psychology | 2016

On initial item selection in cognitive diagnostic computerized adaptive testing

Gongjun Xu; Chun Wang; Zhuoran Shang

There has recently been much interest in computerized adaptive testing (CAT) for cognitive diagnosis. While there exist various item selection criteria and different asymptotically optimal designs, these are mostly constructed based on the asymptotic theory assuming the test length goes to infinity. In practice, with limited test lengths, the desired asymptotic optimality may not always apply, and there are few studies in the literature concerning the optimal design of finite items. Related questions, such as how many items we need in order to be able to identify the attribute pattern of an examinee and what types of initial items provide the optimal classification results, are still open. This paper aims to answer these questions by providing non-asymptotic theory of the optimal selection of initial items in cognitive diagnostic CAT. In particular, for the optimal design, we provide necessary and sufficient conditions for the Q-matrix structure of the initial items. The theoretical development is suitable for a general family of cognitive diagnostic models. The results not only provide a guideline for the design of optimal item selection procedures, but also may be applied to guide item bank construction.


Psychometrika | 2018

The Sufficient and Necessary Condition for the Identifiability and Estimability of the DINA Model

Yuqi Gu; Gongjun Xu

Cognitive diagnosis models (CDMs) are useful statistical tools in cognitive diagnosis assessment. However, as many other latent variable models, the CDMs often suffer from the non-identifiability issue. This work gives the sufficient and necessary condition for identifiability of the basic DINA model, which not only addresses the open problem in Xu and Zhang (Psychometrika 81:625–649, 2016) on the minimal requirement for identifiability, but also sheds light on the study of more general CDMs, which often cover DINA as a submodel. Moreover, we show the identifiability condition ensures the consistent estimation of the model parameters. From a practical perspective, the identifiability condition only depends on the Q-matrix structure and is easy to verify, which would provide a guideline for designing statistically valid and estimable cognitive diagnosis tests.


Biometrics | 2018

Semiparametric estimation of the accelerated mean model with panel count data under informative examination times

Sy Han Chiou; Gongjun Xu; Jun Yan; Chiung Yu Huang

Panel count data arise when the number of recurrent events experienced by each subject is observed intermittently at discrete examination times. The examination time process can be informative about the underlying recurrent event process even after conditioning on covariates. We consider a semiparametric accelerated mean model for the recurrent event process and allow the two processes to be correlated through a shared frailty. The regression parameters have a simple marginal interpretation of modifying the time scale of the cumulative mean function of the event process. A novel estimation procedure for the regression parameters and the baseline rate function is proposed based on a conditioning technique. In contrast to existing methods, the proposed method is robust in the sense that it requires neither the strong Poisson-type assumption for the underlying recurrent event process nor a parametric assumption on the distribution of the unobserved frailty. Moreover, the distribution of the examination time process is left unspecified, allowing for arbitrary dependence between the two processes. Asymptotic consistency of the estimator is established, and the variance of the estimator is estimated by a model-based smoothed bootstrap procedure. Numerical studies demonstrated that the proposed point estimator and variance estimator perform well with practical sample sizes. The methods are applied to data from a skin cancer chemoprevention trial.

Collaboration


Dive into the Gongjun Xu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chun Wang

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar

Jun Yan

University of Connecticut

View shared research outputs
Top Co-Authors

Avatar

Sy Han Chiou

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kevin Leder

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wei Pan

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar

Xianghua Luo

University of Minnesota

View shared research outputs
Researchain Logo
Decentralizing Knowledge