Is this you? Create Your Porfile

Jinming Zhang

University of Illinois at Urbana–Champaign

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jinming Zhang is active.

Explore More

Publication

Featured researches published by Jinming Zhang.

Psychometrika | 1999

The theoretical detect index of dimensionality and its application to approximate simple structure

Jinming Zhang; William Stout

In this paper, a theoretical index of dimensionality, called the theoretical DETECT index, is proposed to provide a theoretical foundation for the DETECT procedure. The purpose of DETECT is to assess certain aspects of the latent dimensional structure of a test, important to practitioner and research alike. Under reasonable modeling restrictions referred to as “approximate simple structure”, the theoretical DETECT index is proven to be maximized at thecorrect dimensionality-based partition of a test, where the number of item clusters in this partition corresponds to the number of substantivelyseparate dimensions present in the test and by “correct” is meant that each cluster in this partition contains only items that correspond to the same separate dimension. It is argued that the separation into item clusters achieved by DETECT is appropriate from the applied perspective of desiring a partition into clusters that are interpretable as substantively distinct between clusters and substantively homogeneous within cluster. Moreover, the maximum DETECT index value is a measure of the amount of multidimensionality present. The estimation of the theoretical DETECT index is discussed and a genetic algorithm is developed to effectively execute DETECT. The study of DETECT is facilitated by the recasting of two factor analytic concepts in a multidimensional item response theory setting: a dimensionally homogeneous item cluster and an approximate simple structure test.

Applied Psychological Measurement | 1996

Conditional Covariance-Based Nonparametric Multidimensionality Assessment.

William Stout; Brian Habing; Jeff Douglas; Hae Rim Kim; Louis Roussos; Jinming Zhang

According to the weak local independence approach to defining dimensionality, the fundamental quantities for determining a tests dimensional structure are the co variances of item-pair responses conditioned on exam inee trait level. This paper describes three dimensionality assessment procedures-HCA/CCPROX, DIMTEST, and DETECT—that use estimates of these con ditional covariances. All three procedures are nonpara metric ; that is, they do not depend on the functional form of the item response functions. These procedures are applied to a dimensionality study of the LSAT, which illustrates the capacity of the approaches to assess the lack of unidimensionality, identify groups of items manifesting approximate simple structure, determine the number of dominant dimensions, and measure the amount of multidimensionality.

Psychometrika | 1999

Conditional covariance structure of generalized compensatory multidimensional items

Jinming Zhang; William Stout

Some nonparametric dimensionality assessment procedures, such as DIMTEST and DETECT, use nonparametric estimates of item pair conditional covariances given an appropriately chosen subtest score as their basic building blocks. Such conditional covariances given some subtest score can be regarded as an approximation to the conditional covariances given an appropriately chosen unidimensional latent composite, where the composite is oriented in the multidimensional test space direction in which the subtest score measures best. In this paper, the structure and properties of such item pair conditional covariances given a unidimensional latent composite are thoroughly investigated, assuming a semiparametric IRT modeling framework called a generalized compensatory model. It is shown that such conditional covariances are highly informative about the multidimensionality structure of a test. The theory developed here is very useful in establishing properties of dimensionality assessment procedures, current and yet to be developed, that are based upon estimating such conditional covariances.In particular, the new theory is used to justify the DIMTEST procedure. Because of the importance of conditional covariance estimation, a new bias reducing approach is presented. A byproduct of likely independent importance beyond the study of conditional covariances is a rigorous score information based definition of an items and a scores direction of best measurement in the multidimensional test space.

Psychometrika | 2002

Hypergeometric family and item overlap rates in computerized adaptive testing

Hua Hua Chang; Jinming Zhang

A computerized adaptive test (CAT) is usually administered to small groups of examinees at frequent time intervals. It is often the case that examinees who take the test earlier share information with examinees who will take the test later, thus increasing the risk that many items may become known. Item overlap rate for a group of examinees refers to the number of overlapping items encountered by these examinees divided by the test length. For a specific item pool, different item selection algorithms may yield different item overlap rates. An important issue in designing a good CAT item selection algorithm is to keep item overlap rate below a preset level. In doing so, it is important to investigate what the lowest rate could be for all possible item selection algorithms. In this paper we rigorously prove that if every item has an equal possibility to be selected from the pool in a fixed-length CAT, the number of overlapping items among any α randomly sampled examinees follows the hypergeometric distribution family for α ≥ 1. Thus, the expected values of the number of overlapping items among any randomly sampled α examinees can be calculated precisely. These values may serve as benchmarks in controlling item overlap rates for fixed-length adaptive tests.

Applied Psychological Measurement | 2012

Calibration of Response Data Using MIRT Models With Simple and Mixed Structures

Jinming Zhang

It is common to assume during a statistical analysis of a multiscale assessment that the assessment is composed of several unidimensional subtests or that it has simple structure. Under this assumption, the unidimensional and multidimensional approaches can be used to estimate item parameters. These two approaches are equivalent in parameter estimation if the joint maximum likelihood method is used. However, they are different from each other if the marginal maximum likelihood method is applied. A simulation study is conducted to further compare the unidimensional and multidimensional approaches with the marginal maximum likelihood method. The simulation results indicate that when the number of items is small, the multidimensional approach provides more accurate estimates of item parameters, whereas the unidimensional approach prevails if the number of items in each subtest is large enough. Furthermore, the impact of the violation of the simple structure assumption is also investigated theoretically and numerically. The results demonstrate that when a set of response data does not have a simple structure but is specified as such in calibration, the models will be incorrectly estimated and the correlation coefficients between abilities will be overestimated.

Psychometrika | 1997

On Holland's Dutch identity conjecture

Jinming Zhang; William Stout

The manifest probabilities of observed examinee response patterns resulting from marginalization with respect to the latent ability distribution produce the marginal likelihood function in item response theory. Under the conditions that the posterior distribution of examinee ability given some test response pattern is normal and the item logit functions are linear, Holland (1990a) gives a quadratic form for the log-manifest probabilities by using the Dutch Identity. Further, Holland conjectures that this special quadratic form is a limiting one for all “smooth” unidimensional item response models as test length tends to infinity. The purpose of this paper is to give three counterexamples to demonstrate that Hollands Dutch Identity conjecture does not hold in general. The counterexamples suggest that only under strong assumptions can it be true that the limits of log-manifest probabilities are quadratic. Three propositions giving sets of such strong conditions are given.

Applied Psychological Measurement | 2014

A Sequential Procedure for Detecting Compromised Items in the Item Pool of a CAT System

Jinming Zhang

To maintain the validity of a continuous testing system, such as computerized adaptive testing (CAT), items should be monitored to ensure that the performance of test items has not gone through any significant changes during their lifetime in an item pool. In this article, the author developed a sequentially monitoring procedure based on a series of statistical hypothesis tests to examine whether the statistical characteristics of individual items have changed significantly during test administration. Simulation studies show that under the simulated setting, by choosing an appropriate cutoff point, the procedure can control the rate of Type I errors at any reasonable significance level and meanwhile, has a very low rate of Type II errors.

Behavior Research Methods | 2012

Comparing single-pool and multiple-pool designs regarding test security in computerized testing

Jinming Zhang; Hua Hua Chang; Qing Yi

This article compares the use of single- and multiple-item pools with respect to test security against item sharing among some examinees in computerized testing. A simulation study was conducted to make a comparison among different pool designs using the item selection method of maximum item information with the Sympson–Hetter exposure control and content balance. The results from the simulation study indicate that two-pool designs have a better degree of resistance to item sharing than do the single-pool design in terms of measurement precision in ability estimation. This article further characterizes the conditions under which employing a multiple-pool design is better than using a single, whole pool in terms of minimizing the number of compromised items encountered by examinees under a randomized item selection method. Although no current computerized testing program endorses the randomized item selection method, the results derived in this study can shed some light on item pool designs regarding test security for all item selection algorithms, especially those that try to equalize or balance item exposure rates by employing a randomized item selection method locally, such as the a-stratified-with-b-blocking method.

Educational Evaluation and Policy Analysis | 2010

An Investigation of Bias in Reports of the National Assessment of Educational Progress

Henry Braun; Jinming Zhang; Sailesh Vezzu

This article investigates plausible explanations for the observed heterogeneity among jurisdictions in the exclusion rates of students with disabilities and English language learners in administrations of the National Assessment of Educational Progress (NAEP). It also examines the operating characteristics of a particular class of methods, called full-population estimates (FPE), that carry out statistical adjustments to NAEP’s reported scores to address the possible bias due to differential exclusion rates. The conclusions are that for many states there is a strong likelihood of bias in the results reported and that neither the current NAEP procedure nor the FPE methodologies constitute an ideal solution to the problem. Some alternative methods and related research questions are indicated. It is suggested that the general strategy employed here may be useful in investigating similar questions in other large-scale assessment surveys.

Language Testing | 2014

Investigating correspondence between language proficiency standards and academic content standards: A generalizability theory study

Chih Kai Lin; Jinming Zhang

Research on the relationship between English language proficiency standards and academic content standards serves to provide information about the extent to which English language learners (ELLs) are expected to encounter academic language use that facilitates their content learning, such as in mathematics and science. Standards-to-standards correspondence thus contributes to validity evidence regarding ELL achievements in a standard-based assessment system. The current study aims to examine the reliability of reviewer judgments about language performance indicators associated with academic disciplines in standards-to-standards correspondence studies in the US K–12 settings. Ratings of cognitive complexity germane to the language performance indicators were collected from 20 correspondence studies with over 500 reviewers, consisting of content experts and ESL specialists. Using generalizability theory, we evaluate reviewer reliability and standard errors of measurement in their ratings with respect to the number of reviewers. Results show that depending on the particular grades and subject areas, 3–6 reviewers are needed to achieve acceptable reliability and to control for reasonable measurement errors in their judgments.

Explore More