Frederic M. Lord
Princeton University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Frederic M. Lord.
Applied Psychological Measurement | 1983
Martha L. Stocking; Frederic M. Lord
A common problem arises when independent esti mates of item parameters from two separate data sets must be expressed in the same metric. This problem is frequently confronted in studies of horizontal and ver tical equating and in studies of item bias. This paper discusses a number of methods for finding the appro priate transformation from one metric to another met ric and presents a new method. Data are given com paring this new method with a current method, and recommendations are made.
Educational and Psychological Measurement | 1953
Frederic M. Lord
IN any consideration of the nature of the metric provided by the raw score on a mental test, one is likely to be faced with the fact that the raw score units of measurement cannot ordinarily be considered as &dquo;equal.&dquo; If we administer two tests of the same trait or ability, the two tests having different distributions of item &dquo;difficulty,&dquo; to the same group of examinees, we will obtain two different shapes of raw score distributions from the two tests, as illustrated in Figure I, for
Educational and Psychological Measurement | 1968
Frederic M. Lord
acteristic curves was first presented by Lawley (1943). Lazarsfeld (1950, 1959) developed a more general theory, mainly in the context of attitude tests and other short questionnaires. Lord (1952, 1953a, 1953b) worked on the theory for aptitude and achievement tests. He reported (1952) some encouraging empirical results, checking for agreement between the mathematical model and actual test data. In 1957-1958 Birnbaum published a series of reports using item-characteristic-curve models. He applied modern theory of statistical inference, answering many outstanding theoretical problems in test construction, scoring of responses, and interpretation of test results.
Journal of the American Statistical Association | 1960
Frederic M. Lord
Abstract When the control variable contains errors of measurement, the usual analysis of covariance fails to adjust adequately for initial differences between groups. A large-sample significance test is presented here for the case where the fallible control variable has been measured in duplicate.
Psychometrika | 1962
J. A. Keats; Frederic M. Lord
The negative hypergeometric distribution of raw scores on mental tests is derived from certain assumptions relating to test theory. This result is checked empirically in a number of examples. Further derivations lead to the bivariate distribution of parallel tests which is also verified with actual data. The bivariate distribution of raw score and true score is also derived from a further assumption. This distribution is used to set confidence limits for true scores for persons with a given raw score.
Applied Psychological Measurement | 1977
Frederic M. Lord
Two parallel forms of a broad-range tailored test of verbal ability have been built. The test is appro priate from fifth grade through graduate school. Simulated test administrations indicate that the 25- item tailored test is at least as good as a compar able 50-item conventional test. At most ability levels, the tailored test measures much better. An offer is made to provide upon request item charac teristic curve parameters for 690 widely used Coop erative Test items, in order to facilitate research.
Educational and Psychological Measurement | 1962
Frederic M. Lord
TRULY representative national norms are seldom obtained for any published test. The most serious obstacle is the fact that not every school i~ willing, at the request of some test publisher, to suspend its accustomed activities and require its students to spend a class period or more taking tests. As a result, published national&dquo; norms usually do not represent the nation’s school but, at best) only those willing at a particular time to cooperate with a particular test publisher. The problem of getting each school’s cooperation would be less serious if only a few moments of each student’s time were required, rather than an entire class period. This raises the question of whether the performance of a large group on a long test can be estimated by administering only a few items to each student. If such methods of estimating group performance were possible, they would be helpful not only for norming a single test, but even
Psychometrika | 1955
Frederic M. Lord
Sampling fluctuations resulting from the sampling of test items rather than of examinees are discussed. It is shown that the Kuder-Richardson reliability coefficients actually are measures of this type of sampling fluctuation. Formulas for certain standard errors are derived; in particular, a simple formula is given for the standard error of measurement of an individual examinees score. A common misapplication of the Wilks-Votaw criterion for parallel tests is pointed out. It is shown that the Kuder-Richardson formula-21 reliability coefficient should be used instead of the formula-20 coefficient in certain common practical situations.
Applied Psychological Measurement | 1984
Marilyn S. Wingersky; Frederic M. Lord
The sampling errors of maximum likelihood esti mates of item response theory parameters are studied in the case when both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is investigated. Finally, the ef fect of anchor-test length on the standard error of item parameters is studied numerically for the situation, common in equating studies, when two groups of ex aminees each take a different test form together with the same anchor test. The results encourage the use of rectangular or bimodal ability distributions, and also the use of very short anchor tests.
Psychometrika | 1952
Frederic M. Lord
Under certain assumptions an expression, in terms of item difficulties and intercorrelations, is derived for the curvilinear correlation of test score on the “ability underlying the test,” this ability being defined as the common factor of the item tetrachoric intercorrelations corrected for guessing. It is shown that this curvilinear correlation is equal to the square root of the test reliability. Numerical values for these curvilinear correlations are presented for a number of hypothetical tests, defined in terms of their item parameters. These numerical results indicate that the reliability and the curvilinear correlation will be maximized by (1) minimizing the variability of item difficulty and (2) making the level of item difficulty somewhat easier than the halfway point between a chance percentage of correct answers and 100 per cent correct answers.