Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Susan J. Maller is active.

Publication


Featured researches published by Susan J. Maller.


Educational and Psychological Measurement | 2007

Iterative Purification and Effect Size Use With Logistic Regression for Differential Item Functioning Detection

Brian F. French; Susan J. Maller

Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling Type I error rates. The effectiveness of such controls, especially used in combination, requires evaluation. Detection errors were evaluated through simulation across iterative purification and no purification procedures with and without the use of an effect size criterion. Sample size, DIF magnitude and percentage, and ability differences were manipulated. Purification was beneficial under certain conditions, although overall power and Type I error rates did not substantially improve. The LR statistical test without purification performed as well as other classification criteria and may be the practical choice for many situations. Continued evaluation of the effect size guidelines and purification are discussed.


Educational and Psychological Measurement | 2001

Differential Item Functioning in the Wisc-III: Item Parameters for Boys and Girls in the National Standardization Sample

Susan J. Maller

The Wechsler Intelligence Scale for Children–Third Edition (WISC-III) is the most widely used test of intelligence in the world. However, the manual for the WISC-III provides insufficient detail regarding the detection of differential item functioning (DIF). The WISC-III national standardization sample (N = 2,200) was used to investigate DIF in six WISC-III subtests. After fitting two parameter logistic and graded response models to the data, items were tested for DIF using the item response theory likelihood ratio DIF detection method. Of the 151 items studied, 52 were found to function differently across groups. The magnitude of the DIF was also considered by examining (a) parameter differences between groups and (b) root mean squared probability differences. Because the scores of boys and girls may be composed of different items systematically scored as correct, their IQs cannot be assumed to have the same meaning. Further investigations of item content bias are recommended.


Educational and Psychological Measurement | 2004

Universal Nonverbal Intelligence Test factor invariance across deaf and standardization samples

Susan J. Maller; Brian F. French

The Universal Nonverbal Intelligence Test (UNIT) is an individually administered, nonverbal intelligence test designed for use with non-English-speaking, limited English proficient, or deaf children. The aim of this study was to assess the factor structure invariance of the UNIT across deaf and standardization samples through the use of multisample confirmatory factor an alysis. Two theoretical models were evaluated: a primary two-factor (Memory and Reasoning) model and a secondary two-factor (Symbolic and Nonsymbolic) model. The general forms of both models were invariant across groups, thus supporting both theoretical models. For the primary model, partial measurement invariance was found. A follow-up analysis found that the analogic reasoning measurement intercept was not invariant, indicating that the expected subtest scores for deaf examinees were lower than those for examinees in the standardization sample. In addition, the standardization sample had a higher Memory latent factor mean than the deaf sample did. The secondary model was not invariant, with three pattern coefficients differing across groups.


Archive | 2003

Best Practices in Detecting Bias in Nonverbal Tests

Susan J. Maller

Group comparisons of performance on intelligence tests have been advanced as evidence of real similarities or differences in intellectual ability by Jensen (1980) and, more recently, by Herrnstein and Murray (1994). This purported evidence includes the mean intelligence score dif-ferences that have been reported for various ethnic groups (e.g., Jensen, 1969,1980; Loehlin, Lindzey, & Spuhler, 1975; Lynn, 1977; Munford & Munoz, 1980) or gender (e.g., Feingold, 1993; Nelson, Arthur, Lautiger, & Smith, 1994; Smith, Edmonds, & Smith, 1989; Vance, Hankins, & Brown, 1988; Wessel & Potter, 1994; Wilkinson, 1993).


Educational and Psychological Measurement | 2010

Factor Structure Invariance of the Kaufman Adolescent and Adult Intelligence Test Across Male and Female Samples

Jason C. Immekus; Susan J. Maller

Multisample confirmatory factor analysis (MCFA) and latent mean structures analysis (LMS) were used to test measurement invariance and latent mean differences on the Kaufman Adolescent and Adult Intelligence Scale™ (KAIT) across males and females in the standardization sample. MCFA found that the parameters of the KAIT two-factor model were invariant across groups. A follow-up LMS found intercept differences on the Memory for Block Designs, Famous Faces, Auditory Comprehension, and Logical Steps subtests, indicating low to moderately higher expected scores for males. Thus, latent means were not tested for invariance. Although the KAIT two-factor model met partial measurement invariance, it did not demonstrate strong factorial invariance. Implications for test score interpretation are discussed.


Educational and Psychological Measurement | 2009

Item Parameter Invariance of the Kaufman Adolescent and Adult Intelligence Test Across Male and Female Samples

Jason C. Immekus; Susan J. Maller

The Kaufman Adolescent and Adult Intelligence Test (KAIT™) is an individually administered test of intelligence for individuals ranging in age from 11 to 85+ years. The item response theory—likelihood ratio procedure, based on the two-parameter logistic model, was used to detect differential item functioning (DIF) in the KAIT across males and females in the standardization sample. Root mean squared differences and item parameter differences were used to indicate the magnitude of DIF and identify which group the item parameter favored. A z test of proportion differences was conducted to determine if the number of parameters exhibiting gender DIF exceeded the number expected by chance, estimated by randomly dividing the sample in half and repeating the analyses. Of the 176 item parameters examined, 42 (24%) lacked invariance, with most items reporting uniform DIF. Implications for test score interpretation and future research are discussed.


Journal of Special Education | 1993

The Effects of Residential Versus Day Placement on the Performance IQS of Children with Hearing Impairment

Jeffery P. Braden; Susan J. Maller; Marius M. Paquin

The effect of placement on the PIQs of children with hearing impairment (HI) was examined. Specifically, children in three types of placement options (commuter to a residential school, resident at a residential school, and commuter to a mainstream day program) were evaluated 3 to 4 years after enrollment to determine what, if any, changes had occurred in their Wechsler Performance IQs. The ANCOVA results demonstrate statistically significant gains for commuters and residents attending the residential program, in contrast to no PIQ change for children attending the day program. These results contradict arguments that placement in a segregated, residential setting invariably inhibits cognitive development. Additional implications for educational placement and research are discussed.


frontiers in education conference | 2005

Work in progress - a model to evaluate team effectiveness

P.K. Imbrie; Jason C. Immekus; Susan J. Maller

This work-in-progress presents instruments that have been developed to assess team effectiveness for students in engineering classrooms. The instruments include: a) a 24-item self-report instrument (Team Effectiveness Scale) requiring students to indicate the degree their team worked together across the following domains: interdependency, goal-setting, potency, and learning; and b) a 6-item measure (Peer Assessment Scale) that asks students to rate each team members contribution towards the functionality of their team. Evidence of scale psychometric properties are provided along with the relationship between peer assessment and team effectiveness scores and the degree to which scores on the team effectiveness scale discriminated between functional and dysfunctional teams, consistent with instructor judgments


Archive | 2017

Best Practices in Detecting Bias in Cognitive Tests

Susan J. Maller; Lai-Kwan Pei

Although the terms item bias and DIF are often used interchangeably, the term DIF was suggested (Holland and Thayer in Proceedings of the 27th Annual Conference of the Military Testing Association, vol 1, pp 282–287, 1988) as a somewhat neutral term to refer to differences in the statistical properties of an item between groups of examinees of equal ability. Items that exhibit DIF threaten the validity of a test and may have serious consequences for groups as well as individuals, because the probabilities of correct responses are determined not only by the trait that the test claims to measure, but also by factors specific to group membership, such as ethnicity or gender. Thus, it is critical to identify DIF items in a test. In this chapter, the more popular DIF detection methods, including Mantel–Haenszel procedure, logistic regression modeling, SIBTEST, and IRT likelihood ratio test, are described. Details of the other methods, as well as some older methods not mentioned above, can be found in the overviews given by Camilli and Shepard (Methods for identifying biased test items. Sage, Thousand Oaks, 1994), Clauser and Mazor (Educ Measure Issues Pract 17:31–44, 1998), Holland and Wainer (Differential item functioning. Erlbaum, Hillsdale, 1993), Millsap and Everson (Appl Psychol Measure 16:389–402, 1993), Osterlind and Everson (Differential item functioning. Sage, Thousand Oaks, 2009), and Penfield and Camilli (Handbook of statistics: Vol. 26. Psychometrics. Elsevier, Amsterdam, pp 125–167, 2007).


Journal of Clinical Epidemiology | 2000

Test of item-response bias in the CES-D scale: experience from the New Haven EPESE Study

Stephen R. Cole; Ichiro Kawachi; Susan J. Maller; Lisa F. Berkman

Collaboration


Dive into the Susan J. Maller's collaboration.

Top Co-Authors

Avatar

Jason C. Immekus

California State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jeffery P. Braden

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge