Archive | 2021

The Utility of Multiplex Closeness Centrality for Predicting Item Difficulty Parameters in Anomia Tests

 

Abstract


Background: Confrontation naming tests for the assessment of aphasia are perhaps the most commonly used tests in aphasiology. Recently, such tests have been modeled using item response theory approaches. Despite their advantages, item response theory models require large sample sizes for parameter estimation that are often unrealistic when working with clinical populations. As an alternative approach, Fergadiotis, Kellough & Hula (2015) explored automatic item calibration by regressing item difficulty parameters on word length, age of acquisition (AOA; Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012), and lexical frequency as quantified by the Log10CD index (Brysbaert & New, 2009). Despite the high predictive utility that they achieved, the model’s performance was far from perfect (R = .63) which may carry implications for the accuracy of any difficulty parameters derived by the model. Purpose: This study aims to examine the addition of a fourth psycholinguistic variable to the regression model, multiplex closeness centrality (MCC) (Castro & Stella, 2019). It is hypothesized that the ability to capture how well-connected words are in the human lexicon would make MCC a potential indicator of semantic processing which would contribute to the predictive utility of the model. Method: A multiple regression analysis was carried out with the Philadelphia Naming Test item difficulty parameters as the dependent variable, and lexical frequency, AOA, word length, and MCC as the predictors. Item difficulty parameters were estimated based on a traditional calibration approach (Fergadiotis, et al., 2015). Results & Conclusions: Our analysis showed a high correlation between MCC and item difficulty and suggested that the addition of MCC has allowed the model to account for more variance. However, the change between the model with three variables and the one with four MULTIPLEX CENTRALITY CLOSENESS IN PREDICTING ITEM DIFFICULTY ii variables, including MCC, was not statistically significant. In other words, MCC did not add unique information to the regression model despite the high correlation with item difficulty due to the overlapping variance of MCC with other predictors. However, the findings should be interpreted cautiously because of a large number of missing values in MCC. Post hoc analyses indicated that data were missing not at random which might have contributed to the lack of significant findings. Thus, we suggest that future research investigate this type of study using a complete dataset and appropriately apply the missing data theory to their analysis. MULTIPLEX CENTRALITY CLOSENESS IN PREDICTING ITEM DIFFICULTY 1 Aphasia is an acquired neurogenic language disorder, often the result of a stroke, that affects more than 2.4 million people in the US (Simmons-Mackie, 2018). Typically, people with aphasia (PWA) present with anomia, which refers to the inability to access and retrieve words during language production (Goodglass & Wingfield, 1997; Raymer & Rothi, 2012). Further, PWA exhibit symptoms of anomia even when other symptoms of aphasia resolve during the evolution of a stroke (e.g., morphosyntactical deficits). Since anomia is the cardinal deficit in aphasia, anomia treatment has received considerable attention in aphasiology. Further, given that anomia is a primary diagnostic feature of aphasia, clinicians typically include evaluations of word access and retrieval in the batteries of tests administered to stroke patients. Perhaps the most commonly used tool to assess anomia both in research and clinical contexts are confrontation naming tests (CNTs) (Brady et al., 2016; Kiran et al., 2018). These tests consist of illustrations of common objects. Typically, an item is administered and PWA are evaluated as they name the items. Beside their clinical usage, the CNTs have served as a fundamental tool in many aphasiology-related research studies. Particularly, Dell and his colleagues have developed a model of the cognitive machinery underlying word production based on people’s performances on CNTs (G. Dell et al., 1997; Schwartz et al., 2006). Error types retrieved from CNTs have also contributed to lesion analyses, which are used to study the neural correlates of language deficits (Schwartz et al., 2009). Moreover, researchers have used these CNTs to verify the efficacy of different treatment approaches (Kendall et al., 2015; Quique et al., 2019) and the cortical reorganization after anomia treatment (Fridriksson et al., 2006). There is a long history of developing such tests, and different CNTs have been created throughout the past few decades by different test developers and speech language pathologists. Some of the most frequently used ones are the Philadelphia Naming Test (PNT) (Roach et al., MULTIPLEX CENTRALITY CLOSENESS IN PREDICTING ITEM DIFFICULTY 2 1996), the Boston Naming Test (Kaplan et al., 1983), and the Snodgrass and Vanderwart stimuli (Snodgrass & Vanderwart, 1980). Moreover, all major aphasia batteries have a naming subtest, as seen in the Western Aphasia Battery – R (Kertesz, 2007), the Comprehensive Aphasia Test (Swinburn et al., 2004), the Boston Diagnostic Aphasia Examination (Goodglass & Kaplan, 1972), the Preliminary Neuropsychological Battery (Cossa et al., 1999), and the Object and Action Naming Battery (Druks & Masterson, 2000). Despite the popularity of confrontation picture naming tests, their utility for quantifying anomia is limited by at least four issues. First, patients’ ability estimates collected from different tests are placed on different metrics and cannot be directly compared (Fergadiotis, Swiderski, et al., 2019). The difficulty stems in part from the varied difficulty of the items being used. For instance, a 20% accuracy on a test with difficult words (e.g., stethoscope) may not necessarily indicate worse naming ability than the 30% accuracy on a test with easier words (e.g., cat). This prevents the direct comparison of estimates across CNTs which: (i) disrupts the flow of clinical information across healthcare settings that use different CNTs; (ii) may lead to unnecessary testing of patients depending on the availability of CNTs at each setting; and (iii) restricts our ability to conduct meta-analytic studies. Further, currently available tests invalidly assume constant measurement error based on which 95% confidence intervals are estimated around ability scores. With the exception of recent work (e.g., Fergadiotis, Hula, et al., 2019; Hula et al., 2020) on the PNT, naming tests assume equal measurement error regardless of ability level. This ignores that measurement error varies as a function of the degree to which the difficulty of the test targets the ability level of the person being tested (Embretson & Reise, 2000). Assuming an average constant measurement error leads to invalidly narrow confidence intervals for PWA in the extremes, and overly wide confidence MULTIPLEX CENTRALITY CLOSENESS IN PREDICTING ITEM DIFFICULTY 3 intervals for PWA in the middle of the ability distribution. Further, the assumption of equal measurement error has significant implications for the assessment of change. For example, the PNT short forms published by Walker and Schwartz (2012) are optimally targeted to, and thus most precise for PWA with moderate anomia. If one assumes a constant measurement error, then any confidence intervals and associated probabilities derived about the change score may be distorted. The confidence intervals around change score estimates for very mildly and severely impaired PWA may be misleadingly narrow, leading to an inflated type I error rate. On the other hand, the width of confidence intervals around change scores for moderately impaired PWA may be overestimated, leading to decreased power to detect real change and an increased type II error rate. Another notable limitation of currently available tests is that they are inefficient. Particularly, most tests must be administered in their entirety, leading to long administration times and increased testing burden for clinicians and patients. In addition, the items are a priori selected and as a result, they could either be too challenging or too easy for certain test takers. Therefore, a patient may experience frustration or boredom the longer the test is carried out, which may affect their performance and contaminate the test scores. Finally, with limited exceptions (Hula et al., 2020), there is a lack of tools that can generate multiple equivalent test forms with non-overlapping item content. As a result, oftentimes in practice, the same set of testing items are used throughout the course of a treatment, creating a possibility for test practice effects to influence the patients’ performance leading to invalid conclusions. For example, a patient’s score on the naming task may appear as improved, or may even reach the benchmark of the treatment plan, when in fact, results are due to familiarization with the testing items as opposed to effective treatment. Potentially, the MULTIPLEX CENTRALITY CLOSENESS IN PREDICTING ITEM DIFFICULTY 4 clinician can move on to working on other conditions, leaving the naming deficit unaddressed. Or, a patient may be told that they have improved based on their test scores, while they may not experience any actual gains to words beyond the ones included in the test. Computerized Adaptive Version of the Philadelphia Naming Test To address these limitations, recent studies have looked into possible improvements to CNTs. Specifically, psychometric research has focused on the Philadelphia Naming Test which is among the most commonly used CNTs in research applications (G. S. Dell, 1986). This test contains 175 items, depicted by black-and-white drawings of simple objects. All targets are nouns that range from 1-4 syllables in length. The items include high-, medium-, or lowfrequency targets as determined by Francis and Kučera (1982). Further, the items were selected from a set of 277 it

Volume None
Pages None
DOI 10.15760/honors.962
Language English
Journal None

Full Text