Scott A. Crossley | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Scott A. Crossley is active.

Explore More

Publication

Featured researches published by Scott A. Crossley.

Written Communication | 2010

Linguistic features of writing quality

Danielle S. McNamara; Scott A. Crossley; Philip M. McCarthy

In this study, a corpus of expert-graded essays, based on a standardized scoring rubric, is computationally evaluated so as to distinguish the differences between those essays that were rated as high and those rated as low. The automated tool, Coh-Metrix, is used to examine the degree to which high- and low-proficiency essays can be predicted by linguistic indices of cohesion (i.e., coreference and connectives), syntactic complexity (e.g., number of words before the main verb, sentence structure overlap), the diversity of words used by the writer, and characteristics of words (e.g., frequency, concreteness, imagability). The three most predictive indices of essay quality in this study were syntactic complexity (as measured by number of words before the main verb), lexical diversity (as measured by the Measure of Textual Lexical Diversity), and word frequency (as measured by Celex, logarithm for all words). Using 26 validated indices of cohesion from Coh-Metrix, none showed differences between high- and low-proficiency essays and no indices of cohesion correlated with essay ratings. These results indicate that the textual features that characterize good student writing are not aligned with those features that facilitate reading comprehension. Rather, essays judged to be of higher quality were more likely to contain linguistic features associated with text difficulty and sophisticated language.

Written Communication | 2011

The Development of Writing Proficiency as a Function of Grade Level: A Linguistic Analysis:

Scott A. Crossley; Jennifer L. Weston; Susan T. McLain Sullivan; Danielle S. McNamara

In this study, a corpus of essays stratified by level (9th grade, 11th grade, and college freshman) are analyzed computationally to discriminate differences between the linguistic features produced in essays by adolescents and young adults. The automated tool Coh-Metrix is used to examine to what degree essays written at various grade levels can be distinguished from one another using a number of linguistic features related to lexical sophistication (i.e., word frequency, word concreteness), syntactic complexity (i.e., the number of modifiers per noun phrase), and cohesion (i.e., word overlap, incidence of connectives). The analysis demonstrates that high school and college writers develop linguistic strategies as a function of grade level. Primarily, these writers produce more sophisticated words and more complex sentence structure as grade level increases. In contrast, these writers produce fewer cohesive features in text as a function of grade level. This analysis supports the notion that linguistic development occurs in the later stages of writing development and that this development is primarily related to producing texts that are less cohesive and more elaborate.

Language Testing | 2011

Predicting Lexical Proficiency in Language Learner Texts Using Computational Indices.

Scott A. Crossley; Tom Salsbury; Danielle S. McNamara; Scott Jarvis

The authors present a model of lexical proficiency based on lexical indices related to vocabulary size, depth of lexical knowledge, and accessibility to core lexical items. The lexical indices used in this study come from the computational tool Coh-Metrix and include word length scores, lexical diversity values, word frequency counts, hypernymy values, polysemy values, semantic co-referentiality, word meaningfulness, word concreteness, word imagability, and word familiarity. Human raters evaluated a corpus of 240 written texts using a standardized rubric of lexical proficiency. To ensure a variety of text levels, the corpus comprised 60 texts each from beginning, intermediate, and advanced second language (L2) adult English learners. The L2 texts were collected longitudinally from 10 English learners. In addition, 60 texts from native English speakers were collected. The holistic scores from the trained human raters were then correlated to a variety of lexical indices. The researchers found that lexical diversity, word hypernymy values and content word frequency explain 44% of the variance of the human evaluations of lexical proficiency in the examined writing samples. The findings represent an important step in the development of a model of lexical proficiency that incorporates both vocabulary size and depth of lexical knowledge features.

Language Testing | 2012

Predicting the proficiency level of language learners using lexical indices

Scott A. Crossley; Tom Salsbury; Danielle S. McNamara

This study explores how second language (L2) texts written by learners at various proficiency levels can be classified using computational indices that characterize lexical competence. For this study, 100 writing samples taken from 100 L2 learners were analyzed using lexical indices reported by the computational tool Coh-Metrix. The L2 writing samples were categorized into beginning, intermediate, and advanced groupings based on the TOEFL and ACT ESL Compass scores of the writer. A discriminant function analysis was used to predict the level categorization of the texts using lexical indices related to breadth of lexical knowledge (word frequency, lexical diversity), depth of lexical knowledge (hypernymy, polysemy, semantic co-referentiality, and word meaningfulness), and access to core lexical items (word concreteness, familiarity, and imagability). The strongest predictors of an individual’s proficiency level were word imagability, word frequency, lexical diversity, and word familiarity. In total, the indices correctly classified 70% of the texts based on proficiency level in both a training and a test set. The authors argue for the applicability of a statistical model as a method to investigate lexical competence across language levels, as a method to assess L2 lexical development, and as a method to classify L2 proficiency.

International journal of continuing engineering education and life-long learning | 2011

Understanding expert ratings of essay quality: Coh-Metrix analyses of first and second language writing

Scott A. Crossley; Danielle S. McNamara

This article reviews recent studies in which human judgements of essay quality are assessed using Coh-Metrix, an automated text analysis tool. The goal of these studies is to better understand the relationship between linguistic features of essays and human judgements of writing quality. Coh-Metrix reports on a wide range of linguistic features, affording analyses of writing at various levels of text structure, including surface, text-base, and situation model levels. Recent studies have examined linguistic features of essay quality related to co-reference, connectives, syntactic complexity, lexical diversity, spatiality, temporality, and lexical characteristics. These studies have analysed essays written by both first language and second language writers. The results support the notion that human judgements of essay quality are best predicted by linguistic indices that correlate with measures of language sophistication such as lexical diversity, word frequency, and syntactic complexity. In contrast, human judgements of essay quality are not strongly predicted by linguistic indices related to cohesion. Overall, the studies portray high quality writing as containing more complex language that may not facilitate text comprehension.

Second Language Research | 2011

Psycholinguistic word information in second language oral discourse

Tom Salsbury; Scott A. Crossley; Danielle S. McNamara

This study uses word information scores from the Medical Research Council (MRC) Psycholinguistic Database to analyse word development in the spontaneous speech data of six adult learners of English as a second language (L2) in a one-year longitudinal study. In contrast to broad measures of lexical development, such as word frequency and lexical diversity, this study analyses L2 learners’ depth of word knowledge as measured by psycholinguistic values for concreteness, imagability, meaningfulness, and familiarity. Repeated measure ANOVAs yielded significant differences over time for concreteness, imagability, and meaningfulness, where the temporal intervals act as the independent variable, and the MRC values function as the dependent variables. Non-significant results were found for familiarity scores. The results provide evidence that learners’ productive vocabularies become more abstract, less context dependent, and more tightly associated over time. This indicates a deeper knowledge of second language vocabulary and has important implications for how vocabulary knowledge can be measured in future studies of L2 lexical development.

Language Teaching Research | 2012

Text simplification and comprehensible input: A case for an intuitive approach

Scott A. Crossley; David Allen; Danielle S. McNamara

Texts are routinely simplified to make them more comprehensible for second language learners. However, the effects of simplification upon the linguistic features of texts remain largely unexplored. Here we examine the effects of one type of text simplification: intuitive text simplification. We use the computational tool, Coh-Metrix, to examine linguistic differences between proficiency levels of a corpus of 300 news texts that had been simplified to three levels of simplification (beginner, intermediate, advanced). The main analysis reveals significant differences between levels for a wide range of linguistic features, particularly between beginner and advanced levels. The results show that lower level texts are generally less lexically and syntactically sophisticated than higher-level texts. The analysis also reveals that lower level texts contain more cohesive features than higher-level texts. The analysis also provides strong evidence that these linguistic features can be used to classify levels of simplified reading texts. Overall, the findings support the notion that intuitively simplified texts at the beginning level contain more linguistic features related to comprehensible input than intuitively simplified texts at the advanced level.

Language Teaching | 2008

Assessing L2 reading texts at the intermediate level: An approximate replication of Crossley, Louwerse, McCarthy & McNamara (2007)

Scott A. Crossley; Danielle S. McNamara

This paper follows up on the work of Crossley, Louwerse, McCarthy & McNamara (2007), who conducted an exploratory study of the linguistic differences of simplified and authentic texts found in beginner level English as a Second Language (ESL) textbooks using the computational tool Coh-Metrix. The purpose of this study is to provide a more comprehensive study of second language (L2) reading texts than that provided by Crossley et al. (2007) by investigating the differences between the linguistic structures of a larger and more selective corpus of intermediate reading texts. This study is important because advocates of both approaches to ESL text construction cite linguistic features, syntax, and discourse structures as essential elements of text readability, but only the Crossley et al. (2007) study has measured the differences between these text types and their implications for L2 learners. This research replicates the methods of the earlier study. The findings of this study provide a more thorough understanding of the linguistic features that construct simplified and authentic texts. This work will enable material developers, publishers, and reading researchers to more accurately judge the values of simplified and authentic L2 texts as well as improve measures for matching readers to text.

Behavior Research Methods | 2017

Sentiment Analysis and Social Cognition Engine (SEANCE): An automatic tool for sentiment, social cognition, and social-order analysis

Scott A. Crossley; Kristopher Kyle; Danielle S. McNamara

This study introduces the Sentiment Analysis and Cognition Engine (SEANCE), a freely available text analysis tool that is easy to use, works on most operating systems (Windows, Mac, Linux), is housed on a user’s hard drive (as compared to being accessed via an Internet interface), allows for batch processing of text files, includes negation and part-of-speech (POS) features, and reports on thousands of lexical categories and 20 component scores related to sentiment, social cognition, and social order. In the study, we validated SEANCE by investigating whether its indices and related component scores can be used to classify positive and negative reviews in two well-known sentiment analysis test corpora. We contrasted the results of SEANCE with those from Linguistic Inquiry and Word Count (LIWC), a similar tool that is popular in sentiment analysis, but is pay-to-use and does not include negation or POS features. The results demonstrated that both the SEANCE indices and component scores outperformed LIWC on the categorization tasks.

artificial intelligence in education | 2011

Predicting human scores of essay quality using computational indices of linguistic and textual features

Scott A. Crossley; Rod D. Roscoe; Danielle S. McNamara

This study assesses the potential for computational indices to predict human ratings of essay quality. The results demonstrate that linguistic indices related to type counts, given/new information, personal pronouns, word frequency, conclusion n-grams, and verb forms predict 43% of the variance in human scores of essay quality.

Explore More