Kristopher Kyle
Georgia State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kristopher Kyle.
Behavior Research Methods | 2017
Scott A. Crossley; Kristopher Kyle; Danielle S. McNamara
This study introduces the Sentiment Analysis and Cognition Engine (SEANCE), a freely available text analysis tool that is easy to use, works on most operating systems (Windows, Mac, Linux), is housed on a user’s hard drive (as compared to being accessed via an Internet interface), allows for batch processing of text files, includes negation and part-of-speech (POS) features, and reports on thousands of lexical categories and 20 component scores related to sentiment, social cognition, and social order. In the study, we validated SEANCE by investigating whether its indices and related component scores can be used to classify positive and negative reviews in two well-known sentiment analysis test corpora. We contrasted the results of SEANCE with those from Linguistic Inquiry and Word Count (LIWC), a similar tool that is popular in sentiment analysis, but is pay-to-use and does not include negation or POS features. The results demonstrated that both the SEANCE indices and component scores outperformed LIWC on the categorization tasks.
Discourse Processes | 2014
Scott A. Crossley; Laura K. Allen; Kristopher Kyle; Danielle S. McNamara
Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a simple NLP (SiNLP) tool. The overarching goal of the article is to proliferate the use of NLP in discourse processing research. The article also provides an instantiation and empirical evaluation of the linguistic features measured by SiNLP to demonstrate their strength in investigating constructs of interest to the discourse processing community. Although relatively simple, the results of this analysis reveal that the tool is quite powerful, performing on par with a sophisticated text analysis tool, Coh-Metrix, on a common discourse processing task (i.e., predicting essay scores). Such a tool could prove useful to researchers interested in investigating features of language that affect discourse production and comprehension.
Discourse Processes | 2017
Scott A. Crossley; Stephen Skalicky; Mihai Dascalu; Danielle S. McNamara; Kristopher Kyle
ABSTRACT Research has identified a number of linguistic features that influence the reading comprehension of young readers; yet, less is known about whether and how these findings extend to adult readers. This study examines text comprehension, processing, and familiarity judgment provided by adult readers using a number of different approaches (i.e., natural language processing, crowd-sourced ratings, and machine learning). The primary focus is on the identification of the linguistic features that predict adult text readability judgments, and how these features perform when compared to traditional text readability formulas such as the Flesch-Kincaid grade level formula. The results indicate the traditional readability formulas are less predictive than models of text comprehension, processing, and familiarity derived from advanced natural language processing tools.
Language Testing | 2016
Kristopher Kyle; Scott A. Crossley; Danielle S. McNamara
This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these tasks differ with regard to linguistic characteristics. The findings lend support to using a variety of speaking tasks to assess speaking proficiency. Namely, with regard to linguistic differences, the findings suggest that responses to performance tasks can be accurately grouped based on whether a task is independent or integrated. The findings also suggest that although the independent tasks included in the TOEFL iBT may represent a single construct, responses to integrated tasks vary across task sub-type.
Language Testing | 2017
Kristopher Kyle; Scott A. Crossley
Over the past 45 years, the construct of syntactic sophistication has been assessed in L2 writing using what Bulté and Housen (2012) refer to as absolute complexity (Lu, 2011; Ortega, 2003; Wolfe-Quintero, Inagaki, & Kim, 1998). However, it has been argued that making inferences about learners based on absolute complexity indices (e.g., mean length of t-unit and mean length of clause) may be difficult, both from practical and theoretical perspectives (Norris & Ortega, 2009). Furthermore, indices of absolute complexity may not align with some prominent theories of language learning such as usage-based theories (e.g., Ellis, 2002a,b). This study introduces a corpus-based approach for measuring syntactic sophistication in L2 writing using a usage-based, frequency-driven perspective. Specifically, novel computational indices related to the frequency of verb argument constructions (VACs) and the strength of association between VACs and the verbs that fill them (i.e., verb–VAC combinations) are developed. These indices are then compared against traditional indices of syntactic complexity (e.g., mean length of T-unit and mean length of clause) with regard to their ability to model one aspect of holistic scores of writing quality in Test of English as a Foreign Language (TOEFL) independent essays. Indices related to usage-based theories of syntactic development explained greater variance (R2 = .142) in holistic scores of writing quality than traditional methods of assessing syntactic complexity (R2 = .058). The results have important implications for modeling syntactic sophistication, L2 writing assessment, and AES systems.
Behavior Research Methods | 2018
Kristopher Kyle; Scott A. Crossley; Cynthia M. Berger
This study introduces the second release of the Tool for the Automatic Analysis of Lexical Sophistication (TAALES 2.0), a freely available and easy-to-use text analysis tool. TAALES 2.0 is housed on a user’s hard drive (allowing for secure data processing) and is available on most operating systems (Windows, Mac, and Linux). TAALES 2.0 adds 316 indices to the original tool. These indices are related to word frequency, word range, n-gram frequency, n-gram range, n-gram strength of association, contextual distinctiveness, word recognition norms, semantic network, and word neighbors. In this study, we validated TAALES 2.0 by investigating whether its indices could be used to model both holistic scores of lexical proficiency in free writes and word choice scores in narrative essays. The results indicated that the TAALES 2.0 indices could be used to explain 58% of the variance in lexical proficiency scores and 32% of the variance in word-choice scores. Newly added TAALES 2.0 indices, including those related to n-gram association strength, word neighborhood, and word recognition norms, featured heavily in these predictor models, suggesting that TAALES 2.0 represents a substantial upgrade.
International Review of Applied Linguistics in Language Teaching | 2018
James Garner; Scott A. Crossley; Kristopher Kyle
Abstract Acommon approach to analyzing phraseological knowledge in first language (L1) and second language (L2) learners is to employ raw frequency data. Several studies have also analyzed n-gram use on the basis of statistical association scores. Results from n-gram studies have found significant differences between L1 and L2 writers and between intermediate and advanced L2 writers in terms of their bigram use. The current study expands on this research by investigating the connection between bigram and trigram association measures and human judgments of L2 writing quality. Using multiple statistical association indices, it examines bigram and trigram use by beginner and intermediate L1 Korean learners of English in English placement test essays. Results of a logistic regression indicated that intermediate writers employed a greater number of strongly associated academic bigrams and spoken trigrams. These findings have important implications for understanding lexical development in L2 writers and notions of writing proficiency.
Behavior Research Methods | 2018
Scott A. Crossley; Kristopher Kyle; Mihai Dascalu
This article introduces the second version of the Tool for the Automatic Analysis of Cohesion (TAACO 2.0). Like its predecessor, TAACO 2.0 is a freely available text analysis tool that works on the Windows, Mac, and Linux operating systems; is housed on a user’s hard drive; is easy to use; and allows for batch processing of text files. TAACO 2.0 includes all the original indices reported for TAACO 1.0, but it adds a number of new indices related to local and global cohesion at the semantic level, reported by latent semantic analysis, latent Dirichlet allocation, and word2vec. The tool also includes a source overlap feature, which calculates lexical and semantic overlap between a source and a response text (i.e., cohesion between the two texts based measures of text relatedness). In the first study in this article, we examined the effects that cohesion features, prompt, essay elaboration, and enhanced cohesion had on expert ratings of text coherence, finding that global semantic similarity as reported by word2vec was an important predictor of coherence ratings. A second study was conducted to examine the source and response indices. In this study we examined whether source overlap between the speaking samples found in the TOEFL-iBT integrated speaking tasks and the responses produced by test-takers was predictive of human ratings of speaking proficiency. The results indicated that the percentage of keywords found in both the source and response and the similarity between the source document and the response, as reported by word2vec, were significant predictors of speaking quality. Combined, these findings help validate the new indices reported for TAACO 2.0.
TESOL Quarterly | 2015
Kristopher Kyle; Scott A. Crossley
Behavior Research Methods | 2016
Scott A. Crossley; Kristopher Kyle; Danielle S. McNamara