Xiaofei Lu
Pennsylvania State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xiaofei Lu.
Educational Researcher | 2013
David Gamson; Xiaofei Lu; Sarah Anne Eckert
The widely adopted Common Core State Standards (CCSS) call for raising the level of text complexity in textbooks and reading materials used by students across all grade levels in the United States; the authors of the English Language Arts component of the CCSS build their case for higher complexity in part upon a research base they say shows a steady decline in the difficulty of student reading textbooks over the past half century. In this interdisciplinary study, we offer our own independent analysis of third- and sixth-grade reading textbooks used throughout the past century. Our data set consists of books from 117 textbook series issued by 30 publishers between 1905 and 2004, resulting in a linguistic corpus of roughly 10 million words. Contrary to previous reports, we find that text complexity has either risen or stabilized over the past half century; these findings have significant implications for the justification of the CCSS as well as for our understanding of a “decline” within American schooling more generally.
international conference on computational linguistics | 2011
Shibamouli Lahiri; Prasenjit Mitra; Xiaofei Lu
Formality and its converse, informality, are important dimensions of authorial style that serve to determine the social background a particular document is coming from, and the potential audience it is targeted to. In this paper we explored the concept of formality at the sentence level from two different perspectives. One was the Formality Score (F-score) and its distribution across different datasets, how they compared with each other and how F-score could be linked to human-annotated sentences. The other was to measure the inherent agreement between two independent judges on a sentence annotation task. It gave us an idea how subjective the concept of formality was at the sentence level. Finally, we looked into the related issue of document readability and measured its correlation with document formality.
Language Testing | 2015
Matthew E. Poehner; Jie Zhang; Xiaofei Lu
Dynamic assessment (DA) derives from the sociocultural theory of mind as elaborated by Russian psychologist L. S. Vygotsky. By offering mediation when individuals experience difficulties and carefully tracing their responsiveness, Vygotsky (1998) proposed that diagnoses may uncover abilities that have fully formed as well as those still in the process of developing. This insight has led to numerous assessments, collectively referred to as DA, that have been pursued primarily in the domains of special education and general cognitive abilities measurement (Feuerstein, Feuerstein, & Falik, 2010; Haywood & Lidz, 2007). To date, L2 DA work has been primarily conducted in classroom settings (Ableeva, 2010; Lantolf & Poehner, 2011; Poehner, 2007, 2008). This paper discusses a recent project concerning the design of online multiple-choice tests of L2 reading and listening comprehension that leverage the principle that mediation is indispensable for diagnosing development. Specifically, each test item is accompanied by a set of prompts graduated from implicit to explicit. In this way, resultant diagnoses include not only whether learners answered correctly (their actual score) but also the amount of support they required (mediated score) during the test. We argue that the set of scores automatically generated by the tests, together with a breakdown of learner performance on items targeting particular component features of comprehension, provide a fine-grained diagnosis of their L2 development while also offering information relevant to subsequent teaching and learning.
American Educational Research Journal | 2015
Robert J. Stevens; Xiaofei Lu; David P. Baker; Melissa Ray; Sarah Anne Eckert; David Gamson
This research investigated the cognitive demands of reading curricula from 1910 to 2000. We considered both the nature of the text used and the comprehension tasks asked of students in determining the cognitive demands of the curricula. Contrary to the common assumption of a trend of simplification of the texts and comprehension tasks in third- and sixth-grade curricula, the results indicate that curricular complexity declined early in the century and leveled off over the middle decades but has notably increased since the 1970s, particularly for the third-grade curricula.
meeting of the association for computational linguistics | 2005
Xiaofei Lu
This paper describes a hybrid model that combines a rule-based model with two statistical models for the task of POS guessing of Chinese unknown words. The rule-based model is sensitive to the type, length, and internal structure of unknown words, and the two statistical models utilize contextual information and the likelihood for a character to appear in a particular position of words of a particular length and POS category. By combining models that use different sources of information, the hybrid model achieves a precision of 89%, a significant improvement over the best result reported in previous studies, which was 69%.
language resources and evaluation | 2017
Xiaofei Lu; Ben Pin-Yun Wang
Abstract Building on the success of the VU Amsterdam Metaphor Corpus, which comprises English texts annotated with metaphor following the Metaphor Identification Procedure Vrjie Universiteit (MIPVU; Steen et al. in Cogn Linguist 21(4):765–796, 2010a; Steen et al. in A method for linguistic metaphor identification: from MIP to MIPVU. John Benjamins, Amsterdam/Philadelphia, 2010b), this study has three aims: (1) to adapt and evaluate the transferability and reliability of MIPVU for Mandarin Chinese; (2) to construct a corpus of Chinese texts annotated for metaphor using the adapted procedure; and (3) to examine the distribution of metaphor-related words across Chinese texts in three different written registers: academic discourse, fiction, and news. The results of our inter-annotator reliability test show that MIPVU can be reliably applied to linguistic metaphor identification in Chinese texts. Our metaphor-annotated corpus consists of texts randomly sampled from the Lancaster Corpus of Mandarin Chinese, totaling 30,012 words (about 10,000 for each register). Data analysis reveals that approximately one out of every nine lexical units in our Chinese corpus is related to metaphor, that there is considerable variation in metaphor density across different registers and lexical categories, and that metaphor density is significantly lower in Chinese than in English texts. Our assessment of the replicability of MIPVU for Mandarin Chinese adds to the groundbreaking methodological contribution that Steen et al. (2010a, b) has made to metaphor research. The metaphor-annotated corpus of Mandarin Chinese contributes a valuable language resource for Chinese metaphor researchers, and our analysis of the distribution of metaphor-related words in the corpus offers useful new insights into the extent and use of metaphor in Chinese discourse.
Language Testing | 2017
Xiaofei Lu
Research investigating corpora of English learners’ language raises new questions about how syntactic complexity is defined theoretically and operationally for second language (L2) writing assessment. I show that syntactic complexity is important in construct definitions and L2 writing rating scales as well as in L2 writing research. I describe the operationalizations of syntactic complexity measurement in corpus-based L2 writing research, focusing on the Biber Tagger (Biber, Johansson, Leech, Conrad, & Finegan, 1999), Coh-Metrix (McNamara, Graesser, McCarthy, & Cai, 2014), and L2 Syntactic Complexity Analyzer (Lu, 2010), which are three tools commonly used to automate syntactic complexity analysis. A review of findings from recent corpus-based L2 writing studies on the relationship of syntactic complexity to L2 writing quality follows. I conclude with a discussion of the implications of these multiple perspectives on the definition of syntactic complexity in L2 studies.
Language Teaching Research | 2018
Jinfang Peng; Chuming Wang; Xiaofei Lu
Previous studies demonstrated that the continuation task has great language learning potential and that various task-related factors may affect the extent to which the potential can be exploited (e.g. Wang & Wang, 2015). This study investigates the effect of one understudied factor, the linguistic complexity of the input text, on English as a foreign language (EFL) learners’ alignment, writing fluency, and writing accuracy in the continuation task. Two comparable groups of Chinese undergraduate EFL learners read and continued a simplified and unsimplified version of the same incomplete story whose linguistic complexity matched and exceeded their production ability, respectively. Compared to the unsimplified version, the simplified version resulted in more automatic alignment and greater improvement in writing fluency and accuracy. The implications of these findings for writing pedagogy are discussed.
Journal of Quantitative Linguistics | 2018
Jinlin Jiang; Lihua Jiang; Xiaofei Lu
Abstract This study attempts to construct automated scoring models for Chinese EFL (English as a Foreign Language) learners’ English-to-Chinese (E-C) translations in large-scale exams. Our data consisted of 900 human-scored translated texts of three source texts – an expository text, a narrative text and a mixed narrative-argumentative text – with 300 for each source text. Text features were extracted using technologies such as n-gram matching, word alignment and Latent Semantic Analysis. Computer scoring models were constructed using multiple linear regression analysis with text features as independent variables and human-assigned scores as the dependent variable. To determine the number of training texts required to yield the most optimal results, five scoring models were developed with a training set of 50, 100, 130, 150 and 180 texts of each text type, respectively. Results indicated that the correlation coefficients between the model-computed and human-assigned scores were above 0.8 for all five models. The model trained with 130 translated texts performed the best on expository and narrative texts, while that trained with 100 translated texts performed the best on mixed narrative-argumentative texts. Therefore, it is concluded that the text features extracted in this study are effective and that the finalized models can produce reliable scores for Chinese EFL learners’ E-C translations.
International Journal of Corpus Linguistics | 2010
Xiaofei Lu