Øistein E. Andersen
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Øistein E. Andersen.
workshop on innovative use of nlp for building educational applications | 2009
Jennifer Foster; Øistein E. Andersen
This paper explores the issue of automatically generated ungrammatical data and its use in error detection, with a focus on the task of classifying a sentence as grammatical or ungrammatical. We present an error generation tool called GenERRate and show how GenERRate can be used to improve the performance of a classifier on learner data. We describe initial attempts to replicate Cambridge Learner Corpus errors using GenERRate.
conference on computational natural language learning | 2014
Mariano Felice; Zheng Yuan; Øistein E. Andersen; Helen Yannakoudakis; Ekaterina Kochmar
[We would like to thank] Cambridge English Language Assessment, a division of Cambridge Assessment, for supporting this research.
north american chapter of the association for computational linguistics | 2012
Ekaterina Kochmar; Øistein E. Andersen; Ted Briscoe
Previous work on automated error recognition and correction of texts written by learners of English as a Second Language has demonstrated experimentally that training classifiers on error-annotated ESL text generally outperforms training on native text alone and that adaptation of error correction models to the native language (L1) of the writer improves performance. Nevertheless, most extant models have poor precision, particularly when attempting error correction, and this limits their usefulness in practical applications requiring feedback. We experiment with various feature types, varying quantities of error-corrected data, and generic versus L1-specific adaptation to typical errors using Naive Bayes (NB) classifiers and develop one model which maximizes precision. We report and discuss the results for 8 models, 5 trained on the HOO data and 3 (partly) on the full error-coded Cambridge Learner Corpus, from which the HOO data is drawn.
English Profile Journal | 2011
Øistein E. Andersen
Manual error annotation of learner corpora is time-consuming and error-prone, whereas existing automatic techniques cannot reliably detect and correct all types of error. This paper shows that the two methods can successfully complement each other: automatic detection and partial correction of trivial errors relieves the human annotator from the laborious task of incessantly marking up oft-committed mistakes and enables him or her to focus on errors which cannot or cannot yet be handled mechanically, thus enabling more consistent annotation with considerably less manual time and effort expended.
Applied Measurement in Education | 2018
Helen Yannakoudakis; Øistein E. Andersen; Ardeshir Geranpayeh; Ted Briscoe; Diane Nicholls
ABSTRACT There are quite a few challenges in the development of an automated writing placement model for non-native English learners, among them the fact that exams that encompass the full range of language proficiency exhibited at different stages of learning are hard to design. However, acquisition of appropriate training data that are relevant to the task at hand is essential in the development of the model. Using the Cambridge Learner Corpus writing scores, which have been subsequently benchmarked to Common European Framework of Reference for Languages (CEFR) levels, we conceptualize the task as a supervised machine learning problem, and primarily focus on developing a generic writing model. Such an approach facilitates the modeling of truly consistent, internal marking criteria regardless of the prompt delivered, which has the additional advantage of requiring smaller dataset sizes and not necessarily requiring re-training or tuning for new tasks. The system is developed to predict someone’s proficiency level on the CEFR scale, which allows learners to point to a specific standard of achievement. We furthermore integrate our model into Cambridge English Write & ImproveTM—a freely available, cloud-based tool that automatically provides diagnostic feedback to non-native English language learners at different levels of granularity—and examine its use.
language resources and evaluation | 2008
Øistein E. Andersen; Julien Nioche; Ted Briscoe; John A. Carroll
Archive | 2010
Ted Briscoe; Ben Medlock; Øistein E. Andersen
workshop on innovative use of nlp for building educational applications | 2013
Øistein E. Andersen; Helen Yannakoudakis; Fiona Barker; Tim Parish
empirical methods in natural language processing | 2017
Helen Yannakoudakis; Marek Rei; Øistein E. Andersen; Zheng Yuan
north american chapter of the association for computational linguistics | 2018
Meng Zhang; Xie Chen; Ronan Cummins; Øistein E. Andersen; Ted Briscoe