Markus Dickinson
Indiana University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Markus Dickinson.
Computer Assisted Language Learning | 2008
Markus Dickinson; Soojeong Eom; Yunkyoung Kang; Chong Min Lee; Rebecca Sachs
Task-based synchronous computer-mediated communication (CMC) may represent an optimal psycholinguistic environment for form-meaning connections, but learners do not receive feedback from a trusted authority. Intelligent computer-assisted language learning (ICALL) provides feedback, but the encouragement of communicative interaction remains largely unexplored. To combine the benefits of ICALL and CMC, we are designing a parser-based system that provides feedback on particle usage for first-year L2 Korean learners while they chat in CMC. Both to facilitate the use of a new orthography by beginning learners and to make processing feasible for the ICALL system, we guide the content of the activity by using picture-based information-gap tasks and a game record, and we control the range of allowable learner input by using a word bank. These constraints enable the system to address L2 errors in meaningful, goal-driven interactions.
meeting of the association for computational linguistics | 2005
Markus Dickinson; W. Detmar Meurers
Consistency of corpus annotation is an essential property for the many uses of annotated corpora in computational and theoretical linguistics. While some research addresses the detection of inconsistencies in positional annotation (e.g., part-of-speech) and continuous structural annotation (e.g., syntactic constituency), no approach has yet been developed for automatically detecting annotation errors in discontinuous structural annotation. This is significant since the annotation of potentially discontinuous stretches of material is increasingly relevant, from tree-banks for free-word order languages to semantic and discourse annotation.In this paper we discuss how the variation n-gram error detection approach (Dickinson and Meurers, 2003a) can be extended to discontinuous structural annotation. We exemplify the approach by showing how it successfully detects errors in the syntactic annotation of the German TIGER corpus (Brants et al., 2002).
workshop on innovative use of nlp for building educational applications | 2008
Markus Dickinson; Joshua Herring
We outline a new ICALL system for learners of Russian, focusing on the processing needed for basic morphological errors. By setting out an appropriate design for a lexicon and distinguishing the types of morphological errors to be detected, we establish a foundation for error detection across exercises.
meeting of the association for computational linguistics | 2009
Markus Dickinson
Building on work detecting errors in dependency annotation, we set out to correct local dependency errors. To do this, we outline the properties of annotation errors that make the task challenging and their existence problematic for learning. For the task, we define a feature-based model that explicitly accounts for non-relations between words, and then use ambiguities from one model to constrain a second, more relaxed model. In this way, we are successfully able to correct many errors, in a way which is potentially applicable to dependency parsing more generally.
international conference on computational linguistics | 2008
Markus Dickinson
As it serves as a basis for POS tagging, category induction, and human category acquisition, we investigate the information needed to disambiguate a word in a local context, when using corpus categories. Specifically, we increase the recall of an error detection method by abstracting the word to be disambiguated to a representation containing information about some of its inherent properties, namely the set of categories it can potentially have. This work thus provides insights into the relation of corpus categories to categories derived from local contexts.
Language and Linguistics Compass | 2015
Markus Dickinson
This paper surveys methods for annotation error detection and correction. Methods can broadly be characterized as to whether they detect inconsistencies with respect to some statistical model based only on the corpus data or whether they detect inconsistencies with respect to a grammatical model, in general, some external information source. Two extended examples are presented, illustrating these different techniques: (1) the variation n-gram method, which searches for inconsistences in annotation for identical strings; and (2) a method of ad hoc rule detection, for syntactic annotation, which compares treebank rules to a grammar to determine which are anomalous. Methods for detecting annotation errors have developed much over the last decade, and thus corpus practitioners can benefit greatly from them, while at the same time NLP researchers can learn more about the nuances of the annotation they use and see how error correction methods intersect with NLP techniques.
recent advances in natural language processing | 2017
Wen Li; Markus Dickinson
Social media provides users a platform to publish messages and socialize with others, and microblogs have gained more users than ever in recent years. With such usage, user profiling is a popular task in computational linguistics and text mining. Different approaches have been used to predict users’ gender, age, and other information, but most of this work has been done on English and other Western languages. The goal of this project is to predict the gender of users based on their posts on Weibo, a Chinese micro-blogging platform. Given issues in Chinese word segmentation, we explore character and word n-grams as features for this task, as well as using character and word embeddings for classification. Given how the data is extracted, we approach the task on a per-post basis, and we show the difficulties of the task for both humans and computers. Nonetheless, we present encouraging results and point to future improvements.
north american chapter of the association for computational linguistics | 2016
Levi King; Markus Dickinson
We investigate questions of how to reason about learner meaning in cases where the set of correct meanings is never entirely complete, specifically for the case of picture description tasks (PDTs). To operationalize this, we explore different models of representing and scoring non-native speaker (NNS) responses to a picture, including bags of dependencies, automatically determining the relevant parts of an image from a set of native speaker (NS) responses. In more exploratory work, we examine the variability in both NS and NNS responses, and how different system parameters correlate with the variability. In this way, we hope to provide insight for future system development, data collection, and investigations into learner language.
workshop on innovative use of nlp for building educational applications | 2015
Scott Ledbetter; Markus Dickinson
In this paper, we describe a morphological analyzer for learner Hungarian, built upon limited grammatical knowledge of Hungarian. The rule-based analyzer requires very few resources and is flexible enough to do both morphological analysis and error detection, in addition to some unknown word handling. As this is work-in-progress, we demonstrate its current capabilities, some areas where analysis needs to be improved, and an initial foray into how the system output can support the analysis of interlanguage grammars.
linguistic annotation workshop | 2015
Markus Dickinson; Marwa Ragheb
We examine some non-canonical annotation categories that license missing material (ellipses and enumerations). In extending these categories to learner data, the distinctions seem to require an annotator to determine whether a sentence is grammatical or not when deciding between particular analyses. We unpack the assumptions surrounding the annotation of learner language and how these particular phenomena compare to competing analyses, pointing out the implications for annotation practice and second language analysis.