Joel R. Tetreault | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Joel R. Tetreault is active.

Explore More

Publication

Featured researches published by Joel R. Tetreault.

international conference on computational linguistics | 2014

Automated Grammatical Error Correction for Language Learners

Joel R. Tetreault; Claudia Leacock

A fast growing area in Natural Language Processing is the use of automated tools for identifying and correcting grammatical errors made by language learners. This growth, in part, has been fueled by the needs of a large number of people in the world who are learning and using a second or foreign language. For example, it is estimated that there are currently over one billion people who are non-native writers of English. These numbers drive the demand for accurate tools that can help learners to write and speak proficiently in another language. Such demand also makes this an exciting time for those in the NLP community who are developing automated methods for grammatical error correction (GEC). Our motivation for the COLING tutorial is to make others more aware of this field and its particular set of challenges. For these reasons, we believe that the tutorial will potentially benefit a broad range of conference attendees. In general, there has been a surge in interest in using NLP to address educational needs, which in turn, has spawned the recurring ACL/NAACL workshop “Innovative Use of Natural Language Processing for Building Educational Applications” that had its 9th edition at ACL 2014. The last three years, in particular, have been pivotal for GEC. Papers on the topic have become more commonplace at main conferences such as ACL, NAACL and EMNLP, as well as two editions of a Morgan Claypool Synthesis Series book on the topic (Leacock et al., 2010; Leacock et al., 2014). In 2011 and 2012, the first shared tasks in GEC (Dale and Kilgarriff, 2011; Dale et al., 2012) were created, and dozens of teams from all over the world participated. This was followed by two successful CoNLL Shared Tasks on the topic in 2013 and 2014 (Ng et al., 2013; Ng et al., 2014). While there have been many exciting developments in GEC over the last few years, there is still considerable room for improvement as state-of-the-art performance in detecting and correcting several important error types is still inadequate for real world applications. We hope to engage researchers from other NLP fields to develop novel and effective approaches to these problems. Our tutorial is specifically designed to:

international conference on computational linguistics | 2008

The Ups and Downs of Preposition Error Detection in ESL Writing

Joel R. Tetreault; Martin Chodorow

In this paper we describe a methodology for detecting preposition errors in the writing of non-native English speakers. Our system performs at 84% precision and close to 19% recall on a large set of student essays. In addition, we address the problem of annotation and evaluation in this domain by showing how current approaches of using only one rater can skew system evaluation. We present a sampling approach to circumvent some of the issues that complicate evaluation of error detection systems.

meeting of the association for computational linguistics | 2007

Detection of Grammatical Errors Involving Prepositions

Martin Chodorow; Joel R. Tetreault; Na-Rae Han

This paper presents ongoing work on the detection of preposition errors of non-native speakers of English. Since prepositions account for a substantial proportion of all grammatical errors by ESL (English as a Second Language) learners, developing an NLP application that can reliably detect these types of errors will provide an invaluable learning resource to ESL students. To address this problem, we use a maximum entropy classifier combined with rule-based filters to detect preposition errors in a corpus of student essays. Although our work is preliminary, we achieve a precision of 0.8 with a recall of 0.3.

Computational Linguistics | 2001

A corpus-based evaluation of centering and pronoun resolution

Joel R. Tetreault

In this paper we compare pronoun resolution algorithms and introduce a centering algorithm(Left-Right Centering) that adheres to the constraints and rules of centering theory and is an alternative to Brennan, Friedman, and Pollards (1987) algorithm. We then use the Left-Right Centering algorithm to see if two psycholinguistic claims on Cf-list ranking will actually improve pronoun resolution accuracy. Our results from this investigation lead to the development of a new syntax-based ranking of the Cf-list and corpus-based evidence that contradicts the psycholinguistic claims.In this paper we compare pronoun resolution algorithms and introduce a centering algorithm (Left-Right Centering) that adheres to the constraints and rules of centering theory and is an alternative to Brennan, Friedman, and Pollards (1987) algorithm. We then use the Left-Right Centering algorithm to see if two psycholinguistic claims on Cf-list ranking will actually improve pronoun resolution accuracy. Our results from this investigation lead to the development of a new syntax-based ranking of the Cf-list and corpus-based evidence that contradicts the psycholinguistic claims.

international world wide web conferences | 2016

Abusive Language Detection in Online User Content

Chikashi Nobata; Joel R. Tetreault; Achint Thomas; Yashar Mehdad; Yi Chang

Detection of abusive language in user generated online content has become an issue of increasing importance in recent years. Most current commercial methods make use of blacklists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted examples of hate speech. In this work, we develop a machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach. We also develop a corpus of user comments annotated for abusive language, the first of its kind. Finally, we use our detection tool to analyze abusive language over time and in different settings to further enhance our knowledge of this behavior.

Computer Assisted Language Learning | 2008

A Computational Approach to Detecting Collocation Errors in the Writing of Non-Native Speakers of English.

Yoko Futagi; Paul Deane; Martin Chodorow; Joel R. Tetreault

This paper describes the first prototype of an automated tool for detecting collocation errors in texts written by non-native speakers of English. Candidate strings are extracted by pattern matching over POS-tagged text. Since learner texts often contain spelling and morphological errors, the tool attempts to automatically correct them in order to reduce noise. For a measure of collocation strength, we use the rank-ratio statistic calculated over one billion words of native-speaker texts. Two human annotators evaluated the systems performance. We report the overall results, as well as detailed error analyses, and discuss possible improvements for the future.

international conference on computational linguistics | 2008

Native Judgments of Non-Native Usage: Experiments in Preposition Error Detection

Joel R. Tetreault; Martin Chodorow

Evaluation and annotation are two of the greatest challenges in developing NLP instructional or diagnostic tools to mark grammar and usage errors in the writing of non-native speakers. Past approaches have commonly used only one rater to annotate a corpus of learner errors to compare to system output. In this paper, we show how using only one rater can skew system evaluation and then we present a sampling approach that makes it possible to evaluate a system more efficiently.

Language Testing | 2010

The utility of article and preposition error correction systems for English language learners: Feedback and assessment

Martin Chodorow; Michael Gamon; Joel R. Tetreault

In this paper, we describe and evaluate two state-of-the-art systems for identifying and correcting writing errors involving English articles and prepositions. Criterion SM, developed by Educational Testing Service, and ESL Assistant , developed by Microsoft Research, both use machine learning techniques to build models of article and preposition usage which enable them to identify errors and suggest corrections to the writer. We evaluated the effects of these systems on users in two studies. In one, Criterion provided feedback about article errors to native and non-native speakers who were writing an essay for a college-level psychology course. The results showed a significant reduction in the number of article errors in the final essays of the non-native speakers. In the second study, ESL Assistant was used by non-native speakers who were composing email messages. The results indicated that users were selective in their choices among the system’s suggested corrections and that, as a result, they were able to increase the proportion of valid corrections by making effective use of feedback.

Speech Communication | 2008

A Reinforcement Learning approach to evaluating state representations in spoken dialogue systems

Joel R. Tetreault; Diane J. Litman

Although dialogue systems have been an area of research for decades, finding accurate ways of evaluating different systems is still a very active subfield since many leading methods, such as task completion rate or user satisfaction, capture different aspects of the end-to-end human-computer dialogue interaction. In this work, we step back the focus from the complete evaluation of a dialogue system to presenting metrics for evaluating one internal component of a dialogue system: its dialogue manager. Specifically, we investigate how to create and evaluate the best state space representations for a Reinforcement Learning model to learn an optimal dialogue control strategy. We present three metrics for evaluating the impact of different state models and demonstrate their use on the domain of a spoken dialogue tutoring system by comparing the relative utility of adding three features to a model of user, or student, state. The motivation for this work is that if one knows which features are best to use, one can construct a better dialogue manager, and thus better performing dialogue systems.

north american chapter of the association for computational linguistics | 2007

Comparing User Simulation Models For Dialog Strategy Learning

Hua Ai; Joel R. Tetreault; Diane J. Litman

This paper explores what kind of user simulation model is suitable for developing a training corpus for using Markov Decision Processes (MDPs) to automatically learn dialog strategies. Our results suggest that with sparse training data, a model that aims to randomly explore more dialog state spaces with certain constraints actually performs at the same or better than a more complex model that simulates realistic user behaviors in a statistical way.

Explore More