Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michael Heilman is active.

Publication


Featured researches published by Michael Heilman.


workshop on innovative use of nlp for building educational applications | 2008

An Analysis of Statistical Models and Features for Reading Difficulty Prediction

Michael Heilman; Kevyn Collins-Thompson; Maxine Eskenazi

A reading difficulty measure can be described as a function or model that maps a text to a numerical value corresponding to a difficulty or grade level. We describe a measure of readability that uses a combination of lexical features and grammatical features that are derived from subtrees of syntactic parses. We also tested statistical models for nominal, ordinal, and interval scales of measurement. The results indicate that a model for ordinal regression, such as the proportional odds model, using a combination of grammatical and lexical features is most effective at predicting reading difficulty.


workshop on innovative use of nlp for building educational applications | 2008

Retrieval of Reading Materials for Vocabulary and Reading Practice

Michael Heilman; Le Zhao; Juan Pino; Maxine Eskenazi

Finding appropriate, authentic reading materials is a challenge for language instructors. The Web is a vast resource of texts, but most pages are not suitable for reading practice, and commercial search engines are not well suited to finding texts that satisfy pedagogical constraints such as reading level, length, text quality, and presence of target vocabulary. We present a system that uses various language technologies to facilitate the retrieval and presentation of authentic reading materials gathered from the Web. It is currently deployed in two English as a Second Language courses at the University of Pittsburgh.


artificial intelligence in education | 2010

Personalization of Reading Passages Improves Vocabulary Acquisition

Michael Heilman; Kevyn Collins-Thompson; Jamie Callan; Maxine Eskenazi; Alan Juffs; Lois Wilson

The REAP tutoring system provides individualized and adaptive English as a Second Language vocabulary practice. REAP can automatically personalize instruction by providing practice readings about topics that match interests as well as domain-based, cognitive objectives. While most previous research on motivation in intelligent tutoring environments has focused on increasing extrinsic motivation, this study focused on increasing personal interest. Students were randomly split into control and treatment groups. The control-condition tutor chose texts to maximize domain-based goals such as the density of practice opportunities for target words. The treatment-condition tutor also preferred texts that matched personal interests. The results show positive effects of personalization, and also demonstrate the importance of negotiating between motivational and domain-based goals.


meeting of the association for computational linguistics | 2014

Applying Argumentation Schemes for Essay Scoring

Yi Song; Michael Heilman; Beata Beigman Klebanov; Paul Deane

Under the framework of the argumentation scheme theory (Walton, 1996), we developed annotation protocols for an argumentative writing task to support identification and classification of the arguments being made in essays. Each annotation protocol defined argumentation schemes (i.e., reasoning patterns) in a given writing prompt and listed questions to help evaluate an argument based on these schemes, to make the argument structure in a text explicit and classifiable. We report findings based on an annotation of 600 essays. Most annotation categories were applied reliably by human annotators, and some categories significantly contributed to essay score. An NLP system to identify sentences containing scheme-relevant critical questions was developed based on the human annotations.


meeting of the association for computational linguistics | 2014

Predicting Grammaticality on an Ordinal Scale

Michael Heilman; Aoife Cahill; Nitin Madnani; Melissa Lopez; Matthew Mulholland; Joel R. Tetreault

Automated methods for identifying whether sentences are grammatical have various potential applications (e.g., machine translation, automated essay scoring, computer-assisted language learning). In this work, we construct a statistical model of grammaticality using various linguistic features (e.g., misspelling counts, parser outputs, n-gram language model scores). We also present a new publicly available dataset of learner sentences judged for grammaticality on an ordinal scale. In evaluations, we compare our system to the one from Post (2011) and find that our approach yields state-of-the-art performance.


workshop on innovative use of nlp for building educational applications | 2015

Feature selection for automated speech scoring

Anastassia Loukina; Klaus Zechner; Lei Chen; Michael Heilman

Automated scoring systems used for the evaluation of spoken or written responses in language assessments need to balance good empirical performance with the interpretability of the scoring models. We compare several methods of feature selection for such scoring systems and show that the use of shrinkage methods such as Lasso regression makes it possible to rapidly build models that both satisfy the requirements of validity and intepretability, crucial in assessment contexts as well as achieve good empirical performance.


intelligent tutoring systems | 2008

Word Sense Disambiguation for Vocabulary Learning

Anagha Kulkarni; Michael Heilman; Maxine Eskenazi; Jamie Callan

Words with multiple meanings are a phenomenon inherent to any natural language. In this work, we study the effects of such lexical ambiguities on second language vocabulary learning. We demonstrate that machine learning algorithms for word sense disambiguation can induce classifiers that exhibit high accuracy at the task of disambiguating homonyms (words with multiple distinct meanings). Results from a user study that compared two versions of a vocabulary tutoring system, one that applied word sense disambiguation to support learning and another that did not, support rejection of the null hypothesis that learning outcomes with and without word sense disambiguation are equivalent, with a p-value of 0.001. To our knowledge this is the first work that investigates the efficacy of word sense disambiguation for facilitating second language vocabulary learning.


Proceedings of the Second Workshop on Metaphor in NLP | 2014

Different Texts, Same Metaphors: Unigrams and Beyond

Beata Beigman Klebanov; Ben Leong; Michael Heilman; Michael Flor

Current approaches to supervised learning of metaphor tend to use sophisticated features and restrict their attention to constructions and contexts where these features apply. In this paper, we describe the development of a supervised learning system to classify all content words in a running text as either being used metaphorically or not. We start by examining the performance of a simple unigram baseline that achieves surprisingly good results for some of the datasets. We then show how the recall of the system can be improved over this strong baseline.


north american chapter of the association for computational linguistics | 2015

Effective Feature Integration for Automated Short Answer Scoring

Keisuke Sakaguchi; Michael Heilman; Nitin Madnani

A major opportunity for NLP to have a realworld impact is in helping educators score student writing, particularly content-based writing (i.e., the task of automated short answer scoring). A major challenge in this enterprise is that scored responses to a particular question (i.e., labeled data) are valuable for modeling but limited in quantity. Additional information from the scoring guidelines for humans, such as exemplars for each score level and descriptions of key concepts, can also be used. Here, we explore methods for integrating scoring guidelines and labeled responses, and we find that stacked generalization (Wolpert, 1992) improves performance, especially for small training sets.


workshop on innovative use of nlp for building educational applications | 2015

Reducing Annotation Efforts in Supervised Short Answer Scoring

Torsten Zesch; Michael Heilman; Aoife Cahill

Automated short answer scoring is increasingly used to give students timely feedback about their learning progress. Building scoring models comes with high costs, as stateof-the-art methods using supervised learning require large amounts of hand-annotated data. We analyze the potential of recently proposed methods for semi-supervised learning based on clustering. We find that all examined methods (centroids, all clusters, selected pure clusters) are mainly effective for very short answers and do not generalize well to severalsentence responses.

Collaboration


Dive into the Michael Heilman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Noah A. Smith

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Maxine Eskenazi

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Aoife Cahill

University of Stuttgart

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brendan O'Connor

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge