Na-Rae Han
University of Pennsylvania
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Na-Rae Han.
Natural Language Engineering | 2006
Na-Rae Han; Martin Chodorow; Claudia Leacock
One of the most difficult challenges faced by non-native speakers of English is mastering the system of English articles. We trained a maximum entropy classifier to select among a/an, the, or zero article for noun phrases (NPs), based on a set of features extracted from the local context of each. When the classifier was trained on 6 million NPs, its performance on published text was about 83% correct. We then used the classifier to detect article errors in the TOEFL essays of native speakers of Chinese, Japanese, and Russian. These writers made such errors in about one out of every eight NPs, or almost once in every three sentences. The classifiers agreement with human annotators was 85% (kappa = 0.48) when it selected among a/an, the, or zero article. Agreement was 89% (kappa = 0.56) when it made a binary (yes/no) decision about whether the NP should have an article. Even with these levels of overall agreement, precision and recall in error detection were only 0.52 and 0.80, respectively. However, when the classifier was allowed to skip cases where its confidence was low, precision rose to 0.90, with 0.40 recall. Additional improvements in performance may require features that reflect general knowledge to handle phenomena such as indirect prior reference. In August 2005, the classifier was deployed as a component of Educational Testing Services Criterion
meeting of the association for computational linguistics | 2004
Na-Rae Han
^{SM}
finite state methods and natural language processing | 2005
Na-Rae Han
Online Writing Evaluation Service.
language resources and evaluation | 2010
Na-Rae Han; Joel R. Tetreault; Soo-Hwa Lee; Jin-Young Ha
This paper discusses an annotation scheme for Korean null pronouns, which were used in annotating three kinds of Korean text corpora including Penn Korean Treebank. In annotating the corpora, null pronouns and their antecedents were marked up for their type and reference, with coreference relation tracked by numeric identifiers. Based on the annotation scheme, an outline of a potential pronoun resolution strategy is also proposed. The resulting dataset of annotated text is rather small at 11,834 words; we hope the null pronoun classification and annotation scheme proposed in this study will serve as a basis in developing a large-scale annotated corpus in the future.
language resources and evaluation | 2004
Na-Rae Han; Martin Chodorow; Claudia Leacock
This paper describes the implementation and system details of Klex, a finite-state transducer lexicon for the Korean language, developed using XRCE’s Xerox Finite State Tool (XFST). Klex is essentially a transducer network representing the lexicon of the Korean language with the lexical string on the upper side and the inflected surface string on the lower side. Two major applications for Klex are morphological analysis and generation: given a well-formed inflected lower string, a language-independent algorithm derives the upper lexical string from the network and vice versa. Klex was written to conform to the part-of-speech tagging standards of the Korean Treebank Project, and is currently operating as the morphological analysis engine for the project.
Archive | 2006
Ellen F. Prince; Martha Palmer; Na-Rae Han
Archive | 2001
Chung-hye Han; Na-Rae Han; Eon-Suk Ko
Archive | 2001
Chung-hye Han; Na-Rae Han
Language and Information | 2002
Chung-hye Han; Na-Rae Han; Eon-Suk Ko; Martha Palmer
Archive | 2005
Na-Rae Han; Shijong Ryu