Aaron Jaech
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aaron Jaech.
conference of the international speech communication association | 2016
Aaron Jaech; Larry P. Heck; Mari Ostendorf
The goal of this paper is to use multi-task learning to efficiently scale slot filling models for natural language understanding to handle multiple target tasks or domains. The key to scalability is reducing the amount of training data needed to learn a model for a new task. The proposed multi-task model delivers better performance with less data by leveraging patterns that it learns from the other tasks. The approach supports an open vocabulary, which allows the models to generalize to unseen words, which is particularly important when very little training data is used. A newly collected crowd-sourced data set, covering four different domains, is used to demonstrate the effectiveness of the domain adaptation and open vocabulary techniques.
empirical methods in natural language processing | 2015
Aaron Jaech; Victoria Zayats; Hao Fang; Mari Ostendorf; Hannaneh Hajishirzi
This paper addresses the question of how language use affects community reaction to comments in online discussion forums, and the relative importance of the message vs. the messenger. A new comment ranking task is proposed based on community annotated karma in Reddit discussions, which controls for topic and timing of comments. Experimental work with discussion threads from six subreddits shows that the importance of different types of language features varies with the community of interest.
workshop on computational approaches to code switching | 2016
Aaron Jaech; George Mulcaire; Mari Ostendorf; Noah A. Smith
Language identification systems suffer when working with short texts or in domains with unconventional spelling, such as Twitter or other social media. These challenges are explored in a shared task for Language Identification in Code-Switched Data (LICS 2016). We apply a hierarchical neural model to this task, learning character and contextualized word-level representations to make word-level language predictions. This approach performs well on both the 2014 and 2016 versions of the shared task.
empirical methods in natural language processing | 2015
Aaron Jaech; Mari Ostendorf
Usernames are ubiquitous on the Internet, and they are often suggestive of user demographics. This work looks at the degree to which gender and language can be inferred from a username alone by making use of unsupervised morphology induction to decompose usernames into sub-units. Experimental results on the two tasks demonstrate the effectiveness of the proposed morphological features compared to a character n-gram baseline.
north american chapter of the association for computational linguistics | 2016
Aaron Jaech; Rik Koncel-Kedziorski; Mari Ostendorf
Many puns create humor through the relationship between a pun and its phonologically similar target. For example, in “Don’t take geologists for granite” the word “granite” is a pun with the target “granted”. The recovery of the target in the mind of the listener is essential to the success of the pun. This work introduces a new model for automatic target recovery and provides the first empirical test for this task. The model draws upon techniques for automatic speech recognition using weighted finite-state transducers, and leverages automatically learned phone edit probabilities that give insight into how people perceive sounds and into what makes a good pun. The model is evaluated on a small corpus where it is able to automatically recover a large fraction of the pun targets.
IEEE Transactions on Audio, Speech, and Language Processing | 2016
Yanzhang He; Peter Baumann; Hao Fang; Brian Hutchinson; Aaron Jaech; Mari Ostendorf; Eric Fosler-Lussier; Janet B. Pierrehumbert
Out-of-vocabulary (OOV) keywords present a challenge for keyword search (KWS) systems especially in the low-resource setting. Previous research has centered around approaches that use a variety of subword units to recover OOV words. This paper systematically investigates morphology-based subword modeling approaches on seven low-resource languages. We show that using morphological subword units (morphs) in speech recognition decoding is substantially better than expanding word-decoded lattices into subword units including phones, syllables and morphs. As alternatives to grapheme-based morphs, we apply unsupervised morphology learning to sequences of phonemes, graphones, and syllables. Using one of these phone-based morphs is almost always better than using the grapheme-based morphs, but the particular choice varies with the language. By combining the different methods, a substantial gain is obtained over the best single case for all languages, especially for OOV performance.
empirical methods in natural language processing | 2016
Aaron Jaech; George Mulcaire; Shobhit Hathi; Mari Ostendorf; Noah A. Smith
Transactions of the Association for Computational Linguistics | 2018
Aaron Jaech; Mari Ostendorf
north american chapter of the association for computational linguistics | 2018
Aaron Jaech; Shobhit Hathi; Mari Ostendorf
meeting of the association for computational linguistics | 2018
Aaron Jaech; Mari Ostendorf