Yejin Choi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yejin Choi is active.

Explore More

Publication

Featured researches published by Yejin Choi.

computer vision and pattern recognition | 2011

Baby talk: Understanding and generating simple image descriptions

Girish Kulkarni; Visruth Premraj; Sagnik Dhar; Siming Li; Yejin Choi; Alexander C. Berg; Tamara L. Berg

We posit that visually descriptive language offers computer vision researchers both information about the world, and information about how people describe the world. The potential benefit from this source is made more significant due to the enormous amount of language data easily available today. We present a system to automatically generate natural language descriptions from images that exploits both statistics gleaned from parsing large quantities of text data and recognition algorithms from computer vision. The system is very effective at producing relevant sentences for images. It also generates descriptions that are notably more true to the specific image content than previous work.

empirical methods in natural language processing | 2005

OpinionFinder: A System for Subjectivity Analysis

Theresa Wilson; Paul Hoffmann; Swapna Somasundaran; Jason Kessler; Janyce Wiebe; Yejin Choi; Claire Cardie; Ellen Riloff; Siddharth Patwardhan

OpinionFinder is a system that performs subjectivity analysis, automatically identifying when opinions, sentiments, speculations, and other private states are present in text. Specifically, OpinionFinder aims to identify subjective sentences and to mark various aspects of the subjectivity in these sentences, including the source (holder) of the subjectivity and words that are included in phrases expressing positive or negative sentiments.

empirical methods in natural language processing | 2005

Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns

Yejin Choi; Claire Cardie; Ellen Riloff; Siddharth Patwardhan

Recent systems have been developed for sentiment classification, opinion recognition, and opinion analysis (e.g., detecting polarity and strength). We pursue another aspect of opinion analysis: identifying the sources of opinions, emotions, and sentiments. We view this problem as an information extraction task and adopt a hybrid approach that combines Conditional Random Fields (Lafferty et al., 2001) and a variation of AutoSlog (Riloff, 1996a). While CRFs model source identification as a sequence tagging task, AutoSlog learns extraction patterns. Our results show that the combination of these two methods performs better than either one alone. The resulting system identifies opinion sources with 79.3% precision and 59.5% recall using a head noun matching measure, and 81.2% precision and 60.6% recall using an overlap measure.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

BabyTalk: Understanding and Generating Simple Image Descriptions

Girish Kulkarni; Visruth Premraj; Vicente Ordonez; Sagnik Dhar; Siming Li; Yejin Choi; Alexander C. Berg; Tamara L. Berg

We present a system to automatically generate natural language descriptions from images. This system consists of two parts. The first part, content planning, smooths the output of computer vision-based detection and recognition algorithms with statistics mined from large pools of visually descriptive text to determine the best content words to use to describe an image. The second step, surface realization, chooses words to construct natural language sentences based on the predicted content and general statistics from natural language. We present multiple approaches for the surface realization step and evaluate each using automatic measures of similarity to human generated reference descriptions. We also collect forced choice human evaluations between descriptions from the proposed generation system and descriptions from competing approaches. The proposed system is very effective at producing relevant sentences for images. It also generates descriptions that are notably more true to the specific image content than previous work.

empirical methods in natural language processing | 2006

Joint Extraction of Entities and Relations for Opinion Recognition

Yejin Choi; Eric Breck; Claire Cardie

We present an approach for the joint extraction of entities and relations in the context of opinion recognition and analysis. We identify two types of opinion-related entities --- expressions of opinions and sources of opinions --- along with the linking relation that exists between them. Inspired by Roth and Yih (2004), we employ an integer linear programming approach to solve the joint opinion recognition task, and show that global, constraint-based inference can significantly boost the performance of both relation extraction and the extraction of opinion-related entities. Performance further improves when a semantic role labeling system is incorporated. The resulting system achieves F-measures of 79 and 69 for entity and relation extraction, respectively, improving substantially over prior results in the area.

empirical methods in natural language processing | 2009

Adapting a Polarity Lexicon using Integer Linear Programming for Domain-Specific Sentiment Classification

Yejin Choi; Claire Cardie

Polarity lexicons have been a valuable resource for sentiment analysis and opinion mining. There are a number of such lexical resources available, but it is often suboptimal to use them as is, because general purpose lexical resources do not reflect domain-specific lexical usage. In this paper, we propose a novel method based on integer linear programming that can adapt an existing lexicon into a new one to reflect the characteristics of the data more directly. In particular, our method collectively considers the relations among words and opinion expressions to derive the most likely polarity of each lexical item (positive, neutral, negative, or negator) for the given domain. Experimental results show that our lexicon adaptation technique improves the performance of fine-grained polarity classification.

international world wide web conferences | 2010

Using landing pages for sponsored search ad selection

Yejin Choi; Marcus Fontoura; Evgeniy Gabrilovich; Vanja Josifovski; Maurício R. Mediano; Bo Pang

We explore the use of the landing page content in sponsored search ad selection. Specifically, we compare the use of the ads intrinsic content to augmenting the ad with the whole, or parts, of the landing page. We explore two types of extractive summarization techniques to select useful regions from the landing pages: out-of-context and in-context methods. Out-of-context methods select salient regions from the landing page by analyzing the content alone, without taking into account the ad associated with the landing page. In-context methods use the ad context (including its title, creative, and bid phrases) to help identify regions of the landing page that should be used by the ad selection engine. In addition, we introduce a simple yet effective unsupervised algorithm to enrich the ad context to further improve the ad selection. Experimental evaluation confirms that the use of landing pages can significantly improve the quality of ad selection. We also find that our extractive summarization techniques reduce the size of landing pages substantially, while retaining or even improving the performance of ad retrieval over the method that utilize the entire landing page.

empirical methods in natural language processing | 2013

Where Not to Eat? Improving Public Policy by Predicting Hygiene Inspections Using Online Reviews

Jun Seok Kang; Polina Kuznetsova; Michael Luca; Yejin Choi

Restaurant hygiene inspections are often cited as a success story of public disclosure. Hygiene grades influence customer decisions and serve as an accountability system for restaurants. However, cities (which are responsible for inspections) have limited resources to dispatch inspectors, which in turn limits the number of inspections that can be performed. We argue that NLP can be used to improve the effectiveness of inspections by allowing cities to target restaurants that are most likely to have a hygiene violation. In this work, we report the first empirical study demonstrating the utility of review analysis for predicting restaurant inspection results.

empirical methods in natural language processing | 2016

Globally Coherent Text Generation with Neural Checklist Models.

Chloé Kiddon; Luke Zettlemoyer; Yejin Choi

Recurrent neural networks can generate locally coherent text but often have difficulties representing what has already been generated and what still needs to be said – especially when constructing long texts. We present the neural checklist model, a recurrent neural network that models global coherence by storing and updating an agenda of text strings which should be mentioned somewhere in the output. The model generates output by dynamically adjusting the interpolation among a language model and a pair of attention models that encourage references to agenda items. Evaluations on cooking recipes and dialogue system responses demonstrate high coherence with greatly improved semantic coverage of the agenda.

empirical methods in natural language processing | 2014

Keystroke Patterns as Prosody in Digital Writings: A Case Study with Deceptive Reviews and Essays

Ritwik Banerjee; Song Feng; Jun Seok Kang; Yejin Choi

In this paper, we explore the use of keyboard strokes as a means to access the real-time writing process of online authors, analogously to prosody in speech analysis, in the context of deception detection. We show that differences in keystroke patterns like editing maneuvers and duration of pauses can help distinguish between truthful and deceptive writing. Empirical results show that incorporating keystrokebased features lead to improved performance in deception detection in two different domains: online reviews and essays.

Explore More