Is this you? Create Your Porfile

Jacob Andreas

University of California, Berkeley

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jacob Andreas is active.

Explore More

Publication

Featured researches published by Jacob Andreas.

north american chapter of the association for computational linguistics | 2016

Learning to Compose Neural Networks for Question Answering

Jacob Andreas; Marcus Rohrbach; Trevor Darrell; Daniel Klein

We describe a question answering model that applies to both images and structured knowledge bases. The model uses natural language strings to automatically assemble neural networks from a collection of composable modules. Parameters for these modules are learned jointly with network-assembly parameters via reinforcement learning, with only (world, question, answer) triples as supervision. Our approach, which we term a dynamic neural model network, achieves state-of-the-art results on benchmark datasets in both visual and structured domains.

computer vision and pattern recognition | 2017

Modeling Relationships in Referential Expressions with Compositional Modular Networks

Ronghang Hu; Marcus Rohrbach; Jacob Andreas; Trevor Darrell; Kate Saenko

People often refer to entities in an image in terms of their relationships with other entities. For example, the black cat sitting under the table refers to both a black cat entity and its relationship with another table entity. Understanding these relationships is essential for interpreting and grounding such natural language expressions. Most prior work focuses on either grounding entire referential expressions holistically to one region, or localizing relationships based on a fixed set of categories. In this paper we instead present a modular deep architecture capable of analyzing referential expressions into their component parts, identifying entities and relationships mentioned in the input expression and grounding them all in the scene. We call this approach Compositional Modular Networks (CMNs): a novel architecture that learns linguistic analysis and visual inference end-to-end. Our approach is built around two types of neural modules that inspect local regions and pairwise interactions between regions. We evaluate CMNs on multiple referential expression datasets, outperforming state-of-the-art approaches on all tasks.

meeting of the association for computational linguistics | 2014

How much do word embeddings encode about syntax

Jacob Andreas; Daniel Klein

Do continuous word embeddings encode any useful information for constituency parsing? We isolate three ways in which word embeddings might augment a stateof-the-art statistical parser: by connecting out-of-vocabulary words to known ones, by encouraging common behavior among related in-vocabulary words, and by directly providing features for the lexicon. We test each of these hypotheses with a targeted change to a state-of-the-art baseline. Despite small gains on extremely small supervised training sets, we find that extra information from embeddings appears to make little or no difference to a parser with adequate training data. Our results support an overall hypothesis that word embeddings import syntactic information that is ultimately redundant with distinctions learned from treebanks in other ways.

empirical methods in natural language processing | 2015

Alignment-Based Compositional Semantics for Instruction Following

Jacob Andreas; Daniel Klein

This paper describes an alignment-based model for interpreting natural language instructions in context. We approach instruction following as a search over plans, scoring sequences of actions conditioned on structured observations of text and the environment. By explicitly modeling both the low-level compositional structure of individual actions and the high-level structure of full plans, we are able to learn both grounded representations of sentence meaning and pragmatic constraints on interpretation. To demonstrate the model’s flexibility, we apply it to a diverse set of benchmark tasks. On every task, we outperform strong task-specific baselines, and achieve several new state-of-the-art results.

north american chapter of the association for computational linguistics | 2010

Corpus Creation for New Genres: A Crowdsourced Approach to PP Attachment

Mukund Jha; Jacob Andreas; Kapil Thadani; Sara Rosenthal; Kathleen R. McKeown

This paper explores the task of building an accurate prepositional phrase attachment corpus for new genres while avoiding a large investment in terms of time and money by crowd-sourcing judgments. We develop and present a system to extract prepositional phrases and their potential attachments from ungrammatical and informal sentences and pose the subsequent disambiguation tasks as multiple choice questions to workers from Amazons Mechanical Turk service. Our analysis shows that this two-step approach is capable of producing reliable annotations on informal and potentially noisy blog text, and this semi-automated strategy holds promise for similar annotation projects in new genres.

north american chapter of the association for computational linguistics | 2015

When and why are log-linear models self-normalizing?

Jacob Andreas; Daniel Klein

Several techniques have recently been proposed for training “self-normalized” discriminative models. These attempt to find parameter settings for which unnormalized model scores approximate the true label probability. However, the theoretical properties of such techniques (and of self-normalization generally) have not been investigated. This paper examines the conditions under which we can expect self-normalization to work. We characterize a general class of distributions that admit self-normalization, and prove generalization bounds for procedures that minimize empirical normalizer variance. Motivated by these results, we describe a novel variant of an established procedure for training self-normalized models. The new procedure avoids computing normalizers for most training examples, and decreases training time by as much as factor of ten while preserving model quality.

Proceedings of the Second Workshop on Language in Social Media | 2012

Detecting Influencers in Written Online Conversations

Or Biran; Sara Rosenthal; Jacob Andreas; Kathleen R. McKeown; Owen Rambow

It has long been established that there is a correlation between the dialog behavior of a participant and how influential he or she is perceived to be by other discourse participants. In this paper we explore the characteristics of communication that make someone an opinion leader and develop a machine learning based approach for the automatic identification of discourse participants that are likely to be influencers in online communication. Our approach relies on identification of three types of conversational behavior: persuasion, agreement/disagreement, and dialog patterns.

conference on computational natural language learning | 2014

Grounding Language with Points and Paths in Continuous Spaces

Jacob Andreas; Daniel Klein

We present a model for generating pathvalued interpretations of natural language text. Our model encodes a map from natural language descriptions to paths, mediated by segmentation variables which break the language into a discrete set of events, and alignment variables which reorder those events. Within an event, lexical weights capture the contribution of each word to the aligned path segment. We demonstrate the applicability of our model on three diverse tasks: a new color description task, a new financial news task and an established direction-following task. On all three, the model outperforms strong baselines, and on a hard variant of the direction-following task it achieves results close to the state-of-the-art system described in Vogel and Jurafsky (2010).

language resources and evaluation | 2010

Towards Semi-Automated Annotation for Prepositional Phrase Attachment

Kapil Thadani; Sara Rosenthal; Kathleen R. McKeown; William Lipovsky; Jacob Andreas

This paper investigates whether high-quality annotations for tasks involving semantic disambiguation can be obtained without a major investment in time or expense. We examine the use of untrained human volunteers from Amazons Mechanical Turk in disambiguating prepositional phrase (PP) attachment over sentences drawn from the Wall Street Journal corpus. Our goal is to compare the performance of these crowdsourced judgments to the annotations supplied by trained linguists for the Penn Treebank project in order to indicate the viability of this approach for annotation projects that involve contextual disambiguation. The results of our experiments on a sample of the Wall Street Journal corpus show that invoking majority agreement between multiple human workers can yield PP attachments with fairly high precision. This confirms that a crowdsourcing approach to syntactic annotation holds promise for the generation of training corpora in new domains and genres where high-quality annotations are not available and difficult to obtain.

meeting of the association for computational linguistics | 2017

A Minimal Span-Based Neural Constituency Parser.

Mitchell Stern; Jacob Andreas; Daniel Klein

In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans. We show that this model is not only compatible with classical dynamic programming techniques, but also admits a novel greedy top-down inference algorithm based on recursive partitioning of the input. We demonstrate empirically that both prediction schemes are competitive with recent work, and when combined with basic extensions to the scoring model are capable of achieving state-of-the-art single-model performance on the Penn Treebank (91.79 F1) and strong performance on the French Treebank (82.23 F1).

Explore More