Mark Yatskar
University of Washington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mark Yatskar.
computer vision and pattern recognition | 2016
Mark Yatskar; Luke Zettlemoyer; Ali Farhadi
This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e.g., clipping), (2) the participating actors, objects, substances, and locations (e.g., man, shears, sheep, wool, and field) and most importantly (3) the roles these participants play in the activity (e.g., the man is clipping, the shears are his tool, the wool is being clipped from the sheep, and the clipping is in a field). We use FrameNet, a verb and role lexicon developed by linguists, to define a large space of possible situations and collect a large-scale dataset containing over 500 activities, 1,700 roles, 11,000 objects, 125,000 images, and 200,000 unique situations. We also introduce structured prediction baselines and show that, in activity-centric images, situation-driven prediction of objects and activities outperforms independent object and activity recognition.
joint conference on lexical and computational semantics | 2014
Mark Yatskar; Michel Galley; Lucy Vanderwende; Luke Zettlemoyer
This paper studies generation of descriptive sentences from densely annotated images. Previous work studied generation from automatically detected visual information but produced a limited class of sentences, hindered by currently unreliable recognition of activities and attributes. Instead, we collect human annotations of objects, parts, attributes and activities in images. These annotations allow us to build a significantly more comprehensive model of language generation and allow us to study what visual information is required to generate human-like descriptions. Experiments demonstrate high quality output and that activity annotations and relative spatial location of objects contribute most to producing high quality sentences.
meeting of the association for computational linguistics | 2017
Ioannis Konstas; Srinivasan Iyer; Mark Yatskar; Yejin Choi; Luke Zettlemoyer
Sequence-to-sequence models have shown strong performance across a broad range of applications. However, their application to parsing and generating text usingAbstract Meaning Representation (AMR)has been limited, due to the relatively limited amount of labeled data and the non-sequential nature of the AMR graphs. We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs. For AMR parsing, our model achieves competitive results of 62.1SMATCH, the current best score reported without significant use of external semantic resources. For AMR generation, our model establishes a new state-of-the-art performance of BLEU 33.8. We present extensive ablative and qualitative analysis including strong evidence that sequence-based AMR models are robust against ordering variations of graph-to-sequence conversions.
north american chapter of the association for computational linguistics | 2016
Mark Yatskar; Vicente Ordonez; Ali Farhadi
Obtaining common sense knowledge using current information extraction techniques is extremely challenging. In this work, we instead propose to derive simple common sense statements from fully annotated object detection corpora such as the Microsoft Common Objects in Context dataset. We show that many thousands of common sense facts can be extracted from such corpora at high quality. Furthermore, using WordNet and a novel submodular k-coverage formulation, we are able to generalize our initial set of common sense assertions to unseen objects and uncover over 400k potentially useful facts.
computer vision and pattern recognition | 2017
Mark Yatskar; Vicente Ordonez; Luke Zettlemoyer; Ali Farhadi
Semantic sparsity is a common challenge in structured visual classification problems, when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set. This paper studies semantic sparsity in situation recognition, the task of producing structured summaries of what is happening in images, including activities, objects and the roles objects play within the activity. For this problem, we find empirically that most substructures required for prediction are rare, and current state-of-the-art model performance dramatically decreases if even one such rare substructure exists in the target output. We avoid many such errors by (1) introducing a novel tensor composition function that learns to share examples across substructures more effectively and (2) semantically augmenting our training data with automatically gathered examples of rarely observed outputs using web data. When integrated within a complete CRF-based structured prediction model, the tensor-based approach outperforms existing state of the art by a relative improvement of 2.11% and 4.40% on top-5 verb and noun-role accuracy, respectively. Adding 5 million images with our semantic augmentation techniques gives further relative improvements of 6.23% and 9.57% on top-5 verb and noun-role accuracy.
north american chapter of the association for computational linguistics | 2010
Mark Yatskar; Bo Pang; Cristian Danescu-Niculescu-Mizil; Lillian Lee
empirical methods in natural language processing | 2017
Jieyu Zhao; Tianlu Wang; Mark Yatskar; Vicente Ordonez; Kai-Wei Chang
computer vision and pattern recognition | 2018
Rowan Zellers; Mark Yatskar; Sam Thomson; Yejin Choi
empirical methods in natural language processing | 2018
Eunsol Choi; He He; Mohit Iyyer; Mark Yatskar; Wen-tau Yih; Yejin Choi; Percy Liang; Luke Zettlemoyer
north american chapter of the association for computational linguistics | 2018
Jieyu Zhao; Tianlu Wang; Mark Yatskar; Vicente Ordonez; Kai-Wei Chang