Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jiasen Lu is active.

Publication


Featured researches published by Jiasen Lu.


international conference on computer vision | 2015

VQA: Visual Question Answering

Stanislaw Antol; Aishwarya Agrawal; Jiasen Lu; Margaret Mitchell; Dhruv Batra; C. Lawrence Zitnick; Devi Parikh

We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We provide a dataset containing ~0.25M images, ~0.76M questions, and ~10M answers (www.visualqa.org), and discuss the information it provides. Numerous baselines for VQA are provided and compared with human performance.


computer vision and pattern recognition | 2017

Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning

Jiasen Lu; Caiming Xiong; Devi Parikh; Richard Socher

Attention-based neural encoder-decoder frameworks have been widely adopted for image captioning. Most methods force visual attention to be active for every generated word. However, the decoder likely requires little to no visual information from the image to predict non-visual words such as the and of. Other words that may seem visual can often be predicted reliably just from the language model e.g., sign after behind a red stop or phone following talking on a cell. In this paper, we propose a novel adaptive attention model with a visual sentinel. At each time step, our model decides whether to attend to the image (and if so, to which regions) or to the visual sentinel. The model decides whether to attend to the image and where, in order to extract meaningful information for sequential word generation. We test our method on the COCO image captioning 2015 challenge dataset and Flickr30K. Our approach sets the new state-of-the-art by a significant margin.


computer vision and pattern recognition | 2015

Human action segmentation with hierarchical supervoxel consistency

Jiasen Lu; Ran Xu; Jason J. Corso

Detailed analysis of human action, such as action classification, detection and localization has received increasing attention from the community; datasets like JHMDB have made it plausible to conduct studies analyzing the impact that such deeper information has on the greater action understanding problem. However, detailed automatic segmentation of human action has comparatively been unexplored. In this paper, we take a step in that direction and propose a hierarchical MRF model to bridge low-level video fragments with high-level human motion and appearance; novel higher-order potentials connect different levels of the supervoxel hierarchy to enforce the consistency of the human segmentation by pulling from different segment-scales. Our single layer model significantly outperforms the current state-of-the-art on actionness, and our full model improves upon the single layer baselines in action segmentation.


neural information processing systems | 2016

Hierarchical Question-Image Co-Attention for Visual Question Answering

Jiasen Lu; Jianwei Yang; Dhruv Batra; Devi Parikh


empirical methods in natural language processing | 2017

ParlAI: A Dialog Research Software Platform

Alexander H. Miller; Will Feng; Dhruv Batra; Antoine Bordes; Adam Fisch; Jiasen Lu; Devi Parikh; Jason Weston


neural information processing systems | 2017

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

Jiasen Lu; Anitha Kannan; Jianwei Yang; Devi Parikh; Dhruv Batra


arXiv: Computer Vision and Pattern Recognition | 2016

Hierarchical Co-Attention for Visual Question Answering

Jiasen Lu; Jianwei Yang; Dhruv Batra; Devi Parikh


computer vision and pattern recognition | 2018

Neural Baby Talk

Jiasen Lu; Jianwei Yang; Dhruv Batra; Devi Parikh


arXiv: Robotics | 2018

Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition.

Jianwei Yang; Jiasen Lu; Stefan Lee; Dhruv Batra; Devi Parikh


arXiv: Computer Vision and Pattern Recognition | 2018

Graph R-CNN for Scene Graph Generation.

Jianwei Yang; Jiasen Lu; Stefan Lee; Dhruv Batra; Devi Parikh

Collaboration


Dive into the Jiasen Lu's collaboration.

Top Co-Authors

Avatar

Devi Parikh

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Dhruv Batra

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jianwei Yang

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Stefan Lee

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Aishwarya Agrawal

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge