Aishwarya Agrawal
Georgia Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aishwarya Agrawal.
international conference on computer vision | 2015
Stanislaw Antol; Aishwarya Agrawal; Jiasen Lu; Margaret Mitchell; Dhruv Batra; C. Lawrence Zitnick; Devi Parikh
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We provide a dataset containing ~0.25M images, ~0.76M questions, and ~10M answers (www.visualqa.org), and discuss the information it provides. Numerous baselines for VQA are provided and compared with human performance.
Ai Magazine | 2016
C. Lawrence Zitnick; Aishwarya Agrawal; Stanislaw Antol; Margaret Mitchell; Dhruv Batra; Devi Parikh
As machines have become more intelligent, there has been a renewed interest in methods for measuring their intelligence. A common approach is to propose tasks for which a human excels, but one which machines find difficult. However, an ideal task should also be easy to evaluate and not be easily gameable. We begin with a case study exploring the recently popular task of image captioning and its limitations as a task for measuring machine intelligence. An alternative and more promising task is Visual Question Answering that tests a machines ability to reason about language and vision. We describe a dataset unprecedented in size created for the task that contains over 760,000 human generated questions about images. Using around 10 million human generated answers, machines may be easily evaluated.
Proceedings of SPIE | 2014
Ronnie Das; Aishwarya Agrawal; Melissa P. Upton; Eric J. Seibel
The pancreas is a deeply seated organ requiring endoscopically, or radiologically guided biopsies for tissue diagnosis. Current approaches include either fine needle aspiration biopsy (FNA) for cytologic evaluation, or core needle biopsies (CBs), which comprise of tissue cores (L = 1-2 cm, D = 0.4-2.0 mm) for examination by brightfield microscopy. Between procurement and visualization, biospecimens must be processed, sectioned and mounted on glass slides for 2D visualization. Optical information about the native tissue state can be lost with each procedural step and a pathologist cannot appreciate 3D organization from 2D observations of tissue sections 1-8 μm in thickness. Therefore, how might histological disease assessment improve if entire, intact CBs could be imaged in both brightfield and 3D? CBs are mechanically delicate; therefore, a simple device was made to cut intact, simulated CBs (L = 1-2 cm, D = 0.2-0.8 mm) from porcine pancreas. After CBs were laid flat in a chamber, z-stack images at 20x and 40x were acquired through the sample with and without the application of an optical clearing agent (FocusClear®). Intensity of transmitted light increased by 5-15x and islet structures unique to pancreas were clearly visualized 250-300 μm beneath the tissue surface. CBs were then placed in index matching square capillary tubes filled with FocusClear® and a standard optical clearing agent. Brightfield z-stack images were then acquired to present 3D visualization of the CB to the pathologist.
international conference on signal processing | 2014
Aishwarya Agrawal; Shanmuganathan Raman
This paper presents a novel approach for tone mapping of high dynamic range (HDR) images deriving inspiration from local binary patterns. We propose two operators both of which are based on two different local contrast measures. We show that both the operators are computationally simple and successfully compress the high dynamic range, preserving the details in higher as well as lower intensity regions of the HDR image. We present tone mapped low dynamic range (LDR) images for ten existing operators and compare them with the tone mapped images of the proposed operators. We also use an online metric to compare the contrast differences between the HDR image and the LDR images produced by the proposed operators as well as the existing operators. We also show that the operators proposed in this paper are comparable to the reference operators in terms of execution time. We conclude the paper with directions for future research in tone mapping.
empirical methods in natural language processing | 2016
Aishwarya Agrawal; Dhruv Batra; Devi Parikh
arXiv: Computer Vision and Pattern Recognition | 2017
Aishwarya Agrawal; Aniruddha Kembhavi; Dhruv Batra; Devi Parikh
computer vision and pattern recognition | 2018
Aishwarya Agrawal; Dhruv Batra; Devi Parikh; Aniruddha Kembhavi
empirical methods in natural language processing | 2016
Gordon Christie; Ankit Laddha; Aishwarya Agrawal; Stanislaw Antol; Yash Goyal; Kevin Kochersberger; Dhruv Batra
neural information processing systems | 2018
Sainandan Ramakrishnan; Aishwarya Agrawal; Stefan Lee
Computer Vision and Image Understanding | 2017
Gordon Christie; Ankit Laddha; Aishwarya Agrawal; Stanislaw Antol; Yash Goyal; Kevin Kochersberger; Dhruv Batra