Gautam Singh | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gautam Singh is active.

Explore More

Publication

Featured researches published by Gautam Singh.

computer vision and pattern recognition | 2013

Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

Gautam Singh; Jana Kosecka

This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.

IEEE Transactions on Robotics | 2013

Localization in Urban Environments Using a Panoramic Gist Descriptor

Ana C. Murillo; Gautam Singh; Jana Kosecka; José Jesús Guerrero

Vision-based topological localization and mapping for autonomous robotic systems have received increased research interest in recent years. The need to map larger environments requires models at different levels of abstraction and additional abilities to deal with large amounts of data efficiently. Most successful approaches for appearance-based localization and mapping with large datasets typically represent locations using local image features. We study the feasibility of performing these tasks in urban environments using global descriptors instead and taking advantage of the increasingly common panoramic datasets. This paper describes how to represent a panorama using the global gist descriptor, while maintaining desirable invariance properties for location recognition and loop detection. We propose different gist similarity measures and algorithms for appearance-based localization and an online loop-closure detection method, where the probability of loop closure is determined in a Bayesian filtering framework using the proposed image representation. The extensive experimental validation in this paper shows that their performance in urban environments is comparable with local-feature-based approaches when using wide field-of-view images.

international conference on robotics and automation | 2012

Acquiring semantics induced topology in urban environments

Gautam Singh; Jana Kosecka

Methods for acquisition and maintenance of an environment model are central to a broad class of mobility and navigation problems. Towards this end, various metric, topological or hybrid models have been proposed. Due to recent advances in sensing and recognition, acquisition of semantic models of the environments have gained increased interest in the community. In this work, we will demonstrate a capability of using weak semantic models of the environment to induce different topological models, capturing the spatial semantics of the environment at different levels. In the first stage of the model acquisition, we propose to compute semantic layout of the street scenes imagery by recognizing and segmenting buildings, roads, sky, cars and trees. Given such semantic layout, we propose an informative feature characterizing the layout and train a classifier to recognize street intersections in challenging urban inner city scenes. We also show how the evidence of different semantic concepts can induce useful topological representation of the environment, which can aid navigation and localization tasks. To demonstrate the approach, we carry out experiments on a challenging dataset of omnidirectional inner city street views and report the performance of both semantic segmentation and intersection classification.

The International Journal of Robotics Research | 2012

Semantic parsing of street scenes from video

Branislav Micusik; Jana Kosecka; Gautam Singh

Semantic models of the environment can significantly improve navigation and decision making capabilities of autonomous robots or enhance level of human and robot interaction. We present a novel approach for semantic segmentation of street scene images into coherent regions, while simultaneously categorizing each region as one of the predefined categories representing commonly encountered object and background classes. We formulate the segmentation on small blob-based superpixels and exploit a visual vocabulary tree as an intermediate image representation. The main novelty of our approach is the introduction of an explicit model of spatial co-occurrence of visual words associated with superpixels and utilization of appearance, geometry and contextual cues in a probabilistic framework. We demonstrate how individual cues contribute towards global segmentation accuracy and how their combination yields superior performance compared with the best known method on the challenging benchmark dataset which exhibits diversity of street scenes with varying viewpoints, a large number of categories, captured in daylight and dusk.

Large-Scale Visual Geo-Localization | 2016

Semantically Guided Geo-location and Modeling in Urban Environments

Gautam Singh; Jana Kosecka

The problem of localization and geo-location estimation of an image has a long-standing history both in robotics and computer vision. With the advent of availability of large amounts of geo-referenced image data, several image retrieval approaches have been deployed to tackle this problem. In this work, we will show how the capability of semantic labeling of both query views and the reference dataset by means of semantic segmentation can aid (1) the problem of retrieval of views similar and possibly overlapping with the query and (2) guide the recognition and discovery of commonly occurring scene layouts in the reference dataset. We will demonstrate the effectiveness of these semantic representations on examples of localization, semantic concept discovery, and intersection recognition in the images of urban scenes.

workshop on applications of computer vision | 2014

Introspective semantic segmentation

Gautam Singh; Jana Kosecka

Traditional approaches for semantic segmentation work in a supervised setting assuming a fixed number of semantic categories and require sufficiently large training sets. The performance of various approaches is often reported in terms of average per pixel class accuracy and global accuracy of the final labeling. When applying the learned models in the practical settings on large amounts of unlabeled data, possibly containing previously unseen categories, it is important to properly quantify their performance by measuring a classifiers introspective capability. We quantify the confidence of the region classifiers in the context of a non-parametric k-nearest neighbor (k-NN) framework for semantic segmentation by using the so called strangeness measure. The proposed measure is evaluated by introducing confidence based image ranking and showing its feasibility on a dataset containing a large number of previously unseen categories.

international conference on computer vision | 2011

Recognizing manipulation actions in arts and crafts shows using domain-specific visual and textual cues

Benjamin Sapp; Rizwan Chaudhry; Xiaodong Yu; Gautam Singh; Ian Perera; Francis Ferraro; Evelyne Tzoukermann; Jana Kosecka; Jan Neumann

We present an approach for automatic annotation of commercial videos from an arts-and-crafts domain with the aid of textual descriptions. The main focus is on recognizing both manipulation actions (e.g. cut, draw, glue) and the tools that are used to perform these actions (e.g. markers, brushes, glue bottle). We demonstrate how multiple visual cues such as motion descriptors, object presence, and hand poses can be combined with the help of contextual priors that are automatically extracted from associated transcripts or online instructions. Using these diverse features and linguistic information we propose several increasingly complex computational models for recognizing elementary manipulation actions and composite activities, as well as their temporal order. The approach is evaluated on a novel dataset of comprised of 27 episodes of PBS Sprout TV, each containing on average 8 manipulation actions.

Archive | 2010