Prithwijit Guha
Indian Institute of Technology Guwahati
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Prithwijit Guha.
advanced video and signal based surveillance | 2011
Prithwijit Guha; Amitabha Mukerjee; Venkatesh K. Subramanian
Occlusion is often thought of as a challenge for visual algorithms, specially tracking. Existing literature, however, has identified a number of occlusion categories in the context of tracking in ad hoc manner. We propose a systematic approach to formulate a set of occlusion cases by considering the spatial relations among object support(s) (projections on the image plane) with the detected foreground blob(s), to show that only 7 occlusion states are possible. We designate the resulting qualitative formalism as Oc-7, and show how these occlusion states can be detected and used effectively for the task of multi-object tracking under occlusion of various types. The object support is decomposed into overlapping patches which are tracked independently on the occurrence of occlusions. As a demonstration of the application of these occlusion states, we propose a reasoning scheme for selective tracker execution and object feature updates to track multiple objects in complex environments.
asian conference on computer vision | 2006
Prithwijit Guha; Dibyendu Palai; K. S. Venkatesh; Amitabha Mukerjee
Background subtraction is an essential task in several static camera based computer vision systems. Background modeling is often challenged by spatio-temporal changes occurring due to local motion and/or variations in illumination conditions. The background model is learned from an image sequence in a number of stages, viz. preprocessing, pixel/region feature extraction and statistical modeling of feature distribution. A number of algorithms, mainly focusing on feature extraction and statistical modeling have been proposed to handle the problems and comparatively little exploration has occurred at the preprocessing stage. Motivated by the fact that disturbances caused by local motions disappear at lower resolutions, we propose to represent the images at multiple scales in the preprocessing stage to learn a pyramid of background models at different resolutions. During operation, foreground pixels are detected first only at the lowest resolution, and only these pixels are further analyzed at higher resolutions to obtain a precise silhouette of the entire foreground blob. Such a scheme is also found to yield a significant reduction in computation. The second contribution in this paper involves the use of the co-linearity statistic (introduced by Mester et al. for the purpose of illumination independent change detection in consecutive frames) as a pixel neighborhood feature by assuming a linear model with a signal modulation factor and additive noise. The use of co-linearity statistic as a feature has shown significant performance improvement over intensity or combined intensity-gradient features. Experimental results and performance comparisons (ROC curves) for the proposed approach with other algorithms show significant improvements for several test sequences.
international conference on advanced robotics | 2005
Prithwijit Guha; Dibyendu Palai; Dip Goswami; Amitabha Mukerjee
Active video surveillance systems provide challenging research issues in the interface of computer vision, pattern recognition and control system analysis. A significant part of such systems is devoted toward active camera control for efficient target tracking. DynaTracker is a pan-tilt device based active camera system for maintaining continuous track of the moving target, while keeping the same at a pre-specified region (typically, the center) of the image. The significant contributions in this work are the use of mean-shift algorithm for visual tracking and the derivation of the error dynamics for a proportional-integral control action. The stability analysis and optimal controller gain selections are performed from the simulation studies of the derived error dynamics. Simulation predictions are also validated from the results of practical experimentations. The present implementation of DynaTracker performs on a standard Pentium IV PC at an average speed of 10 frames per second while operating on color images of 320times240 resolution
conference on multimedia modeling | 2016
Raghvendra Kannao; Prithwijit Guha
Classification problems using multiple kernel learning MKL algorithms achieve superior performance on account of using a weighted combination of base kernels on feature sub-sets. Each of the base kernels are characterized by the similarity measures defined over the feature sub-sets. Existing works in MKL have mostly used fixed weights which are shown to be related to the overall discriminative capability of corresponding base kernels. We argue that this class discrimination ability of a kernel is a local phenomenon and thus, advocate the necessity of using instance dependent functions for weighing the kernels. We propose a new framework for learning such weighing functions linked to ability of kernels to discriminate in the local regions of the feature space. During training, we first identify the regions of success in the feature sub-spaces, where the base kernels have high likelihood of success. These regions are identified by evaluating the performance of support vector machines SVM trained using corresponding single base kernels. The weighing functions are then estimated by using support vector regression SVR. The target for SVRs is set to 1.0 for the successfully classified patterns and to 0.0, otherwise. The second contribution of this work is the construction and public domain release of a commercial detection dataset of 150 hours, acquired from 5 different TV news channels. Empirical results on 8 standard datasets and our own TV commercial detection dataset have shown the superiority of the proposed scheme of multiple kernel learning.
indian conference on computer vision, graphics and image processing | 2014
Apoorv Vyas; Raghvendra Kannao; Vineet Bhargava; Prithwijit Guha
Automatic identification and extraction of commercial blocks in telecast news videos find a lot of applications in the domain of broadcast monitoring. Existing works in this domain have used channel specific assumptions, machine learning techniques and frequentist approaches for detecting commercial video segments. We note that in the Indian context, several channel specific assumptions do not hold and often news and commercials have comparable frequencies of occurrence. This motivates us to use the machine learning techniques for classifying commercials in news videos. Our main contribution lies in the proposal of two features which are shown to outperform the existing audio-visual features – first, the MFCC bag of words (BoW) as audio track feature and second, overlaid text distribution as video shot feature. The shot feature space is further extended by appending contextual features which are categorized by SVM based classifiers. Additionally, we have used a post-processing stage to suppress the false positives. We have experimented with 54 hours of video acquired from three different Indian English based news channels and have obtained a F-measure of around 97%.
pacific rim international conference on artificial intelligence | 2006
Prithwijit Guha; Amitabha Mukerjee; K. S. Venkatesh
Agents entering the field of view can undergo two different forms of occlusions, either caused by crowding or due to obstructions by background objects at finite distances from the camera. This work aims at identifying the nature of occlusions encountered in multi-agent tracking by using a set of qualitative primitives derived on the basis of the Persistence Hypothesis - objects continue to exist even when hidden from view. We construct predicates describing a comprehensive set of possible occlusion primitives including entry/exit, partial or complete occlusions by background objects, crowding and algorithm failures resulting from track loss. Instantiation of these primitives followed by selective agent feature updates enables us to develop an effective scheme for tracking multiple agents in relatively unconstrained environments. The agents are primarily detected as foreground blobs and are characterized by their centroid trajectory and a non-parametric appearance model learned over the associated pixel co-ordinate and color space. The agents are tracked through a three stage process of motion based prediction, agent-blob association with occlusion primitive identification and appearance model aided agent localization for the occluded ones. The occluded agents are localized within associated foreground regions by a process of iterative foreground pixel assignment to agents followed by their centroid update. Satisfactory tracking performance is observed by employing the proposed algorithm on a traffic video sequence containing complex multi-agent interactions.
international conference on robotics and automation | 2006
Tripuresh Mishra; Prithwijit Guha; Ashish Dutta; K. S. Venkatesh
A novel approach to real-time tracking of three-finger planar grasp points for deforming objects is proposed. The search space of possible grasping configurations is reduced in two stages - firstly, by fixing one finger at the boundary point nearest to the object centroid and secondly, through a heuristic partitioning of the object boundary where the remaining two fingers are localized. The potential grasping configurations satisfying force closure conditions are evaluated through an objective function that maximizes the grasping span while minimizing the distance between the object centroid and the intersection of the contour normals at the finger contact points. A population based stochastic search strategy is adopted for computing the optimal grasping configurations and re-localizing them as the shape undergoes drastic translations, rotations, scaling and local deformations. Experimental results of grasp point tracking are presented for deforming planar shapes extracted from both real and synthetic image sequences. The current implementation of the proposed scheme operates at 10 Hz for grasp point tracking on shapes extracted through visual feedback
international conference on pattern recognition | 2006
Prithwijit Guha; Amitabha Mukerjee; K. S. Venkatesh; Pabitra Mitra
Multi-agent interactions often result in mutual occlusion sequences which constitute a visual signature for the event. We define six qualitative occlusion primitives based on the persistence hypothesis (objects continue to exist even when hidden from view): isolated, occlude with foreground, occlude by background, disappear, enter and exit. Variable length temporal sequences of occlusion primitives are shown to be useful features for categorizing many classes of semantically significant events. Occlusion primitive labels depend on agent positions in the image, which are determined by combining foreground blob tracking and image motion. No prior knowledge of domain or camera calibration is necessary. New foreground blobs are identified as putative agents which may undergo occlusions, split into multiple agents, merge back again, etc. Transition sequences are mined to identify semantic categories (e.g., people disembarking from a vehicle involve a series of splits). Occlusion features alone may be useful for distinguishing some broad categories of interaction states, and together with features such as agent shape and motion histories, these form a rich signature for different event types that can be classified without camera calibration or any environment/agent/action model priors
Pattern Recognition | 2017
Raghvendra Kannao; Prithwijit Guha
Base kernels have local Regions of Success (RoS) or expertise in feature space.RoS for each base kernel is identified during training from cross-validation set.These RoS are modeled using regression in terms of Success Prediction Functions (SPF).SPFs are used as instance dependent weighing functions of base kernels in MKL framework.Proposed weighing scheme maximizes alignment with Ideal Kernel. Multiple Kernel Learning (MKL) literature has mostly focused on learning weights for base kernel combiners. Recent works using instance dependent weights have resulted in better performance compared to fixed weight MKL approaches. This may be attributed to the fact that, different base kernels have varying discriminative capabilities in distinct local regions of input space. We refer to the zones of classification expertize of base kernels as their Regions of Success (RoS). We propose to identify and model them (during training) through a set of instance dependent success prediction functions (SPF) having high values in RoS (and low, otherwise). During operation, the use of these SPFs as instance dependent weighing functions promotes locally discriminative base kernels while suppressing others. We have experimented with 21 benchmark datasets from various domains having large variations in terms of dataset size, interclass imbalances and number of features. Our proposal has achieved higher classification rates and balanced performance (for both positive and negative classes) compared to other instance dependent and fixed weight approaches.
international conference on pattern recognition | 2016
Tanmay Shankar; S.K. Dwivedy; Prithwijit Guha
Deep Reinforcement Learning has enabled the learning of policies for complex tasks in partially observable environments, without explicitly learning the underlying model of the tasks. While such model-free methods do achieve considerable performance, they often ignore the structure of task. We present a more natural representation of the solutions to Reinforcement Learning (RL) problems, within 3 Recurrent Convolutional Neural Network (RCNN) architectures to better exploit this inherent structure. The forward passes of each RCNN execute an efficient Value Iteration, propagate beliefs of state in partially observable environments, and choose optimal actions respectively. Applying back-propagation to these RCNNs allows the system to explicitly learn the Transition Model and Reward Function associated with the underlying MDP, serving as an elegant alternative to classical model-based RL. We evaluate the proposed algorithms in simulation, considering a robot planning problem. We demonstrate the capability of our framework to reduce the cost of re-planning, learn accurate MDP models, and finally re-plan with learned models to achieve near-optimal policies.