Sukhendu Das
Indian Institute of Technology Madras
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sukhendu Das.
Iete Technical Review | 2010
Utthara Gosa Mangai; Suranjana Samanta; Sukhendu Das; Pinaki Roy Chowdhury
Abstract For any pattern classification task, an increase in data size, number of classes, dimension of the feature space, and interclass separability affect the performance of any classifier. A single classifier is generally unable to handle the wide variability and scalability of the data in any problem domain. Most modern techniques of pattern classification use a combination of classifiers and fuse the decisions provided by the same, often using only a selected set of appropriate features for the task. The problem of selection of a useful set of features and discarding the ones which do not provide class separability are addressed in feature selection and fusion tasks. This paper presents a review of the different techniques and algorithms used in decision fusion and feature fusion strategies, for the task of pattern classification. A survey of the prominent techniques used for decision fusion, feature selection, and fusion techniques has been discussed separately. The different techniques used for fusion have been categorized based on the applicability and methodology adopted for classification. A novel framework has been proposed by us, combining both the concepts of decision fusion and feature fusion to increase the performance of classification. Experiments have been done on three benchmark datasets to prove the robustness of combining feature fusion and decision fusion techniques.
IEEE Transactions on Geoscience and Remote Sensing | 2011
Sukhendu Das; T. T. Mirnalinee; Koshy Varghese
The process of road extraction from high-resolution satellite images is complex, and most researchers have shown results on a few selected set of images. Based on the satellite data acquisition sensor and geolocation of the region, the type of processing varies and users tune several heuristic parameters to achieve a reasonable degree of accuracy. We exploit two salient features of roads, namely, distinct spectral contrast and locally linear trajectory, to design a multistage framework to extract roads from high-resolution multispectral satellite images. We trained four Probabilistic Support Vector Machines separately using four different categories of training samples extracted from urban/suburban areas. Dominant Singular Measure is used to detect locally linear edge segments as potential trajectories for roads. This complimentary information is integrated using an optimization framework to obtain potential targets for roads. This provides decent results in situations only when the roads have few obstacles (trees, large vehicles, and tall buildings). Linking of disjoint segments uses the local gradient functions at the adjacent pair of road endings. Region part segmentation uses curvature information to remove stray nonroad structures. Medial-Axis-Transform-based hypothesis verification eliminates connected nonroad structures to improve the accuracy in road detection. Results are evaluated with a large set of multispectral remotely sensed images and are compared against a few state-of-the-art methods to validate the superior performance of our proposed method.
computer vision computer graphics collaboration techniques | 2011
Himanshu Prakash Jain; Anbumani Subramanian; Sukhendu Das; Anurag Mittal
Automatic detection and pose estimation of humans is an important task in Human-Computer Interaction (HCI), user interaction and event analysis. This paper presents a model based approach for detecting and estimating human pose by fusing depth and RGB color data from monocular view. The proposed system uses Haar cascade based detection and template matching to perform tracking of the most reliably detectable parts namely, head and torso. A stick figure model is used to represent the detected body parts. The fitting is then performed independently for each limb, using the weighted distance transform map. The fact that each limb is fitted independently speeds-up the fitting process and makes it robust, avoiding the combinatorial complexity problems that are common with these types of methods. The output is a stick figure model consistent with the pose of the person in the given input image. The algorithm works in real-time and is fully automatic and can detect multiple non-intersecting people.
indian conference on computer vision, graphics and image processing | 2007
A. Pavan Kumar; V. Kamakoti; Sukhendu Das
In this paper, the design of a parallel architecture for on-line face recognition using weighted modular principal component analysis (WMPCA) and its system-on-programmable-chip (SoPC) implementation are discussed. The WMPCA methodology, proposed by us earlier, is based on the assumption that the rates of variation of the different regions of a face are different due to variations in expression and illumination. Given a database of sample faces for training and a query face for recognizing, the WMPCA methodology involves division of the face into horizontal regions. Each of these regions are analyzed independently by computing the eigenfeatures and comparing the same with the corresponding eigenfeatures of the faces stored in the sample database to calculate the corresponding error. The final decision of the face recognizer is based on the weighted sum of the errors computed from each of the regions. These weights are calculated based on the extent to which the various samples of the subject are spread in the eigenspace. The WMPCA methodology has a better recognition rate compared to the modular PCA approach developed by Rajkiran and Vijayan [Rajkiran, G., Vijayan, K., 2004. An improved face recognition technique based on modular PCA approach. Pattern Recognition Letters, 25(4), 429-436]. The methodology also has a wide scope for parallelism. We present an architecture that exploits this parallelism and implement the same as a system-on-programmable-chip on an ALTERA based field programmable gate array (FPGA) platform. The implementation has achieved a processing speed of about 26 frames per second at an operating frequency of 33.33MHz.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
A. Dyana; Sukhendu Das
We present a novel spatio-temporal descriptor to efficiently represent a video object for the purpose of content-based video retrieval. Features from spatial along with temporal information are integrated in a unified framework for the purpose of retrieval of similar video shots. A sequence of orthogonal processing, using a pair of 1-D multiscale and multispectral filters, on the space-time volume (STV) of a video object (VOB) produces a gradually evolving (smoother) surface. Zero-crossing contours (2-D) computed using the mean curvature on this evolving surface are stacked in layers to yield a hilly (3-D) surface, for a joint multispectro-temporal curvature scale space (MST-CSS) representation of the video object. Peaks and valleys (saddle points) are detected on the MST-CSS surface for feature representation and matching. Computation of the cost function for matching a query video shot with a model involves matching a pair of 3-D point sets, with their attributes (local curvature), and 3-D orientations of the finally smoothed STV surfaces. Experiments have been performed with simulated and real-world video shots using precision-recall metric for our performance study. The system is compared with a few state-of-the-art methods, which use shape and motion trajectory for VOB representation. Our unified approach has shown better performance than other approaches that use combined match-costs obtained with separate shape and motion trajectory representations and our previous work on a simple joint spatio-temporal descriptor (3-D-CSS).
EURASIP Journal on Advances in Signal Processing | 2007
Lalit Gupta; Vinod Pathangay; Arpita Patra; A. Dyana; Sukhendu Das
We propose a method for indoor versus outdoor scene classification using a probabilistic neural network (PNN). The scene is initially segmented (unsupervised) using fuzzy-means clustering (FCM) and features based on color, texture, and shape are extracted from each of the image segments. The image is thus represented by a feature set, with a separate feature vector for each image segment. As the number of segments differs from one scene to another, the feature set representation of the scene is of varying dimension. Therefore a modified PNN is used for classifying the variable dimension feature sets. The proposed technique is evaluated on two databases: IITM-SCID2 (scene classification image database) and that used by Payne and Singh in 2005. The performance of different feature combinations is compared using the modified PNN.
ieee international conference on automatic face & gesture recognition | 2008
Vinod Pathangay; Sukhendu Das; Thomas Greiner
In this paper, a geometric method for estimating the face pose (roll and yaw angles) from a single uncalibrated view is presented. The symmetric structure of the human face is exploited by taking the mirror image (horizontal flip) of a test face image as a virtual second view. Facial feature point correspondences are established between the given test and its mirror image using an active appearance model. Thus, the face pose estimation problem is cast as a two-view rotation estimation problem. By using the bilateral symmetry, roll and yaw angles are estimated without the need for camera calibration. The proposed pose estimation method is evaluated on synthetic and natural face datasets, and the results are compared with an eigenspace-based method. It is shown that the proposed symmetry-based method shows performance that is comparable to the eigenspace-based method for both synthetic and real face image datasets.
Pattern Recognition Letters | 2009
A. Dyana; Sukhendu Das
This paper proposes a Gabor filter based representation of motion trajectory, for the purpose of motion-based video retrieval. We propose a spectro-temporal representation of the trajectory, which involves the process of detecting a set of salient points from the peaks (locally) of the Gabor filter responses. The change in trajectory direction is also represented by observing the clockwise or anti-clockwise change in direction at the salient points. The feature set (formed by the frequency, temporal location and turning direction at each salient point) provides a semantic representation of the trajectory. Our approach is a global trajectory representation where matching is performed based on edit distance and is shown to perform well even for partial trajectory matching. The system is tested using two benchmark databases of trajectories, as well as various hand-drawn and partial trajectories. We have also experimented on real world videos. Experimental results have shown better performance than existing systems based on Fourier descriptors, polynomial representation and two state-of-the-art methods of symbolic representations based on PCA and characteristics of movement.
International Journal of Computer Mathematics | 2002
Raman Balasubramanian; Sukhendu Das; Swaminathan Udayabaskaran; Krishnan Swaminathan
In this paper a stochastic analysis of the quantization error in a stereo imaging system has been presented. Further the probability density function of the range estimation error and the expected value of the range error magnitude are derived in terms of various design parameters. Further the relative range error is proposed.
international conference on neural information processing | 2004
A. Pavan Kumar; Sukhendu Das; V. Kamakoti
A method of face recognition using a weighted modular principle component analysis (WMPCA) is presented in this paper. The proposed methodology has a better recognition rate, when compared with conventional PCA, for faces with large variations in expression and illumination. The face is divided into horizontal sub-regions such as forehead, eyes, nose and mouth. Then each of them are separately analyzed using PCA. The final decision is taken based on a weighted sum of errors obtained from each sub-region.A method is proposed, to calculate these weights, which is based on the assumption that different regions in a face vary at different rates with expression, pose and illumination.