Binu Muraleedharan Nair

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Binu Muraleedharan Nair is active.

Explore More

Publication

Featured researches published by Binu Muraleedharan Nair.

Proceedings of SPIE | 2014

Optical flow based Kalman filter for body joint prediction and tracking using HOG-LBP matching

Binu Muraleedharan Nair; Kimberley D. Kendricks; Vijayan K. Asari; Ronald F. Tuttle

We propose a real-time novel framework for tracking specific joints in the human body on low resolution imagery using optical flow based Kalman tracker without the need of a depth sensor. Body joint tracking is necessary for a variety of surveillance based applications such as recognizing gait signatures of individuals, identifying the motion patterns associated with a particular action and the corresponding interactions with objects in the scene to classify a certain activity. The proposed framework consists of two stages; the initialization stage and the tracking stage. In the initialization stage, the joints to be tracked are either manually marked or automatically obtained from other joint detection algorithms in the first few frames within a window of interest and appropriate image descriptions of each joint are computed. We employ the use of a well-known image coding scheme known as the Local Binary Patterns (LBP) to represent the joint local region where this image coding removes the variance to non-uniform lighting conditions as well as enhances the underlying edges and corner. The image descriptions of the joint region would then include a histogram computed from the LBP-coded ROI and a HOG (Histogram of Oriented Gradients) descriptor to represent the edge information. Next the tracking stage can be divided into two phases: Optical flow based detection of joints in corresponding frames of the sequence and prediction /correction phases of Kalman tracker with respect to the joint coordinates. Lucas Kanade optical flow is used to locate the individual joints in consecutive frames of the video based on their location in the previous frame. But more often, mismatches can occur due to the rotation of the joint region and the rotation variance of the optical flow matching technique. The mismatch is then determined by comparing the joint region descriptors using Chi-squared metric between a pair of frames and depending on this statistic, either the prediction phase or the correction phase of the corresponding Kalman filter is called. The Kalman filter for each joint is modeled and designed based on a linear approximation of the joint trajectory where its true form is mostly sinusoidal in fashion. The framework is tested on a private dataset provided by Air Force Institute of Technology. This dataset consists of a total of 21 video sequences, with each sequence containing an individual walking across the face of the building and climbing up / down a flight of stairs. The challenges associated in this dataset are the very low-resolution imagery along with some interlacing effects. The algorithm has being successfully tested on some sequences of this dataset and three joints mainly, the shoulder, the hip and the elbow are tracked successfully within a window of interest. Future work will involve using these three perfectly trackable joints to estimate positions of other joints which are difficult to track due to their small size and occlusions.

systems, man and cybernetics | 2013

Regression Based Learning of Human Actions from Video Using HOF-LBP Flow Patterns

Binu Muraleedharan Nair; Vijayan K. Asari

A human action recognition framework is proposed which models motion variations corresponding to a particular class of actions without the need for sequence length normalization. The motion descriptors used in this framework are based on the optical flow vectors computed at every point on the silhouette of the human body. Histogram of flow(HOF) is computed from the optical flow vectors and these give the motion orientation in a local neighborhood. To get a relationship between the motion vectors at a particular instant, the magnitude and direction of the optical flow vector are coded with local binary patterns(LBP). The concatenation of these histograms(HOF-LBP) are considered as the action feature set to be used in the proposed framework. We illustrate that this motion descriptor is suitable for classifying various human actions when used in conjunction with the proposed action recognition framework which models the motion variations in time for each class using regression based techniques. The feature vectors extracted from the training set are suitably mapped to a lower dimensional space using Empirical Orthogonal Functional Analysis. A regression based technique such as Generalized Regression Neural Networks(GRNN), are used to compute the functional mapping from the action feature vectors to its reduced Eigenspace representation for each class, thereby obtaining separate action manifolds. The feature set obtained from a test sequence are compared with each of the action manifolds by comparing the test coefficients with the ones corresponding to the manifold (as estimated by GRNN) to determine the class using Mahalanobis distance.

international conference industrial engineering other applications applied intelligent systems | 2012

Time invariant gesture recognition by modelling body posture space

Binu Muraleedharan Nair; Vijayan K. Asari

We propose a framework for recognizing actions or gestures by modelling variations of the corresponding shape postures with respect to each action class thereby removing the need for normalization for the speed of motion. The three main aspects are the shape descriptor suitable for describing its posture, the formation of a suitable posture space, and a regression mechanism to model the posture variations with respect to each action class. Histogram of gradients(HOG) is used as the shape descriptor with the variations being mapped to a reduced Eigenspace by PCA. The mapping of each action class from the HOG space to the reduced Eigen space is done using GRNN. Classification is performed by comparing the points on the Eigen space to those determined by each of the action model using Mahalanobis distance. The framework is evaluated on Weizmann action dataset and Cambridge Hand Gesture dataset providing significant and positive results.

Procedia Computer Science | 2011

Multi-Pose Face Recognition And Tracking System

Binu Muraleedharan Nair; Jacob Foytik; Richard C. Tompkins; Yakov Diskin; Theus H. Aspiras; Vijayan K. Asari

Abstract We propose a real time system for person detection, recognition and tracking using frontal and profile faces. The system integrates face detection, face recognition and tracking techniques. The face detection algorithm uses both frontal face and profile face detectors by extracting the Haar’ features and uses them in a cascade of boosted classifiers. The pose is determined from the face detection algorithm which uses a combination of profile and frontal face cascades and, depending on the pose, the face is compared with a particular set of faces having the same range for classification. The detected faces are recognized by projecting them onto the Eigenspace obtained from the training phase using modular weighted PCA and then, are tracked using the Kalman filter multiple face tracker. In this proposed system, the pose range is divided into three bins onto which the faces are sorted and each bin is trained separately to have its own Eigenspace. This system has the advantage of recognizing and tracking an individual with minimum false positives due to pose variations.

applied imagery pattern recognition workshop | 2013

Vision-based navigation system for obstacle avoidance in complex environments

Yakov Diskin; Binu Muraleedharan Nair; Andrew Braun; Solomon Duning; Vijayan K. Asari

We present a mobile system capable of autonomous navigation through complex unknown environments that contain stationary obstacles and moving targets. The intelligent system is composed of several fine-tuned computer vision algorithms running onboard in real-time. The first of these utilizes onboard cameras to allow for stereoscopic estimation of depths within the surrounding environment. The novelty of the approach lies in algorithmic efficiency and the ability of the system to complete a given task through the utilization of scene reconstruction and in making real-time automated decisions. Secondly, the system performs human body detection and recognition using advanced local binary pattern (LBP) descriptors. The LBP descriptors allow the system to perform human identification and tracking tasks irrespective of lighting conditions. Lastly, face detection and recognition allow for an additional layer of biometrics to ensure the correct target is being tracked. The face detection algorithm utilizes the Voila-Jones cascades, which are combined to create a pose invariant face detection system. Furthermore, we utilize a modular principal component analysis technique to perform pose-invariant face recognition. In this paper, we present the results of a series of experiments designed to automate the security patrol process. Our mobile security system completes a series of tasks within varying scenarios that range in difficulty. The tasks consist of tracking an object in an open environment, following a person of interest through a crowded environment, and following a person who disappears around a corner.

Proceedings of SPIE | 2013

Intrusion detection on oil pipeline right of way using monogenic signal representation

Binu Muraleedharan Nair; Varun Santhaseelan; Chen Cui; Vijayan K. Asari

We present an object detection algorithm to automatically detect and identify possible intrusions such as construction vehicles and equipment on the regions designated as the pipeline right-of-way (ROW) from high resolution aerial imagery. The pipeline industry has buried millions of miles of oil pipelines throughout the country and these regions are under constant threat of unauthorized construction activities. We propose a multi-stage framework which uses a pyramidal template matching scheme in the local phase domain by taking a single high resolution training image to classify a construction vehicle. The proposed detection algorithm makes use of the monogenic signal representation to extract the local phase information. Computing the monogenic signal from a two dimensional object region enables us to separate out the local phase information (structural details) from the local energy (contrast) thereby achieving illumination invariance. The first stage involves the local phase based template matching using only a single high resolution training image in a local region at multiple scales. Then, using the local phase histogram matching, the orientation of the detected region is determined and a voting scheme gives a certain weightage to the resulting clusters. The final stage involves the selection of clusters based on the number of votes attained and using the histogram of oriented phase feature descriptor, the object is located at the correct orientation and scale. The algorithm is successfully tested on four different datasets containing imagery with varying image resolution and object orientation.

Proceedings of SPIE | 2012

Multi-modal low cost mobile indoor surveillance system on the Robust Artificial Intelligence-based Defense Electro Robot (RAIDER)

Binu Muraleedharan Nair; Yakov Diskin; Vijayan K. Asari

We present an autonomous system capable of performing security check routines. The surveillance machine, the Clearpath Husky robotic platform, is equipped with three IP cameras with different orientations for the surveillance tasks of face recognition, human activity recognition, autonomous navigation and 3D reconstruction of its environment. Combining the computer vision algorithms onto a robotic machine has given birth to the Robust Artificial Intelligencebased Defense Electro-Robot (RAIDER). The end purpose of the RAIDER is to conduct a patrolling routine on a single floor of a building several times a day. As the RAIDER travels down the corridors off-line algorithms use two of the RAIDERs side mounted cameras to perform a 3D reconstruction from monocular vision technique that updates a 3D model to the most current state of the indoor environment. Using frames from the front mounted camera, positioned at the human eye level, the system performs face recognition with real time training of unknown subjects. Human activity recognition algorithm will also be implemented in which each detected person is assigned to a set of action classes picked to classify ordinary and harmful student activities in a hallway setting.The system is designed to detect changes and irregularities within an environment as well as familiarize with regular faces and actions to distinguish potentially dangerous behavior. In this paper, we present the various algorithms and their modifications which when implemented on the RAIDER serves the purpose of indoor surveillance.

international symposium on visual computing | 2014

Body Joint Tracking in Low Resolution Video Using Region-Based Filtering

Binu Muraleedharan Nair; Kimberly D. Kendricks; Vijayan K. Asari; Ronald F. Tuttle

We propose a region-based body joint tracking scheme to track and estimate continuous joint locations in low resolution imagery where the estimated trajectories can be analyzed for specific gait signatures. The true transition between the joint states are of a continuous nature and specifically follows a sinusoidal trajectory. Recent state of art techniques enables us to estimate pose at each frame from which joint locations can be deduced. But these pose estimates at low resolution are often noisy and discrete and hence not suitable for further gait analysis. Our proposed 2-level region-based tracking scheme gets a good approximation to the true trajectory and obtains finer estimates. Initial joint locations are deduced from a human pose estimation algorithm and subsequent finer locations are estimated and tracked by a Kalman filter. We test the algorithm on sequences containing individuals walking outdoors and evaluate their gait using the estimated joint trajectories.

international symposium on visual computing | 2014

Learning and Association of Features for Action Recognition in Streaming Video

Binu Muraleedharan Nair; Vijayan K. Asari

We propose a novel framework which learns and associates local motion pattern manifolds in streaming videos using generalized regression neural networks (GRNN) to facilitate real time human action recognition. The motivation is to determine an individual’s action even when the action cycle has not yet been completed. The GRNNs are trained to model the regression function of patterns in latent action space on the input local motion-shape patterns. This manifold learning makes the framework invariant to different sequence length and varying action states. Computation of latent action basis is done using EOF analysis and association of local temporal patterns to an action class at runtime follows a probabilistic formulation. This corresponds to finding the closest estimate the GRNN obtains to the corresponding action basis. Experimental results on two datasets, KTH and the UCF Sports, show accuracy of above 90% obtained from 15 to 25 frames.

international symposium on visual computing | 2016

Unsupervised Deep Networks for Temporal Localization of Human Actions in Streaming Videos

Binu Muraleedharan Nair

We propose a deep neural network which captures latent temporal features suitable for localizing actions temporally in streaming videos. This network uses unsupervised generative models containing autoencoders and conditional restricted Boltzmann machines to model temporal structure present in an action. Human motions are non-linear in nature, and thus require continuous temporal model representation of motion which are crucial for streaming videos. The generative ability would help predict features at future time steps which can give an indication of completion of action at any instant. To accumulate M classes of action, we train an autencoder to seperate out actions spaces, and learn generative models per action space. The final layer accumulates statistics from each model, and estimates action class and percentage of completion in a segment of frames. Experimental results prove that this network provides a good predictive and recognition capability required for action localization in streaming videos.

Explore More