Jake K. Aggarwal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jake K. Aggarwal is active.

Explore More

Publication

Featured researches published by Jake K. Aggarwal.

ieee nonrigid and articulated motion workshop | 1997

Human motion analysis: a review

Jake K. Aggarwal; Quin Cai

Human motion analysis is receiving increasing attention from computer vision researchers. This interest is motivated by a wide spectrum of applications, such as athletic performance analysis, surveillance, man-machine interfaces, content-based image storage and retrieval, and video conferencing. The paper gives an overview of the various tasks involved in motion analysis of the human body. The authors focus on three major areas related to interpreting human motion: 1) motion analysis involving human body parts, 2) tracking of human motion using single or multiple cameras, and 3) recognizing human activities from image sequences. Motion analysis of human body parts involves the low-level segmentation of the human body into segments connected by joints, and recovers the 3D structure of the human body using its 2D projections over a sequence of images. Tracking human motion using a single or multiple camera focuses on higher-level processing, in which moving humans are tracked without identifying specific parts of the body structure. After successfully matching the moving human image from one frame to another in image sequences, understanding the human movements or activities comes naturally, which leads to a discussion of recognizing human activities. The review is illustrated by examples.

Computer Vision and Image Understanding | 1999

Human Motion Analysis

Jake K. Aggarwal; Quin Cai

Human motion analysis is receiving increasing attention from computer vision researchers. This interest is motivated by a wide spectrum of applications, such as athletic performance analysis, surveillance, man?machine interfaces, content-based image storage and retrieval, and video conferencing. This paper gives an overview of the various tasks involved in motion analysis of the human body. We focus on three major areas related to interpreting human motion: (1) motion analysis involving human body parts, (2) tracking a moving human from a single view or multiple camera perspectives, and (3) recognizing human activities from image sequences. Motion analysis of human body parts involves the low-level segmentation of the human body into segments connected by joints and recovers the 3D structure of the human body using its 2D projections over a sequence of images. Tracking human motion from a single view or multiple perspectives focuses on higher-level processing, in which moving humans are tracked without identifying their body parts. After successfully matching the moving human image from one frame to another in an image sequence, understanding the human movements or activities comes naturally, which leads to our discussion of recognizing human activities.

systems man and cybernetics | 1989

Structure from stereo-a review

Umesh R. Dhond; Jake K. Aggarwal

The authors review major recent developments in establishing stereo correspondence for the extraction of the 3D structure of a scene. Broad categories of stereo algorithms are identified on the basis of differences in imaging geometry, matching primitives, and the computational structure used. The performance of these stereo techniques on various classes of test images is reviewed, and possible directions of future research are indicated. >

Proceedings of the IEEE | 1988

On the computation of motion from sequences of images-A review

Jake K. Aggarwal; Nagaraj Nandhakumar

Recent developments are reviewed in the computation of motion and structure of objects in a scene from a sequence of images. Two distinct paradigms are highlighted: (i) the feature-based approach and (ii) the optical-flow-based approach. The comparative merits/demerits of these approaches are discussed. The current status of research in these areas is reviewed and future research directions are indicated. >

computer vision and pattern recognition | 2012

View invariant human action recognition using histograms of 3D joints

Lu Xia; Chia-Chih Chen; Jake K. Aggarwal

In this paper, we present a novel approach for human action recognition with histograms of 3D joint locations (HOJ3D) as a compact representation of postures. We extract the 3D skeletal joint locations from Kinect depth maps using Shotton et al.s method [6]. The HOJ3D computed from the action depth sequences are reprojected using LDA and then clustered into k posture visual words, which represent the prototypical poses of actions. The temporal evolutions of those visual words are modeled by discrete hidden Markov models (HMMs). In addition, due to the design of our spherical coordinate system and the robust 3D skeleton estimation from Kinect, our method demonstrates significant view invariance on our 3D action dataset. Our dataset is composed of 200 3D sequences of 10 indoor activities performed by 10 individuals in varied views. Our method is real-time and achieves superior results on the challenging 3D action dataset. We also tested our algorithm on the MSR Action 3D dataset and our algorithm outperforms Li et al. [25] on most of the cases.

international conference on computer vision | 2009

Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities

Michael S. Ryoo; Jake K. Aggarwal

Human activity recognition is a challenging task, especially when its background is unknown or changing, and when scale or illumination differs in each video. Approaches utilizing spatio-temporal local features have proved that they are able to cope with such difficulties, but they mainly focused on classifying short videos of simple periodic actions. In this paper, we present a new activity recognition methodology that overcomes the limitations of the previous approaches using local features. We introduce a novel matching, spatio-temporal relationship match, which is designed to measure structural similarity between sets of features extracted from two videos. Our match hierarchically considers spatio-temporal relationships among feature points, thereby enabling detection and localization of complex non-periodic activities. In contrast to previous approaches to ‘classify’ videos, our approach is designed to ‘detect and localize’ all occurring activities from continuous videos where multiple actors and pedestrians are present. We implement and test our methodology on a newly-introduced dataset containing videos of multiple interacting persons and individual pedestrians. The results confirm that our system is able to recognize complex non-periodic activities (e.g. ‘push’ and ‘hug’) from sets of spatio-temporal features even when multiple activities are present in the scene

computer vision and pattern recognition | 2011

Human detection using depth information by Kinect

Lu Xia; Chia-Chih Chen; Jake K. Aggarwal

Conventional human detection is mostly done in images taken by visible-light cameras. These methods imitate the detection process that human use. They use features based on gradients, such as histograms of oriented gradients (HOG), or extract interest points in the image, such as scale-invariant feature transform (SIFT), etc. In this paper, we present a novel human detection method using depth information taken by the Kinect for Xbox 360. We propose a model based approach, which detects humans using a 2-D head contour model and a 3-D head surface model. We propose a segmentation scheme to segment the human from his/her surroundings and extract the whole contours of the figure based on our detection point. We also explore the tracking algorithm based on our detection result. The methods are tested on our database taken by the Kinect in our lab and present superior results.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1999

Tracking human motion in structured environments using a distributed-camera system

Quin Cai; Jake K. Aggarwal

This paper presents a comprehensive framework for tracking coarse human models from sequences of synchronized monocular grayscale images in multiple camera coordinates. It demonstrates the feasibility of an end-to-end person tracking system using a unique combination of motion analysis on 3D geometry in different camera coordinates and other existing techniques in motion detection, segmentation, and pattern recognition. The system starts with tracking from a single camera view. When the system predicts that the active camera will no longer have a good view of the subject of interest, tracking will be switched to another camera which provides a better view and requires the least switching to continue tracking. The nonrigidity of the human body is addressed by matching points of the middle line of the human image, spatially and temporally, using Bayesian classification schemes. Multivariate normal distributions are employed to model class-conditional densities of the features for tracking, such as location, intensity, and geometric features. Limited degrees of occlusion are tolerated within the system. Experimental results using a prototype system are presented and the performance of the algorithm is evaluated to demonstrate its feasibility for real time applications.

computer vision and pattern recognition | 2011

A large-scale benchmark dataset for event recognition in surveillance video

Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai

We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas with wide coverage. Previous datasets for action recognition are unrealistic for real-world surveillance because they consist of short clips showing one action by one individual [15, 8]. Datasets have been developed for movies [11] and sports [12], but, these actions and scene conditions do not apply effectively to surveillance videos. Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world. The dataset includes large numbers of instances for 23 event types distributed throughout 29 hours of video. This data is accompanied by detailed annotations which include both moving object tracks and event examples, which will provide solid basis for large-scale evaluation. Additionally, we propose different types of evaluation modes for visual recognition tasks and evaluation metrics along with our preliminary experimental results. We believe that this dataset will stimulate diverse aspects of computer vision research and help us to advance the CVER tasks in the years ahead.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1979

Texture Analysis Using Generalized Co-Occurrence Matrices

Larry S. Davis; Steven A. Johns; Jake K. Aggarwal

We present a new approach to texture analysis based on the spatial distribution of local features in unsegmented textures. The textures are described using features derived from generalized co-occurrence matrices (GCM). A GCM is determined by a spatial constraint predicate F and a set of local features P = {(Xi, Yi, di), i = 1,..., m} where (Xi, Yi) is the location of the ith feature, and di is a description of the ith feature. The GCM of P under F, GF, is defined by GF(i, j) = number of pairs, pk, pl such that F(pk, pl) is true and di and dj are the descriptions of pk and pl, respectively. We discuss features derived from GCMs and present an experimental study using natural textures.

Explore More