Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nandita M. Nayak is active.

Publication


Featured researches published by Nandita M. Nayak.


computer vision and pattern recognition | 2013

Context-Aware Modeling and Recognition of Activities in Video

Yingying Zhu; Nandita M. Nayak; Amit K. Roy-Chowdhury

In this paper, rather than modeling activities in videos individually, we propose a hierarchical framework that jointly models and recognizes related activities using motion and various context features. This is motivated from the observations that the activities related in space and time rarely occur independently and can serve as the context for each other. Given a video, action segments are automatically detected using motion segmentation based on a nonlinear dynamical model. We aim to merge these segments into activities of interest and generate optimum labels for the activities. Towards this goal, we utilize a structural model in a max-margin framework that jointly models the underlying activities which are related in space and time. The model explicitly learns the duration, motion and context patterns for each activity class, as well as the spatio-temporal relationships for groups of them. The learned model is then used to optimally label the activities in the testing videos using a greedy search method. We show promising results on the VIRAT Ground Dataset demonstrating the benefit of joint modeling and recognizing activities in a wide-area scene.


IEEE Journal of Selected Topics in Signal Processing | 2013

Context-Aware Activity Recognition and Anomaly Detection in Video

Yingying Zhu; Nandita M. Nayak; Amit K. Roy-Chowdhury

In this paper, we propose a mathematical framework to jointly model related activities with both motion and context information for activity recognition and anomaly detection. This is motivated from observations that activities related in space and time rarely occur independently and can serve as context for each other. The spatial and temporal distribution of different activities provides useful cues for the understanding of these activities. We denote the activities occurring with high frequencies in the database as normal activities. Given training data which contains labeled normal activities, our model aims to automatically capture frequent motion and context patterns for each activity class, as well as each pair of classes, from sets of predefined patterns during the learning process. Then, the learned model is used to generate globally optimum labels for activities in the testing videos. We show how to learn the model parameters via an unconstrained convex optimization problem and how to predict the correct labels for a testing instance consisting of multiple activities. The learned model and generated labels are used to detect anomalies whose motion and context patterns deviate from the learned patterns. We show promising results on the VIRAT Ground Dataset that demonstrates the benefit of joint modeling and recognition of activities in a wide-area scene and the effectiveness of the proposed method in anomaly detection.


Visual Analysis of Humans | 2011

Modeling and Recognition of Complex Human Activities

Nandita M. Nayak; Ricky J. Sethi; Bi Song; Amit K. Roy-Chowdhury

Activity recognition is a field of computer vision which has shown great progress in the past decade. Starting from simple single person activities, research in activity recognition is moving toward more complex scenes involving multiple objects and natural environments. The main challenges in the task include being able to localize and recognize events in a video and deal with the large amount of variation in viewpoint, speed of movement and scale. This chapter gives the reader an overview of the work that has taken place in activity recognition, especially in the domain of complex activities involving multiple interacting objects. We begin with a description of the challenges in activity recognition and give a broad overview of the different approaches. We go into the details of some of the feature descriptors and classification strategies commonly recognized as being the state of the art in this field. We then move to more complex recognition systems, discussing the challenges in complex activity recognition and some of the work which has taken place in this respect. Finally, we provide some examples of recent work in complex activity recognition. The ability to recognize complex behaviors involving multiple interacting objects is a very challenging problem and future work needs to study its various aspects of features, recognition strategies, models, robustness issues, and context, to name a few.


IEEE Transactions on Information Forensics and Security | 2013

Exploiting Spatio-Temporal Scene Structure for Wide-Area Activity Analysis in Unconstrained Environments

Nandita M. Nayak; Yingying Zhu; Amit K. Roy-Chowdhury

Surveillance videos in unconstrained environments typically consist of long duration sequences of activities which occur at different spatio-temporal locations and can involve multiple people acting simultaneously. Often, the activities have contextual relationships with one another. Although context has been studied in the past for the purpose of activity recognition, the use of context in recognition of activities in such challenging environments is relatively unexplored. In this paper, we propose a novel method for capturing the spatio-temporal context between activities in a Markov random field. The structure of the MRF is improvised upon during test time and not predefined, unlike many approaches that model the contextual relationships between activities. Given a collection of videos and a set of weak classifiers for individual activities, the spatio-temporal relationships between activities are represented as probabilistic edge weights in the MRF. This model provides a generic representation for an activity sequence that can extend to any number of objects and interactions in a video. We show that the recognition of activities in a video can be posed as an inference problem on the graph. We conduct experiments on the publicly available UCLA office dataset and the VIRAT dataset, to demonstrate the improvement in recognition accuracy using our proposed model as opposed to recognition using state-of-the-art features on individual activity regions.


Image and Vision Computing | 2013

Vector field analysis for multi-object behavior modeling

Nandita M. Nayak; Yingying Zhu; Amit K. Roy-Chowdhury

This paper proposes an end-to-end system to recognize multi-person behaviors in video, unifying different tasks like segmentation, modeling and recognition within a single optical flow based motion analysis framework. We show how optical flow can be used for analyzing activities of individual actors, as opposed to dense crowds, which is what the existing literature has concentrated on mostly. The algorithm consists of two steps - identification of motion patterns and modeling of motion patterns. Activities are analyzed using the underlying motion patterns which are formed by the optical flow field over a period of time. Streaklines are used to capture these motion patterns via integration of the flow field. To recognize the regions of interest, we utilize the Helmholtz decomposition to compute the divergence potential. The extrema or critical points of this potential indicates regions of high activity in the video, which are then represented as motion patterns by clustering the streaklines. We then present a method to compare two videos by measuring the similarity between their motion patterns using a combination of shape theory and subspace analysis. Such an analysis allows us to represent, compare and recognize a wide range of activities. We perform experiments on state-of-the-art datasets and show that the proposed method is suitable for natural videos in the presence of noise, background clutter and high intra class variations. Our method has two significant advantages over recent related approaches - it provides a single framework that takes care of both low-level and high-level visual analysis tasks, and is computationally efficient.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Context-Aware Activity Modeling Using Hierarchical Conditional Random Fields

Yingying Zhu; Nandita M. Nayak; Amit K. Roy-Chowdhury

In this paper, rather than modeling activities in videos individually, we jointly model and recognize related activities in a scene using both motion and context features. This is motivated from the observations that activities related in space and time rarely occur independently and can serve as the context for each other. We propose a two-layer conditional random field model, that represents the action segments and activities in a hierarchical manner. The model allows the integration of both motion and various context features at different levels and automatically learns the statistics that capture the patterns of the features. With weakly labeled training data, the learning problem is formulated as a max-margin problem and is solved by an iterative algorithm. Rather than generating activity labels for individual activities, our model simultaneously predicts an optimum structural label for the related activities in the scene. We show promising results on the UCLA Office Dataset and VIRAT Ground Dataset that demonstrate the benefit of hierarchical modeling of related activities using both motion and context features.


IEEE Transactions on Image Processing | 2015

Hierarchical Graphical Models for Simultaneous Tracking and Recognition in Wide-Area Scenes

Nandita M. Nayak; Yingying Zhu; Amit K. Roy Chowdhury

We present a unified framework to track multiple people, as well localize, and label their activities, in complex long-duration video sequences. To do this, we focus on two aspects: 1) the influence of tracks on the activities performed by the corresponding actors and 2) the structural relationships across activities. We propose a two-level hierarchical graphical model, which learns the relationship between tracks, relationship between tracks, and their corresponding activity segments, as well as the spatiotemporal relationships across activity segments. Such contextual relationships between tracks and activity segments are exploited at both the levels in the hierarchy for increased robustness. An L1-regularized structure learning approach is proposed for this purpose. While it is well known that availability of the labels and locations of activities can help in determining tracks more accurately and vice-versa, most current approaches have dealt with these problems separately. Inspired by research in the area of biological vision, we propose a bidirectional approach that integrates both bottom-up and top-down processing, i.e., bottom-up recognition of activities using computed tracks and top-down computation of tracks using the obtained recognition. We demonstrate our results on the recent and publicly available UCLA and VIRAT data sets consisting of realistic indoor and outdoor surveillance sequences.


international conference on image processing | 2014

Learning a sparse dictionary of video structure for activity modeling

Nandita M. Nayak; Amit K. Roy-Chowdhury

We present an approach which incorporates spatiotemporal features as well as the relationships between them, into a sparse dictionary learning framework for activity recognition. We propose that the dictionary learning framework can be adapted to learning complex relationships between features in an unsupervised manner. From a set of training videos, a dictionary is learned for individual features, as well as the relationships between them using a stacked predictive sparse decomposition framework. This combined dictionary provides a representation of the structure of the video and is spatio-temporally pooled in a local manner to obtain descriptors. The descriptors are then combined using a multiple kernel learning framework to design classifiers. Experiments have been conducted on two popular activity recognition datasets to demonstrate the superior performance of our approach on single person as well as multi-person activities.


indian conference on computer vision, graphics and image processing | 2012

The role of spatial context in activity recognition

Yingying Zhu; Nandita M. Nayak; Amit K. Roy-Chowdhury

In this paper, we propose a mathematical framework to model activities with both motion and context information for activity recognition. This is motivated from the observations that an activity does not only depend on the motion of the objects of interest but the surrounding objects also provide useful cues for an understanding of the activity. Thus the surrounding objects can serve as context for the concerned activity. Given training data, our model aims to automatically capture and weigh motion and context patterns for each activity class, from sets of predefined attributes, during the learning process. Then, the learned model is used to generate optimum labels for activities in the testing videos based on the motion and context features of these activities. We show how to learn the model parameters via an unconstrained convex optimization methodology and how to predict the correct label for a testing instance. We show promising results on the publicly available VIRAT Ground Dataset that demonstrates the benefit of modeling the surrounding context in recognizing activities in a wide-area scene.


international conference on image processing | 2011

Vector field analysis for motion pattern identification in video

Nandita M. Nayak; Ahmed Tashrif Kamal; Amit K. Roy-Chowdhury

Identification of motion patterns in video is an important problem because it is the first step towards analysis of complex multi-person behaviors to obtain long-term interaction models. In this paper, we will present a flow based technique to identify spatio-temporal motion patterns in a multi-object video. We use the Helmholtz decomposition of optical flow and compute singular points corresponding to component fields. We will show that the optical flow can be used to identify regions which correspond to different moving entities in the video. The singular points in these regions capture the characteristics of the field around them and can be used to identify these regions. This representation would provide us with a framework to analyze activities of individual entities in the scene as well as the global interactions between them. We demonstrate our algorithm on a dataset composed of multi-object videos recorded in a realistic environment.

Collaboration


Dive into the Nandita M. Nayak's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yingying Zhu

University of California

View shared research outputs
Top Co-Authors

Avatar

Bi Song

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ricky J. Sethi

Fitchburg State University

View shared research outputs
Top Co-Authors

Avatar

Utkarsh Gaur

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge