Kishore K. Reddy
University of Central Florida
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kishore K. Reddy.
machine vision applications | 2013
Kishore K. Reddy; Mubarak Shah
Action recognition on large categories of unconstrained videos taken from the web is a very challenging problem compared to datasets like KTH (6 actions), IXMAS (13 actions), and Weizmann (10 actions). Challenges like camera motion, different viewpoints, large interclass variations, cluttered background, occlusions, bad illumination conditions, and poor quality of web videos cause the majority of the state-of-the-art action recognition approaches to fail. Also, an increased number of categories and the inclusion of actions with high confusion add to the challenges. In this paper, we propose using the scene context information obtained from moving and stationary pixels in the key frames, in conjunction with motion features, to solve the action recognition problem on a large (50 actions) dataset with videos from the web. We perform a combination of early and late fusion on multiple features to handle the very large number of categories. We demonstrate that scene context is a very important feature to perform action recognition on very large datasets. The proposed method does not require any kind of video stabilization, person detection, or tracking and pruning of features. Our approach gives good performance on a large number of action categories; it has been tested on the UCF50 dataset with 50 action categories, which is an extension of the UCF YouTube Action (UCF11) dataset containing 11 action categories. We also tested our approach on the KTH and HMDB51 datasets for comparison.
computer vision and pattern recognition | 2011
Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai
We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas with wide coverage. Previous datasets for action recognition are unrealistic for real-world surveillance because they consist of short clips showing one action by one individual [15, 8]. Datasets have been developed for movies [11] and sports [12], but, these actions and scene conditions do not apply effectively to surveillance videos. Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world. The dataset includes large numbers of instances for 23 event types distributed throughout 29 hours of video. This data is accompanied by detailed annotations which include both moving object tracks and event examples, which will provide solid basis for large-scale evaluation. Additionally, we propose different types of evaluation modes for visual recognition tasks and evaluation metrics along with our preliminary experimental results. We believe that this dataset will stimulate diverse aspects of computer vision research and help us to advance the CVER tasks in the years ahead.
international conference on computer vision | 2009
Kishore K. Reddy; Jingen Liu; Mubarak Shah
Action recognition methods suffer from many drawbacks in practice, which include (1)the inability to cope with incremental recognition problems; (2)the requirement of an intensive training stage to obtain good performance; (3) the inability to recognize simultaneous multiple actions; and (4) difficulty in performing recognition frame by frame. In order to overcome all these drawbacks using a single method, we propose a novel framework involving the feature- tree to index large scale motion features using Sphere/Rectangle-tree (SR-tree). The recognition consists of the following two steps: 1) recognizing the local features by non-parametric nearest neighbor (NN), 2) using a simple voting strategy to label the action. The proposed method can provide the localization of the action. Since our method does not require feature quantization, the feature- tree can be efficiently grown by adding features from new training examples of actions or categories. Our method provides an effective way for practical incremental action recognition. Furthermore, it can handle large scale datasets due to the fact that the SR-tree is a disk-based data structure. We have tested our approach on two publicly available datasets, the KTH and the IXMAS multi-view datasets, and obtained promising results.
advanced video and signal based surveillance | 2012
Kishore K. Reddy; Naresh P. Cuntoor; A. G. Amitha Perera; Anthony Hoogs
Research in human action recognition has advanced along multiple fronts in recent years to address various types of actions including simple, isolated actions in staged data (e.g., KTH dataset), complex actions (e.g., Hollywood dataset) and naturally occurring actions in surveillance videos (e.g, VIRAT dataset). Several techniques including those based on gradient, flow and interest-points have been developed for their recognition. Most perform very well in standard action recognition datasets, but fail to produce similar results in more complex, large-scale datasets. Here we analyze the reasons for this less than successful generalization by considering a state-of-the-art technique, histogram of oriented gradients in spatiotemporal volumes as an example. This analysis may prove useful in developing robust and effective techniques for action recognition.
international symposium on biomedical imaging | 2012
Kishore K. Reddy; Berkan Solmaz; Pingkun Yan; Nicholas G. Avgeropoulos; David J. Rippe; Mubarak Shah
Enhancing brain tumor segmentation for accurate tumor volume measurement is a challenging task due to the large variation of tumor appearance and shape, which makes it difficult to incorporate prior knowledge commonly used by other medical image segmentation tasks. In this paper, a novel idea of confidence surface is proposed to guide the segmentation of enhancing brain tumor using information across multi-parametric magnetic resonance imaging (MRI). Texture information along with the typical intensity information from pre-contrast T1 weighted (T1 pre), post-contrast T1 weighted (T1 post), T2 weighted (T2), and fluid attenuated inversion recovery (FLAIR) MRI images are used to train a discriminative classifier at pixel level. The classifier is used to generate a confidence surface, which gives a likelihood of each pixel being a tumor or non-tumor. The obtained confidence surface is then incorporated into two classical methods for segmentation guidance. The proposed approach was evaluated on 19 groups of MRI images with tumor and promising results have been demonstrated.
ieee high performance extreme computing conference | 2015
Michael Giering; Vivek Venugopalan; Kishore K. Reddy
The ability to simultaneously leverage multiple modes of sensor information is critical for perception of an automated vehicles physical surroundings. Spatio-temporal alignment of registration of the incoming information is often a prerequisite to analyzing the fused data. The persistence and reliability of multi-modal registration is therefore the key to the stability of decision support systems ingesting the fused information. LiDAR-video systems like on those many driverless cars are a common example of where keeping the LiDAR and video channels registered to common physical features is important. We develop a deep learning method that takes multiple channels of heterogeneous data, to detect the misalignment of the LiDAR-video inputs. A number of variations were tested on the Ford LiDAR-video driving test data set and will be discussed. To the best of our knowledge the use of multi-modal deep convolutional neural networks for dynamic real-time LiDAR-video registration has not been presented.
advanced video and signal based surveillance | 2011
Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai
Summary form only given. We present a concept for automatic construction site monitoring by taking into account 4D information (3D over time), that is acquired from highly-overlapping digital aerial images. On the one hand todays maturity of flying micro aerial vehicles (MAVs) enables a low-cost and an efficient image acquisition of high-quality data that maps construction sites entirely from many varying viewpoints. On the other hand, due to low-noise sensors and high redundancy in the image data, recent developments in 3D reconstruction workflows have benefited the automatic computation of accurate and dense 3D scene information. Having both an inexpensive high-quality image acquisition and an efficient 3D analysis workflow enables monitoring, documentation and visualization of observed sites over time with short intervals. Relating acquired 4D site observations, composed of color, texture, geometry over time, largely supports automated methods toward full scene understanding, the acquisition of both the change and the construction sites progress.
advanced video and signal based surveillance | 2015
Anthony Hoogs; A. G. Amitha Perera; Roderic Collins; Arslan Basharat; Keith Fieldhouse; Chuck Atkins; Linus Sherrill; Benjamin Boeckel; Russell Blue; Matthew Woehlke; C. Greco; Zhaohui Sun; Eran Swears; Naresh P. Cuntoor; J. Luck; B. Drew; D. Hanson; D. Rowley; J. Kopaz; T. Rude; D. Keefe; A. Srivastava; S. Khanwalkar; A. Kumar; Chia-Chih Chen; Jake K. Aggarwal; Larry S. Davis; Yaser Yacoob; Arpit Jain; Dong Liu
We describe a system for content-based retrieval from large surveillance video archives, using behavior, action and appearance of objects. Objects are detected, tracked, and classified into broad categories. Their behavior and appearance are characterized by action detectors and descriptors, which are indexed in an archive. Queries can be posed as video exemplars, and the results can be refined through relevance feedback. The contributions of our system include the fusion of behavior and action detectors with appearance for matching; the improvement of query results through interactive query refinement (IQR), which learns a discriminative classifier online based on user feedback; and reasonable performance on low resolution, poor quality video. The system operates on video from ground cameras and aerial platforms, both RGB and IR. Performance is evaluated on publicly-available surveillance datasets, showing that subtle actions can be detected under difficult conditions, with reasonable improvement from IQR.
international conference on pervasive and embedded computing and communication systems | 2012
Corey McCall; Kishore K. Reddy; Mubarak Shah
arXiv: Computer Vision and Pattern Recognition | 2014
Soumik Sarkar; Vivek Venugopalan; Kishore K. Reddy; Michael Giering; Julian Ryde; Navdeep Jaitly