Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Eran Swears is active.

Publication


Featured researches published by Eran Swears.


computer vision and pattern recognition | 2011

A large-scale benchmark dataset for event recognition in surveillance video

Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai

We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas with wide coverage. Previous datasets for action recognition are unrealistic for real-world surveillance because they consist of short clips showing one action by one individual [15, 8]. Datasets have been developed for movies [11] and sports [12], but, these actions and scene conditions do not apply effectively to surveillance videos. Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world. The dataset includes large numbers of instances for 23 event types distributed throughout 29 hours of video. This data is accompanied by detailed annotations which include both moving object tracks and event examples, which will provide solid basis for large-scale evaluation. Additionally, we propose different types of evaluation modes for visual recognition tasks and evaluation metrics along with our preliminary experimental results. We believe that this dataset will stimulate diverse aspects of computer vision research and help us to advance the CVER tasks in the years ahead.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Modeling Temporal Interactions with Interval Temporal Bayesian Networks for Complex Activity Recognition

Yongmian Zhang; Yifan Zhang; Eran Swears; Natalia Larios; Ziheng Wang; Qiang Ji

Complex activities typically consist of multiple primitive events happening in parallel or sequentially over a period of time. Understanding such activities requires recognizing not only each individual event but, more importantly, capturing their spatiotemporal dependencies over different time intervals. Most of the current graphical model-based approaches have several limitations. First, time--sliced graphical models such as hidden Markov models (HMMs) and dynamic Bayesian networks are typically based on points of time and they hence can only capture three temporal relations: precedes, follows, and equals. Second, HMMs are probabilistic finite-state machines that grow exponentially as the number of parallel events increases. Third, other approaches such as syntactic and description-based methods, while rich in modeling temporal relationships, do not have the expressive power to capture uncertainties. To address these issues, we introduce the interval temporal Bayesian network (ITBN), a novel graphical model that combines the Bayesian Network with the interval algebra to explicitly model the temporal dependencies over time intervals. Advanced machine learning methods are introduced to learn the ITBN model structure and parameters. Experimental results show that by reasoning with spatiotemporal dependencies, the proposed model leads to a significantly improved performance when modeling and recognizing complex activities involving both parallel and sequential events.


ieee workshop on motion and video computing | 2008

Learning Motion Patterns in Surveillance Video using HMM Clustering

Eran Swears; Anthony Hoogs; A. G. Amitha Perera

We present a novel approach to learning motion behavior in video, and detecting abnormal behavior, using hierarchical clustering of hidden Markov models (HMMs). A continuous stream of track data is used for online and on-demand creation and training of HMMs, where tracks may be of highly variable length and scenes may be very complex with an unknown number of motion patterns. We show how these HMMs can be used for on-line clustering of tracks that represent normal behavior and for detection of deviant tracks. The track clustering algorithm uses a hierarchical agglomerative HMM clustering technique that jointly determines all the HMM parameters (including the number of states) via an expectation maximization (EM) algorithm and the Akaike information criteria. Results are demonstrated on a highly complex scene containing dozens of routes, significant occlusions and hundreds of moving objects.


computer vision and pattern recognition | 2014

Complex Activity Recognition Using Granger Constrained DBN (GCDBN) in Sports and Surveillance Video

Eran Swears; Anthony Hoogs; Qiang Ji; Kim L. Boyer

Modeling interactions of multiple co-occurring objects in a complex activity is becoming increasingly popular in the video domain. The Dynamic Bayesian Network (DBN) has been applied to this problem in the past due to its natural ability to statistically capture complex temporal dependencies. However, standard DBN structure learning algorithms are generatively learned, require manual structure definitions, and/or are computationally complex or restrictive. We propose a novel structure learning solution that fuses the Granger Causality statistic, a direct measure of temporal dependence, with the Adaboost feature selection algorithm to automatically constrain the temporal links of a DBN in a discriminative manner. This approach enables us to completely define the DBN structure prior to parameter learning, which reduces computational complexity in addition to providing a more descriptive structure. We refer to this modeling approach as the Granger Constraints DBN (GCDBN). Our experiments show how the GCDBN outperforms two of the most relevant state-of-the-art graphical models in complex activity classification on handball video data, surveillance data, and synthetic data.


workshop on applications of computer vision | 2012

Learning and recognizing complex multi-agent activities with applications to american football plays

Eran Swears; Anthony Hoogs

We are interested in modeling and recognizing complex behaviors in video, where multiple agents are interacting in a time-varying manner and in a spatially-localized domain such as American football. Our approach pushes the model complexity onto the observations by using a multi-variate kernel density while maintaining a simple HMM model. The temporal interactions of objects are captured by coupling the kernel observation distributions with a time-varying state-transition matrix, producing a Non-Stationary Kernel HMM (NSK-HMM). This modeling philosophy specifically addresses several issues that plague the more complex stationary models with simple observations, i.e. Dynamic Multi-Linked HMM (DML-HMM) and the Time-Delayed Probabilistic Graphical Model (TDPGM). These include: smaller training datasets, sensitivity to intra class variability and/or dense uninformative clutter tracks. Experiments are performed in the American football video domain, where the offensive plays are the activities. Comparisons are made to the DML-HMM and an extension of the TDPGM to DBNs (TDDBN). The NSK-HMM achieves a 57.7% classification accuracy across seven activities, while the DML-HMM is 26.7% and the TDDBN is 21.3%. When tested on four activities the NSK-HMM achieves a 76.0% accuracy.


ieee workshop on motion and video computing | 2009

Functional scene element recognition for video scene analysis

Eran Swears; Anthony Hoogs

We present a method to detect and recognize functional scene elements in video scenes. A functional scene element is a location or object that is primarily defined by its specific function or purpose, rather than its appearance or shape. Our method combines techniques from video scene analysis with functional recognition to decompose a video scene into its functional elements such as parking spots, building entrances, roads and sidewalks. Existing techniques for functional object recognition in video [2,3] are designed for high-resolution video with little clutter and constrained situations, while our approach is designed for real-world video surveillance scenes where there are many movers, and detection and tracking can be poor because of low resolution and frame rates. Video scene analysis methods are focused on motion pattern learning and anomaly detection [4][8][11][12][13][14], whereas we take a recognition approach and develop motion pattern models for specific functional categories. The movements of objects such as vehicles and pedestrians are exploited to detect and classify functional scene elements in an online process that probabilistically accumulates evidence over many tracks to compensate for noisy and partial observations. Results are shown on simulated and real data of complex, busy scenes containing multiple instances of different functional objects. The detected elements are then used to demonstrate that building activity profiles can be extracted and used to distinguish different types of buildings.


Video Analytics for Business Intelligence | 2012

Automatic Activity Profile Generation from Detected Functional Regions for Video Scene Analysis

Eran Swears; Matthew W. Turek; Roderic Collins; A. G. Amitha Perera; Anthony Hoogs

The potential applications of video surveillance to the Business Intelligence domain continue to grow. For example, automatic computer vision algorithms can provide a fast, efficient process to screen hundreds of hours of video for activity patterns that potentially impact the business. Two such algorithms and their variants are discussed in this chapter. These algorithms analyze surveillance video in order to automatically recognize various functional elements, such as: walkways, roadways, parking-spots, and doorways, through their interactions with pedestrian and vehicle detections. The recognized functional element regions provide a means of capturing statistics related to particular businesses. For example, the owner may be interested in the number of people that enter or exit their business versus the number of people that walk past. Results are shown on functional element recognition and business related activity profiles that demonstrate the effectiveness of these algorithms. Experiments are performed using webcam video of a downtown main street in Ocean City NJ, and surveillance video from the CAVIAR shopping center dataset.


advanced video and signal based surveillance | 2011

AVSS 2011 demo session: A large-scale benchmark dataset for event recognition in surveillance video

Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai

Summary form only given. We present a concept for automatic construction site monitoring by taking into account 4D information (3D over time), that is acquired from highly-overlapping digital aerial images. On the one hand todays maturity of flying micro aerial vehicles (MAVs) enables a low-cost and an efficient image acquisition of high-quality data that maps construction sites entirely from many varying viewpoints. On the other hand, due to low-noise sensors and high redundancy in the image data, recent developments in 3D reconstruction workflows have benefited the automatic computation of accurate and dense 3D scene information. Having both an inexpensive high-quality image acquisition and an efficient 3D analysis workflow enables monitoring, documentation and visualization of observed sites over time with short intervals. Relating acquired 4D site observations, composed of color, texture, geometry over time, largely supports automated methods toward full scene understanding, the acquisition of both the change and the construction sites progress.


advanced video and signal based surveillance | 2015

An end-to-end system for content-based video retrieval using behavior, actions, and appearance with interactive query refinement

Anthony Hoogs; A. G. Amitha Perera; Roderic Collins; Arslan Basharat; Keith Fieldhouse; Chuck Atkins; Linus Sherrill; Benjamin Boeckel; Russell Blue; Matthew Woehlke; C. Greco; Zhaohui Sun; Eran Swears; Naresh P. Cuntoor; J. Luck; B. Drew; D. Hanson; D. Rowley; J. Kopaz; T. Rude; D. Keefe; A. Srivastava; S. Khanwalkar; A. Kumar; Chia-Chih Chen; Jake K. Aggarwal; Larry S. Davis; Yaser Yacoob; Arpit Jain; Dong Liu

We describe a system for content-based retrieval from large surveillance video archives, using behavior, action and appearance of objects. Objects are detected, tracked, and classified into broad categories. Their behavior and appearance are characterized by action detectors and descriptors, which are indexed in an archive. Queries can be posed as video exemplars, and the results can be refined through relevance feedback. The contributions of our system include the fusion of behavior and action detectors with appearance for matching; the improvement of query results through interactive query refinement (IQR), which learns a discriminative classifier online based on user feedback; and reasonable performance on low resolution, poor quality video. The system operates on video from ground cameras and aerial platforms, both RGB and IR. Performance is evaluated on publicly-available surveillance datasets, showing that subtle actions can be detected under difficult conditions, with reasonable improvement from IQR.


international conference on computer vision | 2013

Pyramid Coding for Functional Scene Element Recognition in Video Scenes

Eran Swears; Anthony Hoogs; Kim L. Boyer

Recognizing functional scene elements in video scenes based on the behaviors of moving objects that interact with them is an emerging problem of interest. Existing approaches have a limited ability to characterize elements such as cross-walks, intersections, and buildings that have low activity, are multi-modal, or have indirect evidence. Our approach recognizes the low activity and multi-model elements (crosswalks/intersections) by introducing a hierarchy of descriptive clusters to form a pyramid of codebooks that is sparse in the number of clusters and dense in content. The incorporation of local behavioral context such as person-enter-building and vehicle-parking nearby enables the detection of elements that do not have direct motion-based evidence, e.g. buildings. These two contributions significantly improve scene element recognition when compared against three state-of-the-art approaches. Results are shown on typical ground level surveillance video and for the first time on the more complex Wide Area Motion Imagery.

Collaboration


Dive into the Eran Swears's collaboration.

Top Co-Authors

Avatar

Qiang Ji

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Chia-Chih Chen

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Jake K. Aggarwal

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Antonio Torralba

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Bi Song

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge