Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Bi Song is active.

Publication


Featured researches published by Bi Song.


computer vision and pattern recognition | 2011

A large-scale benchmark dataset for event recognition in surveillance video

Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai

We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas with wide coverage. Previous datasets for action recognition are unrealistic for real-world surveillance because they consist of short clips showing one action by one individual [15, 8]. Datasets have been developed for movies [11] and sports [12], but, these actions and scene conditions do not apply effectively to surveillance videos. Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world. The dataset includes large numbers of instances for 23 event types distributed throughout 29 hours of video. This data is accompanied by detailed annotations which include both moving object tracks and event examples, which will provide solid basis for large-scale evaluation. Additionally, we propose different types of evaluation modes for visual recognition tasks and evaluation metrics along with our preliminary experimental results. We believe that this dataset will stimulate diverse aspects of computer vision research and help us to advance the CVER tasks in the years ahead.


IEEE Transactions on Image Processing | 2010

Tracking and Activity Recognition Through Consensus in Distributed Camera Networks

Bi Song; Ahmed Tashrif Kamal; Cristian Soto; Chong Ding; Jay A. Farrell; Amit K. Roy-Chowdhury

Camera networks are being deployed for various applications like security and surveillance, disaster response and environmental modeling. However, there is little automated processing of the data. Moreover, most methods for multicamera analysis are centralized schemes that require the data to be present at a central server. In many applications, this is prohibitively expensive, both technically and economically. In this paper, we investigate distributed scene analysis algorithms by leveraging upon concepts of consensus that have been studied in the context of multiagent systems, but have had little applications in video analysis. Each camera estimates certain parameters based upon its own sensed data which is then shared locally with the neighboring cameras in an iterative fashion, and a final estimate is arrived at in the network using consensus algorithms. We specifically focus on two basic problems - tracking and activity recognition. For multitarget tracking in a distributed camera network, we show how the Kalman-Consensus algorithm can be adapted to take into account the directional nature of video sensors and the network topology. For the activity recognition problem, we derive a probabilistic consensus scheme that combines the similarity scores of neighboring cameras to come up with a probability for each action at the network level. Thorough experimental results are shown on real data along with a quantitative analysis.


international conference on computer vision | 2011

A “string of feature graphs” model for recognition of complex activities in natural videos

Utkarsh Gaur; Yingying Zhu; Bi Song; Amit K. Roy-Chowdhury

Videos usually consist of activities involving interactions between multiple actors, sometimes referred to as complex activities. Recognition of such activities requires modeling the spatio-temporal relationships between the actors and their individual variabilities. In this paper, we consider the problem of recognition of complex activities in a video given a query example. We propose a new feature model based on a string representation of the video which respects the spatio-temporal ordering. This ordered arrangement of local collections of features (e.g., cuboids, STIP), which are the characters in the string, are initially matched using graph-based spectral techniques. Final recognition is obtained by matching the string representations of the query and the test videos in a dynamic programming framework which allows for variability in sampling rates and speed of activity execution. The method does not require tracking or recognition of body parts, is able to identify the region of interest in a cluttered scene, and gives reasonable performance with even a single query example. We test our approach in an example-based video retrieval framework with two publicly available complex activity datasets and provide comparisons against other methods that have studied this problem.


european conference on computer vision | 2010

A stochastic graph evolution framework for robust multi-target tracking

Bi Song; Ting-Yueh Jeng; Elliot Staudt; Amit K. Roy-Chowdhury

Maintaining the stability of tracks on multiple targets in video over extended time periods remains a challenging problem. A few methods which have recently shown encouraging results in this direction rely on learning context models or the availability of training data. However, this may not be feasible in many application scenarios. Moreover, tracking methods should be able to work across different scenarios (e.g. multiple resolutions of the video) making such context models hard to obtain. In this paper, we consider the problem of long-term tracking in video in application domains where context information is not available a priori, nor can it be learned online. We build our solution on the hypothesis that most existing trackers can obtain reasonable short-term tracks (tracklets). By analyzing the statistical properties of these tracklets, we develop associations between them so as to come up with longer tracks. This is achieved through a stochastic graph evolution step that considers the statistical properties of individual tracklets, as well as the statistics of the targets along each proposed long-term track. On multiple real-life video sequences spanning low and high resolution data, we show the ability to accurately track over extended time periods (results are shown on many minutes of continuous video).


IEEE Transactions on Image Processing | 2012

Collaborative Sensing in a Distributed PTZ Camera Network

Chong Ding; Bi Song; Akshay A. Morye; Jay A. Farrell; Amit K. Roy-Chowdhury

The performance of dynamic scene algorithms often suffers because of the inability to effectively acquire features on the targets, particularly when they are distributed over a wide field of view. In this paper, we propose an integrated analysis and control framework for a pan, tilt, zoom (PTZ) camera network in order to maximize various scene understanding performance criteria (e.g., tracking accuracy, best shot, and image resolution) through dynamic camera-to-target assignment and efficient feature acquisition. Moreover, we consider the situation where processing is distributed across the network since it is often unrealistic to have all the image data at a central location. In such situations, the cameras, although autonomous, must collaborate among themselves because each cameras PTZ parameter entails constraints on the others. Motivated by recent work in cooperative control of sensor networks, we propose a distributed optimization strategy, which can be modeled as a game involving the cameras and targets. The cameras gain by reducing the error covariance of the tracked targets or through higher resolution feature acquisition, which, however, comes at the risk of losing the dynamic target. Through the optimization of this reward-versus-risk tradeoff, we are able to control the PTZ parameters of the cameras and assign them to targets dynamically. The tracks, upon which the control algorithm is dependent, are obtained through a consensus estimation algorithm whereby cameras can arrive at a consensus on the state of each target through a negotiation strategy. We analyze the performance of this collaborative sensing strategy in active camera networks in a simulation environment, as well as a real-life camera network.


computer vision and pattern recognition | 2009

Distributed multi-target tracking in a self-configuring camera network

Cristian Soto; Bi Song; Amit K. Roy-Chowdhury

This paper deals with the problem of tracking multiple targets in a distributed network of self-configuring pan-tilt-zoom cameras. We focus on applications where events unfold over a large geographic area and need to be analyzed by multiple overlapping and non-overlapping active cameras without a central unit accumulating and analyzing all the data. The overall goal is to keep track of all targets in the region of deployment of the cameras, while selectively focusing at a high resolution on some particular target features. To acquire all the targets at the desired resolutions while keeping the entire scene in view, we use cooperative network control ideas based on multi-player learning in games. For tracking the targets as they move through the area covered by the cameras, we propose a special application of the distributed estimation algorithm known as Kalman-Consensus filter through which each camera comes to a consensus with its neighboring cameras about the actual state of the target. This leads to a camera network topology that changes with time. Combining these ideas with single-view analysis, we have a completely distributed approach for multi-target tracking and camera network self-configuration. We show performance analysis results with real-life experiments on a network of 10 cameras.


IEEE Signal Processing Magazine | 2011

Distributed Camera Networks

Bi Song; Chong Ding; Ahmed Tashrif Kamal; Jay A. Farrell; Amit K. Roy-Chowdhury

Over the past decade, large-scale camera networks have become increasingly prevalent in a wide range of applications, such as security and surveillance, disaster response, and environmental modeling. In many applications, bandwidth constraints, security concerns, and difficulty in storing and analyzing large amounts of data centrally at a single location necessitate the development of distributed camera network architectures. Thus, the development of distributed scene-analysis algorithms has received much attention lately. However, the performance of these algorithms often suffers because of the inability to effectively acquire the desired images, especially when the targets are dispersed over a wide field of view (FOV). In this article, we show how to develop an end-to-end framework for integrated sensing and analysis in a distributed camera network so as to maximize various scene-understanding performance criteria (e.g., tracking accuracy, best shot, and image resolution).


IEEE Journal of Selected Topics in Signal Processing | 2008

Robust Tracking in A Camera Network: A Multi-Objective Optimization Framework

Bi Song; Amit K. Roy-Chowdhury

We address the problem of tracking multiple people in a network of nonoverlapping cameras. This introduces certain challenges that are unique to this particular application scenario, in addition to existing challenges in tracking like pose and illumination variations, occlusion, clutter and sensor noise. For this purpose, we propose a novel multi-objective optimization framework by combining short term feature correspondences across the cameras with long-term feature dependency models. The overall solution strategy involves adapting the similarities between features observed at different cameras based on the long-term models and finding the stochastically optimal path for each person. For modeling the long-term interdependence of the features over space and time, we propose a novel method based on discriminant analysis models. The entire process allows us to adaptively evolve the feature correspondences by observing the system performance over a time window, and correct for errors in the similarity estimations. We show results on data collected by two large camera networks. These experiments prove that incorporation of the long-term models enable us to hold tracks of objects over extended periods of time, including situations where there are large ldquoblindrdquo areas. The proposed approach is implemented by distributing the processing over the entire network.


international conference on computer vision | 2007

Stochastic Adaptive Tracking In A Camera Network

Bi Song; Amit K. Roy-Chowdhury

We present a novel stochastic, adaptive strategy for tracking multiple people in a large network of video cameras. Similarities between features (appearance and biometrics) observed at different cameras are continuously adapted and the stochastically optimal path for each person computed. The following are the major contributions of the proposed approach. First, we consider situations where the feature similarities are uncertain and treat them as random variables. We show how the distributions of these random variables can be learned and how to compute the tracks in a stochastically optimal manner. Second, we consider the possibility of long-term interdependence of the features over space and time. This allows us to adoptively evolve the feature correspondences by observing the system performance over a time window, and correct for errors in the similarity computations. Third, we show that the above two conditions can be addressed by treating the issue of tracking in a camera network as an optimization problem in a stochastic adaptive system. We show results on data collected by a large camera network. The proposed approach is particularly suitable for distributed processing over the entire network.


international conference on distributed smart cameras | 2008

Decentralized camera network control using game theory

Bi Song; Cristian Soto; Amit K. Roy-Chowdhury; Jay A. Farrell

This paper deals with the problem of decentralized, cooperative control of a camera network. We focus on applications where events unfold over a large geographic area and need to be analyzed by multiple cameras or other kinds of imaging sensors. There is no central unit accumulating and analyzing all the data. The overall goal is to keep track of all objects (i.e., targets) in the region of deployment of the cameras, while selectively focusing at a high resolution on some particular target features based on application requirements. Efficient usage of resources in such a scenario requires that the cameras be active. However, this control cannot be based on separate analysis of the sensed video in each camera. They must act collaboratively to be able to acquire multiple targets at different resolutions. Our research focuses on developing accurate and efficient target acquisition and camera control algorithms in such scenarios using game theory. We show simulated experimental results of the approach.

Collaboration


Dive into the Bi Song's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jay A. Farrell

University of California

View shared research outputs
Top Co-Authors

Avatar

Chong Ding

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chia-Chih Chen

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Cristian Soto

University of California

View shared research outputs
Top Co-Authors

Avatar

Ertem Tuncel

University of California

View shared research outputs
Top Co-Authors

Avatar

Jake K. Aggarwal

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge