Robert T. Collins | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Robert T. Collins is active.

Explore More

Publication

Featured researches published by Robert T. Collins.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

Online selection of discriminative tracking features

Robert T. Collins; Yanxi Liu; Marius Leordeanu

This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for tracking the object. Given a set of seed features, we compute log likelihood ratios of class conditional sample densities from object and background to form a new set of candidate features tailored to the local object/background discrimination task. The two-class variance ratio is used to rank these new features according to how well they separate sample distributions of object and background pixels. This feature evaluation mechanism is embedded in a mean-shift tracking system that adaptively selects the top-ranked discriminative features for tracking. Examples are presented that demonstrate how this method adapts to changing appearances of both tracked object and scene background. We note susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter and develop an additional approach that seeks to minimize the likelihood of distraction.

computer vision and pattern recognition | 2003

Mean-shift blob tracking through scale space

Robert T. Collins

The mean-shift algorithm is an efficient technique for tracking 2D blobs through an image. Although the scale of the mean-shift kernel is a crucial parameter, there is presently no clean mechanism for choosing or updating scale while tracking blobs that are changing in size. We adapt Lindebergs (1998) theory of feature scale selection based on local maxima of differential scale-space filters to the problem of selecting kernel scale for mean-shift blob tracking. We show that a difference of Gaussian (DOG) mean-shift kernel enables efficient tracking of blobs through scale space. Using this kernel requires generalizing the mean-shift algorithm to handle images that contain negative sample weights.

Proceedings of the IEEE | 2001

Algorithms for cooperative multisensor surveillance

Robert T. Collins; Alan J. Lipton; Hironobu Fujiyoshi; Takeo Kanade

The Video Surveillance and Monitoring (VSAM) team at Carnegie Mellon University (CMU) has developed an end-to-end, multicamera surveillance system that allows a single human operator to monitor activities in a cluttered environment using a distributed network of active video sensors. Video understanding algorithms have been developed to automatically detect people and vehicles, seamlessly track them using a network of cooperating active sensors, determine their three-dimensional locations with respect to a geospatial site model, and present this information to a human operator who controls the system through a graphical user interface. The goal is to automatically collect and disseminate real-time information to improve the situational awareness of security providers and decision makers. The feasibility of real-time video surveillance has been demonstrated within a multicamera testbed system developed on the campus of CMU. This paper presents an overview of the issues and algorithms involved in creating this semiautonomous, multicamera surveillance system.

ieee international conference on automatic face and gesture recognition | 2002

Silhouette-based human identification from body shape and gait

Robert T. Collins; Ralph Gross; Jianbo Shi

Our goal is to establish a simple baseline method for human identification based on body shape and gait. This baseline recognition method provides a lower bound against which to evaluate more complicated procedures. We present a viewpoint-dependent technique based on template matching of body silhouettes. Cyclic gait analysis is performed to extract key frames from a test sequence. These frames are compared to training frames using normalized correlation, and subject classification is performed by nearest-neighbor matching among correlation scores. The approach implicitly captures biometric shape cues such as body height, width, and body-part proportions, as well as gait cues such as stride length and amount of arm swing. We evaluate the method on four databases with varying viewing angles, background conditions (indoors and outdoors), walking styles and pixels on target.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2000

Introduction to the special section on video surveillance

Robert T. Collins; Alan J. Lipton; Takeo Kanade

UTOMATED video surveillance addresses real-time observation of people and vehicles within a busy environment, leading to a description of their actions and interactions. The technical issues include moving object detection and tracking, object classification, human motion analysis, and activity understanding, touching on many of the core topics of computer vision, pattern analysis, and aritificial intelligence. Video surveillance has spawned large research projects in the United States, Europe, and Japan, and has been the topic of several international conferences and workshops in recent years. There are immediate needs for automated surveillance systems in commercial, law enforcement, and military applications. Mounting video cameras is cheap, but finding available human resources to observe the output is expensive. Although surveillance cameras are already prevalent in banks, stores, and parking lots, video data currently is used only “after the fact” as a forensic tool, thus losing its primary benefit as an active, real-time medium. What is needed is continuous 24-hour monitoring of surveillance video to alert security officers to a burglary in progress or to a suspicious individual loitering in the parking lot, while there is still time to prevent the crime. In addition to the obvious security applications, video surveillance technology has been proposed to measure traffic flow, detect accidents on highways, monitor pedestrian congestion in public spaces, compile consumer demographics in shopping malls and amusement parks, log routine maintainence tasks at nuclear facilities, and count endangered species. The numerous military applications include patrolling national borders, measuring the flow of refugees in troubled areas, monitoring peace treaties, and providing secure perimeters around bases and embassies. The 11 papers in this special section illustrate topics and techniques at the forefront of video surveillance research. These papers can be loosely organized into three categories. Detection and tracking involves real-time extraction of moving objects from video and continuous tracking over time to form persistent object trajectories. C. Stauffer and W.E.L. Grimson introduce unsupervised statistical learning techniques to cluster object trajectories produced by adaptive background subtraction into descriptions of normal scene activity. Viewpoint-specific trajectory descriptions from multiple cameras are combined into a common scene coordinate system using a calibration technique described by L. Lee, R. Romano, and G. Stein, who automatically determine the relative exterior orientation of overlapping camera views by observing a sparse set of moving objects on flat terrain. Two papers address the accumulation of noisy motion evidence over time. R. Pless, T. Brodský, and Y. Aloimonos detect and track small objects in aerial video sequences by first compensating for the self-motion of the aircraft, then accumulating residual normal flow to acquire evidence of independent object motion. L. Wixson notes that motion in the image does not always signify purposeful travel by an independently moving object (examples of such “motion clutter” are wind-blown tree branches and sun reflections off rippling water) and devises a flow-based salience measure to highlight objects that tend to move in a consistent direction over time. Human motion analysis is concerned with detecting periodic motion signifying a human gait and acquiring descriptions of human body pose over time. R. Cutler and L.S. Davis plot an object’s self-similarity across all pairs of frames to form distinctive patterns that classify bipedal, quadripedal, and rigid object motion. Y. Ricquebourg and P. Bouthemy track apparent contours in XT slices of an XYT sequence volume to robustly delineate and track articulated human body structure. I. Haritaoglu, D. Harwood, and L.S. Davis present W4, a surveillance system specialized to the task of looking at people. The W4 system can locate people and segment their body parts, build simple appearance models for tracking, disambiguate between and separately track multiple individuals in a group, and detect carried objects such as boxes and backpacks. Activity analysis deals with parsing temporal sequences of object observations to produce high-level descriptions of agent actions and multiagent interactions. In our opinion, this will be the most important area of future research in video surveillance. N.M. Oliver, B. Rosario, and A.P. Pentland introduce Coupled Hidden Markov Models (CHMMs) to detect and classify interactions consisting of two interleaved agent action streams and present a training method based on synthetic agents to address the problem of parameter estimation from limited real-world training examples. M. Brand and V. Kettnaker present an entropyminimization approach to estimating HMM topology and

International Journal of Computer Vision | 1988

The Schema System

Bruce A. Draper; Robert T. Collins; John Brolio; Allen R. Hanson; Edward M. Riseman

THE SCHEMA SYSTEM EMBODIES A KNOWLEDGE-BASED APPROACH TO SCENE INTERPRE- TATION. LOW-LEVEL ROUTINES ARE APPLIED TO EXTRACT IMAGE DESCRIPTORS CALLED TOKENS, AND THESE TOKENS ARE FURTHER ORGANIZED BY INTERMEDIATE-LEVEL ROUT- INES INTO MORE ABSTRACT STRUCTURES THAT CAN BE ASSOCIATED WITH OBJECT INST- ANCES. THE THOUSANDS OF TOKENS THAT ARE EXTRACTED FROM AN IMAGE CAN BE GROUPED IN A COMBINATORIALLY EXPLOSIVE MANNER. THEREFORE, KNOWLEDGE IN THE SCHEMA SYSTEM IS NOT LIMITED TO THE DESCRIPTIONS OF OBJECTS; IT INCLUDES INFORMATION ABOUT HOW EACH OBJECT CAN BE RECOGNIZED. OBJECT SCHEMAS CONTROL THE INVOCATION AND EXECUTION OF THE LOW-LEVEL AND INTERMEDIATE-LEVEL ROUT- INES WITH THE GOAL OF FORMING HYPOTHESES ABOUT OBJECTS IN THE SCENE. THE SYSTEM DESCRIBED PRODUCES IMAGE INTERPRETATIONS BASED ON TWO-DIMENSIONAL REASONING, ALTHOUGH NOTHING IN THE SYSTEM ORGANIZATION AND CONTROL STRATEG- IES PRECLUDE THE INCLUSION OF THREE-DIMENSIONAL INFORMATION. THE SCHEMA FRAMEWORK EXPLOITS COARSE-GRAINED PARALLELISM IN A COOPERA- TIVE INTERPRETATION PROCESS. SCHEMA INSTANCES RUN CONCURRENTLY, AND AN OB- JECT SCHEMA OFTEN HAS AVAILABLE A VARIETY OF STRATEGIES FOR IDENTIFICATION, EACH ONE INVOKING KNOWLEDGE SOURCES TO GATHER SUPPORT FOR THE PRESENCE OF A HYPOTHESIZED OBJECT. INTER-SCHEMA COMMUNICATION IS CARRIED OUT ASYNCHRON- OUSLY THROUGH A GLOBAL BLACKBOARD. IN THIS WAY SCHEMA INSTANCES COOPERATE TO IDENTIFY AND LOCATE THE SIGNIFICANT OBJECTS PRESENT IN THE SCENE.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

Vision-Based Analysis of Small Groups in Pedestrian Crowds

Weina Ge; Robert T. Collins; R. B. Ruback

Building upon state-of-the-art algorithms for pedestrian detection and multi-object tracking, and inspired by sociological models of human collective behavior, we automatically detect small groups of individuals who are traveling together. These groups are discovered by bottom-up hierarchical clustering using a generalized, symmetric Hausdorff distance defined with respect to pairwise proximity and velocity. We validate our results quantitatively and qualitatively on videos of real-world pedestrian scenes. Where human-coded ground truth is available, we find substantial statistical agreement between our results and the human-perceived small group structure of the crowd. Results from our automated crowd analysis also reveal interesting patterns governing the shape of pedestrian groups. These discoveries complement current research in crowd dynamics, and may provide insights to improve evacuation planning and real-time situation awareness during public disturbances.

international conference on computer vision | 1999

Three-dimensional scene flow

Sundar Vedula; Simon Baker; Peter Rander; Robert T. Collins; Takeo Kanade

Scene flow is the three-dimensional motion field of points in the world, just as optical flow is the two-dimensional motion field of points in an image. Any optical flow is simply the projection of the scene flow onto the image plane of a camera. We present a framework for the computation of dense, non-rigid scene flow from optical flow. Our approach leads to straightforward linear algorithms and a classification of the task into three major scenarios: complete instantaneous knowledge of the scene structure; knowledge only of correspondence information; and no knowledge of the scene structure. We also show that multiple estimates of the normal flow cannot be used to estimate dense scene flow directly without some form of smoothing or regularization.

First ACM SIGMM international workshop on Video surveillance | 2003

A master-slave system to acquire biometric imagery of humans at distance

Xuhui Zhou; Robert T. Collins; Takeo Kanade; Peter Metes

The Distant Human Identification (DHID) system is a master-slave, real-time surveillance system designed to acquire biometric imagery of humans at distance. A stationary wide field of view master camera is used to monitor an environment at distance. When the master camera detects a moving person, a narrow field of view slave camera is commanded to turn to that direction, acquire the target human, and track them while recording zoomed-in images. These zoomed-in views provide meaningful biometric imagery of the distant humans, who are not recognizable in the master view. Based on the lenses we currently use, the system can detect and track moving people at distances up to 50 meters, within a 60° field of regard.

computer vision and pattern recognition | 2013

Multi-target Tracking by Lagrangian Relaxation to Min-cost Network Flow

Asad A. Butt; Robert T. Collins

We propose a method for global multi-target tracking that can incorporate higher-order track smoothness constraints such as constant velocity. Our problem formulation readily lends itself to path estimation in a trellis graph, but unlike previous methods, each node in our network represents a candidate pair of matching observations between consecutive frames. Extra constraints on binary flow variables in the graph result in a problem that can no longer be solved by min-cost network flow. We therefore propose an iterative solution method that relaxes these extra constraints using Lagrangian relaxation, resulting in a series of problems that ARE solvable by min-cost flow, and that progressively improve towards a high-quality solution to our original optimization problem. We present experimental results showing that our method outperforms the standard network-flow formulation as well as other recent algorithms that attempt to incorporate higher-order smoothness constraints.

Explore More