Aaron F. Bobick | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aaron F. Bobick is active.

Explore More

Publication

Featured researches published by Aaron F. Bobick.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2001

The recognition of human movement using temporal templates

Aaron F. Bobick; James W. Davis

A view-based approach to the representation and recognition of human movement is presented. The basis of the representation is a temporal template-a static vector-image where the vector value at each point is a function of the motion properties at the corresponding spatial location in an image sequence. Using aerobics exercises as a test domain, we explore the representational power of a simple, two component version of the templates: The first value is a binary value indicating the presence of motion and the second value is a function of the recency of motion in a sequence. We then develop a recognition method matching temporal templates against stored instances of views of known actions. The method automatically performs temporal segmentation, is invariant to linear changes in speed, and runs in real-time on standard platforms.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2000

Recognition of visual activities and interactions by stochastic parsing

Yuri A. Ivanov; Aaron F. Bobick

This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-level detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple interacting objects.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1999

Parametric hidden Markov models for gesture recognition

Andrew D. Wilson; Aaron F. Bobick

A method for the representation, recognition, and interpretation of parameterized gesture is presented. By parameterized gesture we mean gestures that exhibit a systematic spatial variation; one example is a point gesture where the relevant parameter is the two-dimensional direction. Our approach is to extend the standard hidden Markov model method of gesture recognition by including a global parametric variation in the output probabilities of the HMM states. Using a linear model of dependence, we formulate an expectation-maximization (EM) method for training the parametric HMM. During testing, a similar EM algorithm simultaneously maximizes the output likelihood of the PHMM for the given sequence and estimates the quantifying parameters. Using visually derived and directly measured three-dimensional hand position measurements as input, we present results that demonstrate the recognition superiority of the PHMM over standard HMM techniques, as well as greater robustness in parameter estimation with respect to noise in the input features. Finally, we extend the PHMM to handle arbitrary smooth (nonlinear) dependencies. The nonlinear formulation requires the use of a generalized expectation-maximization (GEM) algorithm for both training and the simultaneous recognition of the gesture and estimation of the value of the parameter. We present results on a pointing gesture, where the nonlinear approach permits the natural spherical coordinate parameterization of pointing direction.

international conference on computer graphics and interactive techniques | 2005

Texture optimization for example-based synthesis

Vivek Kwatra; Irfan A. Essa; Aaron F. Bobick; Nipun Kwatra

We present a novel technique for texture synthesis using optimization. We define a Markov Random Field (MRF)-based similarity metric for measuring the quality of synthesized texture with respect to a given input sample. This allows us to formulate the synthesis problem as minimization of an energy function, which is optimized using an Expectation Maximization (EM)-like algorithm. In contrast to most example-based techniques that do region-growing, ours is a joint optimization approach that progressively refines the entire texture. Additionally, our approach is ideally suited to allow for controllable synthesis of textures. Specifically, we demonstrate controllability by animating image textures using flow fields. We allow for general two-dimensional flow fields that may dynamically change over time. Applications of this technique include dynamic texturing of fluid animations and texture-based flow visualization.

International Journal of Computer Vision | 1999

Large Occlusion Stereo

Aaron F. Bobick; Stephen S. Intille

A method for solving the stereo matching problem in the presence of large occlusion is presented. A data structure—the disparity space image—is defined to facilitate the description of the effects of occlusion on the stereo matching process and in particular on dynamic programming (DP) solutions that find matches and occlusions simultaneously. We significantly improve upon existing DP stereo matching methods by showing that while some cost must be assigned to unmatched pixels, sensitivity to occlusion-cost and algorithmic complexity can be significantly reduced when highly-reliable matches, or ground control points, are incorporated into the matching process. The use of ground control points eliminates both the need for biasing the process towards a smooth solution and the task of selecting critical prior probabilities describing image formation. Finally, we describe how the detection of intensity edges can be used to bias the recovered solution such that occlusion boundaries will tend to be proposed along such edges, reflecting the observation that occlusion boundaries usually cause intensity discontinuities.

computer vision and pattern recognition | 1997

The representation and recognition of human movement using temporal templates

James W. Davis; Aaron F. Bobick

A new view-based approach to the representation and recognition of action is presented. The basis of the representation is a temporal template-a static vector-image where the vector value at each point is a function of the motion properties at the corresponding spatial location in an image sequence. Using 18 aerobics exercises as a test domain, we explore the representational power of a simple, two component version of the templates: the first value is a binary value indicating the presence of motion, and the second value is a function of the recency of motion in a sequence. We then develop a recognition method which matches these temporal templates against stored instances of views of known actions. The method automatically performs temporal segmentation, is invariant to linear changes in speed, and runs in real-time on a standard platform. We recently incorporated this technique into the KIDSROOM: an interactive, narrative play-space for children.

Teleoperators and Virtual Environments | 1999

The KidsRoom: A Perceptually-Based Interactive and Immersive Story Environment

Aaron F. Bobick; Stephen S. Intille; James W. Davis; Freedom Baird; Claudio S. Pinhanez; Lee W. Campbell; Yuri A. Ivanov; Arjan Schütte; Andrew D. Wilson

The KidsRoom is a perceptually-based, interactive, narrative playspace for children. Images, music, narration, light, and sound effects are used to transform a normal childs bedroom into a fantasy land where children are guided through a reactive adventure story. The fully automated system was designed with the following goals: (1) to keep the focus of user action and interaction in the physical and not virtual space; (2) to permit multiple, collaborating people to simultaneously engage in an interactive experience combining both real and virtual objects; (3) to use computer-vision algorithms to identify activity in the space without requiring the participants to wear any special clothing or devices; (4) to use narrative to constrain the perceptual recognition, and to use perceptual recognition to allow participants to drive the narrative; and (5) to create a truly immersive and interactive room environment. We believe the KidsRoom is the first multi-person, fully-automated, interactive, narrative environment ever constructed using non-encumbering sensors. This paper describes the KidsRoom, the technology that makes it work, and the issues that were raised during the systems development.1 A demonstration of the project, which complements the material presented here and includes videos, images, and sounds from each part of the story is available at .

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1997

A state-based approach to the representation and recognition of gesture

Aaron F. Bobick; Andrew D. Wilson

A state-based technique for the representation and recognition of gesture is presented. We define a gesture to be a sequence of states in a measurement or configuration space. For a given gesture, these states are used to capture both the repeatability and variability evidenced in a training set of example trajectories. Using techniques for computing a prototype trajectory of an ensemble of trajectories, we develop methods for defining configuration states along the prototype and for recognizing gestures from an unsegmented, continuous stream of sensor data. The approach is illustrated by application to a range of gesture-related sensory data: the two-dimensional movements of a mouse input device, the movement of the hand measured by a magnetic spatial position and orientation sensor, and, lastly, the changing eigenvector projection coefficients computed from an image sequence.

computer vision and pattern recognition | 2001

Gait recognition using static, activity-specific parameters

Aaron F. Bobick; Amos Y. Johnson

A gait-recognition technique that recovers static body and stride parameters of subjects as they walk is presented. This approach is an example of an activity-specific biometric: a method of extracting identifying properties of an individual or of an individuals behavior that is applicable only when a person is performing that specific action. To evaluate our parameters, we derive an expected confusion metric (related to mutual information), as opposed to reporting a percent correct with a limited database. This metric predicts how well a given feature vector will filter identity in a large population. We test the utility of a variety of body and stride parameters recovered in different viewing conditions on a database consisting of 15 to 20 subjects walking at both an angled and frontal-parallel view with respect to the camera, both indoors and out. We also analyze motion-capture data of the subjects to discover whether confusion in the parameters is inherently a physical or a visual measurement error property.

computer vision and pattern recognition | 1997

Real-time closed-world tracking

Stephen S. Intille; James W. Davis; Aaron F. Bobick

A real-time tracking algorithm that uses contextual information is described. The method is capable of simultaneously tracking multiple, non-rigid objects when erratic movement and object collisions are common. A closed-world assumption is used to adaptively select and weight image features used for correspondence. Results of algorithm testing and the limitations of the method are discussed. The algorithm has been used to track children in an interactive, narrative playspace.

Explore More