Is this you? Create Your Porfile

Yuri A. Ivanov

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuri A. Ivanov is active.

Explore More

Publication

Featured researches published by Yuri A. Ivanov.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2000

Recognition of visual activities and interactions by stochastic parsing

Yuri A. Ivanov; Aaron F. Bobick

This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-level detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple interacting objects.

Teleoperators and Virtual Environments | 1999

The KidsRoom: A Perceptually-Based Interactive and Immersive Story Environment

Aaron F. Bobick; Stephen S. Intille; James W. Davis; Freedom Baird; Claudio S. Pinhanez; Lee W. Campbell; Yuri A. Ivanov; Arjan Schütte; Andrew D. Wilson

The KidsRoom is a perceptually-based, interactive, narrative playspace for children. Images, music, narration, light, and sound effects are used to transform a normal childs bedroom into a fantasy land where children are guided through a reactive adventure story. The fully automated system was designed with the following goals: (1) to keep the focus of user action and interaction in the physical and not virtual space; (2) to permit multiple, collaborating people to simultaneously engage in an interactive experience combining both real and virtual objects; (3) to use computer-vision algorithms to identify activity in the space without requiring the participants to wear any special clothing or devices; (4) to use narrative to constrain the perceptual recognition, and to use perceptual recognition to allow participants to drive the narrative; and (5) to create a truly immersive and interactive room environment. We believe the KidsRoom is the first multi-person, fully-automated, interactive, narrative environment ever constructed using non-encumbering sensors. This paper describes the KidsRoom, the technology that makes it work, and the issues that were raised during the systems development.1 A demonstration of the project, which complements the material presented here and includes videos, images, and sounds from each part of the story is available at .

International Journal of Computer Vision | 1998

Fast lighting independent background subtraction

Yuri A. Ivanov; Aaron F. Bobick; John Liu

This paper describes a simple method of fast background subtraction based upon disparity verification that is invariant to arbitrarily rapid run-time changes in illumination. Using two or more cameras, the method requires the off-line construction of disparity fields mapping the primary background images. At runtime, segmentation is performed by checking background image to each of the additional auxiliary color intensity values at corresponding pixels. If more than two cameras are available, more robust segmentation can be achieved and, in particular, the occlusion shadows can be generally eliminated as well. Because the method only assumes fixed background geometry, the technique allows for illumination variation at runtime. Since no disparity search is performed, the algorithm is easily implemented in real-time on conventional hardware.

computer vision and pattern recognition | 1998

Action recognition using probabilistic parsing

Aaron F. Bobick; Yuri A. Ivanov

A new approach to the recognition of temporal behaviours and activities is presented. The fundamental idea, inspired by work in speech recognition, is to divide the inference problem into two levels. The lower level is performed using standard independent probabilistic temporal event detectors such as hidden Markov models (HMMs) to propose candidate detections of low level temporal features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low level detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. To achieve such a system we provide techniques for generating a discrete symbol stream from continuous low level detectors and for enforcing temporal exclusion constraints during parsing. We demonstrate the approach in several experiments using both visual and other sensing data.

Versus | 1999

Video surveillance of interactions

Yuri A. Ivanov; Chris Stauffer; Aaron F. Bobick; W.E.L. Grimson

This paper describes an automatic surveillance system, which performs labeling of events and interactions in an outdoor environment. The system is designed to monitor activities in an open parking lot. It consists of three components-an adaptive tracker, an event generator, which maps object tracks onto a set of pre-determined discrete events, and a stochastic parser. The system performs segmentation and labeling of surveillance video of a parking lot and identifies person-vehicle interactions, such as pick-up and drop-off. The system presented in this paper is developed jointly by MIT Media Lab and MIT Artificial Intelligence Lab.

international conference on pattern recognition | 2004

Probabilistic combination of multiple modalities to detect interest

Ashish Kapoor; Rosalind W. Picard; Yuri A. Ivanov

This paper describes a new approach to combine multiple modalities and applies it to the problem of affect recognition. The problem is posed as a combination of classifiers in a probabilistic framework that naturally explains the concepts of experts and critics. Each channel of data has an expert associated that generates the beliefs about the correct class. Probabilistic models of error and the critics, which predict the performance of the expert on the current input, are used to combine the experts beliefs about the correct class. The method is applied to detect the affective state of interest using information from the face, postures and task the subjects are performing. The classification using multiple modalities achieves a recognition accuracy of 67.8%, outperforming the classification using individual modalities. Further, the proposed combination scheme achieves the greatest reduction in error when compared with other classifier combination methods.

international conference on computer vision | 1999

Recognition of multi-agent interaction in video surveillance

Yuri A. Ivanov; Aaron F. Bobick

This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. A complete system consisting of an adaptive tracker, an event generator and the parser performs segmentation and labelling of a surveillance video of a parking lot; the system correctly identifies activities such as pick-up and drop-off, which involve person-vehicle interactions. The main contributions of this paper are extending the parsing algorithm to handle multi-agent interactions within a single parser, providing a general mechanism for consistency-based pruning, and developing an efficient incremental parsing algorithm.

Communications of The ACM | 2000

Perceptual user interfaces: the KidsRoom

Aaron F. Bobick; Stephen S. Intille; James W. Davis; Freedom Baird; Claudio S. Pinhanez; Lee W. Campbell; Yuri A. Ivanov; Arjan Schütte; Andrew D. Wilson

T he KidsRoom is a fully automated and interactive narrative playspace for children developed at the MIT Media Laboratory. Built to explore the design of perceptually based interactive interfaces, the Kids-Room uses computer vision action recognition simultaneously with computerized control of images, video, light, music, sound, and narration to guide children through a storybook adventure. Unlike most previous work in interactive environments, the Kids-Room does not require people in the space to wear any special clothing or hardware, and the KidsRoom can accommodate up to four people simultaneously. The system was designed to use computational perception to keep most interaction in the real, physical space even as participants interacted with virtual characters and scenes. The KidsRoom, designed in the spirit of several popular childrens books, is an interactive childs bedroom that stimulates imagination by responding to actions with images and sound to transform itself into a storybook world. Two of the bedroom walls resemble the real walls in a childs room, complete with real furniture, posters, and windows. The other two walls are large, back-projected video screens used to transform the appearance of the room environment. Four speakers and one amplifier project steerable sound effects, music, and narration into the space. Three video cameras overlooking the space provide input to computer vision people-tracking and action recognition algorithms. Computer-controlled theatrical lighting illuminates the space, and a microphone detects the volume of enthusiastic screams. The room is fully automated. During the story, children interact with objects in the room, with one another, and with virtual creatures projected onto the walls. Perceptual recognition makes it possible for the room to respond to the physical actions of the children by appropriately moving the story forward thereby creating a compelling interactive narrative experience. Conversely, the narrative context of the story makes it easier to develop context-dependent (and therefore more robust) action recognition algorithms. The story developed for the KidsRoom begins with a normal-looking bedroom. Children enter after being told to find out the magic word by asking the talking furniture that speaks when approached. When the children scream the magic word loudly, sounds and images transform the room into a mystical forest. The story narration prods the children to stay in a group and follow a path to a river (see the stone path (a) in the figure). Along the way, they encounter roaring monsters and must hide behind the bed to make the roars …

Robotics and Autonomous Systems | 2002

Solving weak transduction with EM

Yuri A. Ivanov; Bruce Blumberg

Abstract In this paper we describe an algorithm designed for learning perceptual organization of an autonomous agent. The learning algorithm performs incremental clustering of a perceptual input under reward. The distribution of the input samples is modeled by a Gaussian mixture density, which serves as a state space for the policy learning algorithm. The agent learns to select actions in response to the presented stimuli simultaneously with estimating the parameters of the input mixture density. The feedback from the environment is given to the agent in the form of a scalar value, or a reward , which represents the utility of a particular clustering configuration for the action selection. The setting of the learning task makes it impossible to use supervised or partially supervised techniques to estimate the parameters of the input density. The paper introduces the notion of weak transduction and shows a solution to it using an EM-based framework.

international conference on computer graphics and interactive techniques | 2002