Is this you? Create Your Porfile

Neal Checka

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Neal Checka is active.

Explore More

Publication

Featured researches published by Neal Checka.

international conference on acoustics, speech, and signal processing | 2004

Multiple person and speaker activity tracking with a particle filter

Neal Checka; Kevin W. Wilson; Michael R. Siracusa; Trevor Darrell

In this paper, we present a system that combines sound and vision to track multiple people. In a cluttered or noisy scene, multi-person tracking estimates have a distinctly non-Gaussian distribution. We apply a particle filter with audio and video state components, and derive observation likelihood methods based on both audio and video measurements. Our state includes the number of people present, their positions, and whether each person is talking. We show experiments in an environment with sparse microphones and monocular cameras. Our results show that our system can accurately track the locations and speech activity of a varying number of people.

ieee international conference on automatic face and gesture recognition | 2002

Fast stereo-based head tracking for interactive environments

Louis-Philippe Morency; Ali Rahimi; Neal Checka; Trevor Darrell

We present a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a users head under large rotation and illumination variations. Precision and usability of our approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments.

computer vision and pattern recognition | 2003

A Probabilistic Framework for Multi-modal Multi-Person Tracking

Neal Checka; Kevin W. Wilson; Vibhav Rangarajan; Trevor Darrell

In this paper, we present a probabilistic tracking framework that combines sound and vision to achieve more robust and accurate tracking of multiple objects. In a cluttered or noisy scene, our measurements have a non-Gaussian, multi-modal distribution. We apply a particle filter to track multiple people using combined audio and video observations. We have applied our algorithm to the domain of tracking people with a stereo-based visual foreground detection algorithm and audio localization using a beamforming technique. Our model also accurately reflects the number of people present. We test the efficacy of our system on a sequence of multiple people moving and speaking in an indoor environment.

workshop on applications of computer vision | 2002

Activity maps for location-aware computing

David Demirdjian; Konrad Tollmar; Kimberle Koile; Neal Checka; Trevor Darrell

Location-based context is important for many applications. Previous systems offered only coarse room-level features or used manually specified room regions to determine fine-scale features. We propose a location context mechanism based on activity maps, which define regions of similar context based on observations of 3-D patterns of location and motion in an environment. We describe an algorithm for obtaining activity maps using the spatio-temporal clustering of visual tracking data. We show how the recovered maps correspond to regions for common tasks in the environment and describe their use in some applications.

ubiquitous computing | 2002

Face-Responsive Interfaces: From Direct Manipulation to Perceptive Presence

Trevor Darrell; Konrad Tollmar; Frank Bentley; Neal Checka; Loius-Phillipe Morency; Ali Rahimi; Alice H. Oh

Systems for tracking faces using computer vision have recently become practical for human-computer interface applications. We are developing prototype systems for face-responsive interaction, exploring three different interface paradigms: direct manipulation, gazemediated agent dialog, and perceptually-driven remote presence. We consider the characteristics of these types of interactions, and assess the performance of our system on each application. We have found that face pose tracking is a potentially accurate means of cursor control and selection, is seen by users as a natural way to guide agent dialog interaction, and can be used to create perceptually-driven presence artefacts which convey real-time awareness of a remote space.

workshop on perceptive user interfaces | 2001

Audio-video array source separation for perceptual user interfaces

Kevin W. Wilson; Neal Checka; David Demirdjian; Trevor Darrell

Steerable microphone arrays provide a flexible infrastructure for audio source separation. In order for them to be used effectively in perceptual user interfaces, there must be a mechanism in place for steering the focus of the array to the sound source. Audio-only steering techniques often perform poorly in the presence of multiple sound sources or strong reverberation. Video-only techniques can achieve high spatial precision but require that the audio and video subsystems be accurately calibrated to preserve this precision. We present an audio-video localization technique that combines the benefits of the two modalities. We implement our technique in a test environment containing multiple stereo cameras and a room-sized microphone array. Our technique achieves an 8.9 dB improvement over a single far-field microphone and a 6.7 dB improvement over source separation based on video-only localization.

international conference on multimodal interfaces | 2002

Audiovisual arrays for untethered spoken interfaces

Kevin W. Wilson; Vibhav Rangarajan; Neal Checka; Trevor Darrell

When faced with a distant speaker at a known location in a noisy environment, a microphone array can provide a significantly improved audio signal for speech recognition. Estimating the location of a speaker in a reverberant environment from audio information alone can be quite difficult, so we use an array of video cameras to aid localization. Stereo processing techniques are used on pairs of cameras, and foreground 3-D points are grouped to estimate the trajectory of people as they move in an environment. These trajectories are used to guide a microphone array beamformer. Initial results using this system for speech recognition demonstrate increased recognition rates compared to non-array processing techniques.

human-robot interaction | 2012

Handheld operator control unit

Neal Checka; Shawn Schaffert; David Demirdjian; Jan Falkowski; Daniel H. Grollman

Currently, unmanned vehicles support soldiers in a variety of military applications. Typically, a specially-trained user teleoperates these platforms using a large and bulky Operator Control Unit (OCU). The operators total attention is required for controlling the tedious, low-level aspects of the platform, dramatically reducing his personal situational awareness. Furthermore, these OCUs are both platform and mission specific. Ideally, a soldier could instead carry light-weight and portable multi-purpose devices to act as OCUs for multiple platform/mission scenarios. These devices would support a standard set of OCU functionality (e.g., as driving a ground robot) and additional higherlevel task operations (e.g., autonomously patrolling an area). This extended abstract presents the development of apps for a handheld platform that enable both low- and high-level control of an unmanned vehicle.

ieee international conference on technologies for homeland security | 2015

AESOP: Adaptive Event detection SOftware using Programming by example

Ashwin Thangali; Harsha Prasad; Sai Kethamakka; David Demirdjian; Neal Checka

This paper presents AESOP, a software tool for automatic event detection in video. AESOP employs a supervised learning approach for constructing event models, given training examples from different event classes. A trajectory-based formulation is used for modeling events with an aim towards incorporating invariance to changes in the camera location and orientation parameters. The proposed formulation is designed to accommodate events that involve interactions between two or more entities over an extended period of time. AESOPs event models are formulated as HMMs to improve the event detection algorithms robustness to noise in input data and to achieve computationally efficient algorithms for event model training and event detection. AESOPs performance is demonstrated on a wide range of different scenarios, including stationary camera surveillance and aerial video footage captured in land and maritime environments.

international conference on computer vision | 2001