Camille Monnier | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Camille Monnier is active.

Explore More

Publication

Featured researches published by Camille Monnier.

international conference on computer vision | 2013

Randomized Ensemble Tracking

Qinxun Bai; Zheng Wu; Stan Sclaroff; Margrit Betke; Camille Monnier

We propose a randomized ensemble algorithm to model the time-varying appearance of an object for visual tracking. In contrast with previous online methods for updating classifier ensembles in tracking-by-detection, the weight vector that combines weak classifiers is treated as a random variable and the posterior distribution for the weight vector is estimated in a Bayesian manner. In essence, the weight vector is treated as a distribution that reflects the confidence among the weak classifiers used to construct and adapt the classifier ensemble. The resulting formulation models the time-varying discriminative ability among weak classifiers so that the ensembled strong classifier can adapt to the varying appearance, backgrounds, and occlusions. The formulation is tested in a tracking-by-detection implementation. Experiments on 28 challenging benchmark videos demonstrate that the proposed method can achieve results comparable to and often better than those of state-of-the-art approaches.

european conference on computer vision | 2014

A Multi-scale Boosted Detector for Efficient and Robust Gesture Recognition

Camille Monnier; Stan German; Andrey Ost

We present an approach to detecting and recognizing gestures in a stream of multi-modal data. Our approach combines a sliding-window gesture detector with features drawn from skeleton data, color imagery, and depth data produced by a first-generation Kinect sensor. The detector consists of a set of one-versus-all boosted classifiers, each tuned to a specific gesture. Features are extracted at multiple temporal scales, and include descriptive statistics of normalized skeleton joint positions, angles, and velocities, as well as image-based hand descriptors. The full set of gesture detectors may be trained in under two hours on a single machine, and is extremely efficient at runtime, operating at 1700fps using only skeletal data, or at 100fps using fused skeleton and image features. Our method achieved a Jaccard Index score of 0.834 on the ChaLearn-2014 Gesture Recognition Test dataset, and was ranked 2nd overall in the competition.

international conference on document analysis and recognition | 2005

Sequential correction of perspective warp in camera-based documents

Camille Monnier; Vitaly Ablavsky; Steve Holden; Magnus Snorrason

Documents captured with hand-held devices, such as digital cameras often exhibit perspective warp artifacts. These artifacts pose problems for OCR systems which at best can only handle in-plane rotation. We propose a method for recovering the planar appearance of an input document image by examining the vertical rate of change in scale of features in the document. Our method makes fewer assumptions about the document structure than do previously published algorithms.

ieee international conference on automatic face gesture recognition | 2015

A random forest approach to segmenting and classifying gestures

Ajjen Joshi; Camille Monnier; Margrit Betke; Stan Sclaroff

This work investigates a gesture segmentation and recognition scheme that employs a random forest classification model. Our method trains a random forest model to recognize gestures from a given vocabulary, as presented in a training dataset of video plus 3D body joint locations, as well as out-of-vocabulary (non-gesture) instances. Given an input video stream, our trained model is applied to candidate gestures using sliding windows at multiple temporal scales. The class label with the highest classifier confidence is selected, and its corresponding scale is used to determine the segmentation boundaries in time. We evaluated our formulation in segmenting and recognizing gestures from two different benchmark datasets: the NATOPS dataset of 9,600 gesture instances from a vocabulary of 24 aircraft handling signals, and the ChaLearn dataset of 7,754 gesture instances from a vocabulary of 20 Italian communication gestures. The performance of our method compares favorably with state-of-the-art methods that employ Hidden Markov Models or Hidden Conditional Random Fields on the NATOPS dataset.

Image and Vision Computing | 2017

Comparing random forest approaches to segmenting and classifyinggestures

Ajjen Joshi; Camille Monnier; Margrit Betke; Stan Sclaroff

A complete gesture recognition system should localize and classify each gesture from a given gesture vocabulary, within a continuous video stream. In this work, we compare two approaches: a method that performs the tasks of temporal segmentation and classification simultaneously with another that performs the tasks sequentially. The first method trains a single random forest model to recognize gestures from a given vocabulary, as presented in a training dataset of video plus 3D body joint locations, as well as out-of-vocabulary (non-gesture) instances. The second method employs a cascaded approach, training a binary random forest model to distinguish gestures from background and a multi-class random forest model to classify segmented gestures. Given a test input video stream, both frameworks are applied using sliding windows at multiple temporal scales. We evaluated our formulation in segmenting and recognizing gestures from two different benchmark datasets: the NATOPS dataset of 9600 gesture instances from a vocabulary of 24 aircraft handling signals, and the ChaLearn dataset of 7754 gesture instances from a vocabulary of 20Italian communication gestures. The performance of our method compares favorably with state-of-the-art methods that employ Hidden Markov Models or Hidden Conditional Random Fields on the NATOPS dataset. We conclude with a discussion of the advantages of using our model for the task of gesture recognition and segmentation, and outline weaknesses which need to be addressed in the future. Sequential and simultaneous random forest frameworks are compared.Fusing skeletal and appearance features enables accurate gesture representations.Uniform descriptors are created for gestures to account for variability in length.

Proceedings of SPIE, the International Society for Optical Engineering | 2005

Parameter adaptation for target recognition in LADAR

Mark R. Stevens; Camille Monnier; Magnus Snorrason

Automatic Target Recognition (ATR) algorithms are extremely sensitive to differences between the operating conditions under which they are trained and the extended operating conditions in which the fielded algorithms operate. For ATR algorithms to robustly recognize targets while retaining low false alarm rates, they must be able to identify the conditions under which they are operating and tune their parameters on the fly. In this paper, we present a method for tuning the parameters of a model based ATR algorithm using estimates of the current operating conditions. The problem has two components: 1) identifying the current operating conditions and 2) using that information to tune parameters to improve performance. In this paper, we explore the use of a reinforcement learning technique called tile coding for parameter adaptation. In tile coding, we first define a set of valid states describing the world (the operating conditions of interest, such as the level of obscuration). Next, actions (or parameter settings used by the ATR) are defined that are applied when in that state. Parameter settings for each operating condition are learned using an off-line reinforcement learning feedback loop. The result is a lookup table to select the optimal parameter settings for each operation condition. We present results on real LADAR imagery based on parameter tuning learned off-line using synthetic imagery.

Proceedings of SPIE | 2013

Robust leader tracking from an unmanned ground vehicle

Camille Monnier; Stan German; Andrey Ostapchenko

While many leader-follower technologies for robotic mules have been developed in recent years, the problem of reliably tracking and re-acquiring a human leader through cluttered environments continues to pose a challenge to widespread acceptance of these systems. Recent approaches to leader tracking rely on leader-worn equipment that may be damaged, hidden from view, or lost, such as radio transmitters or special clothing, as well as specialized sensing hardware such as high-resolution LIDAR. We present a vision-based approach for robustly tracking a leader using a simple monocular camera. The proposed method requires no modification to the leader’s equipment, nor any specialized sensors on board the host platform. The system learns a discriminative model of the leader’s appearance to robustly track him or her through long occlusions, changing lighting conditions, and cluttered environments. We demonstrate the system’s tracking capabilities on publicly available benchmark datasets, as well as in representative scenarios captured using a small unmanned ground vehicle (SUGV).

Proceedings of SPIE | 2012

A monocular leader-follower system for small mobile robots

Camille Monnier; Stan German; Andrey Ostapchenko

Current generation UGV control systems typically require operators to physically control a platform through teleoperation, even for simple tasks such as travelling from one location to another. While vision-based control technologies promise to significantly reduce the burden on UGV operators, most schemes rely on specialized sensing hardware, such as LIDAR or stereo cameras, or require additional operator-worn equipment or markers to differentiate the leader from nearby pedestrians. We present a system for robust leader-follower control of small UGVs using only a single monocular camera, which is ubiquitous on mobile platforms. The system allows a user to control a mobile robot by leading the way and issuing commands through arm/hand gestures, and differentiates between the leader and nearby pedestrians. The software achieves this by integrating efficient algorithms for pedestrian detection, online appearance learning, and kinematic tracking with a lightweight technique for camera-based gesture recognition.

International Conference on Applied Human Factors and Ergonomics | 2017

A Multi-modal Interface for Natural Operator Teaming with Autonomous Robots (MINOTAUR)

Stephanie Kane; Kevin McGurgan; Martin Voshell; Camille Monnier; Stan German; Andrey Ost

Dismounted squads face logistical problems, including the management of physical burdens in complex operating environments. Autonomous unmanned ground vehicles (UGVs) can help transport equipment and supplies, but require active remote control or teleoperation, even for mundane tasks such as long-distance travel. This requires heads down attention, causing fatigue and reducing situational awareness. To address these needs, we designed and prototyped a Multi-modal Interface for Natural Operator Teaming with Autonomous Robots (MINOTAUR). The MINOTAUR human-robot interface (HRI) provides observability and directability of UGV behavior through a multi-modal interface that leverages gesture input, touch/physical input through a watch-based operator control unit (OCU), and voice input. MINOTAUR’s multi-modal approach enables operators to leverage the strengths of each modality, while the OCU enables quick control inputs through lightweight interactions and at-a-glance information status summaries. This paper describes the requirements and use case analysis that informed MINOTAUR designs and provides detailed descriptions of design concepts.

nuclear science symposium and medical imaging conference | 2015

Advanced algorithm development for detection, tracking, and identification of vehicle-borne radiation sources in a multi-sensor, distributed testbed

Daniel A. Cooper; Robert J. Ledoux; Krzysztof Kamieniecki; Stephen E. Korbly; James Costales; Rustam Niyazov; David Hempstead; Michael Gallagher; Lauren Janney; Nathan D'Olympia; Camille Monnier; Richard Wronski

A robust network of distributed sensors has been proposed in response to the Radiation Awareness and Interdiction Network (RAIN) Broad Agency Announcement (BAA) issued by DHS/DNDO in March of 2014. The testbed system is designed to detect, track, and identify potential threatening radiation sources in moving vehicles without interrupting the flow of traffic in typical highway scenarios. The algorithmic basis for the system depends on a number of data fusion methodologies to optimally combine and exploit multi-sensor, multi-modal data. Specifically, data-level fusion of radiation measurements is being used to enhance detection and identification of radiation sources, while extracted feature-level data from auxiliary video sensors is used both to improve computational speed and accuracy as well as provide operationally relevant source attribution information. An overview of the current development work will be provided in two parts: 1) A theoretical description of the data fusion algorithms and their expected utility will be provided; and 2) Performance results from both simulated and real measurements will be used to demonstrate the efficacy of a system using the proposed data fusion algorithms. The expected value of the testbed system using the advanced algorithms will also be discussed.

Explore More