Andrea Fossati | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrea Fossati is active.

Explore More

Publication

Featured researches published by Andrea Fossati.

International Journal of Computer Vision | 2013

Random Forests for Real Time 3D Face Analysis

Gabriele Fanelli; Matthias Dantone; Juergen Gall; Andrea Fossati; Luc Van Gool

We present a random forest-based framework for real time head pose estimation from depth images and extend it to localize a set of facial features in 3D. Our algorithm takes a voting approach, where each patch extracted from the depth image can directly cast a vote for the head pose or each of the facial features. Our system proves capable of handling large rotations, partial occlusions, and the noisy depth data acquired using commercial sensors. Moreover, the algorithm works on each frame independently and achieves real time performance without resorting to parallel computations on a GPU. We present extensive experiments on publicly available, challenging datasets and present a new annotated head pose database recorded using a Microsoft Kinect.

british machine vision conference | 2011

Social interaction discovery by statistical analysis of F-formations.

Marco Cristani; Loris Bazzani; Giulia Paggetti; Andrea Fossati; Diego Tosato; Alessio Del Bue; Gloria Menegaz; Vittorio Murino

We present a novel approach for detecting social interactions in a crowded scene by employing solely visual cues. The detection of social interactions in unconstrained scenarios is a valuable and important task, especially for surveillance purposes. Our proposal is inspired by the social signaling literature, and in particular it considers the sociological notion of F-formation. An F-formation is a set of possible configurations in space that people may assume while participating in a social interaction. Our system takes as input the positions of the people in a scene and their (head) orientations; then, employing a voting strategy based on the Hough transform, it recognizes F-formations and the individuals associated with them. Experiments on simulations and real data promote our idea.

computer vision and pattern recognition | 2011

Functional categorization of objects using real-time markerless motion capture

Juergen Gall; Andrea Fossati; Luc Van Gool

Unsupervised categorization of objects is a fundamental problem in computer vision. While appearance-based methods have become popular recently, other important cues like functionality are largely neglected. Motivated by psychological studies giving evidence that human demonstration has a facilitative effect on categorization in infancy, we propose an approach for object categorization from depth video streams. To this end, we have developed a method for capturing human motion in real-time. The captured data is then used to temporally segment the depth streams into actions. The set of segmented actions are then categorized in an un-supervised manner, through a novel descriptor for motion capture data that is robust to subject variations. Furthermore, we automatically localize the object that is manipulated within a video segment, and categorize it using the corresponding action. For evaluation, we have recorded a dataset that comprises depth data with registered video sequences for 6 subjects, 13 action classes, and 174 object manipulations.

Archive | 2013

Consumer Depth Cameras for Computer Vision

Andrea Fossati; Juergen Gall; Helmut Grabner; Xiaofeng Ren; Kurt Konolige

We analyze Kinect as a 3D measuring device, experimentally investigate depth measurement resolution and error properties, and make a quantitative comparison of Kinect accuracy with stereo reconstruction from SLR cameras and a 3DTOF camera. We propose a Kinect geometrical model and its calibration procedure providing an accurate calibration of Kinect 3D measurement and Kinect cameras. We compare our Kinect calibration procedure with its alternatives available on Internet, and integrate it into an SfM pipeline where 3D measurements from a moving Kinect are transformed into a common coordinate system, by computing relative poses from matches in its color camera.

machine vision applications | 2011

Real-time vehicle tracking for driving assistance

Andrea Fossati; Patrick Schönmann; Pascal Fua

Detecting car taillights at night is a task which can nowadays be accomplished very fast on cheap hardware. We rely on such detections to build a vision-based system that, coupling them in a rule-based fashion, is able to detect and track vehicles. This allows the generation of an interface that informs a driver of the relative distance and velocity of other vehicles in real time and triggers a warning when a potentially dangerous situation arises. We demonstrate the system using sequences shot using a camera mounted behind a car’s windshield.

international conference on robotics and automation | 2014

3D reconstruction of freely moving persons for re-identification with a depth sensor

Matteo Munaro; Alberto Basso; Andrea Fossati; Luc Van Gool; Emanuele Menegatti

In this work, we describe a novel method for creating 3D models of persons freely moving in front of a consumer depth sensor and we show how they can be used for long-term person re-identification. For overcoming the problem of the different poses a person can assume, we exploit the information provided by skeletal tracking algorithms for warping every point cloud frame to a standard pose in real time. Then, the warped point clouds are merged together to compose the model. Re-identification is performed by matching body shapes in terms of whole point clouds warped to a standard pose with the described method. We compare this technique with a classification method based on a descriptor of skeleton features and with a mixed approach which exploits both skeleton and shape features. We report experiments on two datasets we acquired for RGB-D re-identification which use different skeletal tracking algorithms and which are made publicly available to foster research in this new research branch.

Person Re-Identification | 2014

One-shot person re-identification with a consumer depth camera

Matteo Munaro; Andrea Fossati; Alberto Basso; Emanuele Menegatti; Luc Van Gool

In this chapter, we propose a comparison between two techniques for one-shot person re-identification from soft biometric cues. One is based upon a descriptor composed of features provided by a skeleton estimation algorithm; the other compares body shapes in terms of whole point clouds. This second approach relies on a novel technique we propose to warp the subject’s point cloud to a standard pose, which allows to disregard the problem of the different poses a person can assume. This technique is also used for composing 3D models which are then used at testing time for matching unseen point clouds. We test the proposed approaches on an existing RGB-D re-identification dataset and on the newly built BIWI RGBD-ID dataset. This dataset provides sequences of RGB, depth, and skeleton data for 50 people in two different scenarios and it has been made publicly available to foster advancement in this new research branch.

computer vision and pattern recognition | 2009

Observable subspaces for 3D human motion recovery

Andrea Fossati; Mathieu Salzmann; Pascal Fua

The articulated body models used to represent human motion typically have many degrees of freedom, usually expressed as joint angles that are highly correlated. The true range of motion can therefore be represented by latent variables that span a low-dimensional space. This has often been used to make motion tracking easier. However, learning the latent space in a problem- independent way makes it non trivial to initialize the tracking process by picking appropriate initial values for the latent variables, and thus for the pose. In this paper, we show that by directly using observable quantities as our latent variables, we eliminate this problem and achieve full automation given only modest amounts of training data. More specifically, we exploit the fact that the trajectory of a persons feet or hands strongly constrains body pose in motions such as skating, skiing, or golfing. These trajectories are easy to compute and to parameterize using a few variables. We treat these as our latent variables and learn a mapping between them and sequences of body poses. In this manner, by simply tracking the feet or the hands, we can reliably guess initial poses over whole sequences and, then, refine them.

computer vision and pattern recognition | 2008

Tracking articulated bodies using Generalized Expectation Maximization

Andrea Fossati; Elise Arnaud; Radu Horaud; Pascal Fua

A generalized expectation maximization (GEM) algorithm is used to retrieve the pose of a person from a monocular video sequence shot with a moving camera. After embedding the set of possible poses in a low dimensional space using principal component analysis, the configuration that gives the best match to the input image is held as estimate for the current frame. This match is computed iterating GEM to assign edge pixels to the correct body part and to find the body pose that maximizes the likelihood of the assignments.

international conference on 3d imaging, modeling, processing, visualization & transmission | 2012

Exploiting Physical Inconsistencies for 3D Scene Understanding

Andrea Fossati; Helmut Grabner; Luc Van Gool

Reliable 3D object tracking can provide strong cues for scene understanding. In this paper we exploit inconsistencies between measured 3D trajectories and their predictions using a physical model. In a set of proof-of-concept experiments we show how to retrieve the camera rotation and translation and how to detect surfaces that are hard to visually discern by simply tracking a rigid object. Furthermore we introduce the class distinction between active and passive objects. Prototype examples demonstrate the usability of the visual input for this type of classification. In all the presented experiments, additional information and a deeper understanding about the scene can be obtained, which would not be possible by analyzing solely the image measurements.

Explore More