Davide Baltieri
University of Modena and Reggio Emilia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Davide Baltieri.
ACM Computing Surveys | 2013
Roberto Vezzani; Davide Baltieri; Rita Cucchiara
The field of surveillance and forensics research is currently shifting focus and is now showing an ever increasing interest in the task of people reidentification. This is the task of assigning the same identifier to all instances of a particular individual captured in a series of images or videos, even after the occurrence of significant gaps over time or space. People reidentification can be a useful tool for people analysis in security as a data association method for long-term tracking in surveillance. However, current identification techniques being utilized present many difficulties and shortcomings. For instance, they rely solely on the exploitation of visual cues such as color, texture, and the object’s shape. Despite the many advances in this field, reidentification is still an open problem. This survey aims to tackle all the issues and challenging aspects of people reidentification while simultaneously describing the previously proposed solutions for the encountered problems. This begins with the first attempts of holistic descriptors and progresses to the more recently adopted 2D and 3D model-based approaches. The survey also includes an exhaustive treatise of all the aspects of people reidentification, including available datasets, evaluation metrics, and benchmarking.
Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding | 2011
Davide Baltieri; Roberto Vezzani; Rita Cucchiara
The interest of the research community in creating reference datasets for performance analysis is always very high. Although new datasets, collecting large amounts of video footage are spreading in surveillance and forensics, few bench-marks with annotation data are available for testing specific tasks and especially for 3D/multi-view analysis. In this paper we present 3DPeS, a new dataset for 3D/multi- view surveillance and forensic applications. This has been designed for discussing and evaluating research results in people re-identification and other related activities (people detection, people segmentation and people tracking). The new assessed version of the dataset contains hundreds of video sequences of 200 people taken from a multi-camera distributed surveillance system over several days, with different light conditions; each person is detected multiple times and from different points of view. In surveillance scenarios, the dataset can be exploited to evaluate people reacquisition, 3D body models and people activity reconstruction algorithms. In forensics it can be adopted too, by relaxing some constraints (e.g. real time) and neglecting some information (e.g. calibration). Some results on this new dataset are presented using state of the art methods for people re-identification as a benchmark for future comparisons.
international conference on pattern recognition | 2010
Roberto Vezzani; Davide Baltieri; Rita Cucchiara
Hidden Markov Models (HMM) have been widely used for action recognition, since they allow to easily model the temporal evolution of a single or a set of numeric features extracted from the data. The selection of the feature set and the related emission probability function are the key issues to be defined. In particular, if the training set is not sufficiently large, a manual or automatic feature selection and reduction is mandatory. In this paper we propose to model the emission probability function as a Mixture of Gaussian and the feature set is obtained from the projection histograms of the foreground mask. The projection histograms contain the number of moving pixel for each row and for each column of the frame and they provide sufficient information to infer the instantaneous posture of the person. Then, the HMM framework recovers the temporal evolution of the postures recognizing in such a manner the global action. The proposed method have been successfully tested on the UT-Tower and on the Weizmann Datasets.
international conference on image analysis and processing | 2011
Davide Baltieri; Roberto Vezzani; Rita Cucchiara
We propose a newsimplified 3Dbody model (called SARC3D) for surveillance application, which can be created, updated and compared in real-time. People are detected and tracked in each calibrated camera, with their silhouette, appearance, position and orientation extracted and used to place, scale and orientate a 3D body model. For each vertex of the model a signature (color features, reliability and saliency) is computed from 2D appearance images and exploited for matching. This approach achieves robustness against partial occlusions, pose and viewpoint changes. The complete proposal and a full experimental evaluation are presented, using a new benchmark suite and the PETS2009 dataset.
european conference on computer vision | 2012
Davide Baltieri; Roberto Vezzani; Rita Cucchiara
The recognition of people orientation in single images is still an open issue in several real cases, when the image resolution is poor, body parts cannot be distinguished and localized or motion cannot be exploited. However, the estimation of a person orientation, even an approximated one, could be very useful to improve people tracking and re-identification systems, or to provide a coarse alignment of body models on the input images. In these situations, holistic features seem to be more effective and faster than model based 3D reconstructions. In this paper we propose to describe the people appearance with multi-level HoG feature sets and to classify their orientation using an array of Extremely Randomized Trees classifiers trained on quantized directions. The outputs of the classifiers are then integrated into a global continuous probability density function using a Mixture of Approximated Wrapped Gaussian distributions. Experiments on the TUD Multiview Pedestrians, the Sarc3D, and the 3DPeS datasets confirm the efficacy of the method and the improvement with respect to state of the art approaches.
international conference on computer vision | 2011
Davide Baltieri; Roberto Vezzani; Rita Cucchiara; Ákos Utasi; Csaba Benedek; Tamás Szirányi
In this paper we introduce a novel surveillance system, which uses 3D information extracted from multiple cameras to detect, track and re-identify people. The detection method is based on a 3D Marked Point Process model using two pixel-level features extracted from multi-plane projections of binary foreground masks, and uses a stochastic optimization framework to estimate the position and the height of each person. We apply a rule based Kalman-filter tracking on the detection results to find the object-to-object correspondence between consecutive time steps. Finally, a 3D body model based long-term tracking module connects broken tracks and is also used to re-identify people.
advanced video and signal based surveillance | 2010
Davide Baltieri; Roberto Vezzani; Rita Cucchiara
In this paper, we present a new and fast techniquefor background estimation from cluttered image sequences.Most of the background initialization approaches developedso far collect a number of initial frames and then requirea slow estimation step which introduces a delay wheneverit is applied. Conversely, the proposed technique redistributesthe computational load among all the frames bymeans of a patch by patch preprocessing, which makesthe overall algorithm more suitable for real-time applications.For each patch location a prototype set is created andmaintained. The background is then iteratively estimatedby choosing from each set the most appropriate candidatepatch, which should verify a sort of frequency coherencewith its neighbors. To this aim, the Hadamard transformhas been adopted which requires less computation time thanthe commonly used DCT. Finally, a refinement step exploitsspatial continuity constraints along the patch borders toprevent erroneous patch selections. The approach has beencompared with the state of the art on videos from availabledatasets (ViSOR and CAVIAR), showing a speed up of about10 times and an improved accuracy.
International Journal of Computer Vision | 2015
Davide Baltieri; Roberto Vezzani; Rita Cucchiara
People Re-identification aims at associating multiple instances of a person’s appearance acquired from different points of view, different cameras, or after a spatial or a limited temporal gap to the same identifier. The basic hypothesis is that the person’s appearance is mostly constant. Many appearance descriptors have been adopted in the past, but they are often subject to severe perspective and view-point issues. In this paper, we propose a complete re-identification framework which exploits non-articulated 3D body models to spatially map appearance descriptors (color and gradient histograms) into the vertices of a regularly sampled 3D body surface. The matching and the shot integration steps are directly handled in the 3D body model, reducing the effects of occlusions, partial views or pose changes, which normally afflict 2D descriptors. A fast and effective model to image alignment is also proposed. It allows operation on common surveillance cameras or image collections. A comprehensive experimental evaluation is presented using the benchmark suite 3DPeS.
eurographics, italian chapter conference | 2010
Davide Baltieri; Roberto Vezzani; Rita Cucchiara
Wide area video surveillance always requires to extract and integrate information coming from different cameras and views. Re-identification of people captured from different cameras or different views is one of most challenging problems. In this paper, we present a novel approach for people matching with vertices-based 3D human models. People are detected and tracked in each calibrated camera, and their silhouette, appearance, position and orientation are extracted and used to place, scale and orientate a 3D body model. Colour features are computed from the 2D appearance images and mapped to the 3D model vertices, generating the 3D model for each tracked person. A distance function between 3D models is defined in order to find matches among models belonging to the same person. This approach achieves robustness against partial occlusions, pose and viewpoint changes. A first experimental evaluation is conducted using images extracted from a real camera set-up.
acm multimedia | 2013
Davide Baltieri; Roberto Vezzani; Rita Cucchiara
People re-identification is a challenging problem in surveillance and forensics and it aims at associating multiple instances of the same person which have been acquired from different points of view and after a temporal gap. Image-based appearance features are usually adopted but, in addition to their intrinsically low discriminability, they are subject to perspective and view-point issues. We propose to completely change the approach by mapping local descriptors extracted from RGB-D sensors on a 3D body model for creating a view-independent signature. An original bone-wise color descriptor is generated and reduced with PCA to compute the person signature. The virtual bone set used to map appearance features is learned using a recursive splitting approach. Finally, people matching for re-identification is performed using the Relaxed Pairwise Metric Learning, which simultaneously provides feature reduction and weighting. Experiments on a specific dataset created with the Microsoft Kinect sensor and the OpenNi libraries prove the advantages of the proposed technique with respect to state of the art methods based on 2D or non-articulated 3D body models.