Matia Pizzoli | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matia Pizzoli is active.

Explore More

Publication

Featured researches published by Matia Pizzoli.

international conference on robotics and automation | 2014

SVO: Fast semi-direct monocular visual odometry

Christian Forster; Matia Pizzoli; Davide Scaramuzza

We propose a semi-direct monocular visual odometry algorithm that is precise, robust, and faster than current state-of-the-art methods. The semi-direct approach eliminates the need of costly feature extraction and robust matching techniques for motion estimation. Our algorithm operates directly on pixel intensities, which results in subpixel precision at high frame-rates. A probabilistic mapping method that explicitly models outlier measurements is used to estimate 3D points, which results in fewer outliers and more reliable points. Precise and high frame-rate motion estimation brings increased robustness in scenes of little, repetitive, and high-frequency texture. The algorithm is applied to micro-aerial-vehicle state-estimation in GPS-denied environments and runs at 55 frames per second on the onboard embedded computer and at more than 300 frames per second on a consumer laptop. We call our approach SVO (Semi-direct Visual Odometry) and release our implementation as open-source software.

international conference on robotics and automation | 2014

REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time

Matia Pizzoli; Christian Forster; Davide Scaramuzza

In this paper, we solve the problem of estimating dense and accurate depth maps from a single moving camera. A probabilistic depth measurement is carried out in real time on a per-pixel basis and the computed uncertainty is used to reject erroneous estimations and provide live feedback on the reconstruction progress. Our contribution is a novel approach to depth map computation that combines Bayesian estimation and recent development on convex optimization for image processing. We demonstrate that our method outperforms state-of-the-art techniques in terms of accuracy, while exhibiting high efficiency in memory usage and computing power. We call our approach REMODE (REgularized MOnocular Depth Estimation). Our CUDA-based implementation runs at 30Hz on a laptop computer and is released as open-source software.

international symposium on safety, security, and rescue robotics | 2012

Rescue robots at earthquake-hit Mirandola, Italy: A field report

G-J M. Kruijff; Viatcheslav Tretyakov; Thorsten Linder; Fiora Pirri; Mario Gianni; Panagiotis Papadakis; Matia Pizzoli; Arnab Sinha; E. Pianese; S. Corrao; F. Priori; S. Febrini; S. Angeletti

In May 2012, two major earthquakes occurred in the Emilia-Romagna region, Northern Italy, followed by further aftershocks and earthquakes in June 2012. This sequence of earthquakes and shocks caused multiple casualties, and widespread damage to numerous historical buildings in the region. The Italian National Fire Corps deployed disaster response and recovery of people and buildings. In June 2012, they requested the aid of the EU-funded project NIFTi, to assess damage to historical buildings, and cultural artifacts located therein. To this end, NIFTi deployed a team of humans and robots (UGV, UAV) in the red-area of Mirandola, Emilia-Romagna, from Tuesday July 24 until Friday July 27, 2012. The team worked closely together with the members of the Italian National Fire Corps involved in the red area. This paper describes the deployment, and experience.

intelligent robots and systems | 2013

Air-ground localization and map augmentation using monocular dense reconstruction

Christian Forster; Matia Pizzoli; Davide Scaramuzza

We propose a new method for the localization of a Micro Aerial Vehicle (MAV) with respect to a ground robot. We solve the problem of registering the 3D maps computed by the robots using different sensors: a dense 3D reconstruction from the MAV monocular camera is aligned with the map computed from the depth sensor on the ground robot. Once aligned, the dense reconstruction from the MAV is used to augment the map computed by the ground robot, by extending it with the information conveyed by the aerial views. The overall approach is novel, as it builds on recent developments in live dense reconstruction from moving cameras to address the problem of air-ground localization. The core of our contribution is constituted by a novel algorithm integrating dense reconstructions from monocular views, Monte Carlo localization, and an iterative pose refinement. In spite of the radically different vantage points from which the maps are acquired, the proposed method achieves high accuracy whereas appearance-based, state-of-the-art approaches fail. Experimental validation in indoor and outdoor scenarios reported an accuracy in position estimation of 0.08 meters and real time performance. This demonstrates that our new approach effectively overcomes the limitations imposed by the difference in sensors and vantage points that negatively affect previous techniques relying on matching visual features.

computer vision and pattern recognition | 2011

A general method for the point of regard estimation in 3D space

Fiora Pirri; Matia Pizzoli; Alessandro Rudi

A novel approach to 3D gaze estimation for wearable multi-camera devices is proposed and its effectiveness is demonstrated both theoretically and empirically. The proposed approach, firmly grounded on the geometry of the multiple views, introduces a calibration procedure that is efficient, accurate, highly innovative but also practical and easy. Thus, it can run online with little intervention from the user. The overall gaze estimation model is general, as no particular complex model of the human eye is assumed in this work. This is made possible by a novel approach, that can be sketched as follows: each eye is imaged by a camera; two conics are fitted to the imaged pupils and a calibration sequence, consisting in the subject gazing a known 3D point, while moving his/her head, provides information to 1) estimate the optical axis in 3D world; 2) compute the geometry of the multi-camera system; 3) estimate the Point of Regard in 3D world. The resultant model is being used effectively to study visual attention by means of gaze estimation experiments, involving people performing natural tasks in wide-field, unstructured scenarios.

robotics science and systems | 2014

Appearance-based Active, Monocular, Dense Reconstruction for Micro Aerial Vehicles

Christian Forster; Matia Pizzoli; Davide Scaramuzza

In this paper, we investigate the following problem: given the image of a scene, what is the trajectory that a robot-mounted camera should follow to allow optimal dense depth estimation? The solution we propose is based on maximizing the information gain over a set of candidate trajectories. In order to estimate the information that we expect from a camera pose, we introduce a novel formulation of the measurement uncertainty that accounts for the scene appearance (i.e., texture in the reference view), the scene depth and the vehicle pose. We successfully demonstrate our approach in the case of real-time, monocular reconstruction from a micro aerial vehicle and validate the effectiveness of our solution in both synthetic and real experiments. To the best of our knowledge, this is the first work on active, monocular dense reconstruction, which chooses motion trajectories that minimize perceptual ambiguities inferred by the texture in the scene.

computer vision and pattern recognition | 2011

3D Saliency maps

Fiora Pirri; Matia Pizzoli; Daniele Rigato; Redjan Shabani

Eye tracking devices have been extensively used to study human selection mechanisms and promoted the development of computational models of visual attention, whose well known outcomes are the saliency maps. Among the eye trackers, wearable ones have the advantages of allowing the estimation of the Point of Regard (POR) while performing natural tasks, instead of experimental, static lab settings. The motion of the viewer makes localization necessary to collect data in a coherent reference frame. In this work we present a framework for the estimation and mapping of the sequence of 3D PORs collected by a wearable device in unstructured, experimental settings. The result is a three-dimensional map of gazed objects, which we call 3D Saliency Map and constitutes the novel contribution of this work.

international symposium on visual computing | 2011

From saliency to eye gaze: embodied visual selection for a pan-tilt-based robotic head

Matei Mancas; Fiora Pirri; Matia Pizzoli

This paper introduces a model of gaze behavior suitable for robotic active vision. Built upon a saliency map taking into account motion saliency, the presented model estimates the dynamics of different eye movements, allowing to switch from fixational movements, to saccades and to smooth pursuit. We investigate the effect of the embodiment of attentive visual selection in a pan-tilt camera system. The constrained physical system is unable to follow the important fluctuations characterizing the maxima of a saliency map and a strategy is required to dynamically select what is worth attending and the behavior, fixation or target pursuing, to adopt. The main contributions of this work are a novel approach toward real time, motion-based saliency computation in video sequences, a dynamic model for gaze prediction from the saliency map, and the embodiment of the modeled dynamics to control active visual sensing.

international conference on pattern recognition applications and methods | 2012

CONSTRAINT-FREE TOPOLOGICAL MAPPING AND PATH PLANNING BY MAXIMA DETECTION OF THE KERNEL SPATIAL CLEARANCE DENSITY

Panagiotis Papadakis; Mario Gianni; Matia Pizzoli; Fiora Pirri

Asserting the inherent topology of the environment perceived by a robot is a key prerequisite of high-level decision making. This is achieved through the construction of a concise representation of the environment that endows a robot with the ability to operate in a coarse-to-ﬁne strategy. In this paper, we propose a novel topological segmentation method of generic metric maps operating concurrently as a path-planning algorithm. First, we apply a Gaussian Distance Transform on the map that weighs points belonging to free space according to the proximity of the surrounding free area in a noise resilient mode. We deﬁne a region as the set of all the points that locally converge to a common point of maximum space clearance and employ a weighed meanshift gradient ascent onto the kernel space clearance density in order to detect the maxima that characterize the regions. The spatial intra-connectivity of each cluster is ensured by allowing only for linearly unobstructed mean-shifts which in parallel serves as a path-planning algorithm by concatenating the consecutive mean-shift vectors of the convergence paths. Experiments on structured and unstructured environments demonstrate the effectiveness and potential of the proposed approach.

asian conference on computer vision | 2010

Linear solvability in the viewing graph

Alessandro Rudi; Matia Pizzoli; Fiora Pirri

The Viewing Graph [1] represents several views linked by the corresponding fundamental matrices, estimated pairwise. Given a Viewing Graph, the tuples of consistent camera matrices form a family that we call the Solution Set. This paper provides a theoretical framework that formalizes different properties of the topology, linear solvability and number of solutions of multi-camera systems. We systematically characterize the topology of the Viewing Graph in terms of its solution set by means of the associated algebraic bilinear system. Based on this characterization, we provide conditions about the linearity and the number of solutions and define an inductively constructible set of topologies which admit a unique linear solution. Camera matrices can thus be retrieved efficiently and large viewing graphs can be handled in a recursive fashion. The results apply to problems such as the projective reconstruction from multiple views or the calibration of camera networks.

Explore More