Luigi Nardi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Luigi Nardi is active.

Explore More

Publication

Featured researches published by Luigi Nardi.

international conference on robotics and automation | 2015

Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM

Luigi Nardi; Bruno Bodin; M. Zeeshan Zia; John Mawer; Andy Nisbet; Paul H. J. Kelly; Andrew J. Davison; Mikel Luján; Michael F. P. O'Boyle; Graham D. Riley; Nigel P. Topham; Stephen B. Furber

Real-time dense computer vision and SLAM offer great potential for a new level of scene modelling, tracking and real environmental interaction for many types of robot, but their high computational requirements mean that use on mass market embedded platforms is challenging. Meanwhile, trends in low-cost, low-power processing are towards massive parallelism and heterogeneity, making it difficult for robotics and vision researchers to implement their algorithms in a performance-portable way. In this paper we introduce SLAMBench, a publicly-available software framework which represents a starting point for quantitative, comparable and validatable experimental research to investigate trade-offs in performance, accuracy and energy consumption of a dense RGB-D SLAM system. SLAMBench provides a KinectFusion implementation in C++, OpenMP, OpenCL and CUDA, and harnesses the ICL-NUIM dataset of synthetic RGB-D sequences with trajectory and scene ground truth for reliable accuracy comparison of different implementation and algorithms. We present an analysis and breakdown of the constituent algorithmic elements of KinectFusion, and experimentally investigate their execution time on a variety of multicore and GPU-accelerated platforms. For a popular embedded platform, we also present an analysis of energy efficiency for different configuration alternatives.

international conference on computational science and its applications | 2009

YAO: A Software for Variational Data Assimilation Using Numerical Models

Luigi Nardi; Charles Sorror; Fouad Badran; Sylvie Thiria

Variational data assimilation consists in estimating control parameters of a numerical model in order to minimize the misfit between the forecast values and some actual observations. The gradient based minimization methods require the multiplication of the transpose jacobian matrix (adjoint model), which is of huge dimension, with the derivative vector of the cost function at the observation points. We present a method based on a modular graph concept and two algorithms to avoid these expensive multiplications. The first step of the method is a propagation algorithm on the graph that allows computing the output of the numerical model and its linear tangent, the second is a backpropagation on the graph that allows the computation of the adjoint model. The YAO software implements these two steps using appropriate algorithms. We present a brief description of YAO functionalities.

international conference on parallel architectures and compilation techniques | 2016

Integrating Algorithmic Parameters into Benchmarking and Design Space Exploration in 3D Scene Understanding

Bruno Bodin; Luigi Nardi; M. Zeeshan Zia; Harry Wagstaff; Govind Sreekar Shenoy; Murali Emani; John Mawer; Christos Kotselidis; Andy Nisbet; Mikel Luján; Björn Franke; Paul H. J. Kelly; Michael F. P. O'Boyle

System designers typically use well-studied benchmarks to evaluate and improve new architectures and compilers. We design tomorrows systems based on yesterdays applications. In this paper we investigate an emerging application, 3D scene understanding, likely to be significant in the mobile space in the near future. Until now, this application could only run in real-time on desktop GPUs. In this work, we examine how it can be mapped to power constrained embedded systems. Key to our approach is the idea of incremental co-design exploration, where optimization choices that concern the domain layer are incrementally explored together with low-level compiler and architecture choices. The goal of this exploration is to reduce execution time while minimizing power and meeting our quality of result objective. As the design space is too large to exhaustively evaluate, we use active learning based on a random forest predictor to find good designs. We show that our approach can, for the first time, achieve dense 3D mapping and tracking in the real-time range within a 1W power budget on a popular embedded device. This is a 4.8× execution time improvement and a 2.8× power reduction compared to the state-of-the-art.

international conference on robotics and automation | 2016

Comparative design space exploration of dense and semi-dense SLAM

M. Zeeshan Zia; Luigi Nardi; Andrew Jack; Emanuele Vespa; Bruno Bodin; Paul H. J. Kelly; Andrew J. Davison

SLAM has matured significantly over the past few years, and is beginning to appear in serious commercial products. While new SLAM systems are being proposed at every conference, evaluation is often restricted to qualitative visualizations or accuracy estimation against a ground truth. This is due to the lack of benchmarking methodologies which can holistically and quantitatively evaluate these systems. Further investigation at the level of individual kernels and parameter spaces of SLAM pipelines is non-existent, which is absolutely essential for systems research and integration. We extend the recently introduced SLAMBench framework to allow comparing two state-of-the-art SLAM pipelines, namely KinectFusion and LSD-SLAM, along the metrics of accuracy, energy consumption, and processing frame rate on two different hardware platforms, namely a desktop and an embedded device. We also analyze the pipelines at the level of individual kernels and explore their algorithmic and hardware design spaces for the first time, yielding valuable insights.

ieee international conference on high performance computing data and analytics | 2012

YAO: A Generator of Parallel Code for Variational Data Assimilation Applications

Luigi Nardi; Fouad Badran; Pierre Fortin; Sylvie Thiria

Variational data assimilation consists in estimating control parameters of a numerical model in order to minimize the misfit between the forecast values and the actual observations. The YAO framework is a code generator that facilitates, especially for the adjoint model, the writing and the generation of a variational data assimilation program for a given numerical application. In this paper we present how the modular graph specific to YAO enables the automatic and efficient parallelization of the generated code with OpenMP on shared memory architectures. Thanks to this modular graph we are also able to completely avoid the data race conditions (write/write conflicts). Performance tests with actual applications demonstrates good speedups on a multicore CPU.

international parallel and distributed processing symposium | 2017

Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper

Luigi Nardi; Bruno Bodin; Sajad Saeedi; Emanuele Vespa; Andrew J. Davison; Paul H. J. Kelly

In this paper we investigate an emerging application, 3D scene understanding, likely to be significant in the mobile space in the near future. The goal of this exploration is to reduce execution time while meeting our quality of result objectives. In previous work, we showed for the first time that it is possible to map this application to power constrained embedded systems, highlighting that decision choices made at the algorithmic design-level have the most significant impact. As the algorithmic design space is too large to be exhaustively evaluated, we use a previously introduced multi-objective random forest active learning prediction framework dubbed HyperMapper, to find good algorithmic designs. We show that HyperMapper generalizes on a recent cutting edge 3D scene understanding algorithm and on a modern GPU-based computer architecture. HyperMapper is able to beat an expert human hand-tuning the algorithmic parameters of the class of computer vision applications taken under consideration in this paper automatically. In addition, we use crowd-sourcing using a 3D scene understanding Android app to show that the Pareto front obtained on an embedded system can be used to accelerate the same application on all the 83 smart-phones and tablets with speedups ranging from 2x to over 12x.

international conference on robotics and automation | 2017

Application-oriented design space exploration for SLAM algorithms

Sajad Saeedi; Luigi Nardi; Edward Johns; Bruno Bodin; Paul H. J. Kelly; Andrew J. Davison

In visual SLAM, there are many software and hardware parameters, such as algorithmic thresholds and GPU frequency, that need to be tuned; however, this tuning should also take into account the structure and motion of the camera. In this paper, we determine the complexity of the structure and motion with a few parameters calculated using information theory. Depending on this complexity and the desired performance metrics, suitable parameters are explored and determined. Additionally, based on the proposed structure and motion parameters, several applications are presented, including a novel active SLAM approach which guides the camera in such a way that the SLAM algorithm achieves the desired performance metrics. Real-world and simulated experimental results demonstrate the effectiveness of the proposed design space and its applications.

modeling analysis and simulation on computer and telecommunication systems | 2016

Diplomat: Mapping of Multi-kernel Applications Using a Static Dataflow Abstraction

Bruno Bodin; Luigi Nardi; Paul H. J. Kelly; Michael F. P. O'Boyle

In this paper we propose a novel approach to heterogeneous embedded systems programmability using a task-graph based framework called Diplomat. Diplomat is a task-graph framework that exploits the potential of static dataflow modeling and analysis to deliver performance estimation and CPU/GPU mapping. An application has to be specified once, and then the framework can automatically propose good mappings. We evaluate Diplomat with a computer vision application on two embedded platforms. Using the Diplomat generation we observed a 16% performance improvement on average and up to a 30% improvement over the best existing hand-coded implementation.

Geoscientific Model Development | 2016