Matteo Leonetti | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matteo Leonetti is active.

Explore More

Publication

Featured researches published by Matteo Leonetti.

Ai Magazine | 2015

The 2014 International Planning Competition: Progress and Trends

Stefano V. Albrecht; J. Christopher Beck; David L. Buckeridge; Adi Botea; Cornelia Caragea; Chi-Hung Chi; Theodoros Damoulas; Bistra Dilkina; Eric Eaton; Pooyan Fazli; Sam Ganzfried; C. Lee Giles; Sébastien Guillet; Robert C. Holte; Frank Hutter; Thorsten Koch; Matteo Leonetti; Marius Lindauer; Marlos C. Machado; Yuri Malitsky; Gary F. Marcus; Sebastiaan Meijer; Francesca Rossi; Arash Shaban-Nejad; Sylvie Thiébaux; Manuela M. Veloso; Toby Walsh; Can Wang; Jie Zhang; Yu Zheng

We review the 2014 International Planning Competition (IPC-2014), the eighth in a series of competitions starting in 1998. IPC-2014 was held in three separate parts to assess state-of-the-art in three prominent areas of planning research: the deterministic (classical) part (IPCD), the learning part (IPCL), and the probabilistic part (IPPC). Each part evaluated planning systems in ways that pushed the edge of existing planner performance by introducing new challenges, novel tasks, or both. The competition surpassed again the number of competitors than its predecessor, highlighting the competition’s central role in shaping the landscape of ongoing developments in evaluating planning systems.

intelligent robots and systems | 2013

On-line identification of autonomous underwater vehicles through global derivative-free optimization

George C. Karras; Charalampos P. Bechlioulis; Matteo Leonetti; Narcais Palomeras; Petar Kormushev; Kostas J. Kyriakopoulos; Darwin G. Caldwell

We describe the design and implementation of an on-line identification scheme for Autonomous Underwater Vehicles (AUVs). The proposed method estimates the dynamic parameters of the vehicle based on a global derivative-free optimization algorithm. It is not sensitive to initial conditions, unlike other on-line identification schemes, and does not depend on the differentiability of the model with respect to the parameters. The identification scheme consists of three distinct modules: a) System Excitation, b) Metric Calculator and c) Optimization Algorithm. The System Excitation module sends excitation inputs to the vehicle. The Optimization Algorithm module calculates a candidate parameter vector, which is fed to the Metric Calculator module. The Metric Calculator module evaluates the candidate parameter vector, using a metric based on the residual of the actual and the predicted commands. The predicted commands are calculated utilizing the candidate parameter vector and the vehicle state vector, which is available via a complete navigation module. Then, the metric is directly fed back to the Optimization Algorithm module, and it is used to correct the estimated parameter vector. The procedure continues iteratively until the convergence properties are met. The proposed method is generic, demonstrates quick convergence and does not require a linear formulation of the model with respect to the parameter vector. The applicability and performance of the proposed algorithm is experimentally verified using the AUV Girona 500.

The International Journal of Robotics Research | 2017

BWIBots: A platform for bridging the gap between AI and human–robot interaction research

Piyush Khandelwal; Shiqi Zhang; Jivko Sinapov; Matteo Leonetti; Jesse Thomason; Fangkai Yang; Ilaria Gori; Maxwell Svetlik; Priyanka Khante; Vladimir Lifschitz; Jake K. Aggarwal; Raymond J. Mooney; Peter Stone

Recent progress in both AI and robotics have enabled the development of general purpose robot platforms that are capable of executing a wide variety of complex, temporally extended service tasks in open environments. This article introduces a novel, custom-designed multi-robot platform for research on AI, robotics, and especially human–robot interaction for service robots. Called BWIBots, the robots were designed as a part of the Building-Wide Intelligence (BWI) project at the University of Texas at Austin. The article begins with a description of, and justification for, the hardware and software design decisions underlying the BWIBots, with the aim of informing the design of such platforms in the future. It then proceeds to present an overview of various research contributions that have enabled the BWIBots to better (a) execute action sequences to complete user requests, (b) efficiently ask questions to resolve user requests, (c) understand human commands given in natural language, and (d) understand human intention from afar. The article concludes with a look forward towards future research opportunities and applications enabled by the BWIBot platform.

international conference on robotics and automation | 2014

Online Discovery of AUV Control Policies to Overcome Thruster Failures

Seyed Reza Ahmadzadeh; Matteo Leonetti; Arnau Carrera; Marc Carreras; Petar Kormushev; Darwin G. Caldwell

We investigate methods to improve fault-tolerance of Autonomous Underwater Vehicles (AUVs) to increase their reliability and persistent autonomy. We propose a learning-based approach that is able to discover new control policies to overcome thruster failures as they happen. The proposed approach is a model-based direct policy search that learns on an on-board simulated model of the AUV. The model is adapted to a new condition when a fault is detected and isolated. Since the approach generates an optimal trajectory, the learned fault-tolerant policy is able to navigate the AUV towards a specified target with minimum cost. Finally, the learned policy is executed on the real robot in a closed-loop using the state feedback of the AUV. Unlike most existing methods which rely on the redundancy of thrusters, our approach is also applicable when the AUV becomes under-actuated in the presence of a fault. To validate the feasibility and efficiency of the presented approach, we evaluate it with three learning algorithms and three policy representations with increasing complexity. The proposed method is tested on a real AUV, Girona500.

2012 International Conference on Computing, Networking and Communications (ICNC) | 2012

Self-tuning batching in total order broadcast protocols via analytical modelling and reinforcement learning

Paolo Romano; Matteo Leonetti

Batching is a well known technique to boost the throughput of Total Order Broadcast (TOB) protocols. Unfortunately, its manual configuration is not only a time consuming process, but also a very delicate one, as incorrect settings of the batching parameter can lead to severe performance degradation. In this paper we address precisely this issue, by presenting an innovative mechanism for self-tuning the batching level in TOB protocols. Our solution combines analytical modeling and reinforcement learning techniques, taking the best of these two worlds: drastic reductions of the learning time and the ability to correct inaccurate predictions by accumulating feedback from the operation of the system.

Cybernetics and Information Technologies | 2012

Combining Local and Global Direct Derivative-Free Optimization for Reinforcement Learning

Matteo Leonetti; Petar Kormushev; Simone Sagratella

Abstract We consider the problem of optimization in policy space for reinforcement learning. While a plethora of methods have been applied to this problem, only a narrow category of them proved feasible in robotics. We consider the peculiar characteristics of reinforcement learning in robotics, and devise a combination of two algorithms from the literature of derivative-free optimization. The proposed combination is well suited for robotics, as it involves both off-line learning in simulation and on-line learning in the real environment. We demonstrate our approach on a real-world task, where an Autonomous Underwater Vehicle has to survey a target area under potentially unknown environment conditions. We start from a given controller, which can perform the task under foreseeable conditions, and make it adaptive to the actual environment.

european conference on machine learning | 2011

Reinforcement learning through global stochastic search in N-MDPs

Matteo Leonetti; Luca Iocchi; Subramanian Ramamoorthy

Reinforcement Learning (RL) in either fully or partially observable domains usually poses a requirement on the knowledge representation in order to be sound: the underlying stochastic process must be Markovian. In many applications, including those involving interactions between multiple agents (e.g., humans and robots), sources of uncertainty affect rewards and transition dynamics in such a way that a Markovian representation would be computationally very expensive. An alternative formulation of the decision problem involves partially specified behaviors with choice points. While this reduces the complexity of the policy space that must be explored - something that is crucial for realistic autonomous agents that must bound search time - it does render the domain Non-Markovian. In this paper, we present a novel algorithm for reinforcement learning in Non-Markovian domains. Our algorithm, Stochastic Search Monte Carlo, performs a global stochastic search in policy space, shaping the distribution from which the next policy is selected by estimating an upper bound on the value of each action. We experimentally show how, in challenging domains for RL, high-level decisions in Non-Markovian processes can lead to a behavior that is at least as good as the one learned by traditional algorithms, and can be achieved with significantly fewer samples.

artificial intelligence methodology systems applications | 2012

Automatic generation and learning of finite-state controllers

Matteo Leonetti; Luca Iocchi; Fabio Patrizi

We propose a method for generating and learning agent controllers, which combines techniques from automated planning and reinforcement learning. An incomplete description of the domain is first used to generate a non-deterministic automaton able to act (sub-optimally) in the given environment. Such a controller is then refined through experience, by learning choices at non-deterministic points. On the one hand, the incompleteness of the model, which would make a pure-planning approach ineffective, is overcome through learning. On the other hand, the portion of the domain available drives the learning process, that otherwise would be excessively expensive. Our method allows to adapt the behavior of a given planner to the environment, facing the unavoidable discrepancies between the model and the environment. We provide quantitative experiments with a simulator of a mobile robot to assess the performance of the proposed method.

robot soccer world cup | 2011

LearnPNP: a tool for learning agent behaviors

Matteo Leonetti; Luca Iocchi

High-level programming of robotic agents requires the use of a representation formalism able to cope with several sources of complexity (e.g. parallel execution, partial observability, exogenous events, etc.) and the ability of the designer to model the domain in a precise way. Reinforcement Learning has proved promising in improving the performance, adaptability and robustness of plans in under-specified domains, although it does not scale well with the complexity of common robotic applications. In this paper we propose to combine an extremely expressive plan representation formalism (Petri Net Plans), with Reinforcement Learning over a stochastic process derived directly from such a plan. The derived process has a significantly reduced search space and thus the framework scales well with the complexity of the domain and allows for actually improving the performance of complex behaviors from experience. To prove the effectiveness of the system, we show how to model and learn the behavior of the robotic agents in the context of Keepaway Soccer (a widely accepted benchmark for RL) and the RoboCup Standard Platform League.

measurement and modeling of computer systems | 2011

Poster: selftuning batching in total order broadcast via analytical modelling and reinforcement learning

Paolo Romano; Matteo Leonetti

Total order broadcast [2] (TOB) is a fundamental problem in distributed computing, which requires a set of processes to reach agreement on the delivery order of concurrently broadcast messages. Batching is a well known technique that allows boosting the throughput of Total Order Broadcast (TOB) protocols by amortizing the per-message ordering overhead across a set of incoming messages. Unfortunately, the manual configuration of the optimal batching level is a time consuming and delicate process, as incorrect tuning can lead to severe performance degradation. In this paper, we overview an innovative mechanism for self-tuning the batching level of TOB protocols (a detailed description of which can be found in [3]), which combines analytical modeling and Reinforcement Learning (RL) techniques, to take the best of the two worlds: minimizing learning time and accumulating feedback from the operation of the system to enhance the self-tuning accuracy over time.

Explore More