Borja Fernandez-Gauna
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Borja Fernandez-Gauna.
Robotics and Autonomous Systems | 2013
Borja Fernandez-Gauna; Jose Manuel Lopez-Guede; Manuel Graña
Abstract Transfer learning is a hierarchical approach to reinforcement learning of complex tasks modeled as Markov Decision Processes. The learning results on the source task are used as the starting point for the learning on the target task. In this paper we deal with a hierarchy of constrained systems, where the source task is an under-constrained system, hence called the Partially Constrained Model (PCM). Constraints in the framework of reinforcement learning are dealt with by state-action veto policies. We propose a theoretical background for the hierarchy of training refinements, showing that the effective action repertoires learnt on the PCM are maximal, and that the PCM-optimal policy gives maximal state value functions. We apply the approach to learn the control of Linked Multicomponent Robotic Systems using Reinforcement Learning. The paradigmatic example is the transportation of a hose. The system has strong physical constraints and a large state space. Learning experiments in the target task are realized over an accurate but computationally expensive simulation of the hose dynamics. The PCM is obtained simplifying the hose model. Learning results of the PCM Transfer Learning show an spectacular improvement over conventional Q-learning on the target task.
Cybernetics and Systems | 2012
Jose Manuel Lopez-Guede; Borja Fernandez-Gauna; Manuel Graña; Ekaitz Zulueta
Single robot hose transport is a limit case of linked multicomponent robotic systems, where one robot moves the tip of a hose to a desired position. The interaction between the passive, flexible hose and the robot introduces highly nonlinear effects in the systems dynamics, requiring innovative control design approaches, such as reinforcement learning. This article improves previous approaches to this problem by introducing a novel reinforcement learning algorithm (TRQ-learning) and a new system state definition for the autonomous derivation of the hose–robot control algorithm. Computational experiments based on accurate geometrically exact dynamic splines hose dynamics simulations show the improvement obtained.
Information Sciences | 2013
Borja Fernandez-Gauna; Ion Marqués; Manuel Graña
The paper deals with the problem of learning the control of Multi-Component Robotic Systems (MCRSs) applying Multi-Agent Reinforcement Learning (MARL) algorithms. Modeling Linked MCRS usually leads to over-constrained environments, posing great difficulties for efficient learning with conventional single and multi-agent reinforcement algorithms. In this paper, we propose a hybrid learning algorithm composed of a modified Q-Learning algorithm embedding an Undesired State-Action Prediction (USAP) module trained by a supervised learning approach which learns a model predicting undesired transitions to states breaking physical constraints. The USAP modules output is used by the Q-Learning algorithm to prevent these undesired transitions, therefore boosting learning efficiency. This hybrid approach is extended to the multi-agent case embedding the USAP module in Distributed Round-Robin Q-Learning (D-RR-QL), which requires very little communications among agents. We present results of computational experiments conducted in the classical multi-agent taxi scheduling task and a hose transportation task. Results show a considerable learning gain both in time and accuracy, compared to the state-of-the-art Distributed Q-Learning approach in the deterministic taxi scheduling task. In the hose transportation task, USAP module introduces a significant improvement in learning convergence speed.
Engineering Applications of Artificial Intelligence | 2014
Borja Fernandez-Gauna; Igor Ansoategui; Ismael Etxeberria-Agiriano; Manuel Graña
Feedback controllers for ball screw feed drives may provide great accuracy in positioning, but have no close analytical solution to derive the desired controller. Reinforcement Learning (RL) is proposed to provide autonomous adaptation and learning of them. The RL paradigm allows different approaches, which are tested in this paper looking for the best suited for the ball screw drivers. Specifically, five algorithms are compared on an accurate simulation model of a commercial device, with and without a noisy disturbance on the state observation values. Benchmark results are provided by a double-loop PID controller, whose parameters have been tuned by a random search optimization. Action-critic methods with continuous action space (Policy-Gradient and CACLA) outperform the PID controller in the computational experiments, encouraging future research.
hybrid artificial intelligence systems | 2010
Borja Fernandez-Gauna; Jose Manuel Lopez-Guede; Ekaitz Zulueta
The Linked Multicomponent Robotic Systems are characterized by the existence of a non-rigid linking element This linking element can produce many dynamical effects that introduce perturbations of the basic system behavior, different from uncoupled systems We show through a simulation of a distributed control of a hose tranportation system, that even a minimal dynamical feature of the hose (elastic forces oppossing stretching) can produce significant behavior perturbations.
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems | 2013
Jose Manuel Lopez-Guede; Borja Fernandez-Gauna; Manuel Graña
This paper addresses the problem of efficiency in reinforcement learning of Single Robot Hose Transport (SRHT) by training an Extreme Learning Machine (ELM) from the state-action value Q-table, obtaining large reduction in data space requirements because the number of ELM parameters is much less than the Q-tables size. Moreover, ELM implements a continuous map which can produce compact representations of the Q-table, and generalizations to increased space resolution and unknown situations. In this paper we evaluate empirically three strategies to formulate ELM learning to provide approximations to the Q-table, namely as classification, multi-variate regression and several independent regression problems.
hybrid artificial intelligence systems | 2011
Jose Manuel Lopez-Guede; Borja Fernandez-Gauna; Manuel Graña; Ekaitz Zulueta
Non-rigid physical elements attached to robotic systems introduce non-linear dynamics that requires innovative control approaches. This paper describes some of our results applying Q-Learning to learn the control commands to solve a hose transportation problem. The learning process is developed in a simulated environment. Computationally expensive but dynamically accurate Geometrically Exact Dynamic Splines (GEDS) have been used to model the hose to be transported by a single robot, showing the difficulties of controlling flexible elastic passive linking elements.
hybrid artificial intelligence systems | 2011
Borja Fernandez-Gauna; Jose Manuel Lopez-Guede; Manuel Graña
When conventional Q-Learning is applied to Multi-Component Robotic Systems (MCRS), increasing the number of components produces an exponential growth of state storage requirements. Modular approaches make the state size growth polynomial on the number of components, making more manageable its representation and manipulation. In this article, we give the first steps towards a modular Q-learning approach to learn the distributed control of a Linked MCRS, which is a specific type of MCRSs in which the individual robots are linked by a passive element. We have chosen a paradigmatic application of this kind of systems: a set of robots carrying the tip of a hose from some initial position to a desired goal. The hose dynamics is simplified to be a distance constraint on the robots positions.
Paladyn | 2011
Manuel Graña; Borja Fernandez-Gauna; Jose Manuel Lopez-Guede
Reinforcement Learning (RL) as a paradigm aims to develop algorithms that allow to train an agent to optimally achieve a goal with minimal feedback information about the desired behavior, which is not precisely specified. Scalar rewards are returned to the agent as response to its actions endorsing or opposing them. RL algorithms have been successfully applied to robot control design. The extension of the RL paradigm to cope with the design of control systems for Multi-Component Robotic Systems (MCRS) poses new challenges, mainly related to coping with scaling up of complexity due to the exponential state space growth, coordination issues, and the propagation of rewards among agents. In this paper, we identify the main issues which offer opportunities to develop innovative solutions towards fully-scalable cooperative multi-agent systems.
Neurocomputing | 2015
Jose Manuel Lopez-Guede; Borja Fernandez-Gauna; Jose Antonio Ramos-Hernanz
Autonomous task learning for Linked Multicomponent Robotic Systems (L-MCRS) is an open research issue. Pilot studies applying Reinforcement Learning (RL) on Single Robot Hose Transport (SRHT) task need extensive simulations of the L-MCRS involved in the task. The Geometrically Exact Dynamic Spline (GEDS) simulator used for the accurate simulation of the dynamics of the overall system is a time expensive process, so that it is infeasible to carry out extensive learning experiments based on it. In this paper we address the problem of learning the dynamics of the L-MCRS encapsulated on the GEDS simulator using an Extreme Learning Machine (ELM) approach. Profiting from the adaptability and flexibility of the ELMs, we have formalized the problem of learning the hose geometry as a multi-variate regression problem. Empirical evaluation of this strategy achieves remarkable accurate approximation results.