Ismael Etxeberria-Agiriano

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ismael Etxeberria-Agiriano is active.

Explore More

Publication

Featured researches published by Ismael Etxeberria-Agiriano.

Engineering Applications of Artificial Intelligence | 2014

Reinforcement learning of ball screw feed drive controllers

Borja Fernandez-Gauna; Igor Ansoategui; Ismael Etxeberria-Agiriano; Manuel Graña

Feedback controllers for ball screw feed drives may provide great accuracy in positioning, but have no close analytical solution to derive the desired controller. Reinforcement Learning (RL) is proposed to provide autonomous adaptation and learning of them. The RL paradigm allows different approaches, which are tested in this paper looking for the best suited for the ball screw drivers. Specifically, five algorithms are compared on an accurate simulation model of a commercial device, with and without a noisy disturbance on the state observation values. Benchmark results are provided by a double-loop PID controller, whose parameters have been tuned by a random search optimization. Action-critic methods with continuous action space (Policy-Gradient and CACLA) outperform the PID controller in the computational experiments, encouraging future research.

Information Sciences | 2015

Reinforcement Learning endowed with safe veto policies to learn the control of Linked-Multicomponent Robotic Systems

Borja Fernandez-Gauna; Manuel Graña; Jose Manuel Lopez-Guede; Ismael Etxeberria-Agiriano; Igor Ansoategui

Performing reinforcement learning-based control of systems whose state space has many Undesired Terminal States (UTS) experiences severe convergence problems. We define UTS as terminal states without associated positive reward information. They appear in the training of over-constrained systems, when breaking a constraint implies that all the effort invested during a learning episode is lost without gathering any constructive information about how to achieve the target task. The random exploration performed by RL algorithms is unfruitful until the system reaches any final state bearing some reward that may be used to update the state-action value functions, hence UTS seriously impede the convergence of the learning process. The most efficient learning strategies avoid reaching any UTS, ensuring that each learning process episode provides useful reward information. Safe Modular State Action Veto (Safe-MSAV) policies learn specifically how to avoid state transitions leading to an UTS. The application of MSAV makes state space exploration much more efficient. Bigger ratio of UTS to the total number of states provide greater improvements. Safe-MSAV uses independent concurrent modules, each dealing with a separate kind of UTS. We report experiments on the control of Linked Multicomponent Robotic Systems (L-MCRS) showing a dramatic decrease on the computational resources required, ensuring faster as well as more accurate results than conventional exploration strategies that do not implement explicit mechanisms to avoid falling in UTS.

International Journal of Neural Systems | 2015

Arm Orthosis/Prosthesis Movement Control Based on Surface EMG Signal Extraction

Aaron Suberbiola; Ekaitz Zulueta; Jose Manuel Lopez-Guede; Ismael Etxeberria-Agiriano; Manuel Graña

This paper shows experimental results on electromyography (EMG)-based system control applied to motorized orthoses. Biceps and triceps EMG signals are captured through two biometrical sensors, which are then filtered and processed by an acquisition system. Finally an output/control signal is produced and sent to the actuators, which will then perform the actual movement, using algorithms based on autoregressive (AR) models and neural networks, among others. The research goal is to predict the desired movement of the lower arm through the analysis of EMG signals, so that the movement can be reproduced by an arm orthosis, powered by two linear actuators. In this experiment, best accuracy has achieved values up to 91%, using a fourth-order AR-model and 100ms block length.

international conference on remote engineering and virtual instrumentation | 2012

Distribution middleware technologies for Cyber Physical Systems

Isidro Calvo; Ismael Etxeberria-Agiriano; Adrian Noguero

Cyber-Physical Systems (CPS) are integrations of computation and physical processes. This kind of systems is being increasingly used in different domains such as healthcare, transportation, process control, manufacturing or electric power grids. CPS interact with the physical world and must operate dependably, safely, securely, efficiently and, frequently, in real-time. Consequently, they require new computing and networking technologies capable of supporting them adequately in environments qualitatively different from those found in general purpose computing. This paper analyzes the applicability of different middleware technologies as data distribution means for CPS.

iberian conference on information systems and technologies | 2014

The challenge of building a cyber physical system as an educational experience

Pablo González-Nalda; Isidro Calvo; Ismael Etxeberria-Agiriano; Alejandro Garcia-Ruiz; Sergio Martinez-Lesta; Daniel Caballero-Martin

Building Cyber-Physical Systems (CPS) is highly multidisciplinary as it requires mastering several disciplines such as embedded computing, control and communications theory. It therefore seems appropriate setting challenges to students in the final years of study in this area. This paper presents an educational experience developed by the students of Computer Engineering in Management and Information Systems Degree at the University College of Engineering of Vitoria-Gasteiz. As part of an elective course in the last year students developed a small project where they could apply and reinforce concepts learned in previous courses. This experience allowed the students to reinforce skills already acquired in previous courses (like programming, operating systems, and communication networks) and other generic skills of such as effectively working in groups, integrating different technologies, proactive problem solving, or cope with the complexity of the systems.

international conference on remote engineering and virtual instrumentation | 2012

Configurable cooperative middleware for the next generation of CPS

Ismael Etxeberria-Agiriano; Isidro Calvo; Adrian Noguero; Ekaitz Zulueta

Cyber-Physical Systems (CPS) form an emerging discipline that integrates embedded computers with the physical processes under control. Typically, Cyber-Physical applications include low profile computing components, such as sensors and actuators that must communicate to carry out complex tasks. They may be found in different applications domains e.g. intelligent buildings, industrial automation or critical infrastructure control. This kind of applications requires certain features such as autonomy, fault tolerance, energy efficiency or solving heterogeneity and configurability issues. However, managing the communication issues in this kind of applications can be relatively complex. In this scenario, middleware technologies can help developers in the design of the next generation of CPS. This work describes the design principles of a type of CPS that requires cooperation. More specifically, it presents a generic family of logical cooperation topologies capable of adapting dynamically to changes in the environment.

PLOS ONE | 2015

Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

Borja Fernandez-Gauna; Ismael Etxeberria-Agiriano; Manuel Graña

Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent’s local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots.

International Journal of Advanced Robotic Systems | 2013

Designing High Performance Factory Automation Applications on Top of DDS

Isidro Calvo; Federico Pérez; Ismael Etxeberria-Agiriano; Oier García de Albéniz

DDS is a recent specification aimed at providing high-performance publisher/subscriber middleware solutions. Despite being a very powerful flexible technology, it may prove complex to use, especially for the inexperienced. This work provides some guidelines for connecting software components that represent a new generation of automation devices (such as PLCs, IPCs and robots) using Data Distribution Service (DDS) as a virtual software bus. More specifically, it presents the design of a DDS-based component, the so-called Automation Component, and discusses how to map different traffic patterns using DDS entities exploiting the wealth of QoS management mechanisms provided by the DDS specification. A case study demonstrates the creation of factory automation applications out of software components that encapsulate independent stations.

iberian conference on information systems and technologies | 2015

Flexible, modular, standard, free and affordable model for CPS control applied to mobile robotics

Pablo González-Nalda; Ismael Etxeberria-Agiriano; Isidro Calvo

Cyber-physical systems (CPS) are well suited for both research and teaching. In this paper the architecture for the design of CPS controllers based on free hardware and software is presented. The advantages of this system include its flexibility, its modular structure facilitating concurrent development and its robustness, with easy evolution of existing systems. An application case study that highlights the potential of this model is detailed. Low cost modules have been chosen. As they are largely utilized they have a large community of developers and plenty related technical information can be found. They are based on free hardware and operating systems widely available. Modularity allows implementing them on existing systems, allowing easy evolution. Another significant advantage of the proposed model for both research and teaching is that, due to its low cost, it allows building more mobile robots.

international work-conference on the interplay between natural and artificial computation | 2013

An Empirical Study of Actor-Critic Methods for Feedback Controllers of Ball-Screw Drivers

Borja Fernandez-Gauna; Igor Ansoategui; Ismael Etxeberria-Agiriano; Manuel Graña

In this paper we study the use of Reinforcement Learning Actor-Critic methods to learn the control of a ball-screw feed drive. We have tested three different actors: Q-value based, Policy Gradient and CACLA actors. We have paid special attention to the sensibility to suboptimal learning gain tuning. As a benchmark, we have used randomly-initialized PID controllers. CACLA provides an stable control comparable to the best heuristically tuned PID controller, despite its lack of knowledge of the actual error value.

Explore More