Francisco S. Melo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francisco S. Melo is active.

Explore More

Publication

Featured researches published by Francisco S. Melo.

international conference on machine learning | 2008

An analysis of reinforcement learning with function approximation

Francisco S. Melo; Sean P. Meyn; M. Isabel Ribeiro

We address the problem of computing the optimal Q-function in Markov decision problems with infinite state-space. We analyze the convergence properties of several variations of Q-learning when combined with function approximation, extending the analysis of TD-learning in (Tsitsiklis & Van Roy, 1996a) to stochastic control settings. We identify conditions under which such approximate methods converge with probability 1. We conclude with a brief discussion on the general applicability of our results and compare them with several related works.

intelligent robots and systems | 2007

Affordance-based imitation learning in robots

Manuel Lopes; Francisco S. Melo; Luis Montesano

In this paper we build an imitation learning algorithm for a humanoid robot on top of a general world model provided by learned object affordances. We consider that the robot has previously learned a task independent affordance-based model of its interaction with the world. This model is used to recognize the demonstration by another agent (a human) and infer the task to be learned. We discuss several important problems that arise in this combined framework, such as the influence of an inaccurate model in the recognition of the demonstration. We illustrate the ideas in the paper with some experimental results obtained with a real robot.

Artificial Intelligence | 2011

Decentralized MDPs with sparse interactions

Francisco S. Melo; Manuela M. Veloso

Creating coordinated multiagent policies in environments with uncertainty is a challenging problem, which can be greatly simplified if the coordination needs are known to be limited to specific parts of the state space. In this work, we explore how such local interactions can simplify coordination in multiagent systems. We focus on problems in which the interaction between the agents is sparse and contribute a new decision-theoretic model for decentralized sparse-interaction multiagent systems, Dec-SIMDPs, that explicitly distinguishes the situations in which the agents in the team must coordinate from those in which they can act independently. We relate our new model to other existing models such as MMDPs and Dec-MDPs. We then propose a solution method that takes advantage of the particular structure of Dec-SIMDPs and provide theoretical error bounds on the quality of the obtained solution. Finally, we show a reinforcement learning algorithm in which independent agents learn both individual policies and when and how to coordinate. We illustrate the application of the algorithms throughout the paper in several multiagent navigation scenarios.

From Motor Learning to Interaction Learning in Robots | 2010

Abstraction Levels for Robotic Imitation: Overview and Computational Approaches

Manuel Lopes; Francisco S. Melo; Luis Montesano; José Santos-Victor

This chapter reviews several approaches to the problem of learning by imitation in robotics. We start by describing several cognitive processes identified in the literature as necessary for imitation. We then proceed by surveying different approaches to this problem, placing particular emphasys on methods whereby an agent first learns about its own body dynamics by means of self-exploration and then uses this knowledge about its own body to recognize the actions being performed by other agents. This general approach is related to the motor theory of perception, particularly to the mirror neurons found in primates. We distinguish three fundamental classes of methods, corresponding to three abstraction levels at which imitation can be addressed. As such, the methods surveyed herein exhibit behaviors that range from raw sensory-motor trajectory matching to high-level abstract task replication. We also discuss the impact that knowledge about the world and/or the demonstrator can have on the particular behaviors exhibited.

affective computing and intelligent interaction | 2011

Emotion-based intrinsic motivation for reinforcement learning agents

Pedro Sequeira; Francisco S. Melo; Ana Paiva

In this paper, we propose an adaptation of four common appraisal dimensions that evaluate the relation of an agent with its environment into reward features within an intrinsically motivated reinforcement learning framework. We show that, by optimizing the relative weights of such features for a given environment, the agents attain a greater degree of fitness while overcoming some of their perceptual limitations. This optimization process resembles the evolutionary adaptive process that living organisms are subject to. We illustrate the application of our method in several simulated foraging scenarios.

conference on learning theory | 2007

Q-learning with linear function approximation

Francisco S. Melo; M. Isabel Ribeiro

In this paper, we analyze the convergence of Q-learning with linear function approximation. We identify a set of conditions that implies the convergence of this method with probability 1, when a fixed learning policy is used. We discuss the differences and similarities between our results and those obtained in several related works. We also discuss the applicability of this method when a changing policy is used. Finally, we describe the applicability of this approximate method in partially observable scenarios.

Adaptive Behavior | 2009

A Computational Model of Social-Learning Mechanisms

Manuel Lopes; Francisco S. Melo; Ben Kenward; José Santos-Victor

In this article we propose a computational model that describes how observed behavior can influence an observer’s own behavior, including the acquisition of new task descriptions. The sources of influence on our model’s behavior are: beliefs about the world’s possible states and actions causing transitions between them; baseline preferences for certain actions; a variable tendency to infer and share goals in observed behavior; and a variable tendency to act efficiently to reach rewarding states. Acting on these premises, our model is able to replicate key empirical studies of social learning in children and chimpanzees. We demonstrate how a simple artificial system can account for a variety of biological social transfer phenomena, such as goal-inference and over-imitation, by taking into account action constraints and incomplete knowledge about the world dynamics.

international conference on social robotics | 2015

Personalized Assistance for Dressing Users

Steven D. Klee; Beatriz Quintino Ferreira; Rui F. M. Silva; João Paulo Costeira; Francisco S. Melo; Manuela M. Veloso

In this paper, we present an approach for a robot to provide personalized assistance for dressing a user. In particular, given a dressing task, our approach finds a solution involving manipulator motions and also user repositioning requests. Specifically, the solution allows the robot and user to take turns moving in the same space and is cognizant of the user’s limitations. To accomplish this, a vision module monitors the human’s motion, determines if he is following the repositioning requests, and infers mobility limitations when he cannot. The learned constraints are used during future dressing episodes to personalize the repositioning requests. Our contributions include a turn-taking approach to human-robot coordination for the dressing problem and a vision module capable of learning user limitations. After presenting the technical details of our approach, we provide an evaluation with a Baxter manipulator.

european conference on machine learning | 2010

Learning from demonstration using MDP induced metrics

Francisco S. Melo; Manuel Lopes

In this paper we address the problem of learning a policy from demonstration. Assuming that the policy to be learned is the optimal policy for an underlying MDP, we propose a novel way of leveraging the underlying MDP structure in a kernel-based approach. Our proposed approach rests on the insight that the MDP structure can be encapsulated into an adequate state-space metric. In particular we show that, using MDP metrics, we are able to cast the problem of learning from demonstration as a classification problem and attain similar generalization performance as methods based on inverse reinforcement learning at a much lower online computational cost. Our method is also able to attain superior generalization than other supervised learning methods that fail to consider the MDP structure.

european conference on machine learning | 2008

Fitted natural actor-critic: a new algorithm for continuous state-action MDPs

Francisco S. Melo; Manuel Lopes

In this paper we address reinforcement learning problems with continuous state-action spaces. We propose a new algorithm, fitted natural actor-critic(FNAC), that extends the work in [1] to allow for general function approximation and data reuse. We combine the natural actor-critic architecture [1] with a variant of fitted value iteration using importance sampling. The method thus obtained combines the appealing features of both approaches while overcoming their main weaknesses: the use of a gradient-based actor readily overcomes the difficulties found in regression methods with policy optimization in continuous action-spaces; in turn, the use of a regression-based critic allows for efficient use of data and avoids convergence problems that TD-based critics often exhibit. We establish the convergence of our algorithm and illustrate its application in a simple continuous space, continuous action problem.

Explore More