Francisco Martinez-Gil

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francisco Martinez-Gil is active.

Explore More

Publication

Featured researches published by Francisco Martinez-Gil.

Simulation Modelling Practice and Theory | 2014

MARL-Ped: a Multi-Agent Reinforcement Learning Based Framework to Simulate Pedestrian Groups

Francisco Martinez-Gil; Miguel Lozano; Fernando Fernández

Abstract Pedestrian simulation is complex because there are different levels of behavior modeling. At the lowest level, local interactions between agents occur; at the middle level, strategic and tactical behaviors appear like overtakings or route choices; and at the highest level path-planning is necessary. The agent-based pedestrian simulators either focus on a specific level (mainly in the lower one) or define strategies like the layered architectures to independently manage the different behavioral levels. In our Multi-Agent Reinforcement-Learning-based Pedestrian simulation framework (MARL-Ped) the situation is addressed as a whole. Each embodied agent uses a model-free Reinforcement Learning (RL) algorithm to learn autonomously to navigate in the virtual environment. The main goal of this work is to demonstrate empirically that MARL-Ped generates learned behaviors adapted to the level required by the pedestrian scenario. Three different experiments, described in the pedestrian modeling literature, are presented to test our approach: (i) election of the shortest path vs. quickest path; (ii) a crossing between two groups of pedestrians walking in opposite directions inside a narrow corridor; (iii) two agents that move in opposite directions inside a maze. The results show that MARL-Ped solves the different problems, learning individual behaviors with characteristics of pedestrians (local control that produces adequate fundamental diagrams, route-choice capability, emergence of collective behaviors and path-planning). Besides, we compared our model with that of Helbing’s social forces, a well-known model of pedestrians, showing similarities between the pedestrian dynamics generated by both approaches. These results demonstrate empirically that MARL-Ped generates variate plausible behaviors, producing human-like macroscopic pedestrian flow.

Autonomous Agents and Multi-Agent Systems | 2015

Strategies for simulating pedestrian navigation with multiple reinforcement learning agents

Francisco Martinez-Gil; Miguel Lozano; Fernando Fernández

In this paper, a new multi-agent reinforcement learning approach is introduced for the simulation of pedestrian groups. Unlike other solutions, where the behaviors of the pedestrians are coded in the system, in our approach the agents learn by interacting with the environment. The embodied agents must learn to control their velocity, avoiding obstacles and the other pedestrians, to reach a goal inside the scenario. The main contribution of this paper is to propose this new methodology that uses different iterative learning strategies, combining a vector quantization (state space generalization) with the Q-learning algorithm (VQQL). Two algorithmic schemas, Iterative VQQL and Incremental, which differ in the way of addressing the problems, have been designed and used with and without transfer of knowledge. These algorithms are tested and compared with the VQQL algorithm as a baseline in two scenarios where agents need to solve well-known problems in pedestrian modeling. In the first, agents in a closed room need to reach the unique exit producing and solving a bottleneck. In in the second, two groups of agents inside a corridor need to reach their goal that is placed in opposite sides (they need to solve the crossing). In the first scenario, we focus on scalability, use metrics from the pedestrian modeling field, and compare with the Helbing’s social force model. The emergence of collective behaviors, that is, the shell-shaped clogging in front of the exit in the first scenario, and the lane formation as a solution to the problem of the crossing, have been obtained and analyzed. The results demonstrate that the proposed schemas find policies that carry out the tasks, suggesting that they are applicable and generalizable to the simulation of pedestrians groups.

motion in games | 2012

Calibrating a Motion Model Based on Reinforcement Learning for Pedestrian Simulation

Francisco Martinez-Gil; Miguel Lozano; Fernando Fernández

In this paper, the calibration of a framework based in Multi-agent Reinforcement Learning (RL) for generating motion simulations of pedestrian groups is presented. The framework sets a group of autonomous embodied agents that learn to control individually its instant velocity vector in scenarios with collisions and friction forces. The result of the process is a different learned motion controller for each agent. The calibration of both, the physical properties involved in the motion of our embodied agents and the corresponding dynamics, is an important issue for a realistic simulation. The physics engine used has been calibrated with values taken from real pedestrian dynamics. Two experiments have been carried out for testing this approach. The results of the experiments are compared with databases of real pedestrians in similar scenarios. As a comparison tool, the diagram of speed versus density, known as fundamental diagram in the literature, is used.

Simulation Modelling Practice and Theory | 2017

Emergent behaviors and scalability for multi-agent reinforcement learning-based pedestrian models

Francisco Martinez-Gil; Miguel Lozano; Fernando Fernández

Abstract This paper analyzes the emergent behaviors of pedestrian groups that learn through the multiagent reinforcement learning model developed in our group. Five scenarios studied in the pedestrian model literature, and with different levels of complexity, were simulated in order to analyze the robustness and the scalability of the model. Firstly, a reduced group of agents must learn by interaction with the environment in each scenario. In this phase, each agent learns its own kinematic controller, that will drive it at a simulation time. Secondly, the number of simulated agents is increased, in each scenario where agents have previously learnt, to test the appearance of emergent macroscopic behaviors without additional learning. This strategy allows us to evaluate the robustness and the consistency and quality of the learned behaviors. For this purpose several tools from pedestrian dynamics, such as fundamental diagrams and density maps, are used. The results reveal that the developed model is capable of simulating human-like micro and macro pedestrian behaviors for the simulation scenarios studied, including those where the number of pedestrians has been scaled by one order of magnitude with respect to the situation learned.

ACM Computing Surveys | 2017

Modeling, Evaluation, and Scale on Artificial Pedestrians: A Literature Review

Francisco Martinez-Gil; Miguel Lozano; Ignacio García-Fernández; Fernando Fernández

Modeling pedestrian dynamics and their implementation in a computer are challenging and important issues in the knowledge areas of transportation and computer simulation. The aim of this article is to provide a bibliographic outlook so that the reader may have quick access to the most relevant works related to this problem. We have used three main axes to organize the article’s contents: pedestrian models, validation techniques, and multiscale approaches. The backbone of this work is the classification of existing pedestrian models; we have organized the works in the literature under five categories, according to the techniques used for implementing the operational level in each pedestrian model. Then the main existing validation methods, oriented to evaluate the behavioral quality of the simulation systems, are reviewed. Furthermore, we review the key issues that arise when facing multiscale pedestrian modeling, where we first focus on the behavioral scale (combinations of micro and macro pedestrian models) and second on the scale size (from individuals to crowds). The article begins by introducing the main characteristics of walking dynamics and its analysis tools and concludes with a discussion about the contributions that different knowledge fields can make in the near future to this exciting area.

multi agent systems and agent based simulation | 2014

Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study

Francisco Martinez-Gil; Miguel Lozano; Fernando Fernández

In this work, a Multi-agent Reinforcement Learning framework is used to generate simulations of virtual pedestrians groups. The aim is to study the influence of two different learning approaches in the quality of generated simulations. The case of study consists on the simulation of the crossing of two groups of embodied virtual agents inside a narrow corridor. This scenario is a classic experiment inside the pedestrian modeling area, because a collective behavior, specifically the lanes formation, emerges with real pedestrians. The paper studies the influence of different learning algorithms, function approximation approaches, and knowledge transfer mechanisms on performance of learned pedestrian behaviors. Specifically, two different RL-based schemas are analyzed. The first one, Iterative Vector Quantization with Q-Learning (ITVQQL), improves iteratively a state-space generalizer based on vector quantization. The second scheme, named TS, uses tile coding as the generalization method with the Sarsa(\(\lambda \)) algorithm. Knowledge transfer approach is based on the use of Probabilistic Policy Reuse to incorporate previously acquired knowledge in current learning processes; additionally, value function transfer is also used in the ITVQQL schema to transfer the value function between consecutive iterations. Results demonstrate empirically that our RL framework generates individual behaviors capable of emerging the expected collective behavior as occurred in real pedestrians. This collective behavior appears independently of the learning algorithm and the generalization method used, but depends extremely on whether knowledge transfer was applied or not. In addition, the use of transfer techniques has a remarkable influence in the final performance (measured in number of times that the task was solved) of the learned behaviors.

international conference on algorithms and architectures for parallel processing | 2016

MARL-Ped+Hitmap: Towards Improving Agent-Based Simulations with Distributed Arrays

Eduardo Rodriguez-Gutiez; Francisco Martinez-Gil; Juan M. Orduña; Arturo Gonzalez-Escribano

Multi-agent systems allow the modelling of complex, heterogeneous, and distributed systems in a realistic way. MARL-Ped is a multi-agent system tool, based on the MPI standard, for the simulation of different scenarios of pedestrians who autonomously learn the best behavior by Reinforcement Learning. MARL-Ped uses one MPI process for each agent by design, with a fixed fine-grain granularity. This requirement limits the performance of the simulations for a restricted number of processors that is lesser than the number of agents. On the other hand, Hitmap is a library to ease the programming of parallel applications based on distributed arrays. It includes abstractions for the automatic partition and mapping of arrays at runtime with arbitrary granularity, as well as functionalities to build flexible communication patterns that transparently adapt to the data partitions.

Journal of Experimental and Theoretical Artificial Intelligence | 2008

Agent's actions as a classification criteria for the state space in a learning from rewards system

Francisco Martinez-Gil

We focus in this paper on the problem of learning an autonomous agents policy when the state space is very large and the set of actions available is comparatively short. To this end, we use a non-parametric decision rule (concretely, a nearest-neighbour strategy) in order to cluster the state space by means of the action that leads to a successful situation. Using an exploration strategy to avoid greedy behaviour, the agent builds clusters of positively-classified states through trial and error learning. In this paper, we implement a 3D synthetic agent which plays an ‘avoid the asteroid’ game that suits our assumptions. Using as the state space a feature vector space extracted from a visual navigation system, we test two exploration strategies using the trial and error learning method. This experiment shows that the agent is a good classifier over the state space, and will therefore show good behaviour in its synthetic world.

international conference on agents and artificial intelligence | 2010