Computer Science Multiagent Systems - Researchain

Featured Researches

A Regularized Opponent Model with Maximum Entropy Objective

In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the "optimality". In this paper, we redefine the binary random variable o in multi-agent setting and formalize multi-agent reinforcement learning (MARL) as probabilistic inference. We derive a variational lower bound of the likelihood of achieving the optimality and name it as Regularized Opponent Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel perspective on opponent modeling and show how it can improve the performance of training agents theoretically and empirically in cooperative games. To optimize ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of convergence. We extend the exact algorithm to complex environments by proposing an approximate version, ROMMEO-AC. We evaluate these two algorithms on the challenging iterated matrix game and differential game respectively and show that they can outperform strong MARL baselines.

Multiagent Systems

A Reputation System for Multi-Agent Marketplaces

We present an exploration of a reputation system based on explicit ratings weighted by the values of corresponding financial transactions from the perspective of its ability to grant "security" to market participants by protecting them from scam and "equity" in terms of having real qualities of the participants correctly assessed. We present a simulation modeling approach based on the selected reputation system and discuss the results of the simulation.

Multiagent Systems

A Review of Platforms for the Development of Agent Systems

Agent-based computing is an active field of research with the goal of building autonomous software of hardware entities. This task is often facilitated by the use of dedicated, specialized frameworks. For almost thirty years, many such agent platforms have been developed. Meanwhile, some of them have been abandoned, others continue their development and new platforms are released. This paper presents a up-to-date review of the existing agent platforms and also a historical perspective of this domain. It aims to serve as a reference point for people interested in developing agent systems. This work details the main characteristics of the included agent platforms, together with links to specific projects where they have been used. It distinguishes between the active platforms and those no longer under development or with unclear status. It also classifies the agent platforms as general purpose ones, free or commercial, and specialized ones, which can be used for particular types of applications.

Multiagent Systems

A Self-Integration Testbed for Decentralized Socio-technical Systems

The Internet of Things comes along with new challenges for experimenting, testing, and operating decentralized socio-technical systems at large-scale. In such systems, autonomous agents interact locally with their users, and remotely with other agents to make intelligent collective choices. Via these interactions they self-regulate the consumption and production of distributed resources. While such complex systems are often deployed and operated using centralized computing infrastructures, the socio-technical nature of these decentralized systems requires new value-sensitive design paradigms; empowering trust, transparency, and alignment with citizens' social values, such as privacy preservation, autonomy, and fairness among citizens' choices. Currently, instruments and tools to study such systems and guide the prototyping process from simulation to live deployment are missing, or not practical in this distributed socio-technical context. This paper bridges this gap by introducing a novel testbed architecture for decentralized socio-technical systems running on IoT. This new architecture is designed for a seamless reusability of (i) application-independent decentralized services by an IoT application, and (ii) different IoT applications by the same decentralized service. This dual self-integration promises IoT applications that are simpler to prototype, and can interoperate with decentralized services during runtime to self-integrate more complex functionality. Such integration provides stronger validation of IoT applications, and improves resource utilization. Pressure and crash tests during continuous operations of several weeks, with more than 80K network joining and leaving of agents, 2.4M parameter changes, and 100M communicated messages, confirm the robustness and practicality of the testbed architecture.

Multiagent Systems

A Separation-Based Methodology to Consensus Tracking of Switched High-Order Nonlinear Multi-Agent Systems

This work investigates a reduced-complexity adaptive methodology to consensus tracking for a team of uncertain high-order nonlinear systems with switched (possibly asynchronous) dynamics. It is well known that high-order nonlinear systems are intrinsically challenging as feedback linearization and backstepping methods successfully developed for low-order systems fail to work. At the same time, even the adding-one power-integrator methodology, well explored for the single-agent high-order case, presents some complexity issues and is unsuited for distributed control. At the core of the proposed distributed methodology is a newly proposed definition for separable functions: this definition allows the formulation of a separation-based lemma to handle the high-order terms with reduced complexity in the control design. Complexity is reduced in a twofold sense: the control gain of each virtual control law does not have to be incorporated in the next virtual control law iteratively, thus leading to a simpler expression of the control laws; the order of the virtual control gains increases only proportionally (rather than exponentially) with the order of the systems, dramatically reducing high-gain issues.

Multiagent Systems

A Simulation Model for Pedestrian Crowd Evacuation Based on Various AI Techniques

This paper attempts to design an intelligent simulation model for pedestrian crowd evacuation. For this purpose, the cellular automata(CA) was fully integrated with fuzzy logic, the kth nearest neighbors (KNN), and some statistical equations. In this model, each pedestrian was assigned a specific speed, according to his/her physical, biological and emotional features. The emergency behavior and evacuation efficiency of each pedestrian were evaluated by coupling his or her speed with various elements, such as environment, pedestrian distribution and familiarity with the exits. These elements all have great impacts on the evacuation process. Several experiments were carried out to verify the performance of the model in different emergency scenarios. The results show that the proposed model can predict the evacuation time and emergency behavior in various types of building interiors and pedestrian distributions. The research provides a good reference to the design of building evacuation systems.

Multiagent Systems

A Study on Accelerating Average Consensus Algorithms Using Delayed Feedback

In this paper, we study accelerating a Laplacian-based dynamic average consensus algorithm by splitting the conventional delay-free disagreement feedback into weighted summation of a current and an outdated term. We determine for what weighted sum there exists a range of time delay that results in the higher rate of convergence for the algorithm. For such weights, using the Lambert W function, we obtain the rate increasing range of the time delay, the maximum reachable rate and comment on the value of the corresponding maximizer delay. We also study the effect of use of outdated feedback on the control effort of the agents and show that only for some specific affine combination of the immediate and outdated feedback the control effort of the agents does not go beyond that of the delay-free algorithm. Additionally, we demonstrate that using outdated feedback does not increase the steady state tracking error of the average consensus algorithm. Lastly, we determine the optimum combination of the current and the outdated feedback weights to achieve the maximum increase in the rate of convergence without increasing the control effort of the agents. We demonstrate our results through a numerical example.

Multiagent Systems

A Survey of Deep Reinforcement Learning in Video Games

Deep reinforcement learning (DRL) has made great achievements since proposed. Generally, DRL agents receive high-dimensional inputs at each step, and make actions according to deep-neural-network-based policies. This learning mechanism updates the policy to maximize the return with an end-to-end method. In this paper, we survey the progress of DRL methods, including value-based, policy gradient, and model-based algorithms, and compare their main techniques and properties. Besides, DRL plays an important role in game artificial intelligence (AI). We also take a review of the achievements of DRL in various video games, including classical Arcade games, first-person perspective games and multi-agent real-time strategy games, from 2D to 3D, and from single-agent to multi-agent. A large number of video game AIs with DRL have achieved super-human performance, while there are still some challenges in this domain. Therefore, we also discuss some key points when applying DRL methods to this field, including exploration-exploitation, sample efficiency, generalization and transfer, multi-agent learning, imperfect information, and delayed spare rewards, as well as some research directions.

Multiagent Systems

A Versatile Multi-Robot Monte Carlo Tree Search Planner for On-Line Coverage Path Planning

Mobile robots hold great promise in reducing the need for humans to perform jobs such as vacuuming, seeding,harvesting, painting, search and rescue, and inspection. In practice, these tasks must often be done without an exact map of the area and could be completed more quickly through the use of multiple robots working together. The task of simultaneously covering and mapping an area with multiple robots is known as multi-robot on-line coverage and is a growing area of research. Many multi-robot on-line coverage path planning algorithms have been developed as extensions of well established off-line coverage algorithms. In this work we introduce a novel approach to multi-robot on-line coverage path planning based on a method borrowed from game theory and machine learning- Monte Carlo Tree Search. We implement a Monte Carlo Tree Search planner and compare completion times against a Boustrophedon-based on-line multi-robot planner. The MCTS planner is shown to perform on par with the conventional Boustrophedon algorithm in simulations varying the number of robots and the density of obstacles in the map. The versatility of the MCTS planner is demonstrated by incorporating secondary objectives such as turn minimization while performing the same coverage task. The versatility of the MCTS planner suggests it is well suited to many multi-objective tasks that arise in mobile robotics.

Multiagent Systems

A generic and density-sensitive method for multi-scale pedestrian dynamics

Microscopic approaches to the simulation of pedestrian dynamics rely on modelling the behaviour of individual agents and their mutual interactions. Regarding the spatial resolution, microscopic simulators are either based on continuous (SpaceCont) or discrete (SpaceDisc) approaches. To combine the advantages of both approaches, we propose to integrate SpaceCont and SpaceDisc into a hybrid simulation model. Such a hybrid approach allows simulating critical regions with a continuous spatial resolution and uncritical ones with discrete spatial resolution while enabling consistent information exchange between the two simulation models. We introduce a generic approach that provides consistent solutions for the challenges resulting from coupling diverging time steps and spatial resolutions. Furthermore, we present a dynamic and density-sensitive approach to detect dense areas during the simulation run. If a critical region is detected, the simulation model used in this area is dynamically switched to a space-continuous one. The correctness of the hybrid model is evaluated by comparison with a established simulator. Its superior computational efficiency is shown by runtime comparison with a standard microscopic simulation.on with the simulation results of other, well-established simulation models.

Ready to get started?

Join us today

Archive Your Research