Computer Science Multiagent Systems - Researchain

Featured Researches

Act to Reason: A Dynamic Game Theoretical Model of Driving

The focus of this paper is to propose a driver model that incorporates human reasoning levels as actions during interactions with other drivers. Different from earlier work using game theoretical human reasoning levels, we propose a dynamic approach, where the actions are the levels themselves, instead of conventional driving actions such as accelerating or braking. This results in a dynamic behavior, where the agent adapts to its environment by exploiting different behavior models as available moves to choose from, depending on the requirements of the traffic situation. The bounded rationality assumption is preserved since the selectable strategies are designed by adhering to the fact that humans are cognitively limited in their understanding and decision making. Using a highway merging scenario, it is demonstrated that the proposed dynamic approach produces more realistic outcomes compared to the conventional method that employs fixed human reasoning levels.

Multiagent Systems

Action Semantics Network: Considering the Effects of Actions in Multiagent Systems

In multiagent systems (MASs), each agent makes individual decisions but all of them contribute globally to the system evolution. Learning in MASs is difficult since each agent's selection of actions must take place in the presence of other co-learning agents. Moreover, the environmental stochasticity and uncertainties increase exponentially with the increase in the number of agents. Previous works borrow various multiagent coordination mechanisms into deep learning architecture to facilitate multiagent coordination. However, none of them explicitly consider action semantics between agents that different actions have different influences on other agents. In this paper, we propose a novel network architecture, named Action Semantics Network (ASN), that explicitly represents such action semantics between agents. ASN characterizes different actions' influence on other agents using neural networks based on the action semantics between them. ASN can be easily combined with existing deep reinforcement learning (DRL) algorithms to boost their performance. Experimental results on StarCraft II micromanagement and Neural MMO show ASN significantly improves the performance of state-of-the-art DRL approaches compared with several network architectures.

Multiagent Systems

Active Deception using Factored Interactive POMDPs to Recognize Cyber Attacker's Intent

This paper presents an intelligent and adaptive agent that employs deception to recognize a cyber adversary's intent. Unlike previous approaches to cyber deception, which mainly focus on delaying or confusing the attackers, we focus on engaging with them to learn their intent. We model cyber deception as a sequential decision-making problem in a two-agent context. We introduce factored finitely nested interactive POMDPs (I-POMDPx) and use this framework to model the problem with multiple attacker types. Our approach models cyber attacks on a single honeypot host across multiple phases from the attacker's initial entry to reaching its adversarial objective. The defending I-POMDPx-based agent uses decoys to engage with the attacker at multiple phases to form increasingly accurate predictions of the attacker's behavior and intent. The use of I-POMDPs also enables us to model the adversary's mental state and investigate how deception affects their beliefs. Our experiments in both simulation and on a real host show that the I-POMDPx-based agent performs significantly better at intent recognition than commonly used deception strategies on honeypots.

Multiagent Systems

Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning

In cooperative stochastic games multiple agents work towards learning joint optimal actions in an unknown environment to achieve a common goal. In many real-world applications, however, constraints are often imposed on the actions that can be jointly taken by the agents. In such scenarios the agents aim to learn joint actions to achieve a common goal (minimizing a specified cost function) while meeting the given constraints (specified via certain penalty functions). In this paper, we consider the relaxation of the constrained optimization problem by constructing the Lagrangian of the cost and penalty functions. We propose a nested actor-critic solution approach to solve this relaxed problem. In this approach, an actor-critic scheme is employed to improve the policy for a given Lagrange parameter update on a faster timescale as in the classical actor-critic architecture. A meta actor-critic scheme using this faster timescale policy updates is then employed to improve the Lagrange parameters on the slower timescale. Utilizing the proposed nested actor-critic schemes, we develop three Nested Actor-Critic (N-AC) algorithms. Through experiments on constrained cooperative tasks, we show the effectiveness of the proposed algorithms.

Multiagent Systems

Adaptable and Verifiable BDI Reasoning

Long-term autonomy requires autonomous systems to adapt as their capabilities no longer perform as expected. To achieve this, a system must first be capable of detecting such changes. In this position paper, we describe a system architecture for BDI autonomous agents capable of adapting to changes in a dynamic environment and outline the required research. Specifically, we describe an agent-maintained self-model with accompanying theories of durative actions and learning new action descriptions in BDI systems.

Multiagent Systems

Adaptation and learning over networks under subspace constraints -- Part I: Stability Analysis

This paper considers optimization problems over networks where agents have individual objectives to meet, or individual parameter vectors to estimate, subject to subspace constraints that require the objectives across the network to lie in low-dimensional subspaces. This constrained formulation includes consensus optimization as a special case, and allows for more general task relatedness models such as smoothness. While such formulations can be solved via projected gradient descent, the resulting algorithm is not distributed. Starting from the centralized solution, we propose an iterative and distributed implementation of the projection step, which runs in parallel with the stochastic gradient descent update. We establish in this Part I of the work that, for small step-sizes μ , the proposed distributed adaptive strategy leads to small estimation errors on the order of μ . We examine in the accompanying Part II [2] the steady-state performance. The results will reveal explicitly the influence of the gradient noise, data characteristics, and subspace constraints, on the network performance. The results will also show that in the small step-size regime, the iterates generated by the distributed algorithm achieve the centralized steady-state performance.

Multiagent Systems

Adaptation and learning over networks under subspace constraints -- Part II: Performance Analysis

Part I of this paper considered optimization problems over networks where agents have individual objectives to meet, or individual parameter vectors to estimate, subject to subspace constraints that require the objectives across the network to lie in low-dimensional subspaces. Starting from the centralized projected gradient descent, an iterative and distributed solution was proposed that responds to streaming data and employs stochastic approximations in place of actual gradient vectors, which are generally unavailable. We examined the second-order stability of the learning algorithm and we showed that, for small step-sizes μ , the proposed strategy leads to small estimation errors on the order of μ . This Part II examines steady-state performance. The results reveal explicitly the influence of the gradient noise, data characteristics, and subspace constraints, on the network performance. The results also show that in the small step-size regime, the iterates generated by the distributed algorithm achieve the centralized steady-state performance.

Multiagent Systems

Adaptive Online Distributed Optimal Control of Very-Large-Scale Robotic Systems

This paper presents an adaptive online distributed optimal control approach that is applicable to optimal planning for very-large-scale robotics systems in highly uncertain environments. This approach is developed based on the optimal mass transport theory. It is also viewed as an online reinforcement learning and approximate dynamic programming approach in the Wasserstein-GMM space, where a novel value functional is defined based on the probability density functions of robots and the time-varying obstacle map functions describing the changing environmental information. The proposed approach is demonstrated on the path planning problem of very-largescale robotic systems where the approximated layout of obstacles in the workspace is incrementally updated by the observations of robots, and compared with some existing state-of-the-art approaches. The numerical simulation results show that the proposed approach outperforms these approaches in aspects of the average traveling distance and the energy cost.

Multiagent Systems

Adaptive Social Learning

This work proposes a novel strategy for social learning by introducing the critical feature of adaptation. In social learning, several distributed agents update continually their belief about a phenomenon of interest through: i) direct observation of streaming data that they gather locally; and ii) diffusion of their beliefs through local cooperation with their neighbors. Traditional social learning implementations are known to learn well the underlying hypothesis (which means that the belief of every individual agent peaks at the true hypothesis), achieving steady improvement in the learning accuracy under stationary conditions. However, these algorithms do not perform well under nonstationary conditions commonly encountered in online learning, exhibiting a significant inertia to track drifts in the streaming data. In order to address this gap, we propose an Adaptive Social Learning (ASL) strategy, which relies on a small step-size parameter to tune the adaptation degree. We provide a detailed characterization of the learning performance by means of a steady-state analysis. Focusing on the small step-size regime, we establish that the ASL strategy achieves consistent learning under standard global identifiability assumptions. We derive reliable Gaussian approximations for the probability of error (i.e., of choosing a wrong hypothesis) at each individual agent. We also carry out a large deviations analysis revealing the universal behavior of adaptive social learner: the error probabilities decrease exponentially fast with the inverse of the step-size, and we characterize the resulting exponential learning rate.

Multiagent Systems

Adaptive Workload Allocation for Multi-human Multi-robot Teams for Independent and Homogeneous Tasks

Multi-human multi-robot (MH-MR) systems have the ability to combine the potential advantages of robotic systems with those of having humans in the loop. Robotic systems contribute precision performance and long operation on repetitive tasks without tiring, while humans in the loop improve situational awareness and enhance decision-making abilities. A system's ability to adapt allocated workload to changing conditions and the performance of each individual (human and robot) during the mission is vital to maintaining overall system performance. Previous works from literature including market-based and optimization approaches have attempted to address the task/workload allocation problem with focus on maximizing the system output without regarding individual agent conditions, lacking in real-time processing and have mostly focused exclusively on multi-robot systems. Given the variety of possible combination of teams (autonomous robots and human-operated robots: any number of human operators operating any number of robots at a time) and the operational scale of MH-MR systems, development of a generalized framework of workload allocation has been a particularly challenging task. In this paper, we present such a framework for independent homogeneous missions, capable of adaptively allocating the system workload in relation to health conditions and work performances of human-operated and autonomous robots in real-time. The framework consists of removable modular function blocks ensuring its applicability to different MH-MR scenarios. A new workload transition function block ensures smooth transition without the workload change having adverse effects on individual agents. The effectiveness and scalability of the system's workload adaptability is validated by experiments applying the proposed framework in a MH-MR patrolling scenario with changing human and robot condition, and failing robots.

Ready to get started?

Join us today

Archive Your Research