IEEE Transactions on Neural Networks and Learning Systems | 2021

Hierarchical Optimal Synchronization for Linear Systems via Reinforcement Learning: A Stackelberg–Nash Game Perspective

 
 
 
 
 

Abstract


Considering the fact that in the real world, a certain agent may have some sort of advantage to act before others, a novel hierarchical optimal synchronization problem for linear systems, composed of one major agent and multiple minor agents, is formulated and studied in this article from a Stackelberg–Nash game perspective. The major agent herein makes its decision prior to others, and then, all the minor agents determine their actions simultaneously. To seek the optimal controllers, the Hamilton–Jacobi–Bellman (HJB) equations in coupled forms are established, whose solutions are further proven to be stable and constitute the Stackelberg–Nash equilibrium. Due to the introduction of the asymmetric roles for agents, the established HJB equations are more strongly coupled and more difficult to solve than that given in most existing works. Therefore, we propose a new reinforcement learning (RL) algorithm, i.e., a two-level value iteration (VI) algorithm, which does not rely on complete system matrices. Furthermore, the proposed algorithm is shown to be convergent, and the converged values are exactly the optimal ones. To implement this VI algorithm, neural networks (NNs) are employed to approximate the value functions, and the gradient descent method is used to update the weights of NNs. Finally, an illustrative example is provided to verify the effectiveness of the proposed algorithm.

Volume 32
Pages 1600-1611
DOI 10.1109/TNNLS.2020.2985738
Language English
Journal IEEE Transactions on Neural Networks and Learning Systems

Full Text