Finite-time Safe Reinforcement Learning Control of Multi-player Nonzero-Sum Game for Quadcopter Systems
Published in Information Sciences, 2025
This paper investigates a finite-time safe reinforcement learning control algorithm for multi-player nonzero-sum games (FT-SRL-NZS). In addressing the finite-time safe optimal control issue, value functions incorporating designated barrier functions for the involved players are established within the transformed finite-time stable space. The finite-time safe optimal controller is derived from the solution to the transformed Nash equilibrium condition. An actor-critic structure is proposed for solving the Hamilton-Jacobi-Bellman (HJB) equation in the finite-time stable space, aimed at approximating the finite-time optimal value and its corresponded controller using a novel finite-time concurrent learning update law. A dynamic event-trigger rule adjusts the trigger condition in real time, thereby minimizing the computational and communicative demands associated with calculating Nash equilibrium. Lyapunov stability analysis is employed to examine the finite-time equilibrium of the closed-loop system. Numerical simulations and unmanned aerial vehicle (UAV) hardware tests are carried out to illustrate the efficacy of the proposed finite-time safe control algorithm.
Recommended citation: Tan, Junkai and Xue, Shuangsi and Guan, Qingshu and Qu, Kai and Cao, Hui (2025). Finite-time Safe Reinforcement Learning Control of Multi-player Nonzero-Sum Game for Quadcopter Systems. Information Sciences.
Download Paper