Papers
arxiv:2305.17568

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

Published on May 27, 2023
Authors:
,
,
,

Abstract

A primal-dual method with shadow reward and neighbor truncation is proposed for safe multi-agent reinforcement learning, achieving convergence to a first-order stationary point and demonstrating effectiveness through experiments.

AI-generated summary

We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints. The objective and constraints are described by {\it general utilities}, i.e., nonlinear functions of the long-term state-action occupancy measure, which encompass broader decision-making goals such as risk, exploration, or imitations. The exponential growth of the state-action space size with the number of agents presents challenges for global observability, further exacerbated by the global coupling arising from agents' safety constraints. To tackle this issue, we propose a primal-dual method utilizing shadow reward and κ-hop neighbor truncation under a form of correlation decay property, where κ is the communication radius. In the exact setting, our algorithm converges to a first-order stationary point (FOSP) at the rate of Oleft(T^{-2/3}right). In the sample-based setting, we demonstrate that, with high probability, our algorithm requires mathcal{O}left(ε^{-3.5}right) samples to achieve an ε-FOSP with an approximation error of O(φ_0^{2κ}), where φ_0in (0,1). Finally, we demonstrate the effectiveness of our model through extensive numerical experiments.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2305.17568 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2305.17568 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2305.17568 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.