On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization

Gaur, Mudit; Bedi, Amrit Singh; Wang, Di; Aggarwal, Vaneet

Computer Science > Machine Learning

arXiv:2306.10486 (cs)

[Submitted on 18 Jun 2023]

Title:On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization

Authors:Mudit Gaur, Amrit Singh Bedi, Di Wang, Vaneet Aggarwal

View PDF

Abstract:Actor-critic algorithms have shown remarkable success in solving state-of-the-art decision-making problems. However, despite their empirical effectiveness, their theoretical underpinnings remain relatively unexplored, especially with neural network parametrization. In this paper, we delve into the study of a natural actor-critic algorithm that utilizes neural networks to represent the critic. Our aim is to establish sample complexity guarantees for this algorithm, achieving a deeper understanding of its performance characteristics. To achieve that, we propose a Natural Actor-Critic algorithm with 2-Layer critic parametrization (NAC2L). Our approach involves estimating the $Q$-function in each iteration through a convex optimization problem. We establish that our proposed approach attains a sample complexity of $\tilde{\mathcal{O}}\left(\frac{1}{\epsilon^{4}(1-\gamma)^{4}}\right)$. In contrast, the existing sample complexity results in the literature only hold for a tabular or linear MDP. Our result, on the other hand, holds for countable state spaces and does not require a linear or low-rank structure on the MDP.

Comments:	arXiv admin note: text overlap with arXiv:2211.07675
Subjects:	Machine Learning (cs.LG)
ACM classes:	F.2.1
Cite as:	arXiv:2306.10486 [cs.LG]
	(or arXiv:2306.10486v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.10486

Submission history

From: Mudit Gaur Mr. [view email]
[v1] Sun, 18 Jun 2023 06:22:04 UTC (608 KB)

Computer Science > Machine Learning

Title:On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators