A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit

Kobzar, Vladimir A.; Kohn, Robert V.

Computer Science > Machine Learning

arXiv:2202.05767 (cs)

[Submitted on 11 Feb 2022 (v1), last revised 15 Jul 2023 (this version, v5)]

Title:A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit

Authors:Vladimir A. Kobzar, Robert V. Kohn

View PDF

Abstract:This work addresses a version of the two-armed Bernoulli bandit problem where the sum of the means of the arms is one (the symmetric two-armed Bernoulli bandit). In a regime where the gap between these means goes to zero as the number of prediction periods approaches infinity, i.e., the difficulty of detecting the gap increases as the sample size increases, we obtain the leading order terms of the minmax optimal regret and pseudoregret for this problem by associating each of them with a solution of a linear heat equation. Our results improve upon the previously known results; specifically, we explicitly compute these leading order terms in three different scaling regimes for the gap. Additionally, we obtain new non-asymptotic bounds for any given time horizon. Although optimal player strategies are not known for more general bandit problems, there is significant interest in considering how regret accumulates under specific player strategies, even when they are not known to be optimal. We expect that the methods of this paper should be useful in settings of that type.

Comments:	Improved results in the large gap regime
Subjects:	Machine Learning (cs.LG); Analysis of PDEs (math.AP); Machine Learning (stat.ML)
MSC classes:	68W27 (Primary) 93E20 49L20 35Q93 35K05 (Secondary)
ACM classes:	I.2.8
Cite as:	arXiv:2202.05767 [cs.LG]
	(or arXiv:2202.05767v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2202.05767

Submission history

From: Vladimir Kobzar [view email]
[v1] Fri, 11 Feb 2022 17:03:18 UTC (96 KB)
[v2] Fri, 9 Sep 2022 15:01:51 UTC (126 KB)
[v3] Sat, 12 Nov 2022 08:09:09 UTC (102 KB)
[v4] Mon, 10 Jul 2023 17:06:02 UTC (101 KB)
[v5] Sat, 15 Jul 2023 02:11:47 UTC (104 KB)

Computer Science > Machine Learning

Title:A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators