Weighted Gaussian Process Bandits for Non-stationary Environments

Deng, Yuntian; Zhou, Xingyu; Kim, Baek**; Tewari, Ambuj; Gupta, Abhishek; Shroff, Ness

Computer Science > Machine Learning

arXiv:2107.02371 (cs)

[Submitted on 6 Jul 2021 (v1), last revised 28 Mar 2022 (this version, v4)]

Title:Weighted Gaussian Process Bandits for Non-stationary Environments

Authors:Yuntian Deng, Xingyu Zhou, Baek** Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

View PDF

Abstract:In this paper, we consider the Gaussian process (GP) bandit optimization problem in a non-stationary environment. To capture external changes, the black-box function is allowed to be time-varying within a reproducing kernel Hilbert space (RKHS). To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression. A key challenge is how to cope with infinite-dimensional feature maps. To that end, we leverage kernel approximation techniques to prove a sublinear regret bound, which is the first (frequentist) sublinear regret guarantee on weighted time-varying bandits with general nonlinear rewards. This result generalizes both non-stationary linear bandits and standard GP-UCB algorithms. Further, a novel concentration inequality is achieved for weighted Gaussian process regression with general weights. We also provide universal upper bounds and weight-dependent upper bounds for weighted maximum information gains. These results are of independent interest for applications such as news ranking and adaptive pricing, where weights can be adopted to capture the importance or quality of data. Finally, we conduct experiments to highlight the favorable gains of the proposed algorithm in many cases when compared to existing methods.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2107.02371 [cs.LG]
	(or arXiv:2107.02371v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2107.02371

Submission history

From: Yuntian Deng [view email]
[v1] Tue, 6 Jul 2021 03:37:33 UTC (838 KB)
[v2] Sun, 6 Feb 2022 16:57:34 UTC (676 KB)
[v3] Tue, 15 Feb 2022 15:36:24 UTC (677 KB)
[v4] Mon, 28 Mar 2022 17:34:27 UTC (678 KB)

Computer Science > Machine Learning

Title:Weighted Gaussian Process Bandits for Non-stationary Environments

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Weighted Gaussian Process Bandits for Non-stationary Environments

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators