Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

Pedramfar, Mohammad; Aggarwal, Vaneet

Computer Science > Machine Learning

arXiv:2303.13604 (cs)

[Submitted on 23 Mar 2023]

Title:Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

Authors:Mohammad Pedramfar, Vaneet Aggarwal

View PDF

Abstract:This paper investigates the problem of combinatorial multiarmed bandits with stochastic submodular (in expectation) rewards and full-bandit delayed feedback, where the delayed feedback is assumed to be composite and anonymous. In other words, the delayed feedback is composed of components of rewards from past actions, with unknown division among the sub-components. Three models of delayed feedback: bounded adversarial, stochastic independent, and stochastic conditionally independent are studied, and regret bounds are derived for each of the delay models. Ignoring the problem dependent parameters, we show that regret bound for all the delay models is $\tilde{O}(T^{2/3} + T^{1/3} \nu)$ for time horizon $T$, where $\nu$ is a delay parameter defined differently in the three cases, thus demonstrating an additive term in regret with delay in all the three delay models. The considered algorithm is demonstrated to outperform other full-bandit approaches with delayed composite anonymous feedback.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:2303.13604 [cs.LG]
	(or arXiv:2303.13604v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2303.13604

Submission history

From: Mohammad Pedramfar [view email]
[v1] Thu, 23 Mar 2023 18:38:33 UTC (139 KB)

Computer Science > Machine Learning

Title:Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators