A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Documents

  • Fulltext

    Final published version, 528 KB, PDF document

We present a modified tuning of the algorithm of Zimmert and Seldin [2020] for adversarial multiarmed bandits with delayed feedback, which in addition to the minimax optimal adversarial regret guarantee shown by Zimmert and Seldin simultaneously achieves a near-optimal regret guarantee in the stochastic setting with fixed delays. Specifically, the adversarial regret guarantee is O(√TK + √dT log K), where T is the time horizon, K is the number of arms, and d is the fixed delay, whereas the stochastic regret guarantee is O (equation presented), where Δi are the suboptimality gaps. We also present an extension of the algorithm to the case of arbitrary delays, which is based on an oracle knowledge of the maximal delay dmax and achieves O(√TK + √Dlog K + dmaxK1/3 log K) regret in the adversarial regime, where D is the total delay, and O (equation presented) regret in the stochastic regime, where σmax is the maximal number of outstanding observations. Finally, we present a lower bound that matches the refined adversarial regret upper bound achieved by the skipping technique of Zimmert and Seldin [2020] in the adversarial setting.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
Number of pages26
PublisherNeurIPS Proceedings
Publication date2022
ISBN (Electronic)9781713871088
Publication statusPublished - 2022
Event36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States
Duration: 28 Nov 20229 Dec 2022

Conference

Conference36th Conference on Neural Information Processing Systems, NeurIPS 2022
LandUnited States
ByNew Orleans
Periode28/11/202209/12/2022
SeriesAdvances in Neural Information Processing Systems
Volume35
ISSN1049-5258

Bibliographical note

Publisher Copyright:
© 2022 Neural information processing systems foundation. All rights reserved.

ID: 383431352