Computer Science > Machine Learning

arXiv:2205.06811 (cs)

[Submitted on 13 May 2022 (v1), last revised 10 Jul 2022 (this version, v2)]

Title:Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

Authors:Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

View PDF

Abstract:We study the linear contextual bandit problem in the presence of adversarial corruption, where the reward at each round is corrupted by an adversary, and the corruption level (i.e., the sum of corruption magnitudes over the horizon) is $C\geq 0$. The best-known algorithms in this setting are limited in that they either are computationally inefficient or require a strong assumption on the corruption, or their regret is at least $C$ times worse than the regret without corruption. In this paper, to overcome these limitations, we propose a new algorithm based on the principle of optimism in the face of uncertainty. At the core of our algorithm is a weighted ridge regression where the weight of each chosen action depends on its confidence up to some threshold. We show that for both known $C$ and unknown $C$ cases, our algorithm with proper choice of hyperparameter achieves a regret that nearly matches the lower bounds. Thus, our algorithm is nearly optimal up to logarithmic factors for both cases. Notably, our algorithm achieves the near-optimal regret for both corrupted and uncorrupted cases ($C=0$) simultaneously.

Comments:	25 pages, 1 table. This version simplifies the proof of the regret upper bound in Version 1, and provides a stronger result for the lower bound
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2205.06811 [cs.LG]
	(or arXiv:2205.06811v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2205.06811

Submission history

From: Quanquan Gu [view email]
[v1] Fri, 13 May 2022 17:58:58 UTC (29 KB)
[v2] Sun, 10 Jul 2022 02:02:58 UTC (25 KB)

Computer Science > Machine Learning

Title:Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators