Computer Science > Machine Learning

arXiv:2210.00895 (cs)

[Submitted on 30 Sep 2022 (v1), last revised 6 Feb 2023 (this version, v2)]

Title:On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

Authors:Antoine Barrier (UMPA-ENSL, LMO, CELESTE), Aurélien Garivier (UMPA-ENSL, LIP), Gilles Stoltz (LMO, CELESTE)

View PDF

Abstract:We lay the foundations of a non-parametric theory of best-arm identification in multi-armed bandits with a fixed budget T. We consider general, possibly non-parametric, models D for distributions over the arms; an overarching example is the model D = P(0,1) of all probability distributions over [0,1]. We propose upper bounds on the average log-probability of misidentifying the optimal arm based on information-theoretic quantities that correspond to infima over Kullback-Leibler divergences between some distributions in D and a given distribution. This is made possible by a refined analysis of the successive-rejects strategy of Audibert, Bubeck, and Munos (2010). We finally provide lower bounds on the same average log-probability, also in terms of the same new information-theoretic quantities; these lower bounds are larger when the (natural) assumptions on the considered strategies are stronger. All these new upper and lower bounds generalize existing bounds based, e.g., on gaps between distributions.

Subjects:	Machine Learning (cs.LG); Information Theory (cs.IT); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2210.00895 [cs.LG]
	(or arXiv:2210.00895v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2210.00895
Journal reference:	ALT 2023 - The 34th International Conference on Algorithmic Learning Theory, Feb 2023, Singapour, Singapore

Submission history

From: Gilles Stoltz [view email] [via CCSD proxy]
[v1] Fri, 30 Sep 2022 10:55:40 UTC (291 KB)
[v2] Mon, 6 Feb 2023 14:56:11 UTC (108 KB)

Computer Science > Machine Learning

Title:On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators