Regret to the best vs. regret to the average

Eyal Even-Dar¹,
Michael Kearns²,
Yishay Mansour^1,3 &
…
Jennifer Wortman²

799 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

We study online regret minimization algorithms in an experts setting. In this setting, the algorithm chooses a distribution over experts at each time step and receives a gain that is a weighted average of the experts’ instantaneous gains. We consider a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the average of all experts, the regret to any given fixed mixture of experts, or the regret to the worst expert. This study leads both to new understanding of the limitations of existing no-regret algorithms, and to new algorithms with novel performance guarantees. More specifically, we show that any algorithm that achieves only \(O(\sqrt{T})\) cumulative regret to the best expert on a sequence of T trials must, in the worst case, suffer regret \(\varOmega(\sqrt{T})\) to the average, and that for a wide class of update rules that includes many existing no-regret algorithms (such as Exponential Weights and Follow the Perturbed Leader), the product of the regret to the best and the regret to the average is, in the worst case, Ω(T). We then describe and analyze two alternate new algorithms that both achieve cumulative regret only \(O(\sqrt{T}\log T)\) to the best expert and have only constant regret to any given fixed distribution over experts (that is, with no dependence on either T or the number of experts N). The key to the first algorithm is the gradual increase in the “aggressiveness” of updates in response to observed divergences in expert performances. The second algorithm is a simple twist on standard exponential-update algorithms.

Article PDF

An Efficient Algorithm for Learning with Semi-bandit Feedback

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

Exploiting problem structure in optimization under uncertainty via online convex optimization

Article 30 March 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Auer, P., Cesa-Bianchi, N., & Gentile, C. (2002). Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences, 64, 48–75.
Article MATH MathSciNet Google Scholar
Cesa-Bianchi, N., & Lugosi, G. (2006). Prediction, learning, and games. Cambridge: Cambridge University Press.
MATH Google Scholar
Cesa-Bianchi, N., Mansour, Y., & Stoltz, G. (2007). Improved second-order bounds for prediction with expert advice. Machine Learning, 66(2/3), 321–352.
Article Google Scholar
Cover, T. (1991). Universal portfolios. Mathematical Finance, 1(1), 1–19.
Article MATH MathSciNet Google Scholar
Even-Dar, E., Kearns, M., Mansour, Y., & Wortman, J. (2007). Regret to the best versus regret to the average. In Twentieth annual conference on learning theory (pp. 233–247).
Freund, Y. (2003). Predicting a binary sequence almost as well as the optimal biased coin. Information and Computation, 182(2), 73–94.
Article MATH MathSciNet Google Scholar
Helmbold, D., Schapire, R., Singer, Y., & Warmuth, M. (1998). On-line portfolio selection using multiplicative updates. Mathematical Finance, 8(4), 325–347.
Article MATH Google Scholar
Kalai, A., & Vempala, S. (2005). Efficient algorithms for on-line optimization. Journal of Computer and System Sciences, 71(3), 291–307.
Article MATH MathSciNet Google Scholar
Littlestone, N., & Warmuth, M. K. (1994). The weighted majority algorithm. Information and Computation, 108(2), 212–261.
Article MATH MathSciNet Google Scholar
Vovk, V. (1998). A game of prediction with expert advice. Journal of Computer and System Sciences, 56(2), 153–173.
Article MATH MathSciNet Google Scholar
Zhang, T. (2007). Personal communication.

Download references

Author information

Authors and Affiliations

Google Research, 76 Ninth Avenue, New York, NY, 10011, USA
Eyal Even-Dar & Yishay Mansour
Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
Michael Kearns & Jennifer Wortman
Department of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
Yishay Mansour

Authors

Eyal Even-Dar
View author publications
You can also search for this author in PubMed Google Scholar
Michael Kearns
View author publications
You can also search for this author in PubMed Google Scholar
Yishay Mansour
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer Wortman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jennifer Wortman.

Additional information

Editors: Claudio Gentile, Nader H. Bshouty.

A preliminary version of this work appeared in the Proceedings of the Twentieth Annual Conference on Learning Theory, 2007 (Even-Dar et al. 2007).

This work was done while E. Even-Dar was a post doctoral researcher at the University of Pennsylvania.

Y. Mansour was supported in part by grant no. 1079/04 from the Israel Science Foundation, a grant from BSF, an IBM faculty award, and the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. This paper reflects only the authors’ views.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Even-Dar, E., Kearns, M., Mansour, Y. et al. Regret to the best vs. regret to the average. Mach Learn 72, 21–37 (2008). https://doi.org/10.1007/s10994-008-5060-z

Download citation

Received: 30 August 2007
Revised: 28 March 2008
Accepted: 04 April 2008
Published: 13 May 2008
Issue Date: August 2008
DOI: https://doi.org/10.1007/s10994-008-5060-z

Regret to the best vs. regret to the average

Abstract

Article PDF

Similar content being viewed by others

An Efficient Algorithm for Learning with Semi-bandit Feedback

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

Exploiting problem structure in optimization under uncertainty via online convex optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Regret to the best vs. regret to the average

Abstract

Article PDF

Similar content being viewed by others

An Efficient Algorithm for Learning with Semi-bandit Feedback

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

Exploiting problem structure in optimization under uncertainty via online convex optimization

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation