Non-stationary Projection-Free Online Learning with Dynamic and Adaptive Regret Guarantees

Authors

  • Yibo Wang National Key Laboratory for Novel Software Technology, Nanjing University School of Artificial Intelligence, Nanjing University
  • Wenhao Yang National Key Laboratory for Novel Software Technology, Nanjing University School of Artificial Intelligence, Nanjing University
  • Wei Jiang National Key Laboratory for Novel Software Technology, Nanjing University
  • Shiyin Lu Alibaba Group
  • Bing Wang Alibaba Group
  • Haihong Tang Alibaba Group
  • Yuanyu Wan School of Software Technology, Zhejiang University
  • Lijun Zhang National Key Laboratory for Novel Software Technology, Nanjing University School of Artificial Intelligence, Nanjing University

DOI:

https://doi.org/10.1609/aaai.v38i14.29495

Keywords:

ML: Online Learning & Bandits, ML: Optimization, ML: Time-Series/Data Streams

Abstract

Projection-free online learning has drawn increasing interest due to its efficiency in solving high-dimensional problems with complicated constraints. However, most existing projection-free online methods focus on minimizing the static regret, which unfortunately fails to capture the challenge of changing environments. In this paper, we investigate non-stationary projection-free online learning, and choose dynamic regret and adaptive regret to measure the performance. Specifically, we first provide a novel dynamic regret analysis for an existing projection-free method named BOGD_IP, and establish an O(T^¾ (1+P_T)) dynamic regret bound, where P_T denotes the path-length of the comparator sequence. Then, we improve the upper bound to O(T^¾ (1+P_T)^¼) by running multiple BOGD_IP algorithms with different step sizes in parallel, and tracking the best one on the fly. Our results are the first general-case dynamic regret bounds for projection-free online learning, and can recover the existing O(T^¾) static regret by setting P_T = 0. Furthermore, we propose a projection-free method to attain an O(?^¾) adaptive regret bound for any interval with length ?, which nearly matches the static regret over that interval. The essential idea is to maintain a set of BOGD_IP algorithms dynamically, and combine them by a meta algorithm. Moreover, we demonstrate that it is also equipped with an O(T^¾ (1+P_T)^¼) dynamic regret bound. Finally, empirical studies verify our theoretical findings.

Published

2024-03-24

How to Cite

Wang, Y., Yang, W., Jiang, W., Lu, S., Wang, B., Tang, H., Wan, Y., & Zhang, L. (2024). Non-stationary Projection-Free Online Learning with Dynamic and Adaptive Regret Guarantees. Proceedings of the AAAI Conference on Artificial Intelligence, 38(14), 15671-15679. https://doi.org/10.1609/aaai.v38i14.29495

Issue

Section

AAAI Technical Track on Machine Learning V