Towards understanding why lookahead generalizes better than SGD and beyond
Abstract
Supplementary Material
- Download
- 319.22 KB
References
Recommendations
Towards theoretically understanding why SGD generalizes better than ADAM in deep learning
NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing SystemsIt is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse generalization performance than SGD despite their faster training speed. This work aims to provide understandings on this generalization gap by analyzing their local ...
Online machine minimization with lookahead
AbstractThis paper studies the online machine minimization problem, where the jobs have real release times, uniform processing times and a common deadline. We investigate how the lookahead ability improves the performance of online algorithms. Two ...
Better Algorithms for Benign Bandits
The online multi-armed bandit problem and its generalizations are repeated decision making problems, where the goal is to select one of several possible decisions in every round, and incur a cost associated with the decision, in such a way that the ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
Publisher
Curran Associates Inc.
Red Hook, NY, United States
Publication History
Qualifiers
- Research-article
- Research
- Refereed limited
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0