Scaling up stochastic gradient descent for non-convex optimisation
Abstract
References
Index Terms
- Scaling up stochastic gradient descent for non-convex optimisation
Recommendations
Asynchronous parallel stochastic gradient descent: a numeric core for scalable distributed machine learning algorithms
MLHPC '15: Proceedings of the Workshop on Machine Learning in High-Performance Computing EnvironmentsThe implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in terms of ...
Stochastic gradient descent as approximate Bayesian inference
Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian ...
Stochastic Gradient Descent with Polyak’s Learning Rate
AbstractStochastic gradient descent (SGD) for strongly convex functions converges at the rate . However, achieving good results in practice requires tuning the parameters (for example the learning rate) of the algorithm. In this paper we propose a ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
Publisher
Kluwer Academic Publishers
United States
Publication History
Author Tags
Qualifiers
- Research-article
Funding Sources
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
Citations
View Options
View options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in