A Heuristic Adaptive Fast Gradient Method in Stochastic Optimization Problems

A. V. Ogal’tsov¹ &
A. I. Tyurin¹

455 Accesses
1 Altmetric
Explore all metrics

Abstract

A fast adaptive heuristic stochastic gradient descent method is proposed. It is shown that this algorithm has a higher convergence rate in practical problems than currently popular optimization methods. Furthermore, a justification of this method is given, and difficulties that prevent obtaining optimal estimates for the proposed algorithm are described.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybridization of accelerated gradient descent method

Article 27 December 2017

Stochastic Gradient Descent and Its Variants in Machine Learning

Article 12 February 2019

A fast non-monotone line search for stochastic gradient descent

Article 23 September 2023

REFERENCES

Yu. E. Nesterov, Introduction to Convex Optimization (Mosk. Tsentr Nepreryvnogo Matematicheskogo Obrazovaniya, Moscow, 2010) [in Russian].
Google Scholar
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT Press, Cambridge, Mass., 2016).
MATH Google Scholar
A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems (2012), pp. 1097–1105.
A. V. Gasnikov, P. E. Dvurechensky, and I. N. Usmanova, “On the nontriviality of fast randomized methods,” Trudy Mosk. Fiz.-Teknh. Inst. 8 (2), 67–100 (2016).
Google Scholar
A. V. Gasnikov, Modern Numerical Optimization Methods: The Universal Gradient Descent Method (Mosc. Fiziko-tekhnicheskii Institut, 2018). arXiv:1711.00394
F. Bach and K. Y. Levy, “A universal algorithm for variational inequalities adaptive to smoothness and noise,” COLT, 2019.
Google Scholar
S. Vaswani, A. Mishkin, I. Laradji, M. Schmidt, G. Gidel, and S. Lacoste-Julien, “Painless stochastic gradient: Interpolation, line-search, and convergence rates,” NIPS, 2019.
Google Scholar
J. Nocedal and S. Wright, Numerical Optimization (Springer Science, 2006).
MATH Google Scholar
R. Ward, X. Wu, and L. Bottou, “AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization,” ICML, 2019.
Google Scholar
J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” J. Mach. Learning Res. 12, 2121–2159 (2011).
MathSciNet MATH Google Scholar
Q. Deng, Y. Cheng, and G. Lan, “Optimal adaptive and accelerated stochastic gradient descent,” 2018. arXiv:1810.00553.
K. Y. Levy, A. Yurtsever, and V. Cevher, “Online adaptive methods, universality and acceleration,” NIPS, 2018.
Google Scholar
A. N. Iusem, A. Jofre, R. I. Oliveira, and P. Thompson, “Variance-based extragradient methods with line search for stochastic variational inequalities,” SIAM J. Optim. 29, 175–206 (2019).
Article MathSciNet Google Scholar
S. Boucheron, G. Lugosi, and P. Massart, Concentration Inequalities: A Nonasymptotic Theory Of Independence (Oxford Univ. Press, 2013).
Book Google Scholar
D. Panchenko, “Symmetrization approach to concentration inequalities for empirical processes,” Annals Probab. 31, 2068–2081 (2003).
Article MathSciNet Google Scholar
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” ICLR, 2015.
Google Scholar
M. D. Gupta and T. Huang, “Bregman distance to l1 regularized logistic regression,” in 19th International Conference on Pattern Recognition, 2008, pp. 1–4.
O. Devolder, F. Glineur, and Yu. Nesterov, “First-order methods with inexact oracle: The strongly convex case,” CORE Discussion Paper 2013/16. 2013. https://www.uclouvain.be/cps/ucl/doc/core/documents/coredp2013_16web.pdf
O. Devolder, F. Glineur, and Yu. Nesterov, “First-order methods of smooth convex optimization with inexact oracle,” Math. Program. 146, 37–75 (2014).
Article MathSciNet Google Scholar
O. Devolder, “Exactness, inexactness and stochasticity in first-order methods for largescale convex optimization,” PhD thesis, CORE UCL, 2013.
M. Li, T. Zhang, Y. Chen, and A. J. Smola, “Efficient mini-batch training for stochastic optimization,” ACM, 2014.
Book Google Scholar
A. Juditsky and A. Nemirovski, “Large deviations of vector-valued martingales in 2-smooth normed spaces,” 2008. arXiv:0809.0813
A. V. Gasnikov and A. I. Tyurin, “Fast gradient descent for convex minimization problems with an oracle producing a (δ; L)-model of function at the requested point,” Comput. Math. Math. Phys. 59, 1085–1097 (2019).
Article MathSciNet Google Scholar
C. Robert and G. Casella, Monte Carlo Statistical Methods (Springer Science, 2013).
MATH Google Scholar
G. Lan, A. Nemirovski, and A. Shapiro, “Validation analysis of mirror descent stochastic approximation method,” Math. Program. 134, 425–458 (2012).
Article MathSciNet Google Scholar
A. Griewank, “On automatic differentiation,” Math. Program: Recent Developments Appl. 6 (6), 83–107 (1989).
Google Scholar
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE 86, 2278–2324 (1998).
Article Google Scholar
A. Krizhevsky, Learning Multiple Layers of Features from Tiny Images, PhD thesis, University of Toronto, 2009.

Download references

Funding

The work by A. I. Tyurin was supported by the Russian Foundation for Basic Research, project no. 19-31-90062 Aspiranty.

Author information

Authors and Affiliations

State University—Higher School of Economics, 101000, Moscow, Russia
A. V. Ogal’tsov & A. I. Tyurin

Authors

A. V. Ogal’tsov
View author publications
You can also search for this author in PubMed Google Scholar
A. I. Tyurin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. I. Tyurin.

Additional information

Translated by A. Klimontovich

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ogal’tsov, A.V., Tyurin, A.I. A Heuristic Adaptive Fast Gradient Method in Stochastic Optimization Problems. Comput. Math. and Math. Phys. 60, 1108–1115 (2020). https://doi.org/10.1134/S0965542520070088

Download citation

Received: 07 October 2019
Revised: 12 November 2019
Accepted: 10 March 2020
Published: 09 August 2020
Issue Date: July 2020
DOI: https://doi.org/10.1134/S0965542520070088

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hybridization of accelerated gradient descent method

Stochastic Gradient Descent and Its Variants in Machine Learning

A fast non-monotone line search for stochastic gradient descent

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords:

Subscribe and save

Buy Now

Navigation

A Heuristic Adaptive Fast Gradient Method in Stochastic Optimization Problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hybridization of accelerated gradient descent method

Stochastic Gradient Descent and Its Variants in Machine Learning

A fast non-monotone line search for stochastic gradient descent

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Subscribe and save

Buy Now

Search

Navigation