Google Scholar

[PS][PS] Learning rate schedules for faster stochastic gradient search

C Darken, J Chang, J Moody - Neural networks for signal processing, 1992 - 193.166.3.2

C Darken, J Chang, J Moody

Neural networks for signal processing, 1992•193.166.3.2

Stochastic gradient descent is a general algorithm that includes LMS, on-line
backpropagation, and adaptive k-means clustering as special cases. The standard choices
of the learning rate (both adaptive and xed functions of time) often perform quite poorly. In
contrast, our recently proposed class of\search then converge"(STC) learning rate schedules
(Darken and Moody, 1990b, 1991) display the theoretically optimal asymptotic convergence
rate and a superior ability to escape from poor local minima However, the user is …

Abstract

Stochastic gradient descent is a general algorithm that includes LMS, on-line backpropagation, and adaptive k-means clustering as special cases. The standard choices of the learning rate (both adaptive and xed functions of time) often perform quite poorly. In contrast, our recently proposed class of\search then converge"(STC) learning rate schedules (Darken and Moody, 1990b, 1991) display the theoretically optimal asymptotic convergence rate and a superior ability to escape from poor local minima However, the user is responsible for setting a key parameter. We propose here a new methodology for creating the rst automatically adapting learning rates that achieve the optimal rate of convergence.

193.166.3.2

Show moreShow less

Save Cite Cited by 365 Related articles All 5 versions View as HTML

Cite

Advanced search

Saved to My library

[PS][PS] Learning rate schedules for faster stochastic gradient search