Abstract
A wide collection of popular statistical learning methods, ranging from K-means to Support Vector Machines through Neural Networks, can be formulated as a stochastic gradient descent (SGD) algorithm in a specific setup. In practice, the main limitation of this incremental optimization technique is due to the stochastic noise induced by the choice at random of the data involved in the gradient estimate computation at each iteration. It is the purpose of this paper to introduce a novel implementation of the SGD algorithm, where the data subset used at a given step is not picked uniformly at random among all possible subsets but drawn from a specific adaptive sampling scheme, depending on the past iterations in a Markovian manner, in order to refine the current statistical estimation of the gradient. Beyond an algorithmic description of the approach we propose, rate bounds are established and illustrative numerical results are displayed in order to provide theoretical and empirical evidence of its statistical performance, compared to more “naive” SGD implementations. Computational issues are also discussed at length, revealing the practical advantages of the method promoted.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. (2009)
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives (2014)
Devroye, L.: Non-uniform random variate generation (1986)
Needell, D., Srebro, N., Ward, R.: Stochastic gradient descent, weighted sampling and the randomized kaczmarz algorithm
Bach, F., Moulines, E.: Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: NIPS, pp. 451–459 (2011)
Fort, G.: Central limit theorems for stochastic approximation with controlled Markov Chain. EsaimPS (2014)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Statist. 22 (1951)
Bottou, L.: Online algorithms and stochastic approximations. In: Online Learning and Neural Networks
Bottou, L.: Stochastic gradient tricks. In: Neural Networks, Tricks of the Trade, Reloaded (2012)
Mairal, J.: Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning (2014)
Nesterov, Y., Nesterov, I.U.E.: Introductory Lectures on Convex Optimization: A Basic Course. Applied Optimization. Springer (2004)
Pelletier, M.: Weak convergence rates for stochastic approximation with application to multiple targets and simulated annealing. Ann. Appl. Prob (1998)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: NIPS, pp. 315–323 (2013)
Schmidt, M.W., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. CoRR (2013)
Clemencon, S., Bertail, P., Chautru, E.: Scaling up m-estimation via sampling designs: The horvitz-thompson stochastic gradient descent. In: 2014 IEEE Big Data (2014)
Shalev-Shwartz, S., Zhang, T.: Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization (2012)
Meyn, S., Tweedie, R.L.: Markov Chains and Stochastic Stability (2009)
Zhao, P., Zhang, T.: Stochastic Optimization with Importance Sampling (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Papa, G., Bianchi, P., Clémençon, S. (2015). Adaptive Sampling for Incremental Optimization Using Stochastic Gradient Descent. In: Chaudhuri, K., GENTILE, C., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2015. Lecture Notes in Computer Science(), vol 9355. Springer, Cham. https://doi.org/10.1007/978-3-319-24486-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-24486-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24485-3
Online ISBN: 978-3-319-24486-0
eBook Packages: Computer ScienceComputer Science (R0)