Computer Science > Machine Learning

arXiv:2410.16523 (cs)

[Submitted on 21 Oct 2024 (v1), last revised 29 Oct 2024 (this version, v2)]

Title:Efficient Neural Network Training via Subset Pretraining

Authors:Jan Spörer, Bernhard Bermeitinger, Tomas Hrycej, Niklas Limacher, Siegfried Handschuh

Abstract:In training neural networks, it is common practice to use partial gradients computed over batches, mostly very small subsets of the training set. This approach is motivated by the argument that such a partial gradient is close to the true one, with precision growing only with the square root of the batch size. A theoretical justification is with the help of stochastic approximation theory. However, the conditions for the validity of this theory are not satisfied in the usual learning rate schedules. Batch processing is also difficult to combine with efficient second-order optimization methods. This proposal is based on another hypothesis: the loss minimum of the training set can be expected to be well-approximated by the minima of its subsets. Such subset minima can be computed in a fraction of the time necessary for optimizing over the whole training set. This hypothesis has been tested with the help of the MNIST, CIFAR-10, and CIFAR-100 image classification benchmarks, optionally extended by training data augmentation. The experiments have confirmed that results equivalent to conventional training can be reached. In summary, even small subsets are representative if the overdetermination ratio for the given model parameter set sufficiently exceeds unity. The computing expense can be reduced to a tenth or less.

Comments:	To appear in KDIR 2024
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computation (stat.CO); Methodology (stat.ME); Machine Learning (stat.ML)
Cite as:	arXiv:2410.16523 [cs.LG]
	(or arXiv:2410.16523v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.16523
Journal reference:	Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR2024
Related DOI:	https://doi.org/10.5220/0012893600003838

Submission history

From: Bernhard Bermeitinger [view email]
[v1] Mon, 21 Oct 2024 21:31:12 UTC (991 KB)
[v2] Tue, 29 Oct 2024 14:18:51 UTC (983 KB)

Computer Science > Machine Learning

Title:Efficient Neural Network Training via Subset Pretraining

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficient Neural Network Training via Subset Pretraining

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators