Nothing Special   »   [go: up one dir, main page]

Skip to main content

Evolving Adaptive Neural Network Optimizers for Image Classification

  • Conference paper
  • First Online:
Genetic Programming (EuroGP 2022)

Abstract

The evolution of hardware has enabled Artificial Neural Networks to become a staple solution to many modern Artificial Intelligence problems such as natural language processing and computer vision. The neural network’s effectiveness is highly dependent on the optimizer used during training, which motivated significant research into the design of neural network optimizers. Current research focuses on creating optimizers that perform well across different topologies and network types. While there is evidence that it is desirable to fine-tune optimizer parameters for specific networks, the benefits of designing optimizers specialized for single networks remain mostly unexplored.

In this paper, we propose an evolutionary framework called Adaptive AutoLR (ALR) to evolve adaptive optimizers for specific neural networks in an image classification task. The evolved optimizers are then compared with state-of-the-art, human-made optimizers on two popular image classification problems. The results show that some evolved optimizers perform competitively in both tasks, even achieving the best average test accuracy in one dataset. An analysis of the best evolved optimizer also reveals that it functions differently from human-made approaches. The results suggest ALR can evolve novel, high-quality optimizers motivating further research and applications of the framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bottou, L.: On-Line Learning and Stochastic Approximations, pp. 9–42. Cambridge University Press, Cambridge (1999)

    Google Scholar 

  2. Carvalho, P., Lourenço, N., Assunção, F., Machado, P.: AutoLR: an evolutionary approach to learning rate policies. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, GECCO 2020, pp. 672–680. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3377930.3390158

  3. Chollet, F., et al.: Keras CIFAR10 architecture (2015). https://keras.io/examples/cifar10_cnn_tfaugment2d/

  4. Chollet, F., et al.: Keras MNIST architecture (2015). https://keras.io/examples/mnist_cnn/

  5. Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2012)

    Article  Google Scholar 

  6. Hinton, G., Srivastava, N., Swersky, K.: Overview of mini-batch gradient descent. University Lecture (2015). https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  8. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  9. Jacobs, R.A.: Increased rates of convergence through learning rate adaptation. Neural Netw. 1(4), 295–307 (1988)

    Article  Google Scholar 

  10. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  12. Lopez, M.M., Kalita, J.: Deep learning applied to NLP. arXiv preprint arXiv:1703.03091 (2017)

  13. Mockus, J., Tiesis, V., Zilinskas, A.: The application of Bayesian methods for seeking the extremum, vol. 2, pp. 117–129 (2014)

    Google Scholar 

  14. Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence o (1/k\({\hat{}}\) 2). In: Doklady an USSR, vol. 269, pp. 543–547 (1983)

    Google Scholar 

  15. Pedro, C.: Adaptive AutoLR grammar (2020). https://github.com/soren5/autolr/blob/master/grammars/adaptive_autolr_grammar.txt

  16. Pedro, C.: AutoLR (2020). https://github.com/soren5/autolr

  17. Pedro, C.: Keras CIFAR model (2020). https://github.com/soren5/autolr/blob/benchmarks/models/json/cifar_model.json

  18. Pedro, C.: Keras MNIST model (2020). https://github.com/soren5/autolr/blob/benchmarks/models/json/mnist_model.json

  19. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)

    Article  Google Scholar 

  20. Senior, A., Heigold, G., Ranzato, M., Yang, K.: An empirical study of learning rates in deep neural networks for speech recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6724–6728. IEEE (2013)

    Google Scholar 

  21. Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012, vol. 2, pp. 2951–2959. Curran Associates Inc., Red Hook (2012)

    Google Scholar 

  22. Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2017, pp. 497–504. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3071178.3071229

Download references

Acknowledgments

This work is partially funded by: Fundação para a Ciência e Tecnologia (FCT), Portugal, under the grant UI/BD/151053/2021, and by national funds through the FCT - Foundation for Science and Technology, I.P., within the scope of the project CISUC - UID/CEC/00326/2020 and by European Social Fund, through the Regional Operational Program Centro 2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Carvalho .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carvalho, P., Lourenço, N., Machado, P. (2022). Evolving Adaptive Neural Network Optimizers for Image Classification. In: Medvet, E., Pappa, G., Xue, B. (eds) Genetic Programming. EuroGP 2022. Lecture Notes in Computer Science, vol 13223. Springer, Cham. https://doi.org/10.1007/978-3-031-02056-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-02056-8_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-02055-1

  • Online ISBN: 978-3-031-02056-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics