Article

XferNAS: Transfer Neural Architecture Search

Author:

Martin WistubaAuthors Info & Claims

Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part III

Pages 247 - 262

https://doi.org/10.1007/978-3-030-67664-3_15

Published: 14 September 2020 Publication History

Abstract

The term Neural Architecture Search (NAS) refers to the automatic optimization of network architectures for a new, previously unknown task. Since testing an architecture is computationally very expensive, many optimizers need days or even weeks to find suitable architectures. However, this search time can be significantly reduced if knowledge from previous searches on different tasks is reused. In this work, we propose a generally applicable framework that introduces only minor changes to existing optimizers to leverage this feature. As an example, we select an existing optimizer and demonstrate the complexity of the integration of the framework as well as its impact. In experiments on CIFAR-10 and CIFAR-100, we observe a reduction in the search time from 200 to only 6 GPU days, a speed up by a factor of 33. In addition, we observe new records of 1.99 and 14.06 for NAS optimizers on the CIFAR benchmarks, respectively. In a separate study, we analyze the impact of the amount of source and target data. Empirically, we demonstrate that the proposed framework generally gives better results and, in the worst case, is just as good as the unmodified optimizer.

References

[1]

Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017). https://openreview.net/forum?id=S1c2cvqee

[2]

Bender, G., Kindermans, P.J., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research. PMLR, 10–15 July 2018, Stockholmsmässan, Stockholm Sweden, vol. 80, pp. 550–559 (2018). http://proceedings.mlr.press/v80/bender18a.html

[3]

Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), 2–7 February 2018, New Orleans, Louisiana, USA, pp. 2787–2794 (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16755

[4]

Cai, H., Yang, J., Zhang, W., Han, S., Yu, Y.: Path-level network transformation for efficient architecture search. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, 10–15 July 2018, Stockholmsmässan, Stockholm, Sweden, pp. 677–686 (2018). http://proceedings.mlr.press/v80/cai18a.html

[5]

Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: Proceedings of the International Conference on Learning Representations, ICLR 2019, New Orleans, Louisiana, USA (2019). https://openreview.net/forum?id=HylVB3AqYm

[6]

DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017). http://arxiv.org/abs/1708.04552

[7]

Gastaldi, X.: Shake-shake regularization. CoRR abs/1705.07485 (2017). http://arxiv.org/abs/1705.07485

[8]

Ha, D., Eck, D.: A neural representation of sketch drawings. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings (2018). https://openreview.net/forum?id=Hy6GHpkCW

[9]

Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 21–26 July 2017, Honolulu, HI, USA, pp. 2261–2269. IEEE Computer Society (2017).

[10]

Istrate, R., Scheidegger, F., Mariani, G., Nikolopoulos, D.S., Bekas, C., Malossi, A.C.I.: TAPAS: train-less accuracy predictor for architecture search. In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, (AAAI-19), Honolulu, Hawaii, USA (2019)

[11]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, 7–9 May 2015, San Diego, CA, USA, Conference Track Proceedings (2015), http://arxiv.org/abs/1412.6980

[12]

Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)

[13]

Larsson, G., Maire, M., Shakhnarovich, G.: FractalNet: ultra-deep neural networks without residuals. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017). https://openreview.net/forum?id=S1VaB4cex

[14]

Liu, C., et al.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)

[15]

Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: 6th International Conference on Learning Representations, ICLR 2018, 30 April–3 May 2018, Vancouver, BC, Canada, Conference Track Proceedings (2018). https://openreview.net/forum?id=BJQRKzbA-

[16]

Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: Proceedings of the International Conference on Learning Representations, ICLR 2019, New Orleans, Louisiana, USA (2019)

[17]

Liu, W., et al.: SSD: single shot multibox detector. In: Computer Vision - ECCV 2016–14th European Conference, 11–14 October 2016, Amsterdam, The Netherlands, Proceedings, Part I, pp. 21–37 (2016).

[18]

Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017). https://openreview.net/forum?id=Skq89Scxx

[19]

Luo, R., Tian, F., Qin, T., Chen, E., Liu, T.: Neural architecture optimization. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3–8 December 2018, Montréal, Canada, pp. 7827–7838 (2018). http://papers.nips.cc/paper/8007-neural-architecture-optimization

[20]

Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011 (2011). http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf

[21]

Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 4095–4104. PMLR, 10–15 July 2018, Stockholmsmässan, Stockholm Sweden (2018).http://proceedings.mlr.press/v80/pham18a.html

[22]

Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, 27 January– 1 February 2019, Honolulu, Hawaii, USA, pp. 4780–4789 (2019)

[23]

Real, E., et al.: Large-scale evolution of image classifiers. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research. PMLR, International Convention Centre, 6–11 August 2017, Sydney, Australia, vol. 70, pp. 2902–2911 (2017)

[24]

Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, 27–30 June 2016, Las Vegas, NV, USA, pp. 779–788 (2016)

[25]

Sciuto, C., Yu, K., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. CoRR abs/1902.08142 (2019)

[26]

Wistuba M Berlingerio M, Bonchi F, Gärtner T, Hurley N, and Ifrim G Deep learning architecture search by neuro-cell-based evolution with function-preserving mutations Machine Learning and Knowledge Discovery in Databases 2019 Cham Springer 243-258

Digital Library

[27]

Wistuba, M.: Practical deep learning architecture optimization. In: 5th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2018, 1–3 October 2018, Turin, Italy, pp. 263–272 (2018).

[28]

Wistuba, M., Pedapati, T.: Inductive transfer for neural architecture optimization. CoRR abs/1903.03536 (2019). http://arxiv.org/abs/1903.03536

[29]

Wistuba, M., Rawat, A., Pedapati, T.: A survey on neural architecture search. CoRR abs/1905.01392 (2019). http://arxiv.org/abs/1905.01392

[30]

Wong, C., Houlsby, N., Lu, Y., Gesmundo, A.: Transfer learning with neural AutoML. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3–8 December 2018, Montréal, Canada, pp. 8366–8375 (2018)

[31]

Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747 (2017)

[32]

Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 21–26 July 2017, Honolulu, HI, USA, pp. 5987–5995 (2017)

[33]

Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: Proceedings of the International Conference on Learning Representations, ICLR 2019, New Orleans, Louisiana, USA (2019)

[34]

Zhong, Z., Yan, J., Wu, W., Shao, J., Liu, C.: Practical block-wise neural network architecture generation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, 18–22 June 2018, Salt Lake City, UT, USA, pp. 2423–2432 (2018)

[35]

Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017)

[36]

Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, 18–22 June 2018, Salt Lake City, UT, USA, pp. 8697–8710 (2018).

Cited By

Recommendations

Auto-Keras: An Efficient Neural Architecture Search System
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Neural architecture search (NAS) has been proposed to automatically tune deep neural networks, but existing search algorithms, e.g., NASNet, PNAS, usually suffer from expensive computational cost. Network morphism, which keeps the functionality of a ...
True Rank Guided Efficient Neural Architecture Search for End to End Low-Complexity Network Discovery
Computer Analysis of Images and Patterns
Abstract
Neural architecture search (NAS) aims to automate neural network design process and has shown promising results for image classification tasks. Owing to combinatorially huge neural network design spaces coupled with training cost of candidates, ...
Sequential node search for faster neural architecture search
Abstract
Neural Architecture Search (NAS) has progressed significantly by reducing search costs since it was first introduced. The faster NAS algorithms use cell-type architectures. The cells are made up of nodes, and typically, all the node operations ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part III

Sep 2020

782 pages

ISBN:978-3-030-67663-6

DOI:10.1007/978-3-030-67664-3

Editors:
Frank Hutter
Albert-Ludwigs-Universität, Freiburg, Germany
,
Kristian Kersting
TU Darmstadt, Darmstadt, Germany
,
Jefrey Lijffijt
Ghent University, Ghent, Belgium
,
Isabel Valera
Saarland University, Saarbrücken, Germany

© Springer Nature Switzerland AG 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 14 September 2020

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View Options

View options

Figures

Tables

Media

View Table of Conten