Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-030-67664-3_15guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

XferNAS: Transfer Neural Architecture Search

Published: 14 September 2020 Publication History

Abstract

The term Neural Architecture Search (NAS) refers to the automatic optimization of network architectures for a new, previously unknown task. Since testing an architecture is computationally very expensive, many optimizers need days or even weeks to find suitable architectures. However, this search time can be significantly reduced if knowledge from previous searches on different tasks is reused. In this work, we propose a generally applicable framework that introduces only minor changes to existing optimizers to leverage this feature. As an example, we select an existing optimizer and demonstrate the complexity of the integration of the framework as well as its impact. In experiments on CIFAR-10 and CIFAR-100, we observe a reduction in the search time from 200 to only 6 GPU days, a speed up by a factor of 33. In addition, we observe new records of 1.99 and 14.06 for NAS optimizers on the CIFAR benchmarks, respectively. In a separate study, we analyze the impact of the amount of source and target data. Empirically, we demonstrate that the proposed framework generally gives better results and, in the worst case, is just as good as the unmodified optimizer.

References

[1]
Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017). https://openreview.net/forum?id=S1c2cvqee
[2]
Bender, G., Kindermans, P.J., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research. PMLR, 10–15 July 2018, Stockholmsmässan, Stockholm Sweden, vol. 80, pp. 550–559 (2018). http://proceedings.mlr.press/v80/bender18a.html
[3]
Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), 2–7 February 2018, New Orleans, Louisiana, USA, pp. 2787–2794 (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16755
[4]
Cai, H., Yang, J., Zhang, W., Han, S., Yu, Y.: Path-level network transformation for efficient architecture search. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, 10–15 July 2018, Stockholmsmässan, Stockholm, Sweden, pp. 677–686 (2018). http://proceedings.mlr.press/v80/cai18a.html
[5]
Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: Proceedings of the International Conference on Learning Representations, ICLR 2019, New Orleans, Louisiana, USA (2019). https://openreview.net/forum?id=HylVB3AqYm
[6]
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017). http://arxiv.org/abs/1708.04552
[7]
Gastaldi, X.: Shake-shake regularization. CoRR abs/1705.07485 (2017). http://arxiv.org/abs/1705.07485
[8]
Ha, D., Eck, D.: A neural representation of sketch drawings. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings (2018). https://openreview.net/forum?id=Hy6GHpkCW
[9]
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 21–26 July 2017, Honolulu, HI, USA, pp. 2261–2269. IEEE Computer Society (2017).
[10]
Istrate, R., Scheidegger, F., Mariani, G., Nikolopoulos, D.S., Bekas, C., Malossi, A.C.I.: TAPAS: train-less accuracy predictor for architecture search. In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, (AAAI-19), Honolulu, Hawaii, USA (2019)
[11]
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, 7–9 May 2015, San Diego, CA, USA, Conference Track Proceedings (2015), http://arxiv.org/abs/1412.6980
[12]
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)
[13]
Larsson, G., Maire, M., Shakhnarovich, G.: FractalNet: ultra-deep neural networks without residuals. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017). https://openreview.net/forum?id=S1VaB4cex
[14]
Liu, C., et al.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)
[15]
Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: 6th International Conference on Learning Representations, ICLR 2018, 30 April–3 May 2018, Vancouver, BC, Canada, Conference Track Proceedings (2018). https://openreview.net/forum?id=BJQRKzbA-
[16]
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: Proceedings of the International Conference on Learning Representations, ICLR 2019, New Orleans, Louisiana, USA (2019)
[17]
Liu, W., et al.: SSD: single shot multibox detector. In: Computer Vision - ECCV 2016–14th European Conference, 11–14 October 2016, Amsterdam, The Netherlands, Proceedings, Part I, pp. 21–37 (2016).
[18]
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017). https://openreview.net/forum?id=Skq89Scxx
[19]
Luo, R., Tian, F., Qin, T., Chen, E., Liu, T.: Neural architecture optimization. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3–8 December 2018, Montréal, Canada, pp. 7827–7838 (2018). http://papers.nips.cc/paper/8007-neural-architecture-optimization
[20]
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011 (2011). http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf
[21]
Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 4095–4104. PMLR, 10–15 July 2018, Stockholmsmässan, Stockholm Sweden (2018).http://proceedings.mlr.press/v80/pham18a.html
[22]
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, 27 January– 1 February 2019, Honolulu, Hawaii, USA, pp. 4780–4789 (2019)
[23]
Real, E., et al.: Large-scale evolution of image classifiers. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research. PMLR, International Convention Centre, 6–11 August 2017, Sydney, Australia, vol. 70, pp. 2902–2911 (2017)
[24]
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, 27–30 June 2016, Las Vegas, NV, USA, pp. 779–788 (2016)
[25]
Sciuto, C., Yu, K., Jaggi, M., Musat, C., Salzmann, M.: Evaluating the search phase of neural architecture search. CoRR abs/1902.08142 (2019)
[26]
Wistuba M Berlingerio M, Bonchi F, Gärtner T, Hurley N, and Ifrim G Deep learning architecture search by neuro-cell-based evolution with function-preserving mutations Machine Learning and Knowledge Discovery in Databases 2019 Cham Springer 243-258
[27]
Wistuba, M.: Practical deep learning architecture optimization. In: 5th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2018, 1–3 October 2018, Turin, Italy, pp. 263–272 (2018).
[28]
Wistuba, M., Pedapati, T.: Inductive transfer for neural architecture optimization. CoRR abs/1903.03536 (2019). http://arxiv.org/abs/1903.03536
[29]
Wistuba, M., Rawat, A., Pedapati, T.: A survey on neural architecture search. CoRR abs/1905.01392 (2019). http://arxiv.org/abs/1905.01392
[30]
Wong, C., Houlsby, N., Lu, Y., Gesmundo, A.: Transfer learning with neural AutoML. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3–8 December 2018, Montréal, Canada, pp. 8366–8375 (2018)
[31]
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747 (2017)
[32]
Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 21–26 July 2017, Honolulu, HI, USA, pp. 5987–5995 (2017)
[33]
Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. In: Proceedings of the International Conference on Learning Representations, ICLR 2019, New Orleans, Louisiana, USA (2019)
[34]
Zhong, Z., Yan, J., Wu, W., Shao, J., Liu, C.: Practical block-wise neural network architecture generation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, 18–22 June 2018, Salt Lake City, UT, USA, pp. 2423–2432 (2018)
[35]
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017, 24–26 April 2017, Toulon, France, Conference Track Proceedings (2017)
[36]
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, 18–22 June 2018, Salt Lake City, UT, USA, pp. 8697–8710 (2018).

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part III
Sep 2020
782 pages
ISBN:978-3-030-67663-6
DOI:10.1007/978-3-030-67664-3

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 14 September 2020

Author Tags

  1. Neural Architecture Search
  2. AutoML
  3. Metalearning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media