Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3394486.3403126acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks

Published: 20 August 2020 Publication History

Abstract

Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We proposeAutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture,AutoGrow grows new layers if the growth improves the accuracy; otherwise, stops growing and thus discovers the depth. We propose robust growing and stopping policies to generalize to different network architectures and datasets. Our experiments show that by applying the same policy to different network architectures,AutoGrow can always discover near-optimal depth on various datasets of MNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and ImageNet. For example, in terms of accuracy-computation trade-off,AutoGrow discovers a better depth combination in \resnets than human experts. OurAutoGrow is efficient. It discovers depth within similar time of training a single DNN. Our code is available at \urlhttps://github.com/wenwei202/autogrow.

References

[1]
Peter J Angeline, Gregory M Saunders, and Jordan B Pollack. 1994. An evolutionary algorithm that constructs recurrent neural networks. IEEE transactions on Neural Networks, Vol. 5, 1 (1994), 54--65.
[2]
Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. 2018. Understanding and simplifying one-shot architecture search. In International Conference on Machine Learning. 549--558.
[3]
Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018a. Efficient architecture search by network transformation. In Thirty-Second AAAI Conference on Artificial Intelligence .
[4]
Han Cai, Jiacheng Yang, Weinan Zhang, Song Han, and Yong Yu. 2018b. Path-level network transformation for efficient architecture search. arXiv preprint arXiv:1806.02639 (2018).
[5]
Tianqi Chen, Ian Goodfellow, and Jonathon Shlens. 2015. Net2net: Accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641 (2015).
[6]
Corinna Cortes, Xavier Gonzalvo, Vitaly Kuznetsov, Mehryar Mohri, and Scott Yang. 2017. Adanet: Adaptive structural learning of artificial neural networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 874--883.
[7]
Xiaoliang Dai, Hongxu Yin, and Niraj K Jha. 2017. NeST: A neural network synthesis tool based on a grow-and-prune paradigm. arXiv:1711.02017 (2017).
[8]
Xiaocong Du, Zheng Li, and Yu Cao. 2019. CGaP: Continuous Growth and Pruning for Efficient Deep Learning. arXiv preprint arXiv:1905.11533 (2019).
[9]
Thomas Elsken, Jan-Hendrik Metzen, and Frank Hutter. 2017. Simple and Efficient Architecture Search for CNNs. In Workshop on Meta-Learning (MetaLearn 2017) at NIPS .
[10]
Jiashi Feng and Trevor Darrell. 2015. Learning the structure of deep convolutional networks. In Proceedings of the IEEE international conference on computer vision. 2749--2757.
[11]
Ariel Gordon, Elad Eban, Ofir Nachum, Bo Chen, Hao Wu, Tien-Ju Yang, and Edward Choi. 2018. Morphnet: Fast & simple resource-constrained structure learning of deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1586--1595.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[13]
Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 1389--1397.
[14]
Gao Huang, Shichen Liu, Laurens Van der Maaten, and Kilian Q Weinberger. 2018. Condensenet: An efficient densenet using learned group convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2752--2761.
[15]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.
[16]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[17]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[18]
Vadim Lebedev and Victor Lempitsky. 2016. Fast convnets using group-wise brain damage. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2554--2564.
[19]
Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2016. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016).
[20]
Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. 2018. Visualizing the loss landscape of neural nets. In Advances in Neural Information Processing Systems. 6391--6401.
[21]
Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. 2018b. Progressive neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV). 19--34.
[22]
Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017b. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017).
[23]
Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018a. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).
[24]
Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, and Changshui Zhang. 2017a. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE International Conference on Computer Vision. 2736--2744.
[25]
Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. 2017. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE international conference on computer vision . 5058--5066.
[26]
Risto Miikkulainen, Jason Liang, Elliot Meyerson, Aditya Rawal, Daniel Fink, Olivier Francon, Bala Raju, Hormoz Shahrzad, Arshak Navruzyan, Nigel Duffy, et almbox. 2019. Evolving deep neural networks. In Artificial Intelligence in the Age of Neural Networks and Brain Computing. Elsevier, 293--312.
[27]
Jongsoo Park, Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, and Pradeep Dubey. 2017. Faster cnns with direct sparse convolutions and guided pruning. In International Conference on Learning Representations (ICLR) .
[28]
Hieu Pham, Melody Y Guan, Barret Zoph, Quoc V Le, and Jeff Dean. 2018. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018).
[29]
George Philipp and Jaime G Carbonell. 2017. Nonparametric neural networks. arXiv preprint arXiv:1712.05440 (2017).
[30]
Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2902--2911.
[31]
Shreyas Saxena and Jakob Verbeek. 2016. Convolutional neural fabrics. In Advances in Neural Information Processing Systems. 4053--4061.
[32]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[33]
Leslie N Smith, Emily M Hand, and Timothy Doster. 2016. Gradual dropin of layers to train very deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4763--4771.
[34]
Kenneth O Stanley and Risto Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary computation, Vol. 10, 2 (2002), 99--127.
[35]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.
[36]
Tao Wei, Changhu Wang, and Chang Wen Chen. 2017. Modularized morphing of neural networks. arXiv preprint arXiv:1701.03281 (2017).
[37]
Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. 2016. Network morphism. In International Conference on Machine Learning . 564--572.
[38]
Wei Wen, Yuxiong He, Samyam Rajbhandari, Minjia Zhang, Wenhan Wang, Fang Liu, Bin Hu, Yiran Chen, and Hai Li. 2018. Learning intrinsic sparse structures within long short-term memory. In International Conference on Learning Representations (ICLR) .
[39]
Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2016. Learning structured sparsity in deep neural networks. In Advances in neural information processing systems. 2074--2082.
[40]
Huanrui Yang, Wei Wen, and Hai Li. 2020. DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures. In International Conference on Learning Representations (ICLR) .
[41]
Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. 2017. Lifelong learning with dynamically expandable networks. arXiv preprint arXiv:1708.01547 (2017).
[42]
Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).
[43]
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition . 8697--8710.

Cited By

View all
  • (2024)MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00278(2788-2797)Online publication date: 3-Jan-2024
  • (2024)Neuro-Symbolic Computing: Advancements and Challenges in Hardware–Software Co-DesignIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.333625171:3(1683-1689)Online publication date: Mar-2024
  • (2024)A General and Efficient Training for Transformer via Token Expansion2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01494(15783-15792)Online publication date: 16-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automated machine learning
  2. depth
  3. growing
  4. neural architecture search
  5. neural networks

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)157
  • Downloads (Last 6 weeks)22
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00278(2788-2797)Online publication date: 3-Jan-2024
  • (2024)Neuro-Symbolic Computing: Advancements and Challenges in Hardware–Software Co-DesignIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.333625171:3(1683-1689)Online publication date: Mar-2024
  • (2024)A General and Efficient Training for Transformer via Token Expansion2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01494(15783-15792)Online publication date: 16-Jun-2024
  • (2023)Accelerated training via incrementally growing neural networks using variance transfer and learning rate adaptationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666851(16673-16692)Online publication date: 10-Dec-2023
  • (2023)Exploring Structural Sparsity of Deep Networks Via Inverse Scale SpacesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.316888145:2(1749-1765)Online publication date: 1-Feb-2023
  • (2022)Automated Progressive Learning for Efficient Training of Vision Transformers2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.01216(12476-12486)Online publication date: Jun-2022
  • (2022)DyRep: Bootstrapping Training with Dynamic Re-parameterization2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.00067(578-587)Online publication date: Jun-2022
  • (2022)Automatic Searching of Deep Neural Networks for Medical Imaging DiagnosticAdvanced Technologies for Humanity10.1007/978-3-030-94188-8_13(129-140)Online publication date: 29-Jan-2022
  • (2021)Pruning of generative adversarial neural networks for medical imaging diagnostics with evolution strategyInformation Sciences10.1016/j.ins.2020.12.086558(91-102)Online publication date: May-2021
  • (2020)Efficient Evolutionary Deep Neural Architecture Search (NAS) by Noisy Network Morphism MutationBio-inspired Computing: Theories and Applications10.1007/978-981-15-3415-7_41(497-508)Online publication date: 2-Apr-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media