research-article

Public Access

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks

Authors:

Hai LiAuthors Info & Claims

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 833 - 841

https://doi.org/10.1145/3394486.3403126

Published: 20 August 2020 Publication History

Abstract

Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We proposeAutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture,AutoGrow grows new layers if the growth improves the accuracy; otherwise, stops growing and thus discovers the depth. We propose robust growing and stopping policies to generalize to different network architectures and datasets. Our experiments show that by applying the same policy to different network architectures,AutoGrow can always discover near-optimal depth on various datasets of MNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and ImageNet. For example, in terms of accuracy-computation trade-off,AutoGrow discovers a better depth combination in \resnets than human experts. OurAutoGrow is efficient. It discovers depth within similar time of training a single DNN. Our code is available at \urlhttps://github.com/wenwei202/autogrow.

References

[1]

Peter J Angeline, Gregory M Saunders, and Jordan B Pollack. 1994. An evolutionary algorithm that constructs recurrent neural networks. IEEE transactions on Neural Networks, Vol. 5, 1 (1994), 54--65.

Digital Library

[2]

Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. 2018. Understanding and simplifying one-shot architecture search. In International Conference on Machine Learning. 549--558.

[3]

Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018a. Efficient architecture search by network transformation. In Thirty-Second AAAI Conference on Artificial Intelligence .

[4]

Han Cai, Jiacheng Yang, Weinan Zhang, Song Han, and Yong Yu. 2018b. Path-level network transformation for efficient architecture search. arXiv preprint arXiv:1806.02639 (2018).

[5]

Tianqi Chen, Ian Goodfellow, and Jonathon Shlens. 2015. Net2net: Accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641 (2015).

[6]

Corinna Cortes, Xavier Gonzalvo, Vitaly Kuznetsov, Mehryar Mohri, and Scott Yang. 2017. Adanet: Adaptive structural learning of artificial neural networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 874--883.

[7]

Xiaoliang Dai, Hongxu Yin, and Niraj K Jha. 2017. NeST: A neural network synthesis tool based on a grow-and-prune paradigm. arXiv:1711.02017 (2017).

[8]

Xiaocong Du, Zheng Li, and Yu Cao. 2019. CGaP: Continuous Growth and Pruning for Efficient Deep Learning. arXiv preprint arXiv:1905.11533 (2019).

[9]

Thomas Elsken, Jan-Hendrik Metzen, and Frank Hutter. 2017. Simple and Efficient Architecture Search for CNNs. In Workshop on Meta-Learning (MetaLearn 2017) at NIPS .

[10]

Jiashi Feng and Trevor Darrell. 2015. Learning the structure of deep convolutional networks. In Proceedings of the IEEE international conference on computer vision. 2749--2757.

Digital Library

[11]

Ariel Gordon, Elad Eban, Ofir Nachum, Bo Chen, Hao Wu, Tien-Ju Yang, and Edward Choi. 2018. Morphnet: Fast & simple resource-constrained structure learning of deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1586--1595.

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[13]

Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 1389--1397.

[14]

Gao Huang, Shichen Liu, Laurens Van der Maaten, and Kilian Q Weinberger. 2018. Condensenet: An efficient densenet using learned group convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2752--2761.

[15]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.

[16]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[17]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

[18]

Vadim Lebedev and Victor Lempitsky. 2016. Fast convnets using group-wise brain damage. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2554--2564.

[19]

Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2016. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016).

[20]

Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. 2018. Visualizing the loss landscape of neural nets. In Advances in Neural Information Processing Systems. 6391--6401.

[21]

Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. 2018b. Progressive neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV). 19--34.

Digital Library

[22]

Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017b. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017).

[23]

Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018a. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).

[24]

Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, and Changshui Zhang. 2017a. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE International Conference on Computer Vision. 2736--2744.

[25]

Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. 2017. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE international conference on computer vision . 5058--5066.

[26]

Risto Miikkulainen, Jason Liang, Elliot Meyerson, Aditya Rawal, Daniel Fink, Olivier Francon, Bala Raju, Hormoz Shahrzad, Arshak Navruzyan, Nigel Duffy, et almbox. 2019. Evolving deep neural networks. In Artificial Intelligence in the Age of Neural Networks and Brain Computing. Elsevier, 293--312.

[27]

Jongsoo Park, Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, and Pradeep Dubey. 2017. Faster cnns with direct sparse convolutions and guided pruning. In International Conference on Learning Representations (ICLR) .

[28]

Hieu Pham, Melody Y Guan, Barret Zoph, Quoc V Le, and Jeff Dean. 2018. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018).

[29]

George Philipp and Jaime G Carbonell. 2017. Nonparametric neural networks. arXiv preprint arXiv:1712.05440 (2017).

[30]

Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2902--2911.

Digital Library

[31]

Shreyas Saxena and Jakob Verbeek. 2016. Convolutional neural fabrics. In Advances in Neural Information Processing Systems. 4053--4061.

[32]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[33]

Leslie N Smith, Emily M Hand, and Timothy Doster. 2016. Gradual dropin of layers to train very deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4763--4771.

[34]

Kenneth O Stanley and Risto Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary computation, Vol. 10, 2 (2002), 99--127.

[35]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.

[36]

Tao Wei, Changhu Wang, and Chang Wen Chen. 2017. Modularized morphing of neural networks. arXiv preprint arXiv:1701.03281 (2017).

[37]

Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. 2016. Network morphism. In International Conference on Machine Learning . 564--572.

[38]

Wei Wen, Yuxiong He, Samyam Rajbhandari, Minjia Zhang, Wenhan Wang, Fang Liu, Bin Hu, Yiran Chen, and Hai Li. 2018. Learning intrinsic sparse structures within long short-term memory. In International Conference on Learning Representations (ICLR) .

[39]

Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2016. Learning structured sparsity in deep neural networks. In Advances in neural information processing systems. 2074--2082.

[40]

Huanrui Yang, Wei Wen, and Hai Li. 2020. DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures. In International Conference on Learning Representations (ICLR) .

[41]

Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. 2017. Lifelong learning with dynamically expandable networks. arXiv preprint arXiv:1708.01547 (2017).

[42]

Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).

[43]

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition . 8697--8710.

Cited By

Liu JLiu JXu HLiao YYao ZChen MQian C(2025)Enhancing Semi-Supervised Federated Learning With Progressive Training in Heterogeneous Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.349214024:3(2315-2330)Online publication date: Mar-2025
https://doi.org/10.1109/TMC.2024.3492140
Jiang HYu JZheng LZhu HLiu WYin J(2025)Accelerating Training of Large Neural Models by Gradient-Based Growth LearningDatabase Systems for Advanced Applications10.1007/978-981-97-5779-4_2(19-34)Online publication date: 11-Jan-2025
https://doi.org/10.1007/978-981-97-5779-4_2
Pham CTeterwak PNelson SPlummer B(2024)MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00278(2788-2797)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00278
Show More Cited By

Index Terms

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
    2. Search methodologies
      1. Discrete space search
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Lifelong machine learning
    2. Machine learning approaches
      1. Neural networks

Recommendations

Differentiable neural architecture learning for efficient neural networks
Highlights
- We build a new standalone control module based on the scaled sigmoid function to enrich the neural network module family to enable the neural architecture ...
Abstract
Efficient neural networks has received ever-increasing attention with the evolution of convolutional neural networks (CNNs), especially involving their deployment on embedded and mobile platforms. One of the biggest problems to ...
Deep Convolutional Neural Networks for Large-scale Speech Tasks

Convolutional Neural Networks (CNNs) are an alternative type of neural network that can be used to reduce spectral variations and model spectral correlations which exist in signals. Since speech signals exhibit both of these properties, we hypothesize ...
Autonomous deep learning: A genetic DCNN designer for image classification
Highlights
- Search deep convolutional neural networks automatically for image classification.
Abstract
Recent years have witnessed the breakthrough success of deep convolutional neural networks (DCNNs) in image classification and other vision applications. DCNNs have distinct advantages over traditional solutions in providing a uniform ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

August 2020

3664 pages

ISBN:9781450379984

DOI:10.1145/3394486

General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

KDD '20

Sponsor:

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

July 6 - 10, 2020

CA, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
961
Total Downloads

Downloads (Last 12 months)193
Downloads (Last 6 weeks)49

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu JLiu JXu HLiao YYao ZChen MQian C(2025)Enhancing Semi-Supervised Federated Learning With Progressive Training in Heterogeneous Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2024.349214024:3(2315-2330)Online publication date: Mar-2025
https://doi.org/10.1109/TMC.2024.3492140
Jiang HYu JZheng LZhu HLiu WYin J(2025)Accelerating Training of Large Neural Models by Gradient-Based Growth LearningDatabase Systems for Advanced Applications10.1007/978-981-97-5779-4_2(19-34)Online publication date: 11-Jan-2025
https://doi.org/10.1007/978-981-97-5779-4_2
Pham CTeterwak PNelson SPlummer B(2024)MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00278(2788-2797)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00278
Yang XWang ZHu XKim CYu SPajic MManohar RChen YLi H(2024)Neuro-Symbolic Computing: Advancements and Challenges in Hardware–Software Co-DesignIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2023.333625171:3(1683-1689)Online publication date: Mar-2024
https://doi.org/10.1109/TCSII.2023.3336251
Huang WShen YXie JZhang BHe GLi KSun XLin S(2024)A General and Efficient Training for Transformer via Token Expansion2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01494(15783-15792)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01494
Yuan XSavarese PMaire MOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Accelerated training via incrementally growing neural networks using variance transfer and learning rate adaptationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666851(16673-16692)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666851
Fu YLiu CLi DZhong ZSun XZeng JYao Y(2023)Exploring Structural Sparsity of Deep Networks Via Inverse Scale SpacesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.316888145:2(1749-1765)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TPAMI.2022.3168881
Li CZhuang BWang GLiang XChang XYang Y(2022)Automated Progressive Learning for Efficient Training of Vision Transformers2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.01216(12476-12486)Online publication date: Jun-2022
https://doi.org/10.1109/CVPR52688.2022.01216
Huang TYou SZhang BDu YWang FQian CXu C(2022)DyRep: Bootstrapping Training with Dynamic Re-parameterization2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.00067(578-587)Online publication date: Jun-2022
https://doi.org/10.1109/CVPR52688.2022.00067
Rguibi ZHajami ADya Z(2022)Automatic Searching of Deep Neural Networks for Medical Imaging DiagnosticAdvanced Technologies for Humanity10.1007/978-3-030-94188-8_13(129-140)Online publication date: 29-Jan-2022
https://doi.org/10.1007/978-3-030-94188-8_13
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten