Abstract
The hardware platform is a significant consideration in efficient CNN model design. Most lightweight networks are based on GPUs and mobile devices. However, they are usually not efficient nor fast enough for desktop CPU platforms. In this paper, we aim to explore the design of highly-efficient convolutional architectures for desktop CPU platforms. To achieve our goal, we first derive a series of CNN model design guidelines for CPU-based devices by comparing different computing platforms. Based on these proposed guidelines, we further present a Desktop CPU-Aware network architecture search (DcaNAS) to search for the optimal network structure with lower CPU latency. By combining automatic search and manual design, our DcaNAS achieves better flexibility and efficiency. On the ImageNet benchmark, we employ DcaNAS to produce two CPU-based lightweight CNN models: DcaNAS-L for higher accuracy and DcaNAS-S for faster speed. On a single CPU core, DcaNAS-L achieves 78.8% Top-1 (94.6% Top-5) accuracy at 13.6 FPS (73.5 ms), and our DcaNAS-S achieves extremely low CPU latency (43.1 ms). The results show that our DcaNAS method can obtain new state-of-the-art CPU-based networks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Tan M, Pang R, Le QV (2020) EfficientDet: scalable and efficient object. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2020, pp. 10781–10790
Howard A, Sandler M, Chen B et al (2019) Searching for MobileNetV3. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324
Ma N, Zhang X, Zheng H, Sun J (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: European Conference on Computer Vision (ECCV), 2018, pp. 116–131
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: an extremely efficient convolutional neural network for Mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2018, pp. 6848–6856
Sandler M, Howard A, Zhu M et al (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510–4520
Gennari M, Fawcett R, Prisacariu VA (2019) DSConv: efficient convolution operator. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 5148–5157
Singh P, Verma VK, Rai P, Namboodiri VP (2019) HetConv: heterogeneous kernel-based convolutions for deep CNNs. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2019, pp. 4835–4844
Han K, Wang Y, Tian Q et al (2020) GhostNet: more features from cheap operations. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2020, pp. 1580–1589
Wu B, Wan A, Yue X et al (2018) Shift: a zero FLOP, zero parameter alternative to spatial convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 9127–9135
Radosavovic I, Kosaraju RP, Girshick R et al (2020) Designing network design spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10428–10436
Zoph B, Vasudevan V, Shlens J et al (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8697–8710
Yang Z, Wang Y, Chen X et al (2020) CARS: continuous evolution for efficient neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1829–1838
Wan A, Dai X, Zhang P et al (2020) FBNetV2: differentiable neural architecture search for spatial and channel dimensions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 12965–12974
Fang J, Sun Y, Zhang Q et al (2020) Densely connected search space for more flexible neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 10628–10637
Chu X, Zhang B, Li J, Li Q, Xu R (2019) SCARLET-NAS: bridging the gap between stability and scalability in weight-sharing neural architecture search. 2019 arXiv: 1908.06022. [online]. Available: http://arxiv.org/abs/1908.06022
You S, Huang T, Yang M et al (2020) GreedyNAS: towards fast one-shot NAS with greedy Supernet. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1999–2008
Baker B,Gupta O, Naik N et al (2017) Designing neural network architectures using reinforcement learning. In: international conference on learning representations (ICLR), 2017
Liu H, Simonyan K, Yang Y (2019) DARTS: differentiable architecture search. In: International Conference on Learning Representations (ICLR), 2019
Tan M, Chen B, Pang R et al (2019) MnasNet: platform-aware neural architecture search for Mobile. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2820–2828
Bender G, Kindermans PJ, Zoph B et al (2018) Understanding and simplifying one-shot architecture search. In: proceedings of the 35th international conference on machine learning (ICML). 2018, PMLR 80:550-559
Guo Z, Zhang X, Mu H et al (2020) Single path one-shot neural architecture search with uniform sampling. In: European Conference on Computer Vision (ECCV), 2020, pp. 544–560
Chen Y, Yang T, Zhang X et al (2019) DetNAS: Backbone Search for Object Detection In: Conference and Workshop on Neural Information Processing Systems (NeurIPS), 2019
Chu X, Zhang B, Xu R et al (2019) FairNAS: rethinking evaluation fairness of weight sharing neural architecture search. 2019 arXiv: 1907.01845. [online]. Available: http://arxiv.org/abs/1907.01845
Chu X, Zhang B, Xu R (2019) Searching beyond MobileNetV3. IEEE international conference on acoustics, speech and signal processing (ICASSP). Barcelona, Spain, may, 2020
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6:182–197
Howard AG, Zhu M, Chen B et al (2017) MobileNets: efficient convolutional neural networks for Mobile vision applications. 2017 arXiv: 1704.04861. [online]. Available: http://arxiv.org/abs/1704.04861
Zhou D, Zhou X, Zhang W et al (2020) EcoNAS: finding proxies for economical neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11396–11404
Li C, Peng J, Yuan L et al (2020) Block-wisely supervised neural architecture search with knowledge distillation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1989–1998
Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. 2019 arXiv: 1905.11946. [online]. Available: http://arxiv.org/abs/1905.11946
Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3):211–252, 2015
Li X, Lin C, Li C et al (2020) Improving one-shot NAS by suppressing the posterior fading. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 13836–13845
Cai H, Zhu L, Han S (2019) ProxylessNAS: direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (ICLR), 2019
Yu J, Huang T (2020) AutoSlim: towards one-shot architecture search for channel numbers. In: International Conference on Learning Representations (ICLR), 2020
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision (ECCV), 2016, pp. 630–645
Loshchilov I, Hutter F (2017) SGDR: stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (ICLR), 2017, pp. 6009–6025
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778
Xie S, Girshick R, Dollár P et al (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1492–1500
Huang G, Liu Z, Van Der Maaten L et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4700–4708
Li Z, Peng C, Yu G et al (2018) DetNet: a backbone network for object detection. In: European Conference on Computer Vision (ECCV), 2018, pp. 334–350
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. 2018 arXiv: 1804.02767. [online]. Available: http://arxiv.org/abs/1804.02767
Chollet F (2017) Xception: deep learning with Depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp.1800–1807
Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2818–2826
Szegedy C, Ioffe S, Vanhoucke V et al (2017) Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning AAAI Conference on Artificial Intelligence, 2017
Zhu C, Chen F, Shen Z et al (2020) Soft Anchor-Point Object Detection In: European Conference on Computer Vision (ECCV), 2020
Liu S, Qi L, Qin H et al (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8759–8768
Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot MultiBox detector. In: European Conference on Computer Vision (ECCV), 2016, pp. 21–37
Redmon J, Farhadi A (2017) YOLO 9000: Better, Faster, Stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 7263–7271
Bochkovskiy A, Wang CY, Liao HYM (2020) YOLOv4: optimal speed and accuracy of object detection. 2020 arXiv: 2004.10934. [online]. Available: http://arxiv.org/abs/2004.10934
Wang RJ, Li X, Ling CX (2018) Pelee: a real-time object detection system on mobile devices. In Advances in Neural Information Processing Systems, 2018, pp.1963–1972
Qin Z, Li Z, Zhang Z et al (2019) ThunderNet: towards real-time generic object detection on Mobile devices. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019, pp. 6718–6727
Zhu C, He Y, Savvides M et al (2019) Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 840–849
Woo S, Park J, Lee JY et al (2018) CBAM: convolutional block attention module. In: European Conference on Computer Vision (ECCV), 2018, pp. 3–19
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, D., Shen, H. & Shen, Y. DcaNAS: efficient convolutional network Design for Desktop CPU platforms. Appl Intell 51, 4353–4366 (2021). https://doi.org/10.1007/s10489-020-02133-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-02133-0