Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access
Just Accepted

RaNAS: Resource-Aware Neural Architecture Search for Edge Computing

Online AM: 05 November 2024 Publication History

Abstract

Neural architecture search (NAS) for edge devices is often time-consuming because of long-latency deploying and testing on edge devices. The ability to accurately predict the computation cost and memory requirement for convolutional neural networks (CNNs) in advance holds substantial value. Existing work primarily relies on analytical models, which can result in high prediction errors. This paper proposes a resource-aware NAS (RaNAS) model based on various features. Additionally, a new graph neural network is introduced to predict inference latency and maximum memory requirements for CNNs on edge devices. Experimental results show that, within the error bound of ±1%, RaNAS achieves an accuracy improvement of approximately 8% for inference latency prediction and about 25% for maximum memory occupancy prediction over the state-of-the-art approaches.

References

[1]
Jacob Adlers and Gustaf Pihl. 2018. Prediction of Training Time for Deep Neural Networks in Tensorflow.
[2]
Lu Bai, Weixing Ji, Qinyuan Li, Xilai Yao, Wei Xin, and Wanyi Zhu. 2024. ConvDarts: A Fast and Exact Convolutional Algorithm Selector for Deep Learning Frameworks. CCF Trans. High Perform. Comput. 6, 1 (2024), 32–44. https://doi.org/10.1007/S42514-023-00167-7
[3]
Shaked Brody, Uri Alon, and Eran Yahav. 2021. How Attentive Are Graph Attention Networks?arXiv preprint arXiv:2105.14491(2021).
[4]
Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2020. Once-for-All: Train One Network and Specialize it for Efficient Deployment. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=HylxE1HKwS
[5]
Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. https://openreview.net/forum?id=HylVB3AqYm
[6]
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. arXiv preprint arXiv:1410.0759(2014).
[7]
Krishna Teja Chitty-Venkata and Arun K. Somani. 2022. Neural Architecture Search Survey: A Hardware Perspective. ACM Comput. Surv. 55, 4, Article 78 (nov 2022), 36 pages. https://doi.org/10.1145/3524500
[8]
Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, et al. 2019. ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11398–11407.
[9]
Li Deng. 2012. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Process. Mag. 29, 6 (2012), 141–142. https://doi.org/10.1109/MSP.2012.2211477
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171–4186. https://doi.org/10.18653/v1/n19-1423
[11]
Xuanyi Dong and Yi Yang. 2020. NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search. In International Conference on Learning Representations (ICLR). https://openreview.net/forum?id=HJxyZkBKDr
[12]
Lukasz Dudziak, Thomas Chau, Mohamed S. Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas D. Lane. 2020. BRP-NAS: Prediction-based NAS using GCNs. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/768e78024aa8fdb9b8fe87be86f64745-Abstract.html
[13]
Yanjie Gao, Xianyu Gu, Hongyu Zhang, Haoxiang Lin, and Mao Yang. 2023. Runtime Performance Prediction for Deep Learning Models with Graph Neural Network. In 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, SEIP@ICSE 2023, Melbourne, Australia, May 14-20, 2023. IEEE, 368–380. https://doi.org/10.1109/ICSE-SEIP58684.2023.00039
[14]
Yanjie Gao, Yu Liu, Hongyu Zhang, Zhengxian Li, Yonghao Zhu, Haoxiang Lin, and Mao Yang. 2020. Estimating GPU memory consumption of deep learning models. In ESEC/FSE ’20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, USA, November 8-13, 2020, Prem Devanbu, Myra B. Cohen, and Thomas Zimmermann (Eds.). ACM, 1342–1352. https://doi.org/10.1145/3368089.3417050
[15]
Eugenio Gianniti, Li Zhang, and Danilo Ardagna. 2018. Performance Prediction of GPU-Based Deep Learning Applications. In 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). 167–170. https://doi.org/10.1109/CAHPC.2018.8645908
[16]
Google. 2021. google-research/bert: TensorFlow code and pre-trained models for BERT. https://github.com/google-research/bert. (Accessed on 01/23/2022).
[17]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
[18]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity Mappings in Deep Residual Networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14. Springer, 630–645.
[19]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv preprint arXiv:1704.04861(2017).
[20]
Jie Hu, Li Shen, Samuel Albanie, Gang Sun, and Enhua Wu. 2020. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 42, 8 (2020), 2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
[21]
Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-Level Accuracy With 50x Fewer Parameters and < 0.5 MB Model Size. arXiv preprint arXiv:1602.07360(2016).
[22]
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Tallinn, Estonia) (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA, 510–520. https://doi.org/10.1145/3338906.3338955
[23]
Daniel Justus, John Brennan, Stephen Bonner, and Andrew Stephen McGough. 2018. Predicting the Computational Cost of Deep Learning Models. In 2018 IEEE International Conference on Big Data (Big Data). 3873–3882. https://doi.org/10.1109/BigData.2018.8622396
[24]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[25]
Zhuojin Li, Marco Paolieri, and Leana Golubchik. 2023. Predicting Inference Latency of Neural Architectures on Mobile Devices. In Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering. 99–112.
[26]
Bingqian Lu, Jianyi Yang, Weiwen Jiang, Yiyu Shi, and Shaolei Ren. 2021. One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search. Proceedings of the ACM on Measurement and Analysis of Computing Systems 5, 3(2021), 1–34.
[27]
Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In Proceedings of the European conference on computer vision (ECCV). 116–131.
[28]
Jonathan Malmaud. 2020. Shape Infererence in TensorFlow.jl. https://malmaud.github.io/tfdocs/
[29]
Evgeny Ponomarev, Sergey Matveev, Ivan Oseledets, and Valery Glukhov. 2021. Latency Estimation Tool and Investigation of Neural Networks Inference on Mobile GPU. Computers 10, 8 (2021), 104.
[30]
Hang Qi, Evan Randall Sparks, and Ameet Talwalkar. 2017. Paleo: A Performance Model for Deep Neural Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SyVVJ85lg
[31]
Zhongnan Qu. 2022. Enabling Deep Learning on Edge Devices. Ph. D. Dissertation. ETH Zurich, Zürich, Switzerland. https://doi.org/10.3929/ETHZ-B-000574442
[32]
Aditya Rajagopal and Christos-Savvas Bouganis. 2021. perf4sight: A toolflow to model CNN training performance on Edge GPUs. In IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021, Montreal, QC, Canada, October 11-17, 2021. IEEE, 963–971. https://doi.org/10.1109/ICCVW54120.2021.00112
[33]
Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, and Stephen W. Keckler. 2016. VDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (Taipei, Taiwan) (MICRO-49). IEEE Press, Article 18, 13 pages.
[34]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510–4520.
[35]
Iqbal H. Sarker. 2021. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Computer Science (2021).
[36]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.1556
[37]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1–9.
[38]
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. 2019. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2820–2828.
[39]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rJXMpikCZ
[40]
Chuan-Chi Wang, Ying-Chiao Liao, Ming-Chang Kao, Wen-Yew Liang, and Shih-Hao Hung. 2020. PerfNet: Platform-Aware Performance Modeling for Deep Neural Networks. In Proceedings of the International Conference on Research in Adaptive and Convergent Systems(Gwangju, Republic of Korea) (RACS ’20). Association for Computing Machinery, New York, NY, USA, 90–95. https://doi.org/10.1145/3400286.3418245
[41]
Chuan-Chi Wang, Ying-Chiao Liao, Ming-Chang Kao, Wen-Yew Liang, and Shih-Hao Hung. 2021. Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks. SIGAPP Appl. Comput. Rev. 21, 1 (jul 2021), 50–61. https://doi.org/10.1145/3477133.3477137
[42]
Colin White, Mahmoud Safari, Rhea Sukthanker, Binxin Ru, Thomas Elsken, Arber Zela, Debadeepta Dey, and Frank Hutter. 2023. Neural Architecture Search: Insights from 1000 Papers. CoRR abs/2301.08727(2023). https://doi.org/10.48550/ARXIV.2301.08727 arXiv:2301.08727
[43]
Xiaofan Wu, Ying Li, Zhijian Cheng, and Yuan Wang. 2020. Methods for Predict Memory Usage of Big Data Computing System: A Comparison Study. In Journal of Physics: Conference Series, Vol.  1631. IOP Publishing, 012153.
[44]
Chengru Yang, Zhehao Li, Chaoyi Ruan, Guanbin Xu, Cheng Li, Ruichuan Chen, and Feng Yan. 2021. PerfEstimator: A Generic and Extensible Performance Estimator for Data Parallel DNN Training. In 2021 IEEE/ACM International Workshop on Cloud Intelligence (CloudIntelligence). 13–18. https://doi.org/10.1109/CloudIntelligence52565.2021.00012
[45]
Zhao Yang, Shengbing Zhang, Ruxu Li, Chuxi Li, Miao Wang, Danghui Wang, and Meng Zhang. 2021. Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization. Sensors 21, 2 (2021), 444. https://doi.org/10.3390/S21020444
[46]
Geoffrey X. Yu, Yubo Gao, Pavel Golikov, and Gennady Pekhimenko. 2021. Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, 503–521. https://www.usenix.org/conference/atc21/presentation/yu
[47]
Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, and Yunxin Liu. 2021. nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices. In Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services (Virtual Event, Wisconsin) (MobiSys ’21). Association for Computing Machinery, New York, NY, USA, 81–93. https://doi.org/10.1145/3458864.3467882
[48]
Ru Zhang, Wencong Xiao, Hongyu Zhang, Yu Liu, Haoxiang Lin, and Mao Yang. 2020. An Empirical Study on Program Failures of Deep Learning Jobs. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (Seoul, South Korea) (ICSE ’20). Association for Computing Machinery, New York, NY, USA, 1159–1170. https://doi.org/10.1145/3377811.3380362
[49]
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 6848–6856. https://doi.org/10.1109/CVPR.2018.00716

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization Just Accepted
EISSN:1544-3973
Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 05 November 2024
Accepted: 10 October 2024
Revised: 21 August 2024
Received: 31 May 2024

Check for updates

Author Tags

  1. Neural architecture search
  2. edge devices
  3. graph neural network
  4. computational resource

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 118
    Total Downloads
  • Downloads (Last 12 months)118
  • Downloads (Last 6 weeks)118
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media