Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3400286.3418245acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

PerfNet: Platform-Aware Performance Modeling for Deep Neural Networks

Published: 25 November 2020 Publication History

Abstract

The technology of deep learning has grown rapidly and been widely used in the industry. In addition to the accuracy of the deep learning (DL) models, system developers are also interested in comprehending their performance aspects to make sure that the hardware design and the systems deployed to meet the application demands. However, developing a performance model to serve the aforementioned purpose needs to take many issues into account, e.g. the DL model, the runtime software, and the system architecture, which is quite complex. In this work, we propose a multi-layer regression network, called PerfNet, to predict the performance of DL models on heterogeneous systems. To train the PerfNet, we develop a tool to collect the performance features and characteristics of DL models on a set of heterogeneous systems, including key hyper-parameters such as loss functions, network shapes, and dataset size, as well as the hardware specifications. Our experiments show that the results of our approach are more accurate than previously published methods. In the case of VGG16 on GTX1080Ti, PerfNet yields a mean absolute percentage error of 20%, while the referenced work constantly overestimates with errors larger than 200%.

References

[1]
2014. NVIDIA TensorRT. (2014). https://developer.nvidia.com/tensorrt
[2]
2018. mlperf. (2018). https://mlperf.org/
[3]
2018. OpenVINO. (2018). https://software.intel.com/en-us/openvino-toolkit
[4]
2020. Tensorflow Profiler. (2020). https://www.tensorflow.org/tensorboard
[5]
Cody Coleman, Deepak Narayanan, Daniel Kang, Tian Zhao, Jian Zhang, Luigi Nardi, Peter Bailis, Kunle Olukotun, Chris Ré, and Matei Zaharia. 2017. Dawn-bench: An end-to-end deep learning benchmark and competition. Training 100, 101 (2017), 102.
[6]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. In Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition, CVPR.
[7]
Daniel Justus, John Brennan, Stephen Bonner, and Andrew Stephen McGough. 2018. Predicting the computational cost of deep learning models. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 3873--3882.
[8]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[9]
Cheng-Kung Lai, Chih-Wei Yeh, Chia-Heng Tu, and Shih-Hao Hung. 2017. Fast profiling framework and race detection for heterogeneous system. Journal of Systems Architecture 81 (2017), 83--91.
[10]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[11]
Cheng-Yueh Liu, Po-Yao Huang, Chia-Heng Tu, and Shih-Hao Hung. 2018. A Fast and Scalable Cluster Simulator for Network Performance Projection of HPC Applications. In 2018 International Conference on High Performance Computing & Simulation (HPCS). IEEE, 970--977.
[12]
Hang Qi, Evan R Sparks, and Ameet Talwalkar. 2017. Paleo: A performance model for deep neural networks. (2017).
[13]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[14]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.

Cited By

View all
  • (2024)Using Benchmarking and Regression Models for Predicting CNN Training Time on a GPUProceedings of the 4th Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn STrategy10.1145/3660317.3660323(8-15)Online publication date: 3-Jun-2024
  • (2024)LLM-Pilot: Characterize and Optimize Performance of your LLM Inference ServicesProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00022(1-18)Online publication date: 17-Nov-2024
  • (2024)PerfTop: Towards Performance Prediction of Distributed Learning over General TopologyJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104922(104922)Online publication date: May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
RACS '20: Proceedings of the International Conference on Research in Adaptive and Convergent Systems
October 2020
300 pages
ISBN:9781450380256
DOI:10.1145/3400286
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Benchmark
  2. Heterogeneous Systems
  3. Machine Learning
  4. Performance Prediction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

RACS '20
Sponsor:

Acceptance Rates

RACS '20 Paper Acceptance Rate 42 of 148 submissions, 28%;
Overall Acceptance Rate 393 of 1,581 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Using Benchmarking and Regression Models for Predicting CNN Training Time on a GPUProceedings of the 4th Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn STrategy10.1145/3660317.3660323(8-15)Online publication date: 3-Jun-2024
  • (2024)LLM-Pilot: Characterize and Optimize Performance of your LLM Inference ServicesProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00022(1-18)Online publication date: 17-Nov-2024
  • (2024)PerfTop: Towards Performance Prediction of Distributed Learning over General TopologyJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104922(104922)Online publication date: May-2024
  • (2022)AI-Driven Performance Modeling for AI Inference WorkloadsElectronics10.3390/electronics1115231611:15(2316)Online publication date: 26-Jul-2022
  • (2022)Scenario Based Run-Time Switching for Adaptive CNN-Based Applications at the EdgeACM Transactions on Embedded Computing Systems10.1145/348871821:2(1-33)Online publication date: 8-Feb-2022
  • (2022)CONTINUER: Maintaining Distributed DNN Services During Edge Failures2022 IEEE International Conference on Edge Computing and Communications (EDGE)10.1109/EDGE55608.2022.00029(143-152)Online publication date: Jul-2022
  • (2021)Toward accurate platform-aware performance modeling for deep neural networksACM SIGAPP Applied Computing Review10.1145/3477133.347713721:1(50-61)Online publication date: 20-Jul-2021
  • (2021)ALOHA: A Unified Platform-Aware Evaluation Method for CNNs Execution on Heterogeneous Systems at the EdgeIEEE Access10.1109/ACCESS.2021.31152439(133289-133308)Online publication date: 2021
  • (2020)Performance Analysis and Optimization for Federated Learning Applications with PySyft-based Secure Aggregation2020 International Computer Symposium (ICS)10.1109/ICS51289.2020.00046(191-196)Online publication date: Dec-2020
  • (2020)PerfNetRT: Platform-Aware Performance Modeling for Optimized Deep Neural Networks2020 International Computer Symposium (ICS)10.1109/ICS51289.2020.00039(153-158)Online publication date: Dec-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media