research-article

PerfNet: Platform-Aware Performance Modeling for Deep Neural Networks

Authors:

Shih-Hao HungAuthors Info & Claims

RACS '20: Proceedings of the International Conference on Research in Adaptive and Convergent Systems

Pages 90 - 95

https://doi.org/10.1145/3400286.3418245

Published: 25 November 2020 Publication History

Get Access

Abstract

The technology of deep learning has grown rapidly and been widely used in the industry. In addition to the accuracy of the deep learning (DL) models, system developers are also interested in comprehending their performance aspects to make sure that the hardware design and the systems deployed to meet the application demands. However, developing a performance model to serve the aforementioned purpose needs to take many issues into account, e.g. the DL model, the runtime software, and the system architecture, which is quite complex. In this work, we propose a multi-layer regression network, called PerfNet, to predict the performance of DL models on heterogeneous systems. To train the PerfNet, we develop a tool to collect the performance features and characteristics of DL models on a set of heterogeneous systems, including key hyper-parameters such as loss functions, network shapes, and dataset size, as well as the hardware specifications. Our experiments show that the results of our approach are more accurate than previously published methods. In the case of VGG16 on GTX1080Ti, PerfNet yields a mean absolute percentage error of 20%, while the referenced work constantly overestimates with errors larger than 200%.

References

[1]

2014. NVIDIA TensorRT. (2014). https://developer.nvidia.com/tensorrt

Google Scholar

[2]

2018. mlperf. (2018). https://mlperf.org/

Google Scholar

[3]

2018. OpenVINO. (2018). https://software.intel.com/en-us/openvino-toolkit

Google Scholar

[4]

2020. Tensorflow Profiler. (2020). https://www.tensorflow.org/tensorboard

Google Scholar

[5]

Cody Coleman, Deepak Narayanan, Daniel Kang, Tian Zhao, Jian Zhang, Luigi Nardi, Peter Bailis, Kunle Olukotun, Chris Ré, and Matei Zaharia. 2017. Dawn-bench: An end-to-end deep learning benchmark and competition. Training 100, 101 (2017), 102.

Google Scholar

[6]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. In Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition, CVPR.

Google Scholar

[7]

Daniel Justus, John Brennan, Stephen Bonner, and Andrew Stephen McGough. 2018. Predicting the computational cost of deep learning models. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 3873--3882.

Crossref

Google Scholar

[8]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

Google Scholar

[9]

Cheng-Kung Lai, Chih-Wei Yeh, Chia-Heng Tu, and Shih-Hao Hung. 2017. Fast profiling framework and race detection for heterogeneous system. Journal of Systems Architecture 81 (2017), 83--91.

Digital Library

Google Scholar

[10]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

Crossref

Google Scholar

[11]

Cheng-Yueh Liu, Po-Yao Huang, Chia-Heng Tu, and Shih-Hao Hung. 2018. A Fast and Scalable Cluster Simulator for Network Performance Projection of HPC Applications. In 2018 International Conference on High Performance Computing & Simulation (HPCS). IEEE, 970--977.

Crossref

Google Scholar

[12]

Hang Qi, Evan R Sparks, and Ameet Talwalkar. 2017. Paleo: A performance model for deep neural networks. (2017).

Google Scholar

[13]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

Google Scholar

[14]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.

Crossref

Google Scholar

Cited By

View all

Bryzgalov PMaeda TLiem RAfzal AKousha PZhu ZLee J(2024)Using Benchmarking and Regression Models for Predicting CNN Training Time on a GPUProceedings of the 4th Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn STrategy10.1145/3660317.3660323(8-15)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3660317.3660323
Lazuka MAnghel AParnell T(2024)LLM-Pilot: Characterize and Optimize Performance of your LLM Inference ServicesProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00022(1-18)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00022
Yan CZhu ZNiu YWang CZhuo CXu J(2024)PerfTop: Towards Performance Prediction of Distributed Learning over General TopologyJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104922(104922)Online publication date: May-2024
https://doi.org/10.1016/j.jpdc.2024.104922
Show More Cited By

Index Terms

PerfNet: Platform-Aware Performance Modeling for Deep Neural Networks
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems
2. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by regression
  2. Modeling and simulation
    1. Simulation types and techniques
      1. Massively parallel and high-performance simulations

Recommendations

Toward accurate platform-aware performance modeling for deep neural networks

In this paper, we provide a fine-grain machine learning-based method, PerfNetV2, which improves the accuracy of our previous work for modeling the neural network performance on a variety of GPU accelerators. Given an application, the proposed method can ...
Predicting Reasoner Performance on ABox Intensive OWL 2 EL Ontologies

In this article, the authors introduce the notion of ABox intensity in the context of predicting reasoner performance to improve the representativeness of ontology metrics, and they develop new metrics that focus on ABox features of OWL 2 EL ontologies. ...
Prediction of student academic performance based on their emotional wellbeing and interaction on various e-learning platforms
Abstract
Predicting student performance is crucial in higher education, as it facilitates course selection and the development of appropriate future study plans. The process of supporting the instructors and supervisors in monitoring students in order to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

RACS '20: Proceedings of the International Conference on Research in Adaptive and Convergent Systems

October 2020

300 pages

ISBN:9781450380256

DOI:10.1145/3400286

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 November 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

RACS '20

Sponsor:

SIGAPP

RACS '20: International Conference on Research in Adaptive and Convergent Systems

October 13 - 16, 2020

Gwangju, Republic of Korea

Acceptance Rates

RACS '20 Paper Acceptance Rate 42 of 148 submissions, 28%;

Overall Acceptance Rate 393 of 1,581 submissions, 25%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
223
Total Downloads

Downloads (Last 12 months)40
Downloads (Last 6 weeks)4

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Bryzgalov PMaeda TLiem RAfzal AKousha PZhu ZLee J(2024)Using Benchmarking and Regression Models for Predicting CNN Training Time on a GPUProceedings of the 4th Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn STrategy10.1145/3660317.3660323(8-15)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3660317.3660323
Lazuka MAnghel AParnell T(2024)LLM-Pilot: Characterize and Optimize Performance of your LLM Inference ServicesProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00022(1-18)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00022
Yan CZhu ZNiu YWang CZhuo CXu J(2024)PerfTop: Towards Performance Prediction of Distributed Learning over General TopologyJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104922(104922)Online publication date: May-2024
https://doi.org/10.1016/j.jpdc.2024.104922
Sponner MWaschneck BKumar A(2022)AI-Driven Performance Modeling for AI Inference WorkloadsElectronics10.3390/electronics1115231611:15(2316)Online publication date: 26-Jul-2022
https://doi.org/10.3390/electronics11152316
Minakova SSapra DStefanov TPimentel A(2022)Scenario Based Run-Time Switching for Adaptive CNN-Based Applications at the EdgeACM Transactions on Embedded Computing Systems10.1145/348871821:2(1-33)Online publication date: 8-Feb-2022
https://dl.acm.org/doi/10.1145/3488718
Majeed AKilpatrick PSpence IVarghese B(2022)CONTINUER: Maintaining Distributed DNN Services During Edge Failures2022 IEEE International Conference on Edge Computing and Communications (EDGE)10.1109/EDGE55608.2022.00029(143-152)Online publication date: Jul-2022
https://doi.org/10.1109/EDGE55608.2022.00029
Wang CLiao YKao MLiang WHung S(2021)Toward accurate platform-aware performance modeling for deep neural networksACM SIGAPP Applied Computing Review10.1145/3477133.347713721:1(50-61)Online publication date: 20-Jul-2021
https://dl.acm.org/doi/10.1145/3477133.3477137
Busia PMinakova SStefanov TRaffo LMeloni P(2021)ALOHA: A Unified Platform-Aware Evaluation Method for CNNs Execution on Heterogeneous Systems at the EdgeIEEE Access10.1109/ACCESS.2021.31152439(133289-133308)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3115243
Lin PKao MLiang WHung S(2020)Performance Analysis and Optimization for Federated Learning Applications with PySyft-based Secure Aggregation2020 International Computer Symposium (ICS)10.1109/ICS51289.2020.00046(191-196)Online publication date: Dec-2020
https://doi.org/10.1109/ICS51289.2020.00046
Liao YWang CTu CKao MLiang WHung S(2020)PerfNetRT: Platform-Aware Performance Modeling for Optimized Deep Neural Networks2020 International Computer Symposium (ICS)10.1109/ICS51289.2020.00039(153-158)Online publication date: Dec-2020
https://doi.org/10.1109/ICS51289.2020.00039

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Toward accurate platform-aware performance modeling for deep neural networks

Predicting Reasoner Performance on ABox Intensive OWL 2 EL Ontologies

Prediction of student academic performance based on their emotional wellbeing and interaction on various e-learning platforms