research-article

Dynamic Early Exit Scheduling for Deep Neural Network Inference through Contextual Bandits

Authors:

Dong YuanAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 823 - 832

https://doi.org/10.1145/3459637.3482335

Published: 30 October 2021 Publication History

Abstract

Recent advances in Deep Neural Networks (DNNs) have dramatically improved the accuracy of DNN inference, but also introduce larger latency. In this paper, we investigate how to utilize early exit, a novel method that allows inference to exit at earlier exit points at the cost of an acceptable amount of accuracy. Scheduling the optimal exit point on a per-instance basis is challenging because the realized performance (i.e., confidence and latency) of each exit point is random and the statistics vary in different scenarios. Moreover, the performance has dependencies among the exit points, further complicating the problem. Therefore, the optimal exit scheduling decision cannot be known in advance but should be learned in an online fashion. To this end, we propose Dynamic Early Exit (DEE), a real-time online learning algorithm based on contextual bandit analysis. DEE observes the performance at each exit point as context and decides whether to exit or keep processing. Unlike standard contextual bandit analyses, the rewards of the decisions in our problem are temporally dependent. Furthermore, the performances of the earlier exit points are inevitably explored more compared to the later ones, which poses an unbalance exploration-exploitation trade-off. DEE addresses the aforementioned challenges, where its regret per inference asymptotically approaches zero. We compare DEE with four benchmark schemes in the real-world experiment. The experiment result shows that DEE can improve the overall performance by up to 98.1% compared to the best benchmark scheme.

Supplementary Material

MP4 File (CIKM21-rgfp0678.mp4)

Presentation video

Download
32.97 MB

References

[1]

[n.d.]. DEE Appendix Technical appendix. https://www.dropbox.com/s/h11r2de9jtqkaog/DEE_CIKM_Appendix.pdf?dl=0. Accessed: 2021-05--26.

[2]

Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. 2014. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. In ICML. Beijing, China, 1638--1646.

Digital Library

[3]

Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-Time Analysis of the Multiarmed Bandit Problem. Mach. Learn. 47, 2--3 (May 2002), 235--256.

Digital Library

[4]

Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. 2003. The Nonstochastic Multiarmed Bandit Problem. SIAM J. Comput. 32, 1 (Jan. 2003), 48--77.

Digital Library

[5]

Emmanuel Bengio, Pierre-Luc Bacon, Joelle Pineau, and Doina Precup. 2015. Conditional computation in neural networks for faster models. arXiv preprint arXiv:1511.06297 (2015).

[6]

Konstantin Berestizshevsky and Guy Even. 2019. Dynamically Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence. In ICANN. Springer, Munich, Germany, 306--320.

[7]

Tolga Bolukbasi, JosephWang, Ofer Dekel, and Venkatesh Saligrama. 2017. Adaptive Neural Networks for Efficient Inference. In ICML (PMLR, Vol. 70). PMLR, Sydney, Australia, 527--536.

Digital Library

[8]

Margaux Brégère, Pierre Gaillard, Yannig Goude, and Gilles Stoltz. 2019. Target Tracking for Contextual Bandits: Application to Demand Side Management. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 754--763.

[9]

Richard Combes, Alexandre Proutière, and Alexandre Fauquette. 2020. Unimodal Bandits with Continuous Arms: Order-optimal Regret without Smoothness. POMACS 4, 1 (2020), 1--28.

Digital Library

[10]

Xin Dai, Xiangnan Kong, and Tian Guo. 2020. EPNet: Learning to Exit with Flexible Multi-Branch Network (CIKM '20). Association for Computing Machinery, New York, NY, USA, 235--244. https://doi.org/10.1145/3340531.3411973

Digital Library

[11]

Yuan Deng, Sébastien Lahaie, and Vahab Mirrokni. 2019. A Robust Non- Clairvoyant Dynamic Mechanism for Contextual Auctions. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 8657--8667.

Digital Library

[12]

Maria Dimakopoulou, Zhengyuan Zhou, Susan Athey, and Guido Imbens. 2019. Balanced Linear Contextual Bandits. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. Honolulu, HI, USA, 3445--3453.

Digital Library

[13]

Miroslav Dudik, Daniel Hsu, Satyen Kale, Nikos Karampatziakis, John Langford, Lev Reyzin, and Tong Zhang. 2011. Efficient Optimal Learning for Contextual Bandits. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (Barcelona, Spain) (UAI'11). AUAI Press, Arlington, Virginia, USA, 169--178.

Digital Library

[14]

Michael Figurnov, Maxwell D Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, and Ruslan Salakhutdinov. 2017. Spatially adaptive computation time for residual networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, 1039--1048.

[15]

Dylan J Foster, Akshay Krishnamurthy, and Haipeng Luo. 2019. Model Selection for Contextual Bandits. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 14741--14752.

Digital Library

[16]

Krizhevsky A. Hinton G. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. Computer Science Department, University of Toronto.

[17]

Negin Golrezaei, Adel Javanmard, and Vahab Mirrokni. 2019. Dynamic Incentive- Aware Learning: Robust Pricing in Contextual Auctions. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 9759--9769.

Digital Library

[18]

Yunchao Gong, Liu Liu, Ming Yang, and Lubomir Bourdev. 2014. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014).

[19]

Shupeng Gui, Haotao N Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, and Ji Liu. 2019. Model Compression with Adversarial Robustness: A Unified Optimization Framework. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 1285--1296.

Digital Library

[20]

Elad Hazan and Nimrod Megiddo. 2007. Online Learning with Prior Knowledge. In COLT. Springer, Berlin, Heidelberg, 499--513.

Digital Library

[21]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. Las Vegas, NV, USA, 770--778.

[22]

Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Q Weinberger. 2017. Multi-scale Dense Networks for Resource Efficient Image Classification. International Conference on Learning Representations (2017).

[23]

Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. In ASPLOS (Xi'an, China) (ASPLOS '17). 615--629.

Digital Library

[24]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS - Volume 1 (Lake Tahoe, Nevada) (NIPS'12). Curran Associates Inc., 1097--1105.

Digital Library

[25]

E. Li, L. Zeng, Z. Zhou, and X. Chen. 2020. Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing. IEEE Transactions on Wireless Communications 19, 1 (2020), 447--457.

[26]

F. Li, J. Liu, and B. Ji. 2019. Combinatorial Sleeping Bandits with Fairness Constraints. MathSciNet (2019), 1--1.

[27]

Baoyuan Liu, MinWang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse Convolutional Neural Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, MA, USA.

[28]

Virag Shah, Ramesh Johari, and Jose Blanchet. 2019. Semi-Parametric Dynamic Contextual Pricing. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 2363--2373.

Digital Library

[29]

Aleksandrs Slivkins. 2011. Contextual Bandits with Similarity Information. In COLT. Budapest, Hungary, 679--702.

[30]

Aleksandrs Slivkins. 2019. Introduction to multi-armed bandits. arXiv preprint arXiv:1904.07272 (2019).

[31]

Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2016. Branchynet: Fast Inference via Early Exiting from Deep Neural Networks. In ICPR. Amsterdam, the Netherlands, 2464--2469.

[32]

M.Wang, J. Mo, J. Lin, Z.Wang, and L. Du. 2019. DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks. In SiPS 2019. 178--183.

[33]

X. Wang, Y. Han, V. C. M. Leung, D. Niyato, X. Yan, and X. Chen. 2020. Convergence of Edge Computing and Deep Learning: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 22, 2 (2020), 869--904.

[34]

Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, and Joseph E Gonzalez. 2018. Skipnet: Learning Dynamic Routing in Convolutional Networks. In Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany, 409--424.

Digital Library

[35]

Nirandika Wanigasekara and Christina Yu. 2019. Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 14684--14694.

Digital Library

[36]

Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, and Larry S Davis. 2019. LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 7780--7789.

Digital Library

[37]

Chicheng Zhang, Alekh Agarwal, Hal Daumé Iii, John Langford, and Sahand Negahban. 2019. Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback. In ICML (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 7335--7344.

[38]

Zhengyuan Zhou, Renyuan Xu, and Jose Blanchet. 2019. Learning in Generalized Linear Contextual Bandits with Stochastic Delays. In NIPS. Curran Associates, Inc., Red Hook, NY, USA, 5197--5208.

Digital Library

Cited By

Rahmath P HSrivastava VChaurasia KPacheco RCouto R(2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
https://dl.acm.org/doi/10.1145/3698767
Yan GLiu KLiu CZhang J(2024)Edge Intelligence for Internet of Vehicles: A SurveyIEEE Transactions on Consumer Electronics10.1109/TCE.2024.337850970:2(4858-4877)Online publication date: May-2024
https://doi.org/10.1109/TCE.2024.3378509
Ayyat MNadeem TKrawczyk B(2024)ClassyNet: Class-Aware Early-Exit Neural Networks for Edge DevicesIEEE Internet of Things Journal10.1109/JIOT.2023.334412011:9(15113-15127)Online publication date: 1-May-2024
https://doi.org/10.1109/JIOT.2023.3344120
Show More Cited By

Index Terms

Dynamic Early Exit Scheduling for Deep Neural Network Inference through Contextual Bandits
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Computing methodologies
  1. Artificial intelligence
    1. Planning and scheduling
      1. Planning under uncertainty
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning

Recommendations

SEE: Scheduling Early Exit for Mobile DNN Inference during Service Outage
MSWIM '19: Proceedings of the 22nd International ACM Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems

In recent years, the rapid development of edge computing enables us to process a wide variety of intelligent applications at the edge, such as real-time video analytics. However, edge computing could suffer from service outage caused by the fluctuated ...
eDeepSave: Saving DNN Inference using Early Exit During Handovers in Mobile Edge Environment
Recent advances in deep neural networks (DNNs) have substantially improved the accuracy of intelligent applications. One effective scheme known as DNN partition further improves the speed of the inference by partitioning the DNN to a mobile device and its ...
Invited Paper: Enhancing Privacy in Federated Learning via Early Exit
ApPLIED 2023: Proceedings of the 5th workshop on Advanced tools, programming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems

In this paper, we investigate the interplay between early exit mechanisms in deep neural networks and privacy preservation in the context of federated learning. Our primary objective is to assess how early exits impact privacy during the learning and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

October 2021

4966 pages

ISBN:9781450384469

DOI:10.1145/3459637

General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '21

Sponsor:

CIKM '21: The 30th ACM International Conference on Information and Knowledge Management

November 1 - 5, 2021

Queensland, Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
514
Total Downloads

Downloads (Last 12 months)114
Downloads (Last 6 weeks)9

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rahmath P HSrivastava VChaurasia KPacheco RCouto R(2024)Early-Exit Deep Neural Network - A Comprehensive SurveyACM Computing Surveys10.1145/369876757:3(1-37)Online publication date: 22-Nov-2024
https://dl.acm.org/doi/10.1145/3698767
Yan GLiu KLiu CZhang J(2024)Edge Intelligence for Internet of Vehicles: A SurveyIEEE Transactions on Consumer Electronics10.1109/TCE.2024.337850970:2(4858-4877)Online publication date: May-2024
https://doi.org/10.1109/TCE.2024.3378509
Ayyat MNadeem TKrawczyk B(2024)ClassyNet: Class-Aware Early-Exit Neural Networks for Edge DevicesIEEE Internet of Things Journal10.1109/JIOT.2023.334412011:9(15113-15127)Online publication date: 1-May-2024
https://doi.org/10.1109/JIOT.2023.3344120
Bajpai DJaiswal AHanawal M(2024)I-SplitEE: Image Classification in Split Computing DNNs with Early ExitsICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10622954(2658-2663)Online publication date: 9-Jun-2024
https://doi.org/10.1109/ICC51166.2024.10622954
Bajpai DTrivedi VYadav SHanawal M(2023)SplitEE: Early Exit in Deep Neural Networks with Split ComputingProceedings of the Third International Conference on AI-ML Systems10.1145/3639856.3639873(1-9)Online publication date: 25-Oct-2023
https://dl.acm.org/doi/10.1145/3639856.3639873
Kannan TFeamster NHoffmann HMeng WJensen CCremers CKirda E(2023)Prediction Privacy in Distributed Multi-Exit Neural Networks: Vulnerabilities and SolutionsProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623069(1123-1137)Online publication date: 15-Nov-2023
https://dl.acm.org/doi/10.1145/3576915.3623069
Xu WYin YChen NTu H(2023)Collaborative Inference Acceleration Integrating DNN Partitioning and Task Offloading in Mobile Edge ComputingInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402341008533:11n12(1835-1863)Online publication date: 29-Nov-2023
https://doi.org/10.1142/S0218194023410085
Dong FWang HShen DHuang ZHe QZhang JWen LZhang T(2023)Multi-Exit DNN Inference Acceleration Based on Multi-Dimensional Optimization for Edge IntelligenceIEEE Transactions on Mobile Computing10.1109/TMC.2022.317240222:9(5389-5405)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1109/TMC.2022.3172402
Pacheco RShifrin MCouto RMenasché DHanawal MCampista M(2023)AdaEE: Adaptive Early-Exit DNN Inference Through Multi-Armed BanditsICC 2023 - IEEE International Conference on Communications10.1109/ICC45041.2023.10279243(3726-3731)Online publication date: 28-May-2023
https://doi.org/10.1109/ICC45041.2023.10279243
Kanduri AShahhosseini SNaeini EAlikhani HLiljeberg PDutt NRahmani A(2023)Edge-Centric Optimization of Multi-modal ML-Driven eHealth ApplicationsEmbedded Machine Learning for Cyber-Physical, IoT, and Edge Computing10.1007/978-3-031-40677-5_5(95-125)Online publication date: 7-Oct-2023
https://doi.org/10.1007/978-3-031-40677-5_5
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents