Abstract
With the vigorous development of artificial intelligence (AI), intelligence applications based on deep neural networks (DNNs) have changed people’s lifestyles and production efficiency. However, the large amount of computation and data generated from the network edge becomes the major bottleneck, and the traditional cloud-based computing mode has been unable to meet the requirements of realtime processing tasks. To solve the above problems, by embedding AI model training and inference capabilities into the network edge, edge intelligence (EI) becomes a cutting-edge direction in the field of AI. Furthermore, collaborative DNN inference among the cloud, edge, and end devices provides a promising way to boost EI. Nevertheless, at present, EI oriented collaborative DNN inference is still in its early stage, lacking systematic classification and discussion of existing research efforts. Motivated by it, we have comprehensively investigated recent studies on EI-oriented collaborative DNN inference. In this paper, we first review the background and motivation of EI. Then, we classify four typical collaborative DNN inference paradigms for EI, and analyse their characteristics and key technologies. Finally, we summarize the current challenges of collaborative DNN inference, discuss future development trends and provide future research directions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Y. LeCun, Y. Bengio, G. Hinton. Deep learning. Nature, vol.521, no. 7553, pp.436–444, 2015. DOI: https://doi.org/10.1038/nature14539.
F. Belkadi, M. A. Dhuieb, J. V. Aguado, F. Laroche, A. Bernard, F. Chinesta. Intelligent assistant system as a context-aware decision-making support for the workers of the future. Computers & Industrial Engineering, vol. 139, Article number 105732, 2020. DOI: https://doi.org/10.1016/j.cie.2019.02.046.
S. Bhattacharya, S. R. K. Somayaji, T. R. Gadekallu, M. Alazab, P. K. R. Maddikunta. A review on deep learning for future smart cities. Internet Technology Letters, vol.5, no. 1, Article number e187, 2022. DOI: https://doi.org/10.1002/it12.187.
K. Rzadca, P. Findeisen, J. Swiderski, P. Zych, P. Broniek, J. Kusmierek, P. Nowak, B. Strack, P. Witusowski, S. Hand, J. Wilkes. Autopilot: Workload auto-scaling at Google. In Proceedings of the 15th European Conference on Computer Systems, Heraklion, Greece, Article number 16, 2020. DOI: https://doi.org/10.1145/3342195.3387524.
M. AshifuddinMondal, Z. Rehena. IoT based intelligent agriculture field monitoring system. In Proceedings of the 8th International Conference on Cloud Computing, Data Science & Engineering, IEEE, Noida, India, pp. 625–629, 2018. DOI: https://doi.org/10.1109/CONFLUENCE.2018.8442535.
D. Pal, S. Funilkul, N. Charoenkitkarn, P. Kanthamanon. Internet-of-things and smart homes for elderly healthcare: An end user perspective. IEEE Access, vol.6, pp. 10483–10496, 2018. DOI: https://doi.org/10.1109/ACCESS.2018.2808472.
Y. Y. Mao, C. S. You, J. Zhang, K. B. Huang, K. B. Letaief. A survey on mobile edge computing: The communication perspective. IEEE Communications Surveys & Tutorials, vol.19, no. 4, pp. 2322–2358, 2017. DOI: https://doi.org/10.1109/COMST.2017.2745201.
Q. F. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, I. Stoica. Low latency geo-distributed data analytics. ACM SIGCOMM Computer Communication Review, vol.45, no.4, pp.421–434, 2015. DOI: https://doi.org/10.1145/2829988.2787505.
Z. Zhou, X. Chen, E. Li, L. K. Zeng, K. Luo, J. S. Zhang. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, vol.107, no. 8, pp. 1738–1762, 2019. DOI: https://doi.org/10.1109/JPROC.2019.2918951.
W. S. Shi, J. Cao, Q. Zhang, Y. H. Z. Li, L. Y. Xu. Edge computing: Vision and challenges. IEEE Internet of Things Journal, vol.3, no.5, pp.637–646, 2016. DOI: https://doi.org/10.1109/JIOT.2016.2579198.
J. W. Kang, Z. H. Xiong, D. Niyato, Y. Z. Zou, Y. Zhang, M. Guizani. Reliable federated learning for mobile networks. IEEE Wireless Communications, vol.27, no. 2, pp. 72–80, 2020. DOI: https://doi.org/10.1109/MWC.001.1900119.
J. W. Kang, X. D. Li, J. T. Nie, Y. Liu, M. R. Xu, Z. H. Xiong, D. Niyato, Q. Yan. Communication-efficient and cross-chain empowered federated learning for artificial intelligence of things. IEEE Transactions on Network Science and Engineering, vol.9, no. 5, pp. 2966–2977, 2022. DOI: https://doi.org/10.1109/TNSE.2022.3178970.
Y. B. Qu, C. Dong, J. C. Zheng, H. P. Dai, F. Wu, S. Guo, A. Anpalagan. Empowering edge intelligence by air-ground integrated federated learning. IEEE Network, vol.35, no.5, pp.34–41, 2021. DOI: https://doi.org/10.1109/MNET.111.2100044.
X. W. Xu, Y. K. Ding, S. X. Hu, M. Niemier, J. Cong, Y. Hu, Y. Y. Shi. Scaling for edge inference of deep neural networks. Nature Electronics, vol.1, no. 4, pp. 216–222, 2018. DOI: https://doi.org/10.1038/s41928-018-0059-3.
K. B. Letaief, Y. M. Shi, J. M. Lu, J. H. Lu. Edge artificial intelligence for 6G: Vision, enabling technologies, and applications. IEEE Journal on Selected Areas in Communications, vol.40, no.1, pp.5–36, 2022. DOI: https://doi.org/10.1109/JSAC.2021.3126076.
J. Park, S. Samarakoon, M. Bennis, M. Debbah. Wireless network intelligence at the edge. Proceedings of the IEEE, vol.107, no. 11, pp. 2204–2239, 2019. DOI: https://doi.org/10.1109/JPROC.2019.2941458.
H. Jang, O. Simeone, B. Gardner, A. Gruning. An introduction to probabilistic spiking neural networks: Probabilistic models, learning rules, and applications. IEEE Signal Processing Magazine, vol.36, no.6, pp.64–77, 2019.DOI: https://doi.org/10.1109/MSP.2019.2935234.
F. Bonomi, R. Milito, J. Zhu, S. Addepalli. Fog computing and its role in the internet of things. In Proceedings of the 1st Edition of the MCC Workshop on Mobile Cloud Computing, Helsinki, Finland, pp. 13–16, 2012. DOI: https://doi.org/10.1145/2342509.2342513.
S. G. Deng, H. L. Zhao, W. J. Fang, J. W. Yin, S. Dustdar, A. Y. Zomaya. Edge intelligence: The confluence of edge computing and artificial intelligence. IEEE Internet of Things Journal, vol.7, no. 8, pp. 7457–7469, 2020. DOI: https://doi.org/10.1109/JIOT.2020.2984887.
J. Zhang, K. B. Letaief. Mobile edge intelligence and computing for the internet of vehicles. Proceedings of the IEEE, vol.108, no. 2, pp. 246–261, 2020. DOI: https://doi.org/10.1109/JPROC.2019.2947490.
M. Jouhari, A. K. AI-Ali, E. Baccour, A. Mohamed, A. Erbad, M. Guizani, M. Hamdi. Distributed CNN inference on resource-constrained UAVs for surveillance systems: Design and optimization. IEEE Internet of Things Journal, vol.9, no. 2, pp. 1227–1242, 2022. DOI: https://doi.org/10.1109/JIOT.2021.3079164.
M. Subramanian, A. Wojtusciszyn, L. Favre, S. Boughorbel, J. X. Shan, K. B. Letaief, N. Pitteloud, L. Chouchane. Precision medicine in the era of artificial intelligence: Implications in chronic disease management. Journal of Translational Medicine, vol. 18, no. 1, Article number 472, 2020. DOI: https://doi.org/10.1186/s12967-020-02658-5.
C. Y. Chen, A. Seff, A. Kornhauser, J. X. Xiao. Deep-Driving: Learning affordance for direct perception in autonomous driving. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 2722–2730, 2015. DOI: https://doi.org/10.1109/ICCV.2015.312.
N. Kalatzis, M. Avgeris, D. Dechouniotis, K. Papadakis-Vlachopapadopoulos, I. Roussaki, S. Papavassiliou. Edge computing in IoT ecosystems for UAV-enabled early fire detection. In Proceedings of IEEE International Conference on Smart Computing, Taormina, Italy, pp. 106–114, 2018. DOI: https://doi.org/10.1109/SMARTCOMP.2018.00080.
S. Q. Ren, K. M. He, R. Girshick, J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 91–99, 2015.
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg. SSD: Single shot MultiBox detector. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 21–37, 2016. DOI: https://doi.org/10.1007/978-3-319-46448-0_2.
J. Redmon, A. Farhadi. YOLO9000: Better, faster, stronger. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 6517–6525, 2017. DOI: https://doi.org/10.1109/CVPR.2017.690.
C. Szegedy, W. Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 1–9, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298594.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, [Online], Available: https://arxiv.org/abs/1409.1556, 2014.
H. T. Dinh, C. Lee, D. Niyato, P. Wang. A survey of mobile cloud computing: Architecture, applications, and approaches. Wireless Communications and Mobile Computing, vol. 13, no. 18, pp. 1587–1611, 2013. DOI: https://doi.org/10.1002/wcm.1203.
G. Gobieski, B. Lucia, N. Beckmann. Intelligence beyond the edge: Inference on intermittent embedded systems. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Providence, USA, pp. 199–213, 2019. DOI: https://doi.org/10.1145/3297858.3304011.
M. D. Ryan. Cloud computing privacy concerns on our doorstep. Communications of the ACM, vol.54, no.1, pp. 36–38, 2011. DOI: https://doi.org/10.1145/1866739.1866751.
K. Skala, D. Davidovic, E. Afgan, I. Sović, Z. Sojat. Scalable distributed computing hierarchy: Cloud, fog and dew computing. Open Journal of Cloud Computing, vol. 2, no. 1, pp. 16–24, 2015. DOI: https://doi.org/10.19210/1002.2.1.16.
Y. P. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, L. J. Tang. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Computer Architecture News, vol.45, no.1, pp. 615–629, 2017. DOI: https://doi.org/10.1145/3093337.3037698.
M. Krouka, A. Elgabli, C. B. Issaid, M. Bennis. Energy-efficient model compression and splitting for collaborative inference over time-varying channels. In Proceedings of the 32nd IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications, Helsinki, Finland, pp. 1173–1178, 2021. DOI: https://doi.org/10.1109/PIMRC50174.2021.9569707.
K. K. Huang, Z. Tao, C. Wang, T. X. Guo, C. H. Yang, W. H. Gui. Cloud-edge collaborative method for industrial process monitoring based on error-triggered dictionary learning. IEEE Transactions on Industrial Informatics, vol. 18, no. 12, pp. 8957–8966, 2022.
L. Y. Liu, H. Y. Li, M. Gruteser. Edge assisted real-time object detection for mobile augmented reality. In Proceedings of the 25th Annual International Conference on Mobile Computing and Networking, Los Cabos, Mexico, Article number 25, 2019. DOI: https://doi.org/10.1145/3300061.3300116.
H. B. Zhou, W. W. Zhang, C. W. Wang, X. Ma, H. R. Yu. BBNet: A novel convolutional neural network structure in edge-cloud collaborative inference. Sensors, vol.21, no. 13, Article number 4494, 2021. DOI: https://doi.org/10.3390/s21134494.
X. Dai, X. N. Kong, T. Guo, Y. X. Huang. CiNet: Redesigning deep neural networks for efficient mobile-cloud collaborative inference. In Proceedings of SIAM International Conference on Data Mining, pp. 459–467, 2021.
J. Emmons, S. Fouladi, G. Ananthanarayanan, S. Venkataraman, S. Savarese, K. Winstein. Cracking open the DNN black-box: Video analytics with DNNS across the camera-cloud boundary. In Proceedings of Workshop on Hot Topics in Video Analytics and Intelligent Edges, Los Cabos, Mexico, pp. 27–32, 2019. DOI: https://doi.org/10.1145/3349614.3356023.
M. C. Song, K. Zhong, J. Q. Zhang, Y. Hu, D. Liu, W. G. Zhang, J. Wang, T. Li. In-situ AI: Towards autonomous and incremental deep learning for IoT systems. In Proceedings of IEEE International Symposium on High Performance Computer Architecture, Vienna, Austria, pp. 92–103, 2018. DOI: https://doi.org/10.1109/HPCA.2018.00018.
C. Hu, W. Bao, D. Wang, F. M. Liu. Dynamic adaptive DNN surgery for inference acceleration on the edge. In Proceedings of IEEE INFOCOM Conference on Computer Communications, Paris, France, pp. 1423–1431, 2019. DOI: https://doi.org/10.1109/INFOCOM.2019.8737614.
N. Wang, Y. B. Duan, J. Wu. Accelerate cooperative deep inference via layer-wise processing schedule optimization. In Proceedings of International Conference on Computer Communications and Networks, IEEE, Athens, Greece, pp. 1–9, 2021. DOI: https://doi.org/10.1109/ICCCN52240.2021.9522274.
H. J. Jeong, H. J. Lee, C. H. Shin, S. M. Moon. IONN: Incremental offloading of neural network computations from mobile devices to edge servers. In Proceedings of the ACM Symposium on Cloud Computing, Carlsbad, USA, pp. 401–411, 2018. DOI: https://doi.org/10.1145/3267809.3267828.
S. T. Nimi, A. Arefeen, Y. S. Uddin, Y. Lee. EARLIN: Early out-of-distribution detection for resource-efficient collaborative inference. In Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, Bilbao, Spain, pp. 635–651, 2021. DOI: https://doi.org/10.1007/978-3-030-86486-6_39.
J. Hauswald, T. Manville, Q. Zheng, R. Dreslinski, C. Chakrabarti, T. Mudge. A hybrid approach to offloading mobile image classification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, Italy, pp. 8375–8379, 2014. DOI: https://doi.org/10.1109/ICASSP.2014.6855235.
S. Laskaridis, S. I. Venieris, M. Almeida, I. Leontiadis, N. D. Lane. SPINN: Synergistic progressive inference of neural networks over device and cloud. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, London, UK, Article number 37, 2020. DOI: https://doi.org/10.1145/3372224.3419194.
A. E. Eshratifar, M. S. Abrishami, M. Pedram. JointDNN: An efficient training and inference engine for intelligent mobile cloud computing services. IEEE Transactions on Mobile Computing, vol.20, no. 2, pp. 565–576, 2021. DOI: https://doi.org/10.1109/TMC.2019.2947893.
M. F. Deng, H. Tian, B. Fan. Fine-granularity based application offloading policy in cloud-enhanced small cell networks. In Proceedings of IEEE International Conference on Communications Workshops, Kuala Lumpur, Malaysia, pp. 638–643, 2016. DOI: https://doi.org/10.1109/ICCW.2016.7503859.
M. Gerla, E. K. Lee, G. Pau, U. Lee. Internet of vehicles: From intelligent grid to autonomous cars and vehicular clouds. In Proceedings of IEEE World Forum on Internet of Things, Seoul, Republic of Korea, pp. 241–246, 2014. DOI: https://doi.org/10.1109/WF-IoT.2014.6803166.
B. Kizilkaya, E. Ever, H.Y. Yatbaz, A. Yazici. An effective forest fire detection framework using heterogeneous wireless multimedia sensor networks. ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 18, no. 2, pp. 1–21, 2022.
J. R. Jiang, H. J. Li, L. M. Wang. Joint model, task partitioning and privacy preserving adaptation for edge DNN inference. In Proceedings of IEEE Wireless Communications and Networking Conference, Austin, USA, pp. 1224–1229, 2022. DOI: https://doi.org/10.1109/WCNC51071.2022.9771620.
T. Mohammed, C. Joe-Wong, R. Babbar, M. Di Francesco. Distributed inference acceleration with adaptive DNN partitioning and offloading. In Proceedings of IEEE INFOCOM Conference on Computer Communications, Toronto, Canada, pp. 854–863, 2020. DOI: https://doi.org/10.1109/INFOCOM41043.2020.9155237.
N. L. Shan, Z. C. Ye, X. L. Cui. Collaborative intelligence: Accelerating deep neural network inference via device-edge synergy. Security and Communication Networks, vol.2020, Article number 8831341, 2020. DOI: https://doi.org/10.1155/2020/8831341.
C. Y. Yang, J. J. Kuo, J. P. Sheu, K. J. Zheng. Cooperative distributed deep neural network deployment with edge computing. In Proceedings of ICC/IEEE International Conference on Communications, IEEE, Montreal, Canada, 2021. DOI: https://doi.org/10.1109/ICC42927.2021.9500668.
H. R. Liu, H. Y. Zheng, M. H. Jiao, G. X. Chi. SCADS: Simultaneous computing and distribution strategy for task offloading in mobile-edge computing system. In Proceedings of IEEE 18th International Conference on Communication Technology, Chongqing, China, pp. 1286–1290, 2018. DOI: https://doi.org/10.1109/ICCT.2018.8599958.
M. Hanyao, Y. B. Jin, Z. Z. Qian, S. Zhang, S. L. Lu. Edge-assisted online on-device object detection for realtime video analytics. In Proceedings of IEEE INFOCOM Conference on Computer Communications, Vancouver, Canada, pp. 1–10, 2021. DOI: https://doi.org/10.1109/INFOCOM42981.2021.9488741.
S. Yun, J. M. Kang, S. Choi, I. M. Kim. Cooperative inference of DNNs over noisy wireless channels. IEEE Transactions on Vehicular Technology, vol.70, no.8, pp. 8298–8303, 2021. DOI: https://doi.org/10.1109/TVT.2021.3092179.
E. Li, Z. Zhou, X. Chen. Edge intelligence: On-demand deep learning model co-inference with device-edge synergy. In Proceedings of Workshop on Mobile Edge Communications, Budapest, Hungary, pp. 31–36, 2018. DOI: https://doi.org/10.1145/3229556.3229562.
E. Li, L. K. Zeng, Z. Zhou, X. Chen.z Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications, vol.19, no. 1, pp.447–457, 2020. DOI: https://doi.org/10.1109/TWC.2019.2946140.
J. D. Song, Z. C. Liu, X. F. Wang, C. Qiu, X. Chen. Adaptive and collaborative edge inference in task stream with latency constraint. In Proceedings of ICC/IEEE International Conference on Communications, Montreal, Canada, 2021. DOI: https://doi.org/10.1109/ICC42927.2021.9500892.
L. K. Zeng, E. Li, Z. Zhou, X. Chen. Boomerang: On-demand cooperative deep neural network inference for edge intelligence on the industrial internet of things. IEEE Network, vol.33, no.5, pp.96–103, 2019. DOI: https://doi.org/10.1109/MNET.001.1800506.
S. Hu, C. W. Dong, W. S. Wen. Enable pipeline processing of DNN co-inference tasks in the mobile-edge cloud. In Proceedings of the 6th IEEE International Conference on Computer and Communication Systems, Chengdu, China, pp. 186–192, 2021. DOI: https://doi.org/10.1109/IC-CCS52626.2021.9449178.
B. Y. Fang, X. Zeng, M. Zhang. NestDNN: Resource-aware multi-tenant on-device deep learning for continuous mobile vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, New Delhi, India, pp. 115–127, 2018. DOI: https://doi.org/10.1145/3241539.3241559.
J. B. Du, L. Q. Zhao, J. Feng, X. L. Chu. Computation offloading and resource allocation in mixed fog/cloud computing systems with min-max fairness guarantee. IEEE Transactions on Communications, vol.66, no.4, pp. 1594–1608, 2018. DOI: https://doi.org/10.1109/TCOMM.2017.2787700.
X. Tang, X. Chen, L. K. Zeng, S. Yu, L. Chen. Joint multiuser DNN partitioning and computational resource allocation for collaborative edge intelligence. IEEE Internet of Things Journal, vol.8, no. 12, pp.9511–9522, 2021. DOI: https://doi.org/10.1109/JIOT.2020.3010258.
B. Yang, X. L. Cao, C. Yuen, L. J. Qian. Offloading optimization in edge computing for deep-learning-enabled target tracking by internet of UAVs. IEEE Internet of Things Journal, vol.8, no. 12, pp.9878–9893, 2021. DOI: https://doi.org/10.1109/JIOT.2020.3016694.
C. W. Dong, S. Hu, X. Chen, W. S. Wen. Joint optimization with DNN partitioning and resource allocation in mobile edge computing. IEEE Transactions on Network and Service Management, vol.18, no. 4, pp. 3973–3986, 2021. DOI: https://doi.org/10.1109/TNSM.2021.3116665.
A. E. Roth, M. Sotomayor. Two-sided matching. Handbook of Game Theory with Economic Applications, vol. 1, pp. 485–541, 1992. DOI: https://doi.org/10.1016/S1574-0005(05)80019-0.
S. Teerapittayanon, B. McDanel, H. T. Kung.z Branchy-Net: Fast inference via early exiting from deep neural networks. In Proceedings of the 23rd International Conference on Pattern Recognition, IEEE, Cancun, Mexico, pp. 2464–2469, 2016. DOI: https://doi.org/10.1109/ICPR.2016.7900006.
M. Xue, H. M. Wu, R. D. Li, M. X. Xu, P. F. Jiao. Eos-DNN: An efficient offloading scheme for DNN inference acceleration in local-edge-cloud collaborative environments. IEEE Transactions on Green Communications and Networking, vol.6, no. 1, pp. 248–264, 2022. DOI: https://doi.org/10.1109/TGCN.2021.3111731.
X. J. Li, Y. J. Qin, H. C. Zhou, Z. W. Zhang. An intelligent collaborative inference approach of service partitioning and task offloading for deep learning based service in mobile edge computing networks. Transactions on Emerging Telecommunications Technologies, vol.32, no.9, Article number e4263, 2021. DOI: https://doi.org/10.1002/ett.4263.
P. Liu, B. Z. Qi, S. Banerjee. EdgeEye: An edge service framework for real-time intelligent video analytics. In Proceedings of the 1st International Workshop on Edge Systems, Analytics and Networking, Munich, Germany, pp. 1–6, 2018. DOI: https://doi.org/10.1145/3213344.3213345.
A. Morshed, P. P. Jayaraman, T. Sellis, D. Georgakopoulos, M. Villari, R. Ranjan. Deep osmosis: Holistic distributed deep learning in osmotic computing. IEEE Cloud Computing, vol.4, no. 6, pp. 22–32, 2017. DOI: https://doi.org/10.1109/MCC.2018.1081070.
P. Ren, X. Q. Qiao, Y. K. Huang, L. Liu, C. Pu, S. Dustdar. Fine-grained elastic partitioning for distributed DNN towards mobile web AR services in the 5G era. IEEE Transactions on Services Computing, to be published. DOI: https://doi.org/10.1109/TSC.2021.3098816.
C. Y. Lin, T. C. Wang, K. C. Chen, B. Y. Lee, J. J. Kuo. Distributed deep neural network deployment for smart devices from the edge to the cloud. In Proceedings of ACM MobiHoc Workshop on Pervasive Systems in the IoT Era, Catania, Italy, pp. 43–48, 2019. DOI: https://doi.org/10.1145/3331052.3332477.
S. Dey, J. Mondal, A. Mukherjee. Offloaded execution of deep learning inference at edge: Challenges and insights. In Proceedings of IEEE International Conference on Pervasive Computing and Communications Workshops, Kyoto, Japan, pp. 855–861, 2019. DOI: https://doi.org/10.1109/PERCOMW.2019.8730817.
B. Lin, Y. H. Huang, J. S. Zhang, J. Q. Hu, X. Chen, J. Li. Cost-driven off-loading for DNN-based applications over cloud, edge, and end devices. IEEE Transactions on Industrial Informatics, vol.16, no. 8, pp. 5456–5466, 2020. DOI: https://doi.org/10.1109/TII.2019.2961237.
S. Teerapittayanon, B. McDanel, H. T. Kung. Distributed deep neural networks over the cloud, the edge and end devices. In Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, Atlanta, USA, pp. 328–339, 2017. DOI: https://doi.org/10.1109/ICDCS.2017.226.
Z. Y. Tao, Q. Li. eSGD: Communication efficient distributed deep learning on the edge. In Proceedings of the 1st USENIX Workshop on Hot Topics in Edge Computing, HotEdge, Boston, USA, 2018. Available: https://www.usenix.org/conference/hotedgel8/presentation/tao.
A. Yousefpour, S. Devic, B. Q. Nguyen, A. Kreidieh, A. Liao, A. M. Bayen, J. P. Jue. Guardians of the deep fog: Failure-resilient DNN inference from edge to cloud. In Proceedings of the First International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things, New York, USA, pp. 25–31, 2019. DOI: https://doi.org/10.1145/3363347.3363366.
A. Yousefpour, B. Q. Nguyen, S. Devic, G. H. Wang, A. Kreidieh, H. Lobel, A. M. Bayen, J. P. Jue. ResiliNet: Failure-resilient inference in distributed neural networks. [Online], Available: https://arxiv.org/abs/2002.07386, 2020.
Y. Zhou, J. H. Xiao, Y. Zhou, G. Loianno. Multi-robot collaborative perception with graph neural networks. IEEE Robotics and Automation Letters, vol.7, no. 2, pp. 2289–2296, 2022. DOI: https://doi.org/10.1109/LRA.2022.3141661.
S. J. Wang, F. Jiang, B. Zhang, R. Ma, Q. Hao. Development of UAV-based target tracking and recognition systems. IEEE Transactions on Intelligent Transportation Systems, vol.21, no.8, pp.3409–3422, 2020. DOI: https://doi.org/10.1109/TITS.2019.2927838.
S. Bhagat, P. B. Sujit. UAV target tracking in urban environments using deep reinforcement learning. In Proceedings of International Conference on Unmanned Aircraft Systems, IEEE, Athens, Greece, pp. 694–701, 2020. DOI: https://doi.org/10.1109/ICUAS48674.2020.9213856.
M. Dhuheir, E. Baccour, A. Erbad, S. Sabeeh, M. Hamdi. Efficient real-time image recognition using collaborative swarm of UAVs and convolutional networks. In Proceedings of International Wireless Communications and Mobile Computing, IEEE, Harbin, China, pp. 1954–1959, 2021. DOI: https://doi.org/10.1109/IWCMC51323.2021.9498967.
Y. K. Huang, X. Q. Qiao, S. Dustdar, J. W. Zhang, J. L. Li. Toward decentralized and collaborative deep learning inference for intelligent IoT devices. IEEE Network, vol.36, no.1, pp.59–68, 2022. DOI: https://doi.org/10.1109/MNET.011.2000639.
N. Shlezinger, E. Farhan, H. Morgenstern, Y. C. Eldar. Collaborative inference via ensembles on the edge. In Proceedings of ICASSP/IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Toronto, Canada, pp. 8478–8482, 2021. DOI: https://doi.org/10.1109/ICASSP39728.2021.9414740.
S. Disabato, M. Roveri, C. Alippi. Distributed deep convolutional neural networks for the internet-of-things. IEEE Transactions on Computers, vol. 70, no. 8, pp. 1239–1252, 2021. DOI: https://doi.org/10.1109/TC.2021.3062227.
S. Naveen, M. R. Kounte, M. R. Ahmed. Low latency deep learning inference model for distributed intelligent iot edge clusters. IEEE Access, vol.9, pp. 160607–160621, 2021. DOI: https://doi.org/10.1109/ACCESS.2021.3131396.
J. S. Du, M. H. Shen, Y. F. Du. A distributed in-situ CNN inference system for IoT applications. In Proceedings of the 38th IEEE International Conference on Computer Design, Hartford, USA, pp. 279–287, 2020. DOI: https://doi.org/10.1109/ICCD50377.2020.00055.
E. Baccour, A. Erbad, A. Mohamed, M. Hamdi, M. Guizani. DistPrivacy: Privacy-aware distributed deep neural networks in IoT surveillance systems. In Proceedings of GLOBECOM/IEEE Global Communications Conference, IEEE, Taipei, China, 2020. DOI: https://doi.org/10.1109/GLOBE-COM42002.2020.9322470.
M. Hemmat, A. Davoodi, Y. H. Hu. Edgen AI: Distributed inference with local edge devices and minimal latency. In Proceedings of the 27th Asia and South Pacific Design Automation Conference, IEEE, Taipei, China, pp. 544–549, 2022. DOI: https://doi.org/10.1109/ASP-DAC52403.2022.9712496.
S. Zhang, S. Zhang, Z. Z. Qian, J. Wu, Y. B. Jin, S. L. Lu. DeepSlicing: Collaborative and adaptive CNN inference with low latency. IEEE Transactions on Parallel and Distributed Systems, vol.32, no.9, pp. 2175–2187, 2021. DOI: https://doi.org/10.1109/TPDS.2021.3058532.
J. C. Mao, X. Chen, K. W. Nixon, C. Krieger, Y. R. Chen. MoDNN: Local distributed mobile computing system for deep neural network. In Proceedings of Design, Automation & Test in Europe Conference & Exhibition, IEEE, Lausanne, Switzerland, pp. 1396–1401, 2017. DOI: https://doi.org/10.23919/DATE.2017.7927211.
Z. R. Zhao, K. M. Barijough, A. Gerstlauer.z DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, vol.37, no. 11, pp. 2348–2359, 2018. DOI: https://doi.org/10.1109/TCAD.2018.2858384.
L. K. Zeng, X. Chen, Z. Zhou, L. Yang, J. S. Zhang. CoEdge: Cooperative DNN inference with adaptive workload partitioning over heterogeneous edge devices. IEEE/ACM Transactions on Networking, vol. 29, no. 2, pp. 595–608, 2021. DOI: https://doi.org/10.1109/TNET.2020.3042320.
R. Hadidi, J. S. Cao, M. Woodward, M. S. Ryoo, H. Kim. Distributed perception by collaborative robots. IEEE Robotics and Automation Letters, vol.3, no.4, pp.3709–3716, 2018. DOI: https://doi.org/10.1109/LRA.2018.2856261.
A. Goel, C. Tung, X. Hu, G. K. Thiruvathukal, J. C. Davis, Y. H. Lu. Efficient computer vision on edge devices with pipeline-parallel hierarchical neural networks. In Proceedings of the 27th Asia and South Pacific Design Automation Conference, IEEE, Taipei, China, pp. 532–537, 2022. DOI: https://doi.org/10.1109/ASP-DAC52403.2022.9712574.
X. Liang, Z. Q. Li, D. D. Fan, B. Zhang, G. M. Lu, D. Zhang. Innovative contactless palmprint recognition system based on dual-camera alignment. IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 10, pp. 6464–6476, 2022.
J. Huang, V. Rathod, C. Sun, M. L. Zhu, A. Korattikara, A. Fathi, I. Fischer, Z. Wojna, Y. Song, S. Guadarrama, K. Murphy. Speed/Accuracy trade-offs for modern convolutional object detectors. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3296–3297, 2017. DOI: https://doi.org/10.1109/CV-PR.2017.351.
C. Dong, Y. Shen, Y. B. Qu, K. Wang, J. C. Zheng, Q. H. Wu, F. Wu. UAVs as an intelligent service: Boosting edge intelligence for air-ground integrated networks. IEEE Network, vol.35, no.4, pp. 167–175, 2021. DOI: https://doi.org/10.1109/MNET.011.2000651.
P. F. Wang, B. Y. Zhang, Y. G. Li, S. G. Zhang, Y. Zhang, B. Zhu. An adaptive task migration scheduling approach for edge-cloud collaborative inference. Wireless Communications & Mobile Computing, vol. 2022, 2022. DOI: https://doi.org/10.1155/2022/8804530.
W. H. Liu, J. W. Geng, Z. W. Zhu, J. Cao, Z. R. Lian. Sniper: Cloud-edge collaborative inference scheduling with neural network similarity modeling. In Proceedings of the 59th ACM/IEEE Design Automation Conference, San Francisco, USA, pp. 505–510, 2022. DOI: https://doi.org/10.1145/3489517.3530474.
M. Du, K. Wang, Y. F. Chen, X. Y. Wang, Y. F. Sun. Big data privacy preserving in multi-access edge computing for heterogeneous internet of things. IEEE Communications Magazine, vol.56, no.8, pp.62–67, 2018. DOI: https://doi.org/10.1109/MCOM.2018.1701148.
J. N. Li, J. Wu, J. H. Li, A. K. Bashir, M. J. Piran, A. Anjum. Blockchain-based trust edge knowledge inference of multi-robot systems for collaborative tasks. IEEE Communications Magazine, vol.59, no. 7, pp.94–100, 2021. DOI: https://doi.org/10.1109/MCOM.001.2000419.
D. Li, Z. N. Zhang, W. Y. Liao, Z. W. Xu. KLRA: A kernel level resource auditing tool for IoT operating system security. In Proceedings of IEEE/ACM Symposium on Edge Computing, IEEE, Seattle, USA, pp. 427–432, 2018. DOI: https://doi.org/10.1109/SEC.2018.00058.
Z. B. Wang, K. X. Liu, J. H. Hu, J. Ren, H. C. Guo, W. Yuan. Attrleaks on the edge: Exploiting information leakage from privacy-preserving co-inference. Chinese Journal of Electronics, vol. 32, no. 1, pp. 1–12, 2023.
I. Jarin, B. Eshete. PRICURE: Privacy-preserving collaborative inference in a multi-party setting. In Proceedings of ACM Workshop on Security and Privacy Analytics, pp. 25–35, 2021. DOI: https://doi.org/10.1145/3445970.3451156.
Acknowledgements
This work was supported in part by National Natural Science Foundation of China (Nos. 61931011, 62072303 and 61872310), the Key-area Research and Development Program of Guangdong Province, China (No. 2021B010 1400003), Hong Kong Research Grants Council (RGC) Research Impact Fund, China (No. R5060-19), General Research Fund (Nos. 152221/19E, 152203/20E and 152244/2IE), and Shenzhen Science and Technology Innovation Commission, China (No. JCYJ20200109142008673).
Author information
Authors and Affiliations
Corresponding author
Additional information
Wei-Qing Ren received the B. Sc. degree in electronic science and technology from Nanjing University of Aeronautics and Astronautics, China in 2021. He is currently a master student in electronic information at College of Electronic Information Engineering, Nanjing University of Aeronautics and Astronautics, China.
His research interests include deep learning, UAV based target detection and collaborative inference in UAV swarms.
Yu-Ben Qu received the B. Sc. degree in mathematics and applied mathematics from Nanjing University, China in 2009, the M. Sc. degree in communication and information systems, and the Ph. D. degree in computer science and technology from Nanjing Institute of Communications, China in 2012 and 2016, respectively. From June 2019 to June 2022, he was a postdoctoral fellow with Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. He is currently an associate research fellow in College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, and also with the Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space, Ministry of Industry and Information Technology, China. From October 2015 to January 2016, he was a visiting research associate in School of Computer Science and Engineering, University of Aizu, Japan. He was a recipient of the Best Paper Awards of GPC 2020 and IEEE SAGC 2021.
His research interests include mobile edge computing, edge intelligence and UAVs collaborative intelligence.
Chao Dong received the Ph. D. degree in communication engineering from PLA University of Science and Technology, China in 2007. From 2008 to 2011, he worked as a post doctor at Department of Computer Science and Technology, Nanjing University, China. From 2011 to 2017, he was an associate professor with Institute of Communications Engineering, PLA University of Science and Technology, China. He is now a full professor with College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, China. He is a member of IEEE, ACM and IEICE.
His research interests include D2D communications, UAV swarm networking and anti-jamming network protocols.
Yu-Qian Jing received B. Sc. degree in electronic science and technology from Nanjing University of Aeronautics and Astronautics, China in 2020. He is currently a master student in information and communication engineering at Nanjing University of Aeronautics and Astronautics, China.
His research interest is edge network intelligence.
Hao Sun received the B. Sc. degree in electronic science and technology from Nanjing University of Aeronautics and Astronautics, China in 2022. He is currently a master student in information and communication engineering at College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, China.
His research interests include deep learning, UAV cluster intelligence and UAV collaborative inference.
Qi-Hui Wu received the B. Sc. degree in communications engineering, and the M. Sc. and Ph. D. degrees in communications and information systems from Institute of Communications Engineering, China in 1994, 1997 and 2000, respectively. From 2003 to 2005, he was a postdoctoral research associate at Southeast University, China. From 2005 to 2007, he was an associate professor with Institute of Communications Engineering, PLA University of Science and Technology, China, where he is currently a full professor. From March 2011 to September 2011, he was an advanced visiting scholar in Stevens Institute of Technology, USA. Since 2016, he has been with Nanjing University of Aeronautics and Astronautics and appointed a distinguished professor.
His research interests include wireless communications and statistical signal processing, with emphasis on system design of software defined radio, cognitive radio, and smart radio.
Song Guo received the Ph. D. degree in computer Science from University of Ottawa, Canada in 2005. He is a full professor at Department of Computing, Hong Kong Polytechnic University, China. He also holds a Changjiang Chair Professorship awarded by the Ministry of Education of China. He is a Fellow of the Canadian Academy of Engineering and a Fellow of the IEEE (Computer Society). He published many papers in top venues with wide impact in these areas and was recognized as a Highly Cited Researcher (Clarivate Web of Science). He is the recipient of over a dozen Best Paper Awards from IEEE/ACM conferences, journals, and technical committees. He is the Editor-in-Chief of IEEE Open Journal of the Computer Society and the Chair of IEEE Communications Society (ComSoc) Space and Satellite Communications Technical Committee. He was an IEEE ComSoc Distinguished Lecturer and a Member of IEEE ComSoc Board of Governors. He has served for IEEE Computer Society on Fellow Evaluation Committee, and been named on editorial board of a number of prestigious international journals like IEEE TPDS, IEEE TCC, IEEE TETC, etc. He has also served as Chairs of organizing and technical committees of many international conferences.
His research interests include big data, edge AI, mobile computing and distributed systems.
Rights and permissions
About this article
Cite this article
Ren, WQ., Qu, YB., Dong, C. et al. A Survey on Collaborative DNN Inference for Edge Intelligence. Mach. Intell. Res. 20, 370–395 (2023). https://doi.org/10.1007/s11633-022-1391-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-022-1391-7