research-article

LiteFlow: towards high-performance adaptive neural networks for kernel datapath

Authors:

Chaoliang Zeng,

Kai ChenAuthors Info & Claims

SIGCOMM '22: Proceedings of the ACM SIGCOMM 2022 Conference

Pages 414 - 427

https://doi.org/10.1145/3544216.3544229

Published: 22 August 2022 Publication History

Abstract

Adaptive neural networks (NN) have been used to optimize OS kernel datapath functions because they can achieve superior performance under changing environments. However, how to deploy these NNs remains a challenge. One approach is to deploy these adaptive NNs in the userspace. However, such userspace deployments suffer from either high cross-space communication overhead or low responsiveness, significantly compromising the function performance. On the other hand, pure kernel-space deployments also incur a large performance degradation because the computation logic of model tuning algorithm is typically complex, interfering with the performance of normal datapath execution.

This paper presents LiteFlow, a hybrid solution to build high-performance adaptive NNs for kernel datapath. At its core, LiteFlow decouples the control path of adaptive NNs into: (1) a kernel-space fast path for efficient model inference, and (2) a userspace slow path for effective model tuning. We have implemented LiteFlow with Linux kernel datapath and evaluated it with three popular datapath functions including congestion control, flow scheduling, and load balancing. Compared to prior works, LiteFlow achieves 44.4% better goodput for congestion control, and improves the completion time for long flows by 33.7% and 56.7% for flow scheduling and load balancing, respectively.

Supplementary Material

PDF File (p414-zhang-supp.pdf)

Supplemental material.

Download
84.97 KB

References

[1]

2020. Aurora Codebase. https://github.com/PCCproject/PCC-RL. (2020).

[2]

2020. GCC, the GNU Compiler Collection. https://gcc.gnu.org. (2020).

[3]

2020. Linux Kernel v4.1.5. https://lwn.net/Articles/654091/. (2020).

[4]

2020. Mellanox SN2100 Switch. https://www.mellanox.com/products/ethernet-switches/sn2000. (2020).

[5]

2020. mpstat. https://man7.org/linux/man-pages/man1/mpstat.1.html. (2020).

[6]

2020. netem. https://man7.org/linux/man-pages/man8/tc-netem.8.html. (2020).

[7]

2020. ns3-gym. https://www.nsnam.org/news/2018/12/07/ns3-gym-app.html. (2020).

[8]

2020. Python Jinja. https://jinja.palletsprojects.com/en/3.0.x. (2020).

[9]

2022. Neural Network Optimization with AIMET. https://developer.qualcomm.com/blog/neural-network-optimization-aimet. (2022).

[10]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In USENIX OSDI.

[11]

Soheil Abbasloo, Chen-Yu Yen, and H Jonathan Chao. 2020. Classic meets modern: a pragmatic learning-based congestion control for the internet. In ACM SIGCOMM.

[12]

Mohamed S Abdelfattah, David Han, Andrew Bitar, Roberto DiCecco, Shane O'Connell, Nitika Shanker, Joseph Chu, Ian Prins, Joshua Fender, Andrew C Ling, et al. 2018. DLA: Compiler and FPGA overlay for neural network inference acceleration. In IEEE FPL.

[13]

Alireza Aghasi, Afshin Abdi, Nam Nguyen, and Justin Romberg. 2017. Nettrim: Convex pruning of deep neural networks with performance guarantee. In NeurIPS.

[14]

Ibrahim Umit Akgun, Ali Selman Aydin, and Erez Zadok. 2020. KMLIB: Towards Machine Learning for Operating Systems. In Proceedings of the On-Device Intelligence Workshop, co-located with the MLSys Conference.

[15]

Mohammad Alizadeh, Albert Greenberg, David A Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data center tcp (dctcp). In ACM SIGCOMM.

[16]

Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pfabric: Minimal near-optimal data-center transport. In ACM SIGCOMM.

[17]

Wei Bai, Li Chen, Kai Chen, Dongsu Han, Chen Tian, and Hao Wang. 2015. Information-agnostic flow scheduling for commodity data centers. In USENIX NSDI.

[18]

Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In COMPSTAT. 177--186.

[19]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016).

[20]

Neal Cardwell, Yuchung Cheng, C Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. 2016. BBR: Congestion-based congestion control. ACM Queue 14, 5 (2016), 20--53.

Digital Library

[21]

Nicolo Cesa-Bianchi and Gábor Lugosi. 2006. Prediction, learning, and games. Cambridge university press.

[22]

Li Chen, Kai Chen, Wei Bai, and Mohammad Alizadeh. 2016. Scheduling mix-flows in commodity datacenters with karuna. In ACM SIGCOMM.

[23]

Li Chen, Justinas Lingys, Kai Chen, and Feng Liu. 2018. Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In ACM SIGCOMM.

Digital Library

[24]

Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).

[25]

Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J Franklin, Joseph E Gonzalez, and Ion Stoica. 2017. Clipper: A low-latency online prediction serving system. In USENIX NSDI.

[26]

Daniel Firestone, Andrew Putnam, Sambhrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian Caulfield, Eric Chung, et al. 2018. Azure accelerated networking: Smartnics in the public cloud. In USENIX NSDI.

[27]

Yoann Ghigoff, Julien Sopena, Kahina Lazri, Antoine Blin, and Gilles Muller. 2021. BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing. In USENIX NSDI.

[28]

Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W Mahoney, and Kurt Keutzer. 2021. A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021).

[29]

Thomas G Goodwillie. 2003. Calculus iii: Taylor series. Geometry & Topology 7, 2 (2003), 645--711.

[30]

Yunhong Gu and Robert L Grossman. 2007. UDT: UDP-based data transfer for high-speed wide area networks. Computer Networks 51, 7 (2007), 1777--1799.

Digital Library

[31]

Kaiyuan Guo, Lingzhi Sui, Jiantao Qiu, Song Yao, Song Han, Yu Wang, and Huazhong Yang. 2016. From model to FPGA: Software-hardware co-design for efficient neural network acceleration. In IEEE Hot Chips.

[32]

Saransh Gupta, Mohsen Imani, Harveen Kaur, and Tajana Simunic Rosing. 2019. Nnpim: A processing in-memory architecture for neural network acceleration. IEEE Trans. Comput. 68, 9 (2019), 1325--1337.

Digital Library

[33]

Sangtae Ha, Injong Rhee, and Lisong Xu. 2008. CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS operating systems review 42, 5 (2008), 64--74.

Digital Library

[34]

Christian Hopps et al. 2000. Analysis of an equal-cost multi-path algorithm. Technical Report. RFC 2992, November.

[35]

Shuihai Hu, Kai Chen, Haitao Wu, Wei Bai, Chang Lan, Hao Wang, Hongze Zhao, and Chuanxiong Guo. 2015. Explicit path control in commodity data centers: Design and applications. In USENIX NSDI.

[36]

Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research 18, 1 (2017), 6869--6898.

Digital Library

[37]

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In IEEE CVPR.

[38]

Nathan Jay, Noga Rotman, Brighten Godfrey, Michael Schapira, and Aviv Tamar. 2019. A deep reinforcement learning perspective on internet congestion control. In ICML.

[39]

Srikanth Kandula, Sudipta Sengupta, Albert Greenberg, Parveen Patel, and Ronnie Chaiken. 2009. The nature of data center traffic: measurements & analysis. In ACM IMC.

[40]

Youngsok Kim, Joonsung Kim, Dongju Chae, Daehyun Kim, and Jangwoo Kim. 2019. μlayer: Low latency on-device inference using cooperative single-layer acceleration and processor-friendly quantization. In EuroSys.

[41]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[42]

Raghuraman Krishnamoorthi. 2018. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv preprint arXiv:1806.08342 (2018).

[43]

Aayan Kumar, Vivek Seshadri, and Rahul Sharma. 2020. Shiftry: RNN inference in 2kb of RAM. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 1--30.

Digital Library

[44]

Adam Langley, Alistair Riddoch, Alyssa Wilk, Antonio Vicente, Charles Krasic, Dan Zhang, Fan Yang, Fedor Kouranov, Ian Swett, Janardhan Iyengar, et al. 2017. The quic transport protocol: Design and internet-scale deployment. In ACM SIGCOMM.

[45]

Yanfang Le, Hyunseok Chang, Sarit Mukherjee, Limin Wang, Aditya Akella, Michael M Swift, and TV Lakshman. 2017. UNO: Uniflying host and smart NIC offload for flexible packet processing. In SoCC.

Digital Library

[46]

Eric Liang, Hang Zhu, Xin Jin, and Ion Stoica. 2019. Neural packet classification. In ACM SIGCOMM.

[47]

Yiqing Ma, Han Tian, Xudong Liao, Junxue Zhang, Weiyan Wang, Kai Chen, and Xin Jin. 2022. Multi-Objective Congestion Control. In ACM EuroSys.

[48]

Akshay Narayan, Frank Cangialosi, Deepti Raghavan, Prateesh Goyal, Srinivas Narayana, Radhika Mittal, Mohammad Alizadeh, and Hari Balakrishnan. 2018. Restructuring endpoint congestion control. In ACM SIGCOMM.

[49]

John Ousterhout. 2021. A linux kernel implementation of the homa transport protocol. In USENIX ATC 21.

[50]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In NeurIPS.

[51]

Alon Rashelbach, Ori Rottenstreich, and Mark Silberstein. 2020. A Computational Approach to Packet Classification. In ACM SIGCOMM.

[52]

Doyen SAHOO, Hong Quang PHAM, Jing LU, and Steven CH HOI. 2018. Online deep learning: Learning deep neural networks on the fly. In IJCAI.

[53]

Peter Jay Salzman, Michael Burian, and Ori Pomerantz. 2007. The linux kernel module programming guide. (2007).

[54]

Giuseppe Siracusano, Davide Sanvito, Salvator Galea, and Roberto Bifulco. 2018. Deep learning inference on commodity network interface cards. In NeurIPS.

[55]

Vojislav Ðukić, Sangeetha Abdu Jyothi, Bojan Karlaš, Muhsen Owaida, Ce Zhang, and Ankit Singla. 2019. Is advance knowledge of flow sizes a plausible assumption?. In USENIX NSDI.

[56]

Asaf Valadarsky, Michael Schapira, Dafna Shahaf, and Aviv Tamar. 2017. Learning to route. In ACM HotNets.

[57]

S. Wilson Prakash and P. Deepalakshmi. 2019. Artificial Neural Network Based Load Balancing On Software Defined Networking. In INCOS.

[58]

Qiongwen Xu, Michael D. Wong, Tanvi Wagle, Srinivas Narayana, and Anirudh Sivaraman. 2021. Synthesizing Safe and Efficient Kernel Extensions for Packet Processing. In ACM SIGCOMM.

[59]

Francis Y Yan, Jestin Ma, Greg D Hill, Deepti Raghavan, Riad S Wahby, Philip Levis, and Keith Winstein. 2018. Pantheon: the training ground for Internet congestion-control research. In USENIX ATC.

[60]

Haipeng Yao, Xin Yuan, Peiying Zhang, Jingjing Wang, Chunxiao Jiang, and Mohsen Guizani. 2019. A Machine Learning Approach of Load Balance Routing to Support Next-Generation Wireless Networks. In IWCMC.

[61]

Hong Zhang, Junxue Zhang, Wei Bai, Kai Chen, and Mosharaf Chowdhury. 2017. Resilient datacenter load balancing in the wild. In ACM SIGCOMM.

[62]

Junxue Zhang, Wei Bai, and Kai Chen. 2019. Enabling ECN for datacenter networks with RTT variations. In ACM CoNEXT.

[63]

Martin Zinkevich. 2003. Online convex programming and generalized infinitesimal gradient ascent. In ICML.

Cited By

Tian HLiao XZeng CSun DZhang JChen K(2024)Efficient DRL-Based Congestion Control With Ultra-Low OverheadIEEE/ACM Transactions on Networking10.1109/TNET.2023.333073732:3(1888-1903)Online publication date: Jun-2024
https://doi.org/10.1109/TNET.2023.3330737
Wang WZhang YJin YTian HChen L(2023)MDP: Model Decomposition and Parallelization of Vision Transformer for Distributed Edge Inference2023 19th International Conference on Mobility, Sensing and Networking (MSN)10.1109/MSN60784.2023.00086(570-578)Online publication date: 14-Dec-2023
https://doi.org/10.1109/MSN60784.2023.00086

Index Terms

LiteFlow: towards high-performance adaptive neural networks for kernel datapath
1. Networks
  1. Network algorithms
    1. Data path algorithms

Recommendations

Adaptive Neural Network Control of Small Unmanned Aerial Rotorcraft

This paper proposes an online learning adaptive neural network for small unmanned aerial rotorcraft to improve control performance during flight. Based on state error information, the weight matrix of the adaptive neural network can be updated on line ...
LiteFlow: Toward High-Performance Adaptive Neural Networks for Kernel Datapath
Adaptive neural networks (NN) have been used to optimize OS kernel datapath functions because they can achieve superior performance under changing environments. However, how to deploy these NNs remains a challenge. One approach is to deploy these adaptive ...
A new method for classification of ECG arrhythmias using neural network with adaptive activation function

In this study, new neural network models with adaptive activation function (NNAAF) were implemented to classify ECG arrhythmias. Our NNAAF models included three types named as NNAAF-1, NNAAF-2 and NNAAf-3. Activation functions with adjustable free ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGCOMM '22: Proceedings of the ACM SIGCOMM 2022 Conference

August 2022

858 pages

ISBN:9781450394208

DOI:10.1145/3544216

General Chairs:
Fernando Kuipers
Delft University of Technology
,
Ariel Orda
Technion Israel Institute of Technology

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Key-Area Research and Development Program of Guangdong Province
NSFC
Hong Kong RGC TRS

Conference

SIGCOMM '22

Sponsor:

SIGCOMM

SIGCOMM '22: ACM SIGCOMM 2022 Conference

August 22 - 26, 2022

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 462 of 3,389 submissions, 14%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
1,328
Total Downloads

Downloads (Last 12 months)279
Downloads (Last 6 weeks)34

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Tian HLiao XZeng CSun DZhang JChen K(2024)Efficient DRL-Based Congestion Control With Ultra-Low OverheadIEEE/ACM Transactions on Networking10.1109/TNET.2023.333073732:3(1888-1903)Online publication date: Jun-2024
https://doi.org/10.1109/TNET.2023.3330737
Wang WZhang YJin YTian HChen L(2023)MDP: Model Decomposition and Parallelization of Vision Transformer for Distributed Edge Inference2023 19th International Conference on Mobility, Sensing and Networking (MSN)10.1109/MSN60784.2023.00086(570-578)Online publication date: 14-Dec-2023
https://doi.org/10.1109/MSN60784.2023.00086

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents