research-article

Open access

Toward reconfigurable kernel datapaths with learned optimizations

Authors:

Thomas Anderson,

Ang ChenAuthors Info & Claims

HotOS '21: Proceedings of the Workshop on Hot Topics in Operating Systems

Pages 175 - 182

https://doi.org/10.1145/3458336.3465288

Published: 03 June 2021 Publication History

Abstract

Today's computing systems pay a heavy "OS tax", as kernel execution accounts for a significant amount of resource footprint. This is not least because today's kernels abound with hardcoded heuristics that are designed with unstated assumptions, which rarely generalize well for diversifying applications and device technologies.

We propose the concept of reconfigurable kernel datapaths that enables kernels to self-optimize dynamically. In this architecture, optimizations are computed from empirical data using machine learning (ML), and they are integrated into the kernel in a safe and systematic manner via an in-kernel virtual machine. This virtual machine implements the reconfigurable match table (RMT) abstraction, where tables are installed into the kernel at points where performance-critical events occur, matches look up the current execution context, and actions encode context-specific optimizations computed by ML, which may further vary from application to application. Our envisioned architecture will support both offline and online learning algorithms, as well as varied kernel subsystems. An RMT verifier will check program well-formedness and model efficiency before admitting an RMT program to the kernel. An admitted program can be interpreted in bytecode or just-in-time compiled to optimize the kernel datapaths.

References

[1]

Digitalisation and Energy. https://www.iea.org/reports/digitalisation-and-energy.

[2]

The DARPA Real-Time Machine Learning Program. https://www.darpa.mil/program/real-time-machine-learning.

[3]

O. Y. Al-Jarrah, P. D. Yoo, S. Muhaidat, G. K. Karagiannidis, and K. Taha. Efficient machine learning for big data: A review. Big Data Research, 2(3):87--93, 2015.

Digital Library

[4]

H. Al Maruf and M. Chowdhury. Effectively prefetching remote memory with leap. In Proc. ATC, 2020.

[5]

N. Amit and M. Wei. The design and implementation of hyperupcalls. In Proc. ATC, 2018.

Digital Library

[6]

N. Amit, M. Wei, and C.-C. Tu. Hypercallbacks: Decoupling policy decisions and execution. In Proc. HotOS, 2017.

Digital Library

[7]

A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The multikernel: A new os architecture for scalable multicore systems. In Proc. SOSP, 2009.

Digital Library

[8]

J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization. Journal of machine learning research, 13(2), 2012.

Digital Library

[9]

B. N. Bershad, S. Savage, P. Pardyak, E. G. Sirer, M. E. Fiuczynski, D. Becker, C. Chambers, and S. Eggers. Extensibility safety and performance in the spin operating system. In Proc. SOSP, 1995.

Digital Library

[10]

C. Bienia. Benchmarking Modern Multiprocessors. PhD thesis, Princeton University, January 2011.

Digital Library

[11]

W. Brendel, J. Rauber, and M. Bethge. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arXiv preprint arXiv:1712.04248, 2017.

[12]

C. S. Burrus and T. Parks. DFT/FFT and convolution algorithms: theory and implementation. John Wiley & Sons, Inc., 1985.

Digital Library

[13]

M. Cafarella, D. DeWitt, V. Gadepally, J. Kepner, C. Kozyrakis, T. Kraska, M. Stonebraker, and M. Zaharia. Dbos: A proposal for a data-centric operating system. arXiv preprint arXiv:2007.11112, 2020.

[14]

J. Chen, S. S. Banerjee, Z. Kalbarczyk, and R. K. Iyer. Machine learning for load balancing in the linux kernel. In Proc. APSys, 2020.

Digital Library

[15]

L.-C. Chen, M. Collins, Y. Zhu, G. Papandreou, B. Zoph, F. Schroff, H. Adam, and J. Shlens. Searching for efficient multi-scale architectures for dense image prediction. In Proc. NeurIPS, 2018.

Digital Library

[16]

Y. Coppens, K. Efthymiadis, T. Lenaerts, and A. Nowe. Distilling deep reinforcement learning policies in soft decision trees. In Proc. IJCAI, 2019.

[17]

M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio. Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or -1. arXiv:1602.02830, 2016.

[18]

C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In Proc. EuroCrypt, 2006.

Digital Library

[19]

D. Engler, F. Kaashoek, and J. O'Toole. Exokernel: an operating system architecture for application-level resource management. In Proc. SOSP, 1995.

Digital Library

[20]

C. Finn, A. Rajeswaran, S. Kakade, and S. Levine. Online meta-learning. In Proc. ICML, 2019.

[21]

E. Gershuni, N. Amit, A. Gurfinkel, N. Narodytska, J. A. Navas, N. Rinetzky, L. Ryzhyk, and M. Sagiv. Simple and precise static analysis of untrusted Linux kernel extensions. In Proc. PLDI, 2019.

Digital Library

[22]

D. Gruss, E. Kraft, T. Tiwari, M. Schwarz, A. Trachtenberg, J. Hennessey, A. Ionescu, and A. Fogh. Page cache attacks. In Proc. CCS, 2019.

Digital Library

[23]

S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan. Deep learning with limited numerical precision. In Proc. ICML, 2015.

Digital Library

[24]

M. Hao, L. Toksoz, N. Li, E. E. Halim, H. Hoffmann, and H. S. Gunawi. Linnos: Predictability on unpredictable flash storage (with a light neural network). In Proc. OSDI, 2020.

[25]

T. Harter, C. Dragga, M. Vaughn, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. A file is not a file: Understanding the I/O behavior of apple desktop applications. In Proc. SOSP, 2011.

Digital Library

[26]

A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, et al. Searching for mobilenetv3. In Proc. ICCV, 2019.

[27]

H. Jin, Q. Song, and X. Hu. Auto-keras: An efficient neural architecture search system. In Proc. KDD, 2019.

Digital Library

[28]

S. Kanev, J. P. Darago, K. Hazelwood, P. Ranganathan, T. Moseley, G.-Y. Wei, and D. Brooks. Profiling a warehouse scale computer. In Proc. ISCA, 2015.

Digital Library

[29]

T. Kraska, A. Beutel, E. H. Chi, and J. Dean. The case for learned data structures. In Proc. SIGMOD, 2018.

Digital Library

[30]

B. Lampson. A note on the confinement problem. Communications of the ACM, 16, 1973.

Digital Library

[31]

C. Li, T. Chen, H. You, Z. Wang, and Y. Lin. Halo: Hardware-aware learning to optimize. In Proc. ECCV. Springer, 2020.

Digital Library

[32]

C. Li, Z. Yu, Y. Fu, Y. Zhang, Y. Zhao, H. You, Q. Yu, Y. Wang, C. Hao, and Y. Lin. Hw-nas-bench: Hardware-aware neural architecture search benchmark. In Proc. ICLR, 2021.

[33]

J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu. Feature selection: A data perspective. ACM Comput. Surv., 50(6), 2017.

Digital Library

[34]

C. Liu, L.-C. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, and L. FeiFei. Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In Proc. CVPR, 2019.

[35]

H. Liu, K. Simonyan, and Y. Yang. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018.

[36]

S. Liu, Z. Du, J. Tao, D. Han, T. Luo, Y. Xie, Y. Chen, and T. Chen. Cambricon: An instruction set architecture for neural networks. In Proc. ISCA, 2016.

Digital Library

[37]

S. Liu, Y. Lin, Z. Zhou, K. Nan, H. Liu, and J. Du. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Proc. MobiSys, 2018.

Digital Library

[38]

M. Maas. A taxonomy of ML for systems problems. IEEE Micro, 40(5), 2020.

[39]

M. Maas, D. Anderson, M. Isard, M. M. Javanmard, K. S. McKinley, and C. Raffel. Learning-based memory allocation for C++ server workloads. In Proc. ASPLOS, 2020.

Digital Library

[40]

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.

[41]

P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz. Pruning convolutional neural networks for resource efficient inference. In Proc. ICLR, 2017.

[42]

L. Nelson, J. V. Geffen, E. Torlak, and X. Wang. Specification and verification in the field: Applying formal methods to BPF just-in-time compilers in the Linux kernel. In Proc. OSDI, 2020.

[43]

R. Nishihara, P. Moritz, S. Wang, A. Tumanov, W. Paul, J. Schleier-Smith, R. Liaw, M. Niknami, M. I. Jordan, and I. Stoica. Real-time machine learning: The missing pieces. In Proc. HotOS, 2017.

Digital Library

[44]

K. Pei, Y. Cao, J. Yang, and S. Jana. Towards practical verification of machine learning: The case of computer vision systems. In Proc. DeepTest, 2019.

[45]

Q. Shan, Z. Li, J. Jia, and C.-K. Tang. Fast image/video upsampling. ACM Transactions on Graphics (TOG), 27(5):1--7, 2008.

Digital Library

[46]

M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le. Mnasnet: Platform-aware neural architecture search for mobile. In Proc. CVPR, 2019.

[47]

M. Tan and Q. V. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946, 2019.

[48]

D. J. Tian, G. Hernandez, J. I. Choi, V. Frost, P. C. Johnson, and K. R. B. Butler. LBM-A security framework for peripherals within the Linux kernel. In Proc. S&P, 2019.

[49]

M. A. Vieira, M. S. Castanho, R. D. Pacífico, E. R. Santos, E. P. C. Júnior, and L. F. Vieira. Fast packet processing with ebpf and xdp: Concepts, code, challenges, and applications. ACM Computing Surveys (CSUR), 53(1):1--36, 2020.

Digital Library

[50]

M. Wang, S. Rasoulinezhad, P. H. Leong, and H. K. So. Niti: Training integer neural networks using integer-only arithmetic. arXiv preprint arXiv:2009.13108, 2020.

[51]

S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan. When edge meets learning: Adaptive control for resource-constrained distributed machine learning. In Proc. INFOCOM, 2018.

[52]

Y. Wiseman and S. Jiang. Advanced Operating Systems and Kernel Applications: Techniques and Technologies: Techniques and Technologies. IGI Global, 2009.

Digital Library

[53]

T.-J. Yang, Y.-H. Chen, and V. Sze. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proc. CVPR, 2017.

[54]

Y. Zhang and Y. Huang. "Learned" operating systems. ACM SIGOPS Operating System Review, 53, 2019.

Digital Library

Cited By

Fingler HTarte IYu HSzekely AHu BAkella ARossbach CAamodt TJerger NSwift M(2023)Towards a Machine Learning-Assisted Kernel with LAKEProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575697(846-861)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575697
Akgun IAydin ABurford AMcNeill MArkhangelskiy MZadok E(2023)Improving Storage Systems Using Machine LearningACM Transactions on Storage10.1145/356842919:1(1-30)Online publication date: 19-Jan-2023
https://dl.acm.org/doi/10.1145/3568429
Duan GFu YCai MChen HSun J(2023)DongTingJournal of Systems and Software10.1016/j.jss.2023.111745203:COnline publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.jss.2023.111745

Index Terms

Toward reconfigurable kernel datapaths with learned optimizations
1. Computing methodologies
  1. Machine learning
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems

Recommendations

Rethinking Compiler Optimizations for the Linux Kernel: An Explorative Study
APSys '15: Proceedings of the 6th Asia-Pacific Workshop on Systems

Performance of the operating system kernel is critical to many applications running on it. Although many efforts have been spent on improving Linux kernel performance, there is not enough attention on GCC, the compiler used to build Linux. As a result, ...
User-level checkpointing through exportable kernel state
IWOOOS '96: Proceedings of the 5th International Workshop on Object Orientation in Operating Systems (IWOOOS '96)

Checkpointing, process migration, and similar services need to have access not only to the memory of the constituent processes, but also to the complete state of all kernel provided objects (e.g., threads and ports) involved. Traditionally, a major ...
Scalability in a real-time kernel
RTCSA '97: Proceedings of the 4th International Workshop on Real-Time Computing Systems and Applications

Scalability of operating system kernels is important especially for real-time systems. The existing real-time kernels, however are made scalable by providing only basic functionalities; thus they often sacrifice the capability of hardware platforms. We ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

HotOS '21: Proceedings of the Workshop on Hot Topics in Operating Systems

June 2021

251 pages

ISBN:9781450384384

DOI:10.1145/3458336

General Chair:
Sebastian Angel
University of Pennsylvania
,
Program Chairs:
Baris Kasikci
University of Michigan
,
Eddie Kohler
Harvard University

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

HotOS '21

Sponsor:

SIGOPS

HotOS '21: Workshop on Hot Topics in Operating Systems

June 1 - 3, 2021

Michigan, Ann Arbor

Upcoming Conference

HOTOS '25

Sponsor:
sigops

Workshop on Hot Topics in Operating Systems

May 14 - 16, 2025

Banff or Lake Louise , AB , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
902
Total Downloads

Downloads (Last 12 months)282
Downloads (Last 6 weeks)54

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fingler HTarte IYu HSzekely AHu BAkella ARossbach CAamodt TJerger NSwift M(2023)Towards a Machine Learning-Assisted Kernel with LAKEProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575697(846-861)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575697
Akgun IAydin ABurford AMcNeill MArkhangelskiy MZadok E(2023)Improving Storage Systems Using Machine LearningACM Transactions on Storage10.1145/356842919:1(1-30)Online publication date: 19-Jan-2023
https://dl.acm.org/doi/10.1145/3568429
Duan GFu YCai MChen HSun J(2023)DongTingJournal of Systems and Software10.1016/j.jss.2023.111745203:COnline publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.jss.2023.111745

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents