default search action
19th OSDI 2024: Santa Clara, CA, USA
- Ada Gavrilovska, Douglas B. Terry:
18th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2024, Santa Clara, CA, USA, July 10-12, 2024. USENIX Association 2024
Memory Management
- Nikita Lazarev, Varun Gohil, James Tsai, Andy Anderson, Bhushan Chitlur, Zhiru Zhang, Christina Delimitrou:
Sabre: Hardware-Accelerated Snapshot Compression for Serverless MicroVMs. 1-18 - Lingfeng Xiang, Zhen Lin, Weishu Deng, Hui Lu, Jia Rao, Yifan Yuan, Ren Wang:
Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration. 19-35 - Yuhong Zhong, Daniel S. Berger, Carl A. Waldspurger, Ryan Wee, Ishwar Agarwal, Rajat Agarwal, Frank Hady, Karthik Kumar, Mark D. Hill, Mosharaf Chowdhury, Asaf Cidon:
Managing Memory Tiers with CXL in Virtualized Environments. 37-56 - Zhihong Luo, Sam Son, Sylvia Ratnasamy, Scott Shenker:
Harvesting Memory-bound CPU Stall Cycles in Software with MSH. 57-75 - Lei Chen, Shi Liu, Chenxi Wang, Haoran Ma, Yifan Qiao, Zhe Wang, Chenggang Wu, Youyou Lu, Xiaobing Feng, Huimin Cui, Shan Lu, Harry Xu:
A Tale of Two Paths: Toward a Hybrid Data Plane for Efficient Far-Memory Applications. 77-95 - Haoran Ma, Yifan Qiao, Shi Liu, Shan Yu, Yuanjiang Ni, Qingda Lu, Jiesheng Wu, Yiying Zhang, Miryung Kim, Harry Xu:
DRust: Language-Guided Distributed Shared Memory with Fine Granularity, Full Transparency, and Ultra Efficiency. 97-115
Low-Latency LLM Serving
- Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, Ramachandran Ramjee:
Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve. 117-134 - Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai:
ServerlessLLM: Low-Latency Serverless Inference for Large Language Models. 135-153 - Wonbeom Lee, Jungi Lee, Junghwan Seo, Jaewoong Sim:
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management. 155-172 - Biao Sun, Ziming Huang, Hanyu Zhao, Wencong Xiao, Xinyi Zhang, Yong Li, Wei Lin:
Llumnix: Dynamic Scheduling for Large Language Model Serving. 173-191 - Yinmin Zhong, Shengyu Liu, Junda Chen, Jianbo Hu, Yibo Zhu, Xuanzhe Liu, Xin Jin, Hao Zhang:
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving. 193-210
Distributed Systems
- Zhenhao He, Dario Korolija, Yu Zhu, Benjamin Ramhorst, Tristan Laan, Lucian Petrica, Michaela Blott, Gustavo Alonso:
ACCL+: an FPGA-Based Collective Engine for Distributed Applications. 211-231 - Liangcheng Yu, Xiao Zhang, Haoran Zhang, John Sonchack, Dan R. K. Ports, Vincent Liu:
Beaver: Practical Partial Snapshots for Distributed Cloud Services. 233-249 - Hanze Zhang, Ke Cheng, Rong Chen, Haibo Chen:
Fast and Scalable In-network Lock Management Using Lock Fission. 251-268 - Martina Camaioni, Rachid Guerraoui, Matteo Monti, Pierre-Louis Roman, Manuel Vidigueira, Gauthier Voron:
Chop Chop: Byzantine Atomic Broadcast to the Network Limit. 269-287
Deep Learning
- Yi Zhai, Sijia Yang, Keyu Pan, Renwei Zhang, Shuo Liu, Chao Liu, Zichun Ye, Jianmin Ji, Jie Zhao, Yu Zhang, Yanyong Zhang:
Enabling Tensor Language Model to Assist in Generating High-Performance Tensor Programs for Deep Learning. 289-305 - Lei Wang, Lingxiao Ma, Shijie Cao, Quanlu Zhang, Jilong Xue, Yining Shi, Ningxin Zheng, Ziming Miao, Fan Yang, Ting Cao, Yuqing Yang, Mao Yang:
Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation. 307-323 - Qizheng Zhang, Ali Imran, Enkeleda Bardhi, Tushar Swamy, Nathan Zhang, Muhammad Shahbaz, Kunle Olukotun:
Caravan: Practical Online Learning of In-Network ML Models with Labeling Agents. 325-345 - Zhiqi Lin, Youshan Miao, Quanlu Zhang, Fan Yang, Yi Zhu, Cheng Li, Saeed Maleki, Xu Cao, Ning Shang, Yilei Yang, Weijiang Xu, Mao Yang, Lintao Zhang, Lidong Zhou:
nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training. 347-363 - Yuhan Liu, Chengcheng Wan, Kuntai Du, Henry Hoffmann, Junchen Jiang, Shan Lu, Michael Maire:
ChameleonAPI: Automatic and Efficient Customization of Neural Networks for ML Applications. 365-386
Operating Systems
- Hayley LeBlanc, Nathan Taylor, James Bornholt, Vijay Chidambaram:
SquirrelFS: using the Rust compiler to check file-system crash consistency. 387-404 - Athinagoras Skiadopoulos, Zhiqiang Xie, Mark Zhao, Qizhe Cai, Saksham Agarwal, Jacob Adelmann, David Ahern, Carlo Contavalli, Michael D. Goldflam, Vitaly Mayatskikh, Raghu Raja, Daniel Walton, Rachit Agarwal, Shrijeet Mukherjee, Christos Kozyrakis:
High-throughput and Flexible Host Networking for Accelerated Computing. 405-423 - Yilun Wu, Byounguk Min, Mohannad Ismail, Wenjie Xiong, Changhee Jung, Dongyoon Lee:
IntOS: Persistent Embedded Operating System and Language Support for Multi-threaded Intermittent Computing. 425-443 - Ao Li, Ning Zhang:
Data-flow Availability: Achieving Timing Assurance in Autonomous Systems. 445-463 - Haibo Chen, Xie Miao, Ning Jia, Nan Wang, Yu Li, Nian Liu, Yutao Liu, Fei Wang, Qiang Huang, Kun Li, Hongyang Yang, Hui Wang, Jie Yin, Yu Peng, Fengwei Xu:
Microkernel Goes General: Performance and Compatibility in the HongMeng Production Microkernel. 465-485
Cloud Computing
- Abdullah Bin Faisal, Noah Martin, Hafiz Mohsin Bashir, Swaminathan Lamelas, Fahad R. Dogar:
When will my ML Job finish? Toward providing Completion Time Estimates through Predictability-Centric Scheduling. 487-505 - Neeraj Kumar, Pol Mauri Ruiz, Vijay Menon, Igor Kabiljo, Mayank Pundir, Andrew Newell, Daniel Lee, Liyuan Wang, Chunqiang Tang:
Optimizing Resource Allocation in Hyperscale Datacenters: Scalability, Usability, and Experiences. 507-528 - Rui Wang, Devin Gibson, Kirk Rodrigues, Yu Luo, Yun Zhang, Kaibo Wang, Yupeng Fu, Ting Chen, Ding Yuan:
μSlope: High Compression and Fast Search on Semi-Structured Logs. 529-544 - Mike Chow, Yang Wang, William Wang, Ayichew Hailu, Rohan Bopardikar, Bin Zhang, Jialiang Qu, David Meisner, Santosh Sonawane, Yunqi Zhang, Rodrigo Paim, Mack Ward, Ivor Huang, Matt McNally, Daniel Hodges, Zoltan Farkas, Caner Gocmen, Elvis Huang, Chunqiang Tang:
ServiceLab: Preventing Tiny Performance Regressions at Hyperscale through Pre-Production Testing. 545-562 - Arnab Choudhury, Yang Wang, Tuomas Pelkonen, Kutta Srinivasan, Abha Jain, Shenghao Lin, Delia David, Siavash Soleimanifard, Michael Chen, Abhishek Yadav, Ritesh Tijoriwala, Denis Samoylov, Chunqiang Tang:
MAST: Global Scheduling of ML Training across Geo-Distributed Datacenters at Hyperscale. 563-580
Formal Verification
- Rishabh R. Iyer, Katerina J. Argyraki, George Candea:
Automatically Reasoning About How Systems Code Uses the CPU Cache. 581-598 - Ziqiao Zhou, Anjali, Weiteng Chen, Sishuai Gong, Chris Hawblitzel, Weidong Cui:
VeriSMo: A Verified Security Module for Confidential VMs. 599-614 - Hao Sun, Zhendong Su:
Validating the eBPF Verifier via State Embedding. 615-628 - Mo Zou, Dong Du, Mingkai Dong, Haibo Chen:
Using Dynamically Layered Definite Releases for Verifying the RefFS File System. 629-648 - Xudong Sun, Wenjie Ma, Jiawei Tyler Gu, Zicheng Ma, Tej Chajed, Jon Howell, Andrea Lattuada, Oded Padon, Lalith Suresh, Adriana Szekeres, Tianyin Xu:
Anvil: Verifying Liveness of Cluster Management Controllers. 649-666
Cloud Security
- Marcos K. Aguilera, Clément Burgelin, Rachid Guerraoui, Antoine Murat, Athanasios Xygkis, Igor Zablotchi:
DSig: Breaking the Barrier of Signatures in Data Centers. 667-685 - Zhongyu Wang, Yaheng Song, Erci Xu, Haonan Wu, Guangxun Tong, Shizhuo Sun, Haoran Li, Jincheng Liu, Lijun Ding, Rong Liu, Jiaji Zhu, Jiesheng Wu:
Ransom Access Memories: Achieving Practical Ransomware Protection in Cloud with DeftPunk. 687-702 - Graeme Connell, Vivian Fang, Rolfe Schmidt, Emma Dauterman, Raluca Ada Popa:
Secret Key Recovery in a Global-Scale End-to-End Encryption System. 703-719 - Darya Kaviani, Sijun Tan, Pravein Govindan Kannan, Raluca Ada Popa:
Flock: A Framework for Deploying On-Demand Distributed Trust. 721-743
Data Management
- Sara McAllister, Yucong Wang, Benjamin Berg, Daniel S. Berger, George Amvrosiadis, Nathan Beckmann, Gregory R. Ganger:
FairyWREN: A Sustainable Cache for Emerging Write-Read-Erase Flash Interfaces. 745-764 - Shujian Qian, Ashvin Goel:
Massively Parallel Multi-Versioned Transaction Processing. 765-781 - Junyi Shu, Kun Qian, Ennan Zhai, Xuanzhe Liu, Xin Jin:
Burstable Cloud Block Storage with Data Processing Units. 783-799 - Ming Zhang, Yu Hua, Zhijun Yang:
Motor: Enabling Multi-Versioning for Distributed Transactions on Disaggregated Memory. 801-819
Analysis of Correctness
- Zu-Ming Jiang, Zhendong Su:
Detecting Logic Bugs in Database Engines via Equivalent Expression Transformation. 821-835 - Tony Nuda Zhang, Travis Hance, Manos Kapritsos, Tej Chajed, Bryan Parno:
Inductive Invariants That Spark Joy: Using Invariant Taxonomies to Streamline Distributed Protocol Proofs. 837-853 - Jiacheng Ma, Rishabh R. Iyer, Sahand Kashani, Mahyar Emami, Thomas Bourgeat, George Candea:
Performance Interfaces for Hardware Accelerators. 855-874 - Eli Goldweber, Weixin Yu, Seyed Armin Vakil-Ghahani, Manos Kapritsos:
IronSpec: Increasing the Reliability of Formal Specifications. 875-891 - Minwoo Ahn, Jeongmin Han, Youngjin Kwon, Jinkyu Jeong:
Identifying On-/Off-CPU Bottlenecks Together with Blocked Samples. 893-910
ML Scheduling
- Bingyang Wu, Ruidong Zhu, Zili Zhang, Peng Sun, Xuanzhe Liu, Xin Jin:
dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving. 911-927 - Chaofan Lin, Zhenhua Han, Chengruidong Zhang, Yuqing Yang, Fan Yang, Chen Chen, Lili Qiu:
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable. 929-945 - Sudipta Saha Shubha, Haiying Shen, Anand Iyer:
USHER: Holistic Interference Avoidance for Resource Optimized ML Inference. 947-964 - Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica:
Fairness in Serving Large Language Models. 965-988 - Donglin Zhuang, Zhen Zheng, Haojun Xia, Xiafei Qiu, Junjie Bai, Wei Lin, Shuaiwen Leon Song:
MonoNN: Enabling a New Monolithic Optimization Space for Neural Network Inference Tasks on Modern GPU-Centric Architectures. 989-1005
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.