Cited By
View all- Prabhu RNayak AMohan JRamjee RPanwar AEeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)vAttention: Dynamic Memory Management for Serving LLMs without PagedAttentionProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707256(1133-1150)Online publication date: 3-Feb-2025
- Li CSha SZeng YYang XLuo YWang XWang ZZhou DBagchi SZhang Y(2024)Taming hot bloat under virtualization with HUGESCOPEProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692053(999-1012)Online publication date: 10-Jul-2024
- Zhou ZGogte VVaish NKennelly CXia PKanev SMoseley TDelimitrou CRanganathan PTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Characterizing a Memory Allocator at Warehouse ScaleProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651350(192-206)Online publication date: 27-Apr-2024
- Show More Cited By