Cited By
View all- Piao XKim J(2024)GMM: An Efficient GPU Memory Management-based Model Serving System for Multiple DNN Inference ModelsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673122(660-668)Online publication date: 12-Aug-2024
- Sinha SDwivedi SAzizian M(2024)Towards Deterministic End-to-end Latency for Medical AI Systems in NVIDIA Holoscan2024 ACM/IEEE 15th International Conference on Cyber-Physical Systems (ICCPS)10.1109/ICCPS61052.2024.00028(235-246)Online publication date: 13-May-2024