research-article

TinyKG: Memory-Efficient Training Framework for Knowledge Graph Neural Recommender Systems

Authors:

Huiyuan Chen,

Xiaoting Li,

Kaixiong Zhou,

Xia Hu,

Chin-Chia Michael Yeh,

Yan Zheng,

Hao YangAuthors Info & Claims

RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

Pages 257 - 267

https://doi.org/10.1145/3523227.3546760

Published: 13 September 2022 Publication History

Get Access

Abstract

There has been an explosion of interest in designing various Knowledge Graph Neural Networks (KGNNs), which achieve state-of-the-art performance and provide great explainability for recommendation. The promising performance is mainly resulting from their capability of capturing high-order proximity messages over the knowledge graphs. However, training KGNNs at scale is challenging due to the high memory usage. In the forward pass, the automatic differentiation engines (e.g., TensorFlow/PyTorch) generally need to cache all intermediate activation maps in order to compute gradients in the backward pass, which leads to a large GPU memory footprint. Existing work solves this problem by utilizing multi-GPU distributed frameworks. Nonetheless, this poses a practical challenge when seeking to deploy KGNNs in memory-constrained environments, especially for industry-scale graphs.

Here we present TinyKG, a memory-efficient GPU-based training framework for KGNNs for the tasks of recommendation. Specifically, TinyKG uses exact activations in the forward pass while storing a quantized version of activations in the GPU buffers. During the backward pass, these low-precision activations are dequantized back to full-precision tensors, in order to compute gradients. To reduce the quantization errors, TinyKG applies a simple yet effective quantization algorithm to compress the activations, which ensures unbiasedness with low variance. As such, the training memory footprint of KGNNs is largely reduced with negligible accuracy loss. To evaluate the performance of our TinyKG, we conduct comprehensive experiments on real-world datasets. We found that our TinyKG with INT2 quantization aggressively reduces the memory footprint of activation maps with 7 ×, only with 2% loss in accuracy, allowing us to deploy KGNNs on memory-constrained devices.

References

[1]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013).

Abstract

References

Cited By

Index Terms

Recommendations

Cache-Memory Interfaces in Compressed Memory Systems

Delta-compressed caching for overcoming the write bandwidth limitation of hybrid main memory

Decoupled zero-compressed memory

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations