research-article

Sparta: high-performance, element-wise sparse tensor contraction on heterogeneous memory

Authors:

Jiawen Liu,

Jie Ren,

Roberto Gioiosa,

Dong Li,

Jiajia LiAuthors Info & Claims

PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Pages 318 - 333

https://doi.org/10.1145/3437801.3441581

Published: 17 February 2021 Publication History

Get Access

Editorial Notes

The authors have requested minor, non-substantive changes to the VoR and, in accordance with ACM policies, a Corrected VoR was published on March 1, 2021. For reference purposes the VoR may still be accessed via the Supplemental Material section on this page.

Abstract

Sparse tensor contractions appear commonly in many applications. Efficiently computing a two sparse tensor product is challenging: It not only inherits the challenges from common sparse matrix-matrix multiplication (SpGEMM), i.e., indirect memory access and unknown output size before computation, but also raises new challenges because of high dimensionality of tensors, expensive multi-dimensional index search, and massive intermediate and output data. To address the above challenges, we introduce three optimization techniques by using multi-dimensional, efficient hashtable representation for the accumulator and larger input tensor, and all-stage parallelization. Evaluating with 15 datasets, we show that Sparta brings 28 -- 576× speedup over the traditional sparse tensor contraction with sparse accumulator. With our proposed algorithm- and memory heterogeneity-aware data management, Sparta brings extra performance improvement on the heterogeneous memory with DRAM and Intel Optane DC Persistent Memory Module (PMM) over a state-of-the-art software-based data management solution, a hardware-based data management solution, and PMM-only by 30.7% (up to 98.5%), 10.7% (up to 28.3%) and 17% (up to 65.1%) respectively.

Supplementary Material

3441581-vor (3441581-vor.pdf)

Version of Record for "Sparta: high-performance, element-wise sparse tensor contraction on heterogeneous memory" by Liu et al., Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '21).

Download
1.03 MB

References

[1]

Neha Agarwal and Thomas F. Wenisch. Thermostat: Application-transparent page management for two-tiered main memory. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2017, Xi'an, China, April 8-12, 2017, pages 631--644, 2017.

Editorial Notes

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Athena: high-performance sparse tensor contraction sequence on heterogeneous memory

System evaluation of the Intel optane byte-addressable NVM

H2M: Exploiting Heterogeneous Shared Memory Architectures

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Badges

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations