Sparse Matrix-Vector Product

Zbigniew Koza²,
Maciej Matyka²,
Łukasz Mirosław³ &
…
Jakub Poła⁴

3267 Accesses
1 Citations

Abstract

The sparse matrix-vector (SpMV) multiplication is one of the key kernels in scientific computing. We present the foundations of its implementation on CUDA- and OpenCL-enabled devices. After introducing the subject, we briefly present three most popular formats: COO, CRS and ELL. They serve as exemplary data structures on which we discuss hardware-related issues associated with efficient SpMV kernel design, such as matrix size, ordering of data, memory boundedness, storage overhead, thread divergence, and coalescence of memory transfers. Next, we present three widely available libraries with stable and validated SpMV kernels: cuSPARSE, CUSP and Paralution. We present and discuss complete codes of several SpMV kernels for both basic SpMV formats and some of its derivatives, including CMRS, and briefly discuss the principles beyond other popular format extensions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

LightSpMV: Faster CUDA-Compatible Sparse Matrix-Vector Multiplication Using Compressed Sparse Rows

Article 10 January 2017

Block-wise dynamic mixed-precision for sparse matrix-vector multiplication on GPUs

Article Open access 11 March 2024

References

Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: SC’09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp. 1–11. ACM, New York (2009). doi:http://doi.acm.org/10.1145/1654059.1654078
Choi, J.W., Singh, A., Vuduc, R.W.: Model-driven autotuning of sparse matrix-vector multiply on GPUs. SIGPLAN Not. 45(5), 115–126 (2010). http://doi.acm.org/10.1145/1837853.1693471
Dziekonski, A., Lamecki, A., Mrozowski, M.: A memory efficient and fast sparse matrix vector product on a GPU. Prog. Electromagn. Res. 116, 49–63 (2011). http://www.jpier.org/PIER/pier116/03.11031607.pdf
Feng, X., Jin, H., Zheng, R., Hu, K., Zeng, J., Shao, Z.: Optimization of sparse matrix-vector multiplication with variant CSR on GPUs. In: 2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS), pp. 165–172. IEEE, Tainan (2011)
Google Scholar
Koza, Z., Matyka, M., Szkoda, S., Mirosław, Ł.: Compressed multirow storage format for sparse matrices on graphics processing units. SIAM J. Sci. Comput. 36(2), 219–239 (2014). http://dx.doi.org/10.1137/120900216
Monakov, A., Lokhmotov, A., Avetisyan, A.: Automatically tuning sparse matrix-vector multiplication for GPU architectures. In: Patt, Y., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds.) High Performance Embedded Architectures and Compilers. Lecture Notes in Computer Science, vol. 5952, pp. 111–125. Springer, Heidelberg (2010). http://dx.doi.org/10.1007/978-3-642-11515-8_10.
Mukunoki, D., Takahashi, D.: Optimization of sparse matrix-vector multiplication for CRS format on NVIDIA Kepler architecture GPUs. In: Murgante, B., Misra, S., Carlini, M., Torre, C.M., Nguyen, H.Q., Taniar, D., Apduhan, B.O., Gervasi, O. (eds.) Computational Science and Its Applications – ICCSA 2013. Lecture Notes in Computer Science, vol. 7975, pp. 211–223. Springer, Heidelberg (2013). http://dx.doi.org/10.1007/978-3-642-39640-3_15
Su, B.Y., Keutzer, K.: clSpMV: a cross-platform opencl spmv framework on GPUs. In: Proceedings of the International Conference on Supercomputing, ICS ’12 (2012)
Google Scholar
Vázquez, F., Garzón, E.M., Martınez, J.A., Fernández, J.: Accelerating sparse matrix vector product with GPUs. In: Proceedings of the International Conference on Computational and Mathematical Methods in Science and Engineering (CMMSE 2009), pp. 1081–1092. CMMSE, Gijón (2009)
Google Scholar
Vázquez, F., Fernández, J.J., Garzón, E.M.: Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach. Parallel Comput. 38(8), 408–420 (2012). doi:10.1016/j.parco.2011.08.003. http://www.sciencedirect.com/science/article/pii/S0167819111001050
Vuduc, R.W.: Automatic performance tuning of sparse matrix kernels. Ph.D. thesis, University of California (2003)
Google Scholar
Yang, X., Parthasarathy, S., Sadayappan, P.: Fast sparse matrix-vector multiplication on GPUs: implications for graph mining. Proc. VLDB Endowment 4(4), 231–242 (2011)
Article Google Scholar
Yoshizawa, H., Takahashi, D.: Automatic tuning of sparse matrix-vector multiplication for CRS format on GPUs. In: 2012 IEEE 15th International Conference on Computational Science and Engineering, pp. 130–136 (2012). doi:http://doi.ieeecomputersociety.org/10.1109/ICCSE.2012.28

Download references

Author information

Authors and Affiliations

Faculty of Physics & Astronomy, University of Wrocław, Wrocław, Poland
Zbigniew Koza & Maciej Matyka
Institute of Informatics, Wrocław University of Technology, Wrocław, Poland
Łukasz Mirosław
Institute of Physics, University of Silesia, Katowice, Poland
Jakub Poła

Authors

Zbigniew Koza
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Matyka
View author publications
You can also search for this author in PubMed Google Scholar
Łukasz Mirosław
View author publications
You can also search for this author in PubMed Google Scholar
Jakub Poła
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zbigniew Koza .

Editor information

Editors and Affiliations

National Center for Supercomputing Applications, University of Illinois, Urbana, Illinois, USA
Volodymyr Kindratenko

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Koza, Z., Matyka, M., Mirosław, Ł., Poła, J. (2014). Sparse Matrix-Vector Product. In: Kindratenko, V. (eds) Numerical Computations with GPUs. Springer, Cham. https://doi.org/10.1007/978-3-319-06548-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-06548-9_6
Published: 09 June 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06547-2
Online ISBN: 978-3-319-06548-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sparse Matrix-Vector Product

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

LightSpMV: Faster CUDA-Compatible Sparse Matrix-Vector Multiplication Using Compressed Sparse Rows

Block-wise dynamic mixed-precision for sparse matrix-vector multiplication on GPUs

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Sparse Matrix-Vector Product

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

LightSpMV: Faster CUDA-Compatible Sparse Matrix-Vector Multiplication Using Compressed Sparse Rows

Block-wise dynamic mixed-precision for sparse matrix-vector multiplication on GPUs

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation