Article

Sparse Matrix-Vector Multiplication Design on FPGAs

Authors:

Junqing Sun,

Gregory Peterson,

Olaf StoraasliAuthors Info & Claims

FCCM '07: Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

Pages 349 - 352

https://doi.org/10.1109/FCCM.2007.60

Published: 23 April 2007 Publication History

Publisher Site

Abstract

Creating a high throughput sparse matrix vector multiplication (SpMxV) implementation depends on a balanced system design. In this paper, we introduce the innovative SpMxV Solver designed for FPGAs (SSF). Besides high computational throughput, system performance is optimized by reducing initialization time and overheads, minimizing and overlapping I/O operations, and increasing scalability. SSF accepts any matrix size and can be easily adapted to different data formats. SSF minimizes the control logic by taking advantage of the data flow via an innovative accumulation circuit which uses pipelined floating point adders. Compared to optimized software codes on a Pentium 4 microprocessor, our design achieves up to 20x speedup.

Cited By

View all

Yavits LGinosar R(2018)Accelerator for Sparse Machine LearningIEEE Computer Architecture Letters10.1109/LCA.2017.271466717:1(21-24)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1109/LCA.2017.2714667
Gomez-Cornejo JZuloaga AVillalta IDel Ser JKretzschmar ULazaro J(2017)A novel BRAM content accessing and processing method based on FPGA configuration bitstreamMicroprocessors & Microsystems10.1016/j.micpro.2017.01.00949:C(64-76)Online publication date: 1-Mar-2017
https://dl.acm.org/doi/10.1016/j.micpro.2017.01.009
Songhori EMirhoseini ALu XKoushanfar FNebel WAtienza D(2015)AHEADProceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition10.5555/2755753.2757032(942-947)Online publication date: 9-Mar-2015
https://dl.acm.org/doi/10.5555/2755753.2757032
Show More Cited By

Index Terms

Sparse Matrix-Vector Multiplication Design on FPGAs
1. Hardware

Recommendations

Sparse Matrix-Vector multiplication on FPGAs
FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays

Floating-point Sparse Matrix-Vector Multiplication (SpMXV) is a key computational kernel in scientific and engineering applications. The poor data locality of sparse matrices significantly reduces the performance of SpMXV on general-purpose processors, ...
Floating-point sparse matrix-vector multiply for FPGAs
FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays

Large, high density FPGAs with high local distributed memory bandwidth surpass the peak floating-point performance of high-end, general-purpose processors. Microprocessors do not deliver near their peak floating-point performance on efficient algorithms ...
A hardware-software co-design approach for implementing sparse matrix vector multiplication on FPGAs

The Field-Programmable Gate Array is an excellent match for the Sparse Matrix-Vector Multiply (SMVM) operation because of its enormous computational capacity and its ability to build a custom memory hierarchy that matches the memory access patterns of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

FCCM '07: Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

April 2007

354 pages

ISBN:0769529402

Publisher

IEEE Computer Society

United States

Publication History

Published: 23 April 2007

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Yavits LGinosar R(2018)Accelerator for Sparse Machine LearningIEEE Computer Architecture Letters10.1109/LCA.2017.271466717:1(21-24)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1109/LCA.2017.2714667
Gomez-Cornejo JZuloaga AVillalta IDel Ser JKretzschmar ULazaro J(2017)A novel BRAM content accessing and processing method based on FPGA configuration bitstreamMicroprocessors & Microsystems10.1016/j.micpro.2017.01.00949:C(64-76)Online publication date: 1-Mar-2017
https://dl.acm.org/doi/10.1016/j.micpro.2017.01.009
Songhori EMirhoseini ALu XKoushanfar FNebel WAtienza D(2015)AHEADProceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition10.5555/2755753.2757032(942-947)Online publication date: 9-Mar-2015
https://dl.acm.org/doi/10.5555/2755753.2757032
Mahdavikhah BMafi RSirouspour SNicolici N(2014)A Multiple-FPGA parallel computing architecture for real-time simulation of soft-object deformationACM Transactions on Embedded Computing Systems10.1145/256003113:4(1-23)Online publication date: 10-Mar-2014
https://dl.acm.org/doi/10.1145/2560031
Halstead RNajjar WRabbah RRaghunathan A(2013)Compiled multithreaded data paths on FPGAs for dynamic workloadsProceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems10.5555/2555729.2555732(1-10)Online publication date: 29-Sep-2013
https://dl.acm.org/doi/10.5555/2555729.2555732
Hoe DComer JCerda JMartinez CShirvaikar M(2012)Cellular automata-based parallel random number generators using FPGAsInternational Journal of Reconfigurable Computing10.1155/2012/2190282012(4-4)Online publication date: 1-Jan-2012
https://dl.acm.org/doi/10.1155/2012/219028
Mahdavikhah BMafi RSirouspour SNicolici NCheung PWawrzynek J(2010)Haptic rendering of deformable objects using a multiple FPGA parallel computing architectureProceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays10.1145/1723112.1723147(189-198)Online publication date: 21-Feb-2010
https://dl.acm.org/doi/10.1145/1723112.1723147
Dubois DDubois ABoorman TConnor CPoole S(2010)Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer with ApplicationACM Transactions on Reconfigurable Technology and Systems10.1145/1661438.16614403:1(1-31)Online publication date: 1-Jan-2010
https://dl.acm.org/doi/10.1145/1661438.1661440
Nagar KZhang YBakos JKindratenko VEl-Ghazawi T(2009)An integrated reduction technique for a double precision accumulatorProceedings of the Third International Workshop on High-Performance Reconfigurable Computing Technology and Applications10.1145/1646461.1646463(11-18)Online publication date: 15-Nov-2009
https://dl.acm.org/doi/10.1145/1646461.1646463

Abstract

Cited By

Index Terms

Recommendations

Sparse Matrix-Vector multiplication on FPGAs

Floating-point sparse matrix-vector multiply for FPGAs

A hardware-software co-design approach for implementing sparse matrix vector multiplication on FPGAs

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations