research-article

Architecture for dense matrix multiplication on a high-performance reconfigurable system

Authors:

Viviane L. S. de Souza,

Victor W. C. de Medeiros,

Manoel E. de LimaAuthors Info & Claims

SBCCI '09: Proceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the Dunes

Article No.: 42, Pages 1 - 6

https://doi.org/10.1145/1601896.1601950

Published: 31 August 2009 Publication History

Abstract

The recent evolution of the programmable logic devices, such as FPGAs (Field Programmable Gate Array), associated with the growing demand for performance improvements in scientific computing applications, has attracted the attention of supercomputers vendors. They have been developing hybrid platforms that links general-purpose processors with co-processors based on FPGAs, aiming computing acceleration.

In this work we present the analysis and development of an important scientific computing operation: matrix multiplication, targeting the commercial hybrid platform RASC (Reconfigurable Application-Specific Computing), developed by Silicon Graphics.

The proposed architecture aims to reach better performance than conventional architectures, dissipating less power. To achieve this goal, we investigated the possibilities of implementation in parallel and data reuse intrinsic to the algorithm. Based on this investigation we propose a case study that uses the available resources in the target platform to explore these features.

References

[1]

Laurenz Christian Buri, Studies of Classical HPC Problems on fine-grained and massively parallel computing enviromnment based on reconfigurable hardware, Msc. Thesis, Department of Microelectronics and Information Technology IMIT KTH, 2006.

[2]

Ronald Scrofano, Jr. Accelerating Scientific Computing Applications with reconfigurable hardware, Ph.D. Thesis, Faculty of the Graduate School University of Southern California, 2006.

Digital Library

[3]

Aiichiro Nakano. Class notes for CSCI 599: High performance scientific computing University of Southern California, Fall semester, 2003.

[4]

D. C. Rapaport. The Art of Molecular Dynamics Simulation. Cambridge University Press, Cambridge, 2004.

Digital Library

[5]

Maya B. Gokhale and Paul S. Graham. Reconfigurable Computing: Accelerating Computation with Field-Programmable Gate Arrays. Springer, Dordrecht, The Netherlands, 2005.

[6]

Ling Zhuo, Viktor K. Prasanna, Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems, IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 18, No. 4, pp. 433--448, April 2007.

Digital Library

[7]

Ling Zhuo, Viktor K. Prasanna, Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems, Proceedings of the 12th International Conference on Parallel and Distributed Systems, p. 87--95, July 12--15, 2006.

Digital Library

[8]

L. Zhuo and V. K. Prasanna. Design Tradeoffs for BLAS Operations on Reconfigurable Hardware. In Proc. 34th Int'l Conf. Parallel Processing (ICPP'05), Oslo, Norway, June 2005.

Digital Library

[9]

SRC Computers, Inc., http://www.srccomp.com/. Accesed in: March/2009.

[10]

SGI RASC, www.sgi.com/products/rasc/. Accessed in: March/2009.

[11]

Ling Zhuo, Viktor K. Prasanna, Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems, Proceedings of the 12th International Conference on Parallel and Distributed Systems, p. 87--95, July 12--15, 2006.

Digital Library

[12]

L. Zhuo and V. K. Prasanna, "High-Performance Linear Algebra Operations on Reconfigurable Systems," Proc. Supercomputing 2005, IEEE CS Press, 2005, p. 2.

Digital Library

[13]

R. Scrofano and V. K. Prasanna. Computing Lennard-Jones Potentials and Forces with Reconfigurable Hardware. In Proc. Int'l Conf. Eng. of Reconfigurable Systems and Algorithms (ERSA'04), pages 284--290, June 2004.

[14]

R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. V. der Vorst. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd Ed. SIAM, 1994.

[15]

NUMALink. http://www.nasi.com/sgi_NUMAlink.php. Accessed in: March/2009.

[16]

Laurenz Christian Buri, Studies of Cassicals HPC Problems on fine-grained and massively parallel computing enviromnment based on reconfigurable hardware, Msc. Thesis, Department of Microelectronics and Information Technology IMIT KTH, 2006.

[17]

Barros, A. C., Medeiros, V. W., Souza, V. L., Nascimento, P. S., Mazer, Â., Barbosa, J. P., Neves, B. P., Santos, I., and de Lima, M. E. 2008. Implementation of a double-precision multiplier accumulator with exception treatment to a dense matrix multiplier module in FPGA. In Proceedings of the 21st Annual Symposium on integrated Circuits and System Design (Gramado, Brazil, September 01--04, 2008). SBCCI '08. ACM, New York, NY, 40--45.

Digital Library

[18]

SSP Stub Users Guide http://techpubs.sgi.com/library/tpl/cgibin/getdoc.cgi?coll=linux&db=bks&fname=/SGI_EndUser/RASC_UG/apb.html. Accessed in: March/2009.

Cited By

Holanda BPimentel RBarbosa JCamarotti RSilva-Filho AJoao LSouza VFerraz JLima M(2011)An FPGA-Based Accelerator to Speed-Up Matrix Multiplication of Floating Point OperationsProceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum10.1109/IPDPS.2011.165(306-309)Online publication date: 16-May-2011
https://dl.acm.org/doi/10.1109/IPDPS.2011.165
do A. Ferreira Ada S. Barros E(2010)A high performance full pipelined arquitecture of MLP Neural Networks in FPGA2010 17th IEEE International Conference on Electronics, Circuits and Systems10.1109/ICECS.2010.5724619(742-745)Online publication date: Dec-2010
https://doi.org/10.1109/ICECS.2010.5724619

Index Terms

Architecture for dense matrix multiplication on a high-performance reconfigurable system

Recommendations

Power and energy efficiency evaluation for HW and SW implementation of nxn matrix multiplication on Altera FPGAs
FPGAworld '09: Proceedings of the 6th FPGAworld Conference

Matrix multiplication is most often involved in graphics, image processing, digital signal processing, robotics and control engineering applications. In this paper we compared and analyzed the power and energy consumption in three different designs, ...
Accelerating a Sparse Matrix Iterative Solver Using a High Performance Reconfigurable Computer
HPCMP-UGC '10: Proceedings of the 2010 DoD High Performance Computing Modernization Program Users Group Conference

High performance reconfigurable computers (HPRCs), which combine general-purpose processors (GPPs) and field programmable gate arrays (FPGAs), are now commercially available. These interesting architectures allow for the creation of reconfigurable ...
Customizable and High Performance Matrix Multiplication Kernel on FPGA (Abstract Only)
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Matrix multiplication (MM) is an important kernel in many application domains, including scientific computing, image processing, machine learning, etc. Numerous accelerator designs have been proposed for higher throughput and energy efficiency. In this ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SBCCI '09: Proceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the Dunes

August 2009

325 pages

ISBN:9781605587059

DOI:10.1145/1601896

General Chair:
Ivan Saraiva
UFRN, Brazil
,
Program Chairs:
Renato Perez Ribas
UFRGS, Brazil
,
Calvin Plett
Carleton University, Canada

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SBC: Brazilian Computer Society
SIGDA: ACM Special Interest Group on Design Automation
SBMICRO: Brazilian Microelectronics Society
IEEE Circuits & Systems Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 August 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SBCCI '09

Sponsor:

SBC
SIGDA
SBMICRO

SBCCI '09: 22nd Symposium on Integrated Circuits and System Design

August 31 - September 3, 2009

Natal, Brazil

Acceptance Rates

SBCCI '09 Paper Acceptance Rate 50 of 119 submissions, 42%;

Overall Acceptance Rate 133 of 347 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
135
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Holanda BPimentel RBarbosa JCamarotti RSilva-Filho AJoao LSouza VFerraz JLima M(2011)An FPGA-Based Accelerator to Speed-Up Matrix Multiplication of Floating Point OperationsProceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum10.1109/IPDPS.2011.165(306-309)Online publication date: 16-May-2011
https://dl.acm.org/doi/10.1109/IPDPS.2011.165
do A. Ferreira Ada S. Barros E(2010)A high performance full pipelined arquitecture of MLP Neural Networks in FPGA2010 17th IEEE International Conference on Electronics, Circuits and Systems10.1109/ICECS.2010.5724619(742-745)Online publication date: Dec-2010
https://doi.org/10.1109/ICECS.2010.5724619

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten