Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Efficient datapath merging for the overhead reduction of run-time reconfigurable systems

Published: 01 February 2012 Publication History

Abstract

High latencies in FPGA reconfiguration are known as a major overhead in run-time reconfigurable systems. This overhead can be reduced by merging multiple data flow graphs representing different kernels of the original program into a single (merged) datapath that will be configured less often compared to the separate datapaths scenario. However, the additional hardware introduced by this technique increases the kernels execution time. In this paper, we present a novel datapath merging technique that reduces both the configuration and execution times of kernels mapped on the reconfigurable fabric. Experimental results show up to 13% reduction in the configuration and execution times of kernels from media-bench workloads, compared to previous art on datapath merging. When compared to conventional high-level synthesis algorithms, our proposal reduces kernels configuration and execution times by up to 48%.

References

[1]
Woods N (2007) Integrating FPGAs in high-performance computing: the architecture and implementation perspective. In: Fifteenth ACM/SIGDA international symposium on field-programmable gate arrays (FPGA), pp 132-137.
[2]
El-Ghazawi T, El-Araby E, Huang M, Gaj K, Kindratenko V, Buell D (2008) The promise of high-performance reconfigurable computing. Computer 41(2):69-76.
[3]
Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv 34(2):171-210.
[4]
Li Z (2002) Configuration management techniques for reconfigurable computing. PhD thesis Northwestern University.
[5]
Rollmann M, Merker R (2008) A cost model for partial dynamic reconfiguration. In: International conference on embedded computer systems: architectures, modeling, and simulation (SAMOS). July 2008, Samos, Greece, pp 182-186.
[6]
Huang Z, Malik S (2001) Managing dynamic reconfiguration overhead in systems-on-a-chip design using reconfigurable datapaths and optimized interconnection networks. In: Proceeding of design automation and test in Europe (DATE), March 2001, Munich, Germany, pp 735-740.
[7]
Coussy Ph, Morawiec A (2008) High-level synthesis from algorithm to digital circuit. Springer, Berlin.
[8]
Fazlali M, Zakerolhosseini A, Sabeghi M, Bertels K, Gaydadjiev G (2009) Datapath configuration time reduction for run-time reconfigurable systems. In: International conference on engineering of reconfigurable systems and algorithms (ERSA), July 2009, Nevada, USA, pp 323-327.
[9]
Qu Y, Tiensyrj K, Soininen J-P, Nurmi J (2008) Design flow instantiation for run-time reconfigurable systems. EURASIP J Embed Syst 2(11):1-9.
[10]
Fazlali M, Zakerolhosseini A, Sahhbahrami A, Gaydadjiev G (2009) High speed merged datapath design for run-time reconfigurable systems. In: International conference on field-programmable technology (FPT), December 2009, Sydney, Australia, pp 339-342.
[11]
Kumlander D (2001) A new exact algorithm for the maximum-weight clique problem based on a heuristic vertex-coloring and a backtrack search. In: European congress of mathematics (4ECM), June-July 2001, Stockholm, Sweden, pp 202-208.
[12]
Farshadjam F, Dehghan M, Fathy M, Ahmadi M (2006) A new compression based approach for reconfiguration overhead reduction in virtex based RTR systems. Comput Electr Eng 32(4):322-347.
[13]
Chavet C, Andriamisaina C, Coussy Ph, Casseau E, Juin E, Urard P, Martin E (2007) A design flow dedicated to multimode architectures for DSP applications. In: International conference on computeraided design (ICCAD), November 2007, San Jose, CA, USA, pp 604-611.
[14]
Boden M, Fiebig T, Meißner T, Rülke S, Becker JA (2007) High-level synthesis of HW tasks targeting run-time reconfigurable FPGAs. In: IEEE international symposium on parallel and distributed processing (IPDPS 2007), March 2007, CA, USA, pp 1-8.
[15]
Chiou L, Bhunia S, Roy K (2005) Synthesis of application-specific highly efficient multi-mode cores for embedded systems. ACM Trans Embed Syst Comput 4(1):168-188.
[16]
Zuluaga M, Topham N (2008) Resource sharing in custom instruction set extensions. In: Symposium on application specific processors (SASP), June 2008, Wellington, DC, USA, pp 7-13.
[17]
Moreano N, Borin E, de Souza C, Araujo G (2005) Efficient datapath merging for partially reconfigurable architectures. IEEE Trans Comput-Aided Des 24(7):969-980.
[18]
Fazlali M Fallah KF, Zolghadr M, Zakerolhosseini A (2009) A new datapath merging method for reconfigurable systems. In: 5th international workshop on applied reconfigurable computing (ARC), March 2009, Karlsruhe, Germany, pp 157-168.
[19]
Economakos G (2006) High-level synthesis with reconfigurable datapath components. In: IEEE international conference on parallel and distributed processing symposium (IPDPS), April 2006, Rhodes Island, Greece.
[20]
Ghiasi S, Nahapetian A, Sarrafzadeh M (2004) An optimal algorithm for minimizing run-time reconfiguration delay. ACM Trans Embed Comput Syst 3(2):237-256.
[21]
Mehdipour F, Saheb-Zamani M, Ahmadifar HR, Sedighi M, Murakami K (2006) Reducing reconfiguration time of reconfigurable computing systems in integrated temporal partitioning and physical design framework. In: IEEE international parallel and distributed processing symposium (IPDPS), April 2006, Rhodes Island, Greece, pp 219-230.
[22]
Cordone R, Redaelli F, Redaelli MA, Santambrogio MD, Sciuto D (2009) Partitioning and scheduling of task graphs on partially dynamically reconfigurable FPGA. IEEE Trans Comput-Aided Des Integr Circuits Syst 28(5):662-675.
[23]
Brisk P, Kaplan A, Sarrafzadeh M (2004) Area-efficient instruction set synthesis for reconfigurable system-on-chip designs. In: Annual conference on design automation (DAC), June 2004, San Diego, CA, USA, pp 395-400.
[24]
Boden M, Fiebig T, Reiband M, Reichel P, Rulke S (2008) GePaRD a high-level generation flow for partially reconfigurable designs. In: IEEE computer society annual symposium on VLSI (ISVLSI), April 2008, France, pp 298-303.
[25]
Shannon K, Diessel O (2007) Module graph merging and placement to reduce reconfiguration overheads in paged FPGA devices. In: international conference on field programmable logic and applications (FPL), August 2007, Amsterdam, Netherlands, pp 293-298.
[26]
Fu W, Compton K (2005) An execution environment for reconfigurable computing. In: Annual IEEE symposium on field-programmable custom computing machines (FCCM), April 2005, CA, USA, pp 149-158.
[27]
de Souza C, Lima AM, Moreano N, Araujo G (2005) The datapath merging problem in reconfigurable systems: Lower bounds and heuristic evaluation. ACM J Exp Algorithmic 10(2):1.
[28]
Garey M, Johnson DS (1979) Computers and intractability-a guide to the theory of NP-completeness. Freeman, San Francisco.
[29]
Ostergard PRJ (2002) A fast algorithm for the maximum-weighted clique problem. Discrete Appl Math 120(1-3):197-207.
[30]
Lee C, Potkonjak M, Mangione WS (1997) Media-bench: a tool for evaluating and synthesizing multimedia and communication systems. In: Annual IEEE/ACM international symposium on microarchitecture (MICRO), December 1997, California, USA, pp 330-335.
[31]
GNU compiler collection internals. http://gcc.gnu.org/onlinedocs

Cited By

View all
  • (2023)A fast MILP solver for high-level synthesis based on heuristic model reduction and enhanced branch and bound algorithmThe Journal of Supercomputing10.1007/s11227-023-05109-279:11(12042-12073)Online publication date: 6-Mar-2023
  • (2021)A symbiosis between population based incremental learning and LP-relaxation based parallel genetic algorithm for solving integer linear programming modelsComputing10.1007/s00607-021-01004-x105:5(1121-1139)Online publication date: 3-Sep-2021
  • (2018)ADAMProceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3174243.3174247(189-198)Online publication date: 15-Feb-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Journal of Supercomputing
The Journal of Supercomputing  Volume 59, Issue 2
February 2012
532 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 February 2012

Author Tags

  1. Datapath merging
  2. Reconfigurable computing
  3. Run-time reconfigurable systems

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A fast MILP solver for high-level synthesis based on heuristic model reduction and enhanced branch and bound algorithmThe Journal of Supercomputing10.1007/s11227-023-05109-279:11(12042-12073)Online publication date: 6-Mar-2023
  • (2021)A symbiosis between population based incremental learning and LP-relaxation based parallel genetic algorithm for solving integer linear programming modelsComputing10.1007/s00607-021-01004-x105:5(1121-1139)Online publication date: 3-Sep-2021
  • (2018)ADAMProceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3174243.3174247(189-198)Online publication date: 15-Feb-2018
  • (2015)A fast placement algorithm for embedded just-in-time reconfigurable extensible processing platformThe Journal of Supercomputing10.1007/s11227-014-1290-y71:1(121-143)Online publication date: 1-Jan-2015
  • (2013)Thermal-aware datapath merging for coarse-grained reconfigurable processorsProceedings of the Conference on Design, Automation and Test in Europe10.5555/2485288.2485679(1649-1654)Online publication date: 18-Mar-2013

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media