Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Performance and Area Modeling of Complete FPGA Designs in the Presence of Loop Transformations

Published: 01 November 2004 Publication History

Abstract

Selecting which program transformations to apply when mapping computations to FPGA-based computing architectures can lead to prohibitively long design space exploration cycles. An alternative is to develop fast, yet accurate, performance and area models to quickly understand the impact and interaction of the transformations. In this paper, we present a combined analytical performance and area modeling approach for complete FPGA designs in the presence of loop transformations. Our approach takes into account the impact of input/output memory bandwidth and memory interface resources, often the limiting factor in the effective implementation of computations. Our preliminary results reveal that our modeling is very accurate, being therefore amenable to be used in a compiler tool to quickly explore very large design spaces.

References

[1]
J. Park and P. Diniz, “Synthesis of Memory Access Controller for Streamed Data Applications for FPGA-Based Computing Engines,” Proc. 14th Int'l Symp. System Synthesis (ISSS 2001), 2001.]]
[2]
M. Wolf and M. Lam, “A Loop Transformation Theory and an Algorithm for Maximizing Parallelism,” IEEE Trans. Parallel and Distributed Systems, 1991.]]
[3]
B. So and M. Hall, “Increasing the Applicability of Scalar Replacement,” Proc. Int'l Conf. Compiler Construction (CC '04), pp.nbsp185-201, 2004.]]
[4]
U. Banerjee R. Eigenman A. Nicolau and D. Padua, Automatic Program Parallelization. IEEE, 1993.]]
[5]
M. Kandemir and A. Choudhary, “Compiler-Directed Scratch Pad Memory Hierarchy Design and Management,” Proc. 2002 ACM/IEEE Design Automation Conf. (DAC '02), 2002.]]
[6]
Xilinx, “Virtex 2.5v FPGA Product Specification. ds003(v2.4),” 2000.]]
[7]
Annapolis MicroSystems, “Wildstar(TM) Reference Manual rev. 4.0,” 1999.]]
[8]
D. Kulkarni W. Najjar R. Rinker and F. Kurdah, “Fast Area Estimation to Support Compiler Optimizations in FPGA-Based Reconfigurable Systems,” Proc. 2002 Symp. Field-Programmable Custom Computing Machines (FCCM '02), 2002.]]
[9]
A. Nayak M. Haldar A. Choudhary and P. Banerjee, “Accurate Area and Delay Estimators for FPGAs,” Proc. Design Automation and Test in Europe Conf. and Exhibition (DATE '02), 2002.]]
[10]
M. Kaul R. Vemuri S. Givindarajan and I.E. Ouaiss, “An Automated Temporal Partitioning and Loop Fission Approach to FPGA Based Reconfigurable Synthesis of DSP Applications,” Proc. IEEE/ACM Design Automation Conf. (DAC '99), 1999.]]
[11]
J. Cardoso, “Loop Dissevering: A Technique for Temporally Partitioning Loops in Dynamically Reconfigurable Computing Platforms,” Proc. 10th Reconfigurable Architectures Workshop (RAW 2003), 2002.]]
[12]
B. So M. Hall and P. Diniz, “A Compiler Approach to Fast Hardware Design Space Exploration for FPGA Systems,” Proc. 2002 ACM Conf. Programming Language Design and Implementation (PLDI '02), pp. 165-176, 2002.]]
[13]
S. Derrien and S. Rajoupadyhe, “Loop Tiling for Reconfigurable Accelerators,” Proc. 11th Int'l Symp. Field-Programmable Logic (FPL 2001), 2001.]]
[14]
J. Liao W.F. Wong and T. Mitra, “A Model for the Hardware Realization of Loops,” Proc. 2003 Int'l Symp. Field Programmable Logic and Applications (FPL '03), pp. 334-344, 2003.]]
[15]
V. Kathail S. Aditya R. Schreiber B. Rau D. Cronquist and M. Sivaraman, “PICO: Automatically Designing Custom Computers,” Computer, 2002.]]
[16]
A. Bakshi V. Prasanna and A. Ledeczi, “MILAN: A Model Based Integrated Simulation Framework for Design of Embedded Systems,” Proc. 2001 Workshop Languages, Compilers, and Tools for Embedded Systems (LCTES 2001), 2001.]]
[17]
A. Halambi P. Grun V. Ganesh A. Khare N. Dutt and A. Nicolau, “EXPRESSION: A Language for Architecture Exploration through Compiler/Simulator Retargetability,” Proc. Conf. Design Automation and Test Europe (DATE '99), 1999.]]

Cited By

View all
  • (2019)Loop Unrolling for Energy Efficiency in Low-Cost Field-Programmable Gate ArraysACM Transactions on Reconfigurable Technology and Systems10.1145/328918611:4(1-23)Online publication date: 21-Jan-2019
  • (2019) SpExSim: assessing kernel suitability for C-based high-level hardware synthesisThe Journal of Supercomputing10.1007/s11227-017-2101-z75:8(4062-4077)Online publication date: 1-Aug-2019
  • (2019)Type-Driven Automated Program Transformations and Cost Modelling for Optimising Streaming Programs on FPGAsInternational Journal of Parallel Programming10.1007/s10766-018-0572-z47:1(114-136)Online publication date: 1-Feb-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 53, Issue 11
November 2004
144 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 November 2004

Author Tags

  1. .
  2. 65
  3. FPGAs
  4. Field-Programmable-Gate-Arrays
  5. Field-Programmable-Gate-Arrays (FPGAs).
  6. Index Terms- Performance analysis and modeling
  7. configurable computing
  8. loop transformations and high-level synthesis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Loop Unrolling for Energy Efficiency in Low-Cost Field-Programmable Gate ArraysACM Transactions on Reconfigurable Technology and Systems10.1145/328918611:4(1-23)Online publication date: 21-Jan-2019
  • (2019) SpExSim: assessing kernel suitability for C-based high-level hardware synthesisThe Journal of Supercomputing10.1007/s11227-017-2101-z75:8(4062-4077)Online publication date: 1-Aug-2019
  • (2019)Type-Driven Automated Program Transformations and Cost Modelling for Optimising Streaming Programs on FPGAsInternational Journal of Parallel Programming10.1007/s10766-018-0572-z47:1(114-136)Online publication date: 1-Feb-2019
  • (2017)HLscope+Proceedings of the 36th International Conference on Computer-Aided Design10.5555/3199700.3199792(691-698)Online publication date: 13-Nov-2017
  • (2017)HLScope+,: Fast and accurate performance estimation for FPGA HLS2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)10.1109/ICCAD.2017.8203844(691-698)Online publication date: 13-Nov-2017
  • (2016)A Lost Cycles Analysis for Performance Prediction using High-Level SynthesisProceedings of the 12th International Symposium on Applied Reconfigurable Computing - Volume 962510.1007/978-3-319-30481-6_28(334-342)Online publication date: 22-Mar-2016
  • (2013)Performance modeling for FPGAsInternational Journal of Reconfigurable Computing10.1155/2013/4280782013(7-7)Online publication date: 1-Jan-2013
  • (2008)Achieving programming model abstractions for reconfigurable computingIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2007.91210616:1(34-44)Online publication date: 1-Jan-2008
  • (2007)Partial data reuse for windowing computationsProceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications10.5555/1764631.1764644(97-109)Online publication date: 27-Mar-2007
  • (2007)A Novel Application-specific Instruction-set Processor Design Approach for Video Processing AccelerationJournal of VLSI Signal Processing Systems10.1007/s11265-007-0050-047:3(297-315)Online publication date: 1-Jun-2007
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media