Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Dataflow Graph Partitioning for Area-Efficient High-Level Synthesis with Systems Perspective

Published: 18 November 2014 Publication History

Abstract

Area efficiency in datapath synthesis is a widely accepted goal of high-level synthesis. Applications represented by their dataflow graphs are synthesized using resource sharing principles to reduce the area. However, existing resource sharing algorithms focus on absolute area reduction and maximal resource sharing. This kind of a design approach leads to constraints on how often, in terms of number of clock cycles, a new set of input data can be fed to an application. It also leads to very large multiplexers in case of very big dataflow graphs with hundreds of nodes. An adaptive dataflow graph partitioning algorithm is proposed that partitions a graph taking into account a user-defined constraint on how often a new set of input data (generally referred to as data initiation interval) is available. At the same time, a resource sharing algorithm is applied to such partitions in order to reduce area. Multiple design points are generated for a given dataflow graph with different area and time measures to enable a designer to make decisions. We demonstrate our graph partitioning algorithm using synthetically generated large dataflow graphs and on some benchmark applications.

References

[1]
Calypto Catapult. 2012. Catapult product family. http://www.calypto.com/products/catapult/overview.
[2]
A. Canis, J. Choi, M. Aldham, V. Zhang, A. Kammoona, J. H. Anderson, and S. Brown. 2011. LegUp: High-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'11). ACM Press, New York, 33--36.
[3]
J. M. P. Cardoso. 2001. Novel algorithm combining temporal partitioning and sharing of functional units. In Proceedings of the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01). IEEE, 31--40.
[4]
D. Chen and J. Cong. 2004. Register binding and port assignment for multiplexer optimization. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC'04). IEEE, 68--73.
[5]
J. Cong and J. Xu. 2008. Simultaneous FU and register binding based on network flow method. In Proceedings of the Design, Automation, and Test in Europe Conference (DATE'08). ACM Press, New York, 1057--1062.
[6]
T. Cormen, C. Leiserson, R. Rivest, and C. Stein. 2001. Introduction to Algorithms 2nd Ed. The MIT Press.
[7]
U. Dhawan, S. Sinha, S. K. Lam, and T. Srikanthan. 2010. Extended compatibility path based hardware binding algorithm for area-time efficient designs. In Proceedings of the 2nd Asia Symposium on Quality Electronics Design (ASQED'10). IEEE, 151--156.
[8]
Express DFG Benchmarks. 2014. Express specifications. http://express.ece.ucsb.edu/benchmark.
[9]
D. Gajski, A. Wu, N. Dutt, and S. Lin., 1992. High Level Synthesis: Introduction to Chip and System Design . Kluwer Academic.
[10]
R. Gupta, S. Gupta, N. D. Dutt, and A. Nicolau. 2004. SPARK: A Parallelizing Approach to the High Level Synthesis of Digital Circuits. Springer.
[11]
C.-Y. Huang, Y.-S. Chen, Y.-L. Lin, and Y.-C. Hsu. 1990. Datapath allocation based on bipartite weighted matching. In Proceedings of the 27th ACM/IEEE Design Automation Conference (DAC'90). ACM Press, New York, 499--504.
[12]
M. Inagi, Y. Takashima, Y. Nakamura, and Y. Kajitani. 2006. A performance driven circuit bipartitioning algorithm for multi-FPGA implementation with time-multiplexed I/Os. In Proceedings of the IEEE International Conference on Field Programmable Technology (FPT'06). 361--364.
[13]
Y. C. Jiang and J. F. Wang. 2007. Temporal dataflow graphs for dynamically reconfigurable computing. IEEE Trans. VLSI 15, 12, 1351--1361.
[14]
T. Kim and X. Liu. 2007.Compatibility path based binding algorithm for interconnect reduction in high level synthesis. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design (ICCAD'07). ACM Press, New York, 435--441.
[15]
F. J. Kurdahi and A. C. Parker. 1987. Real: A program for register allocation. In Proceedings of the 24th ACM/IEEE Design Automation Conference (DAC'87). ACM Press, New York, 210--215.
[16]
C. Lee, M. Potkonjak, and W. Mangione-Smith. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communication systems. In Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'97). IEEE, 330--335.
[17]
R. Mohan, A. Dasu, and S. Pamchnathan. 2001. Temporal partitioning of circuits for advanced partially reconfigurable systems. Proc. SPIE 4525, 27--35.
[18]
Netbeans IDE 7.1.2. 2013. https://netbeans.org/features/index.html.
[19]
B. Pangrle. 1988. Splicer: A heuristic approach to connectivity binding. In Proceedings of the 25th ACM/IEEE Design Automation Conference (DAC'88). ACM Press, New York, 536--541.
[20]
K. M. G. Purna and D. Bhatia. 1999. Temporal partitioning and scheduling dataflow graphs for reconfigurable computers. IEEE Trans. Comput. 48, 6, 579--590.
[21]
Z. Roncheng, T. Jiarong, and T. Pushan. 1998. FPART: A multi-way fpga partitioning procedure based on the improved FM algorithm. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC'98). IEEE, 513--518.
[22]
B. C. Schaffer. 2013. Automatic partitioning of behavioral descriptions for high level synthesis with multiple internal throughputs. In Proceedings of the Electronic System Level Synthesis Conference (ESLsyn'13). IEEE, 1--6.
[23]
M. Sivaraman and S. Aditya. 2002. Cycle-time aware architecture synthesis of custom hardware accelerators. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis of Embedded Systems (CASES'02). ACM Press, New York, 35--42.
[24]
V. Srinivasan, V. Govindarajan, and R. Vemuri. 2001. Fine grained and coarse grained behavioral partitioning with effective utilization of memory and design space exploration for multi-FPGA architectures. IEEE Trans. VLSI 9, 1, 140--158.
[25]
A. Trimaran. 2007. Research compiler specifications. www.trimaran.org.
[26]
C.-J. Tseng and D. Siewiorek. 1986. Automated synthesis of datapath in digital systems. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 5, 3, 379--395.
[27]
Vivadohls/Xilinx Autoesl. 2012. Vivado hls (formerly autoesl) specifications. http://www.xilinx.com/products/design-tools/vivado/index.htm.
[28]
C. Yin, S. Yin, L. Liu, and S. Wei. 2009. Temporal partitioning algorithm for a coarse-grained reconfigurable computing architecture. In Proceedings of the 12th International Symposium on Integrated Circuits (ISIC'09). IEEE, 659--662.
[29]
D. C. Zaretsky, G. Mittal, R. P. Dick, and P. Banerjee. 2007. Balanced scheduling and operation chaining in high-level synthesis for FPGA designs. In Proceedings of the 8th International Symposium on Quality Electronic Design (ISQED '07). IEEE, 595--601.

Cited By

View all
  • (2022)Reuse-Aware Partitioning of Dataflow Graphs on a Tightly-Coupled CGRA2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00065(458-467)Online publication date: Dec-2022
  • (2019)Performance Modeling and Directives Optimization for High Level Synthesis on FPGAIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2019.2912916(1-1)Online publication date: 2019
  • (2017)COMBAProceedings of the 36th International Conference on Computer-Aided Design10.5555/3199700.3199757(430-437)Online publication date: 13-Nov-2017
  • Show More Cited By

Index Terms

  1. Dataflow Graph Partitioning for Area-Efficient High-Level Synthesis with Systems Perspective

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 20, Issue 1
    November 2014
    377 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/2690851
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 18 November 2014
    Accepted: 01 July 2014
    Revised: 01 September 2013
    Received: 01 November 2012
    Published in TODAES Volume 20, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Automatic synthesis
    2. FPGA
    3. computer-aided design
    4. datapath synthesis
    5. design-space exploration
    6. high-level synthesis

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Reuse-Aware Partitioning of Dataflow Graphs on a Tightly-Coupled CGRA2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00065(458-467)Online publication date: Dec-2022
    • (2019)Performance Modeling and Directives Optimization for High Level Synthesis on FPGAIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2019.2912916(1-1)Online publication date: 2019
    • (2017)COMBAProceedings of the 36th International Conference on Computer-Aided Design10.5555/3199700.3199757(430-437)Online publication date: 13-Nov-2017
    • (2017)A Bitwidth-Aware High-Level Synthesis Algorithm Using Operation Chainings for Tiled-DR ArchitecturesIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.E100.A.2911E100.A:12(2911-2924)Online publication date: 2017
    • (2017)COMBA: A comprehensive model-based analysis framework for high level synthesis of real applications2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)10.1109/ICCAD.2017.8203809(430-437)Online publication date: Nov-2017
    • (2017)LP-HLS: Automatic power-intent generation for high-level synthesis based hardware implementation flowMicroprocessors and Microsystems10.1016/j.micpro.2017.02.00250(26-38)Online publication date: May-2017

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media