Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

TBES: Template-Based Exploration and Synthesis of Heterogeneous Multiprocessor Architectures on FPGA

Published: 13 January 2016 Publication History

Abstract

This article describes TBES, a software end-to-end environment for synthesizing multitask applications on FPGAs. The implementation follows a template-based approach for creating heterogeneous multiprocessor architectures. Heterogeneity stems from the use of general-purpose processors along with custom accelerators. Experimental results demonstrate substantial speedup for several classes of applications.
Furthermore, this work allows for reducing development costs and saving development time for the software architect, the domain expert, and the optimization expert. This work provides a framework to bring together various existing tools and optimisation algorithms. The advantages are manifold: modularity and flexibility, easy customization for best-fit algorithm selection, durability and evolution over time, and legacy preservation including domain experts' know-how.
In addition to the use of architecture templates for the overall system, a second contribution lies in using high-level synthesis for promoting exploration of hardware IPs. The domain expert, who best knows which tasks are good candidates for hardware implementation, selects parts of the initial application to be potentially synthesized as dedicated accelerators. As a consequence, the HLS general problem turns into a constrained and more tractable issue, and automation capabilities eliminate the need for tedious and error-prone manual processes during domain space exploration.
The automation only takes place once the application has been broken down into concurrent tasks by the designer, who can then drive the synthesis process with a set of parameters provided by TBES to balance tradeoffs between optimization efforts and quality of results.
The approach is demonstrated step by step up to FPGA implementations and executions with an MJPEG benchmark and a complex Viola-Jones face detection application. We show that TBES allows one to achieve results with up to 10 times speedup to reduce development times and to widen design space exploration.

References

[1]
U. Alqasemi, H. Li, A. Aguirre, and Q. Zhu. 2012. FPGA-based reconfigurable processor for ultrafast interlaced ultrasound and photoacoustic imaging. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 59, 7 (2012), 1344--1353.
[2]
Altera. 2015a. Altera and IBM Unveil FPGA-Accelerated POWER Systems with Coherent Shared Memory. Retrieved from http://newsroom.altera.com/press-releases/nr-ibm-capi.htm.
[3]
Altera. 2015b. Stratix 10 - Overview. Retrieved from https://www.altera.com/products/fpga/stratix-series/stratix-10/overview.html.
[4]
ATL. 2014. The Atlas Transformation Language (ATL). Retrieved from http://www.eclipse.org/atl/.
[5]
I. Augé, F. Pétrot, F. Donnet, and P. Gomez. 2005. Platform-based design from parallel C specifications. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 12 (2005), 1811--1826.
[6]
K. Benkrid, D. Crookes, and A. Benkrid. 2002. Towards a general framework for FPGA based image processing using hardware skeletons. Parallel Computing 28, 7--8 (2002), 1141--1154.
[7]
E. Cartwright, A. Fahkari, S. Ma, C. Smith, M. Huang, D. Andrews, and J. Agron. 2012. Automating the design of mLUT MPSoPC FPGAs in the cloud. In Proceedings of the 2012 22nd International Conference on Field Programmable Logic and Applications (FPL'12). IEEE, 231--236.
[8]
Y. Corre, J. P. Diguet, D. Heller, and L. Lagadec. 2012. A framework for high-level synthesis of heterogeneous MP-SoC. In Proceedings of the Great Lakes Symposium on VLSI. ACM, 283--286.
[9]
P. Coussy, C. Chavet, P. Bomel, D. Heller, E. Senn, and E. Martin. 2008. GAUT: A High-Level Synthesis Tool for DSP applications. In High-Level Synthesis: From Algorithm to Digital Circuit. Springer, 147--169.
[10]
P. Feiler and D. Gluch. 2012. Model-Based Engineering with AADL: An Introduction to the SAE Architecture Analysis & Design Language. Addison-Wesley Professional.
[11]
B. Fort, A. Canis, J. Choi, N. Calagar, R. Lian, S. Hadjis, Y. T. Chen, M. Hall, B. Syrowik, T. Czajkowski, et al. 2014. Automating the design of processor/accelerator embedded systems with legup high-level synthesis. In Proceedings of the 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing (EUC'14). IEEE, 120--129.
[12]
S. L. Graham, P. B. Kessler, and M. K. Mckusick. 1982. Gprof: A call graph execution profiler. ACM Sigplan Notices 17, 6 (1982), 120--126.
[13]
S. Ha, S. Kim, C. Lee, Y. Yi, S. Kwon, and Y. Joo. 2007. PeaCE: A hardware-software codesign environment for multimedia embedded systems. ACM Transactions on Design Automation of Electrical Systems 12, 3 (2007), Article 24.
[14]
M. D. Hill and M. R. Marty. 2008. Amdahl's law in the multicore era. Computer 7 (2008), 33--38.
[15]
G. Kahn. 1974. The semantics of a simple language for parallel programming. Information Processing 74 (1974), 471--475.
[16]
J. Keinert, T. Schlichter, J. Falk, J. Gladigau, C. Haubelt, J. Teich, M. Meredith, and others. 2009. SystemCoDesigner—An automatic ESL synthesis approach by design space exploration and behavioral synthesis for streaming applications. ACM Transactions on Design Automation of Electronic Systems (TODAES) 14, 1 (2009), 1--23.
[17]
M. A. Kinsy and S. Devadas. 2012. Heracles 2.0: A tool for design space exploration of multi/many-core processors. In Proceedings of the Workshop on the Intersections of Computer Architecture and Reconfigurable Logic (CARL'12).
[18]
H. W. Kuhn. 1955. The hungarian method for the assignment problem. Naval Research Logistics Quarterly 2, 1--2 (1955), 83--97.
[19]
M. Leeser, S. Miller, and H. Yu. 2004. Smart camera based on reconfigurable hardware enables diverse real-time applications. In Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2004 (FCCM'04). IEEE, 147--155.
[20]
S. Li, N. Farahini, A. Hemani, K. Rosvall, and I. Sander. 2013. System level synthesis of hardware for DSP applications using pre-characterized function implementations. In Proceedings of the ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'13).
[21]
MDE. 2015. Model-Based Engineering Description. Retrieved from http://modelbasedengineering.com.
[22]
L. Moss, H. Guérard, G. Dare, and G. Bois. 2012. Rapid design exploration on an ESL framework featuring hardware-software codesign for ARM processor-based FPGA's. Space 1 (2012), 18.
[23]
H. Nikolov, T. Stefanov, and E. Deprettere. 2006. Multi-processor system design with ESPAM. In CODES+ ISSS'06. 211--216.
[24]
Opencores. 2014. Online OpenCores Library. Retrieved from http://opencores.org/.
[25]
P. Pawelczak, K. Nolan, L. Doyle, S. W. Oh, and D. Cabric. 2011. Cognitive radio: Ten years of experimentation and development. IEEE Communications Magazine 49, 3 (2011), 90--100.
[26]
A. D. Pimentel, C. Erbas, and S. Polstra. 2006. A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Transactions on Computers, 55, 2 (2006), 99--112.
[27]
M. Rashid, F. Ferrandi, and K. Bertels. 2009. Hartes design flow for heterogeneous platforms. In Quality of Electronic Design, 2009 (ISQED'09). IEEE, 330--338.
[28]
M. Sadri, C. Weis, N. Wehn, and L. Benini. 2013. Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ. In Proceedings of the 10th FPGAworld Conference. ACM, 5.
[29]
S. Shibata, S. Honda, H. Tomiyama, and H. Takada. 2010. Advanced systembuilder: A tool set for multiprocessor design space exploration. In Proceedings of the 2010 International SoC Design Conference (ISOCC'10).
[30]
D. Suzuki, N. Natsui, A. Mochizuki, S. Miura, H. Honjo, K. Kinoshita, H. Sato, S. Ikeda, T. Endoh, H. Ohno, and T. Hanyu. 2013. Fabrication of a magnetic tunnel junction-based 240-tile nonvolatile field-programmable gate array chip skipping wasted write operations for greedy power-reduced logic applications. IEICE Electronics Express 10, 23 (2013).
[31]
M. Thompson, H. Nikolov, T. Stefanov, A. D. Pimentel, C. Erbas, S. Polstra, and E. F. Deprettere. 2007. A framework for rapid system-level exploration, synthesis, and programming of multimedia MP-SoCs. In Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis. ACM, 9--14.
[32]
S. Vassiliadis, S. Wong, G. Gaydadjiev, K. Bertels, G. Kuzmanov, and E. M. Panainte. 2004. The MOLEN polymorphic processor. IEEE Transactions on Computers, 53, 11 (2004), 1363--1375.
[33]
S. Verdoolaege, H. Nikolov, and T. Stefanov. 2007. PN: A tool for improved derivation of process networks. EURASIP Journal on Embedded Systems 2007, 1 (2007), 19--32.
[34]
P. Viola and M. Jones. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001 (CVPR'01). Vol. 1. IEEE, I--511.
[35]
Xilinx. 2011a. Platform Format Specification Reference Manual - Xilinx (UG 642). Retrieved from http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_2/psf_rm.pdf. (2011).
[36]
Xilinx. 2011b. Xilinx XUPV5-LX110T FPGA Board Documentation. Retrieved from http://www.xilinx.com/ univ/xupv5-lx110t.htm. (2011).
[37]
Xilinx. 2012. Xilinx ML605 FPGA Board Documentation. Retrieved from http://www.xilinx.com/products/ boards/ml605/reference_designs.htm.
[38]
Xtext. 2015. Xtext website. Retrieved from https://eclipse.org/Xtext/index.html.
[39]
Y. Yankova, G. Kuzmanov, K. Bertels, G. Gaydadjiev, Y. Lu, and S. Vassiliadis. 2007. DWARV: Delftworkbench automated reconfigurable VHDL generator. In International Conference on Field Programmable Logic and Applications, 2007 (FPL'07). IEEE, 697--701.

Index Terms

  1. TBES: Template-Based Exploration and Synthesis of Heterogeneous Multiprocessor Architectures on FPGA

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Embedded Computing Systems
    ACM Transactions on Embedded Computing Systems  Volume 15, Issue 1
    February 2016
    530 pages
    ISSN:1539-9087
    EISSN:1558-3465
    DOI:10.1145/2872313
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 13 January 2016
    Accepted: 01 August 2015
    Revised: 01 July 2015
    Received: 01 January 2015
    Published in TECS Volume 15, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Electronic system level
    2. high-level synthesis
    3. multiprocessor
    4. system-on-chip.

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 186
      Total Downloads
    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media