Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3174243.3174255acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
research-article
Public Access

Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs

Published: 15 February 2018 Publication History

Abstract

Modern high-level synthesis (HLS) tools greatly reduce the turn-around time of designing and implementing complex FPGA-based accelerators. They also expose various optimization opportunities, which cannot be easily explored at the register-transfer level. With the increasing adoption of the HLS design methodology and continued advances of synthesis optimization, there is a growing need for realistic benchmarks to (1) facilitate comparisons between tools, (2) evaluate and stress-test new synthesis techniques, and (3) establish meaningful performance baselines to track progress of the HLS technology. While several HLS benchmark suites already exist, they are primarily comprised of small textbook-style function kernels, instead of complete and complex applications. To address this limitation, we introduce Rosetta, a realistic benchmark suite for software programmable FPGAs. Designs in Rosetta are fully-developed applications. They are associated with realistic performance constraints, and optimized with advanced features of modern HLS tools. We believe that Rosetta is not only useful for the HLS research community, but can also serve as a set of design tutorials for non-expert HLS users. In this paper we describe the characteristics of our benchmarks and the optimization techniques applied to them. We further report experimental results on an embedded FPGA device as well as a cloud FPGA platform.

References

[1]
Amazon Web Services. AWS FPGA Developer AMI. https://aws. amazon. com/marketplace/pp/B06VVYBLZZ, Dec 2017.
[2]
Amazon Web Services. AWS Shell Interface Specification. https://github. com/aws/aws-fpga/blob/master/hdk/docs/AWS_Shell_Interface_Specification.md, Dec 2017.
[3]
U. Aydonat, S. O'Connell, D. Capalija, A. C. Ling, and G. R. Chiu. An OpenCL Deep Learning Accelerator on Arria 10. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[4]
D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black. A Naturalistic Open Source Movie for Optical Flow Evaluation. European Conference on Computer Vision (ECCV), Oct 2012.
[5]
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A Benchmark Suite for Heterogeneous Computing. Int'l Symp. on Workload Characterization (IISWC), Oct 2009.
[6]
P. Colangelo, R. Huang, E. Luebbers, M. Margala, and K. Nealis. Fine-Grained Acceleration of Binary Neural Networks Using Intel Xeon Processor with Integrated FPGA. Int'l Symp. on Field-Programmable Custom Computing Machines (FCCM), Apr/May 2017.
[7]
J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang. High-Level Synthesis for FPGAs: From Prototyping to Deployment. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 30(4):473--491, 2011.
[8]
M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to
[9]
1 or -1. arXiv preprint arXiv:1602.02830, Mar 2016.
[10]
S. Dai, R. Zhao, G. Liu, S. Srinath, U. Gupta, C. Batten, and Z. Zhang. Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[11]
Q. Gautier, A. Althoff, P. Meng, and R. Kastner. Spector: An OpenCL FPGA Benchmark Suite. Int'l Conf. on Field Programmable Technology (FPT), Dec 2016.
[12]
Y. Hara, H. Tomiyama, S. Honda, and H. Takada. Proposal and Quantitative Analysis of the CHStone Benchmark Program Suite for Practical C-Based High-Level Synthesis. Journal of Information Processing, Vol. 17, pages 242--254, Oct 2008.
[13]
A. Krizhevsky and G. Hinton. Learning Multiple Layers of Features from Tiny Images. Technical report, University of Toronto, Apr 2009.
[14]
Y. LeCun. The MNIST Database of Handwritten Digits. http://yann. lecun. com/exdb/mnist/, Dec 2017.
[15]
Y. Liang, K. Rupnow, Y. Li, D. Min, M. N. Do, and D. Chen. High-Level Synthesis: Productivity, Performance, and Software Constraints. Journal of Electrical and Computer Engineering, 2012:1:1--1:1, Jan 2012.
[16]
G. Liu, M. Tan, S. Dai, R. Zhao, and Z. Zhang. Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2017.
[17]
X. Liu, Y. Chen, T. Nguyen, S. Gurumani, K. Rupnow, and D. Chen. High Level Synthesis of Complex Applications: An H. 264 Video Decoder. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2016.
[18]
D. G. Lowe. Object Recognition from Local Scale-Invariant Features. Int'l Conf. on Computer Vision (ICCV), Oct 1999.
[19]
Y. Ma, Y. Cao, S. Vrudhula, and J.-s. Seo. Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[20]
K. P. Murphy. Machine Learning: A Probabilistic Perspective. MIT Press, 2012.
[21]
J. Pineda. A Parallel Algorithm for Polygon Rasterization. ACM SIGGRAPH Computer Graphics, 22(4):17--20, 1988.
[22]
L.-N. Pouchet. Polybench: The Polyhedral Benchmark Suite. http://www. cs. ucla. edu/pouchet/software/polybench, Dec 2017.
[23]
L.-N. Pouchet, P. Zhang, P. Sadayappan, and J. Cong. Polyhedral-Based Data Reuse Optimization for Configurable Computing. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2013.
[24]
B. Reagen, R. Adolf, Y. S. Shao, G.-Y. Wei, and D. Brooks. Machsuite: Benchmarks for Accelerator Design and Customized Architectures. Int'l Symp. on Workload Characterization (IISWC), Oct 2014.
[25]
Y. S. Shao, B. Reagen, G.-Y. Wei, and D. Brooks. Aladdin: A Pre-RTL, Power-Performance Accelerator Simulator Enabling Large Design Space Exploration of Customized Architectures. Int'l Symp. on Computer Architecture (ISCA), Jun 2014.
[26]
N. K. Srivastava, S. Dai, R. Manohar, and Z. Zhang. Accelerating Face Detection on Programmable SoC Using C-Based Synthesis. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[27]
The Apache Software Foundation. Public Corpus. http://spamassassin. apache. org/old/publiccorpus/, Apr 2017.
[28]
Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre, and K. Vissers. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[29]
P. Viola, M. J. Jones, and D. Snow. Detecting Pedestrians using Patterns of Motion and Appearance. International Journal of Computer Vision, 63(2):153--161, Jul 2005.
[30]
S. Wang, Y. Liang, and W. Zhang. FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs. Design Automation Conf. (DAC), Jun 2017.
[31]
Y. Wang, P. Li, and J. Cong. Theory and Algorithm for Generalized Memory Partitioning in High-Level Synthesis. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2014.
[32]
Z. Wang, B. He, W. Zhang, and S. Jiang. A Performance Analysis Framework for Optimizing OpenCL Applications on FPGAs. Int'l Symp. on High Performance Computer Architecture (HPCA), Mar 2016.
[33]
Z. Wei, L. Dah-Jye, and B. E. Nelson. FPGA-Based Real-Time Optical Flow Algorithm Design and Implementation. Journal of Multimedia, 2:38--45, Sep 2007.
[34]
H. Yonekawa and H. Nakahara. On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA. Int'l Parallel and Distributed Processing Symp. Workshops (IPDPSW), May 2017.
[35]
C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong. Optimizing FPGA-Based Accelerator Design for Deep Convolutional Neural Networks. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2015.
[36]
C. Zhang and V. K. Prasanna. Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[37]
J. Zhang and J. Li. Improving the Performance of OpenCL-Based FPGA Accelerator for Convolutional Neural Network. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[38]
Z. Zhang and B. Liu. SDC-Based Modulo Scheduling for Pipeline Synthesis. Int'l Conf. on Computer-Aided Design (ICCAD), Nov 2013.
[39]
J. Zhao, L. Feng, S. Sharad, W. Zhang, Y. Liang, and B. He. COMBA: A Comprehensive Model-Based Analysis Framework for High Level Synthesis of Real Applications. Int'l Conf. on Computer-Aided Design (ICCAD), Nov 2017.
[40]
R. Zhao, W. Song, W. Zhang, T. Xing, J.-H. Lin, M. B. Srivastava, R. Gupta, and Z. Zhang. Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[41]
G. Zhong, A. Prakash, Y. Liang, T. Mitra, and S. Niar. Lin-Analyzer: A High-Level Performance Analysis Tool for FPGA-Based Accelerators. Design Automation Conf. (DAC), Jun 2016.
[42]
Y. Zhou, K. M. Al-Hawaj, and Z. Zhang. A New Approach to Automatic Memory Banking using Trace-Based Address Mining. Int'l Symp. on Field-Programmable Gate Arrays (FPGA), Feb 2017.
[43]
W. Zuo, P. Li, D. Chen, L.-N. Pouchet, S. Zhong, and J. Cong. Improving Polyhedral Code Generation for High-Level Synthesis. Proc. of the 8th Int. Conf. on Hardware/Software Codesign and System Synthesis (CODES
[44]
ISSS), Sep/Oct 2013.

Cited By

View all
  • (2024)HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and BeyondProceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD10.1145/3670474.3685961(1-9)Online publication date: 9-Sep-2024
  • (2024)FADO: Floorplan-Aware Directive Optimization Based on Synthesis and Analytical Models for High-Level Synthesis Designs on Multi-Die FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/365345817:3(1-33)Online publication date: 20-Mar-2024
  • (2024)A Flexible-Granularity Task Graph Representation and Its Generation from C Applications (WIP)Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3652032.3657580(178-182)Online publication date: 20-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FPGA '18: Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
February 2018
310 pages
ISBN:9781450356145
DOI:10.1145/3174243
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 February 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. benchmarking
  2. fpga
  3. heterogeneous computing
  4. high-level synthesis
  5. reconfigurable computing

Qualifiers

  • Research-article

Funding Sources

Conference

FPGA '18
Sponsor:

Acceptance Rates

FPGA '18 Paper Acceptance Rate 10 of 116 submissions, 9%;
Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)293
  • Downloads (Last 6 weeks)34
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and BeyondProceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD10.1145/3670474.3685961(1-9)Online publication date: 9-Sep-2024
  • (2024)FADO: Floorplan-Aware Directive Optimization Based on Synthesis and Analytical Models for High-Level Synthesis Designs on Multi-Die FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/365345817:3(1-33)Online publication date: 20-Mar-2024
  • (2024)A Flexible-Granularity Task Graph Representation and Its Generation from C Applications (WIP)Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3652032.3657580(178-182)Online publication date: 20-Jun-2024
  • (2024)DONGLE 2.0: Direct FPGA-Orchestrated NVMe Storage for HLSACM Transactions on Reconfigurable Technology and Systems10.1145/365003817:3(1-32)Online publication date: 5-Mar-2024
  • (2024)Skip the Benchmark: Generating System-Level High-Level Synthesis Data using Generative Machine LearningProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3658738(170-176)Online publication date: 12-Jun-2024
  • (2024)Architectural Support for Sharing, Isolating and Virtualizing FPGA ResourcesACM Transactions on Architecture and Code Optimization10.1145/364847521:2(1-26)Online publication date: 21-May-2024
  • (2024)A Unified Memory Dependency Framework for Speculative High-Level SynthesisProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641581(13-25)Online publication date: 17-Feb-2024
  • (2024)REFINE: Runtime Execution Feedback for INcremental Evolution on FPGA DesignsProceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays10.1145/3626202.3637560(108-118)Online publication date: 1-Apr-2024
  • (2024)AMF-Placer 2.0: Open-Source Timing-Driven Analytical Mixed-Size Placer for Large-Scale Heterogeneous FPGAIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.337335743:9(2769-2782)Online publication date: Sep-2024
  • (2024)Invited Paper: Software/Hardware Co-design for LLM and Its Application for Design Verification2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473893(435-441)Online publication date: 22-Jan-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media