Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3652032.3657580acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article
Open access

A Flexible-Granularity Task Graph Representation and Its Generation from C Applications (WIP)

Published: 20 June 2024 Publication History

Abstract

Modern hardware accelerators, such as FPGAs, allow offloading large regions of C/C++ code in order to improve the execution time and/or the energy consumption of software applications. An outstanding challenge with this approach, however, is solving the Hardware/Software (Hw/Sw) partitioning problem. Given the increasing complexity of both the accelerators and the potential code regions, one needs to adopt a holistic approach when selecting an offloading region by exploring the interplay between communication costs, data usage patterns, and target-specific optimizations. To this end, we propose representing a C application as an extended task graph (ETG) with flexible granularity, which can be manipulated through the merging and splitting of tasks. This approach involves generating a task graph overlay on the program's Abstract Syntax Tree (AST) that maps tasks to functions and the flexible granularity operations onto inlining/outlining operations. This maintains the integrity and readability of the original source code, which is paramount for targeting different accelerators and enabling code optimizations, while allowing the offloading of code regions of arbitrary complexity based on the data patterns of their tasks. To evaluate the ETG representation and its compiler, we use the latter to generate ETGs for the programs in Rosetta and MachSuite benchmark suites, and extract several metrics regarding data communication, task-level parallelism, and dataflow patterns between pairs of tasks. These metrics provide important information that can be used by Hw/Sw partitioning methods.

References

[1]
Vikram Advea and Rizos Sakellariou. 2001. Compiler Synthesis of Task Graphs for Parallel Program Performance Prediction. 2017, 208–226. isbn:978-3-540-42862-6 https://doi.org/10.1007/3-540-45574-4_14
[2]
João Bispo and João M.P. Cardoso. 2020. Clava: C/C++ source-to-source compilation using LARA. SoftwareX, 12 (2020), 100565. issn:2352-7110 https://doi.org/10.1016/j.softx.2020.100565
[3]
Jason Cong, Bin Liu, Stephen Neuendorffer, Juanjo Noguera, Kees Vissers, and Zhiru Zhang. 2011. High-Level Synthesis for FPGAs: From Prototyping to Deployment. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 30, 4 (2011), 473–491. https://doi.org/10.1109/TCAD.2011.2110592
[4]
Shao Deng, Shanzhu Xiao, Qiuqun Deng, and Huanzhang Lu. 2023. A hovering swarm particle swarm optimization algorithm based on node resource attributes for hardware/software partitioning. The Journal of Supercomputing, 80 (2023), 09, 1–23. https://doi.org/10.1007/s11227-023-05603-7
[5]
Milind Girkar and Constantine D. Polychronopoulos. 1994. The hierarchical task graph as a universal intermediate representation. Int. J. Parallel Program., 22, 5 (1994), oct, 519–551. issn:0885-7458 https://doi.org/10.1007/BF02577777
[6]
Rajiv Gupta and Madalene Spezialetti. 1996. A Compact Task Graph Representation for Real-Time Scheduling. Real-Time Systems, 11 (1996), 07, 71–102. https://doi.org/10.1007/BF00365521
[7]
Neng Hou, Xiaohu Yan, and Fazhi He. 2019. A Survey on Partitioning Models, Solution Algorithms and Algorithm Parallelization for Hardware/Software Co-Design. Des. Autom. Embedded Syst., 23, 1–2 (2019), jun, 57–77. issn:0929-5585 https://doi.org/10.1007/s10617-019-09220-7
[8]
Guiyuan Jiang, Jigang Wu, Siew Kei Lam, Thambipillai Srikanthan, and Jizhou Sun. 2015. Algorithmic aspects of graph reduction for hardware/software partitioning. The Journal of Supercomputing, 71 (2015), 02, https://doi.org/10.1007/s11227-015-1381-4
[9]
João N. Matos, João Bispo, and Luís Miguel Sousa. 2024. A C Subset for Ergonomic Source-to-Source Analyses and Transformations. In Proceedings of the 16th Workshop on Rapid Simulation and Performance Evaluation for Design (RAPIDO ’24). ACM, New York, NY, USA. 1–8. isbn:9798400717918 https://doi.org/10.1145/3642921.3642922
[10]
Razvan Nane, Vlad-Mihai Sima, Christian Pilato, Jongsok Choi, Blair Fort, Andrew Canis, Yu Ting Chen, Hsuan Hsiao, Stephen Brown, Fabrizio Ferrandi, Jason Anderson, and Koen Bertels. 2016. A Survey and Evaluation of FPGA High-Level Synthesis Tools. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35, 10 (2016), 1591–1604. https://doi.org/10.1109/TCAD.2015.2513673
[11]
Brandon Reagen, Robert Adolf, Yakun Sophia Shao, Gu-Yeon Wei, and David Brooks. 2014. MachSuite: Benchmarks for accelerator design and customized architectures. In IEEE Int’l Symposium on Workload Characterization (IISWC’2014). 110–119. https://doi.org/10.1109/IISWC.2014.6983050
[12]
Tiago Santos, João Bispo, and João M.P. Cardoso. 2023. A CPU-FPGA Holistic Source-To-Source Compilation Approach for Partitioning and Optimizing C/C++ Applications. In 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT). 320–322. https://doi.org/10.1109/PACT58117.2023.00034
[13]
Wenjun Shi, Jigang Wu, Guiyuan Jiang, Siew-kei Lam, and Iain Stewart. 2020. Multiple-Choice Hardware/Software Partitioning for Tree Task-Graph on MPSoC. Comput. J., 63, 1 (2020), 688–700. https://doi.org/10.1093/comjnl/bxy140
[14]
A. Silberman, A.D. Stoyen, and K. Sundaram. 1999. The use of task graphs for modeling complex system behavior. In Proceedings 2nd IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC’99) (Cat. No.99-61702). 340–349. https://doi.org/10.1109/ISORC.1999.776402
[15]
Yuan Zhou, Udit Gupta, Steve Dai, Ritchie Zhao, Nitish Srivastava, Hanchen Jin, Joseph Featherston, Yi-Hsiang Lai, Gai Liu, Gustavo Angarita Velasquez, Wenping Wang, and Zhiru Zhang. 2018. Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software-Programmable FPGAs. Int’l Symp. on Field-Programmable Gate Arrays (FPGA), Feb, https://doi.org/10.1145/3174243.3174255

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
LCTES 2024: Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems
June 2024
182 pages
ISBN:9798400706165
DOI:10.1145/3652032
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FPGA
  2. Hardware Accelerators
  3. Hardware/Software Partitioning
  4. Source-to-Source Compiler
  5. Task Graph

Qualifiers

  • Research-article

Funding Sources

  • Fundação para a Ciência e Tecnologia

Conference

LCTES '24

Acceptance Rates

Overall Acceptance Rate 116 of 438 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 91
    Total Downloads
  • Downloads (Last 12 months)91
  • Downloads (Last 6 weeks)23
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media