Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Tÿcho: A Framework for Compiling Stream Programs

Published: 14 December 2019 Publication History

Abstract

Many application areas for embedded systems, such as DSP, media coding, and image processing, are based on stream processing. Stream programs in these areas are often naturally described as graphs, where nodes are computational kernels that send data over the edges. This structure also exhibits large amounts of concurrency, because the kernels can execute independently as long as there are data to process on the edges. The explicit data dependencies also help making efficient sequential implementations of such programs, allowing programs to be more portable between platforms with various degrees of parallelism.
The kernels can be expressed in many different ways; for example, as imperative programs with read and write statements for the communication or as a set of actions that can be performed and conditions for when these actions can be executed. Traditionally, there has been a tension between how the kernels are expressed and how efficiently they can be implemented. There are very efficient implementation techniques for stream programs with restricted expressiveness, such as synchronous dataflow.
In this article, we present a framework for building stream program compilers that we call Tÿcho. At the core of this framework is a common kernel representation, based on a machine model for stream program kernels called actor machine, on which transformations and optimizations are performed. Both imperative and action-based kernels are translated to this common representation, making the same optimizations applicable to different kinds of kernels, and even across source language boundaries. An actor machine is described by the steps of execution that a kernel can take, and the conditions for taking them, together with a controller that decides how the conditions are tested and the steps are taken.
We outline how kernels of an imperative process language and an action-based language are decomposed and translated to the common kernel representation, and we describe a simple backend that generates sequential C code from this representation. We present optimization heuristics of the decision process in the controller that we evaluate using a few dozen kernels from a video decoder with various degrees of complexity. We also present kernel fusion, by merging the controllers of actor machines, as a way of scheduling kernels on the same processor, which we compare to prior art.

References

[1]
Marianne Baudinet and David MacQueen. 1985. Tree pattern matching for ML. (1985). http://www.smlnj.org/compiler-notes/85-note-baudinet.ps.
[2]
E. Bezati, M. Mattavelli, and J. W. Janneck. 2013. High-level synthesis of dataflow programs for signal processing systems. In Proceedings of the 2013 8th International Symposium on Image and Signal Processing and Analysis (ISPA’13). 750--754.
[3]
Greet Bilsen, Marc Engels, Rud Lauwereins, and Jean Peperstraete. 1996. Cycle-static dataflow. IEEE Trans. Sign. Process. 44, 2 (1996), 397--408.
[4]
Jani Boutellier, Johan Ersfolk, Johan Lilius, Marco Mattavelli, Ghislain Roquier, and Olli Silven. 2015. Actor merging for dataflow process networks. IEEE Trans. Sign. Process. 63, 10 (2015), 2496--2508.
[5]
J. T. Buck and E. A. Lee. 1993. Scheduling dynamic dataflow graphs with bounded memory using the token flow model. In Proceedings of the 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. 429--432.
[6]
Gustav Cedersjö and Jörn W. Janneck. 2014. Software code generation for dynamic dataflow programs. In Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems. ACM, 31--39.
[7]
Gustav Cedersjö and Jörn W. Janneck. 2016. Processes and actors: Translating kahn processes to dataflow with firing. In Proceedings of the 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS’16). IEEE, 21--30.
[8]
Gustav Cedersjö, Jörn W. Janneck, and Jonas Skeppstedt. 2014. Finding fast action selectors for dataflow actors. In Proceedings of the 2014 48th Asilomar Conference on Signals, Systems and Computers. IEEE, 1435--1439.
[9]
G. Cedersjö and J. W. Janneck. 2012. Toward efficient execution of dataflow actors. In Proceedings of the 2012 Conference Record of the 46th Asilomar Conference on Signals, Systems and Computers (ASILOMAR’12). 1465--1469.
[10]
G. Cedersjö and J. W. Janneck. 2013. Actor classification using actor machines. In Proceedings of the 2013 Conference on Signals, Systems and Computers (ASILOMAR’13). 1801--1804.
[11]
Jack B. Dennis. 1974. First version of a data flow procedure language. In Proceedings of the Programming Symposium. Springer, 362--376.
[12]
Stephen A. Edwards. 2003. Tutorial: Compiling concurrent languages for sequential processors. ACM Trans. Des. Autom. Electron. Syst. 8, 2 (Apr. 2003), 141--187.
[13]
Johan Eker and Jörn W. Janneck. 2003. CAL Language Report Specification of the CAL Actor Language. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/4186.html.
[14]
Joachim Falk, Christian Haubelt, and Jürgen Teich. 2006. Efficient representation and simulation of model-based designs in systemc. In Proceedings of the Forum on Specification 8 Design Languages (FDL’06), Vol. 6.
[15]
Joachim Falk, Christian Zebelein, Joachim Keinert, Christian Haubelt, Juergen Teich, and Shuvra S. Bhattacharyya. 2010. Analysis of systemc actor networks for efficient synthesis. ACM Trans. Embed. Comput. Syste. 10, 2 (2010), 18.
[16]
Essayas Gebrewahid, Mingkun Yang, Gustav Cedersjö, Zain Ul Abdin, Veronica Gaspes, Jörn W. Janneck, and Bertil Svensson. 2014. Realizing efficient execution of dataflow actors on manycores. In Proceedings of the 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing (EUC’14). IEEE, 321--328.
[17]
Ruirui Gu, Jörn W. Janneck, Mickaël Raulet, and Shuvra S. Bhattacharyya. 2011. Exploiting statically schedulable regions in dataflow programs. J. Sign. Process. Syst. 63, 1 (2011), 129--142.
[18]
Rajiv Gupta, Eduard Mehofer, and Youtao Zhang. 2002. Profile-guided compiler optimizations. In The Compiler Design Handbook: Optimizations and Machine Code Generation. CRC Press.
[19]
Chia-Jui Hsu, Fuat Keceli, Ming-Yung Ko, Shahrooz Shahparnia, and Shuvra S. Bhattacharyya. 2004. DIF: An interchange format for dataflow-based design tools. In Proceedings of the International Workshop on Embedded Computer Systems. Springer, 423--432.
[20]
Huynh Phung Huynh, Andrei Hagiescu, Weng-Fai Wong, and Rick Siow Mong Goh. 2012. Scalable framework for mapping streaming applications onto multi-GPU systems. SIGPLAN Not. 47, 8 (Feb. 2012), 1--10.
[21]
ISO/IEC 23001-4:2009 2009. Information Technology—MPEG Systems Technologies—Part 4: Codec Configuration Representation. Standard.
[22]
Jorn W. Janneck. 2011. A machine model for dataflow actors and its applications. In Proceedings of the 2011 Conference Record of the 45th Asilomar Conference on Signals, Systems and Computers (ASILOMAR’11). IEEE, 756--760.
[23]
Gilles Kahn. 1974. The semantics of a simple language for parallel programming. In Proceedings of the IFIP Congress on Information Processing’74, Vol. 74. 471--475.
[24]
Bart Kienhuis and Ed F. Deprettere. 2003. Modeling stream-based applications using the SBF model of computation. J. VLSI Sign. Process. Syst. Sign. Image Vid. Technol. 34, 3 (2003), 291--300.
[25]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis 8 transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization. IEEE Computer Society, 75.
[26]
Edward Lee, David G. Messerschmitt, et al. 1987. Synchronous data flow. Proc. IEEE 75, 9 (1987), 1235--1245.
[27]
Edward A. Lee. 1997. A Denotational Semantics for Dataflow with Firing. Electronics Research Laboratory, College of Engineering, University of California. http://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/3167.html.
[28]
William Plishker, Nimish Sane, and Shuvra S. Bhattacharyya. 2009. A generalized scheduling approach for dynamic dataflow applications. In Proceedings of the Conference on Design, Automation and Test in Europe. European Design and Automation Association, 111--116.
[29]
Jonas Skeppstedt. 2016. The ASIM Power Architecture simulator.
[30]
Robert Soulé, Michael I. Gordon, Saman Amarasinghe, Robert Grimm, and Martin Hirzel. 2013. Dynamic expressivity with static optimization for streaming languages. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems. ACM, 159--170.
[31]
K. Strehl, L. Thiele, M. Gries, D. Ziegenbein, R. Ernst, and J. Teich. 2001. FunState-an internal design representation for codesign. IEEE Trans. VLSI Syst. 9, 4 (Aug. 2001), 524--544.
[32]
Sander Stuijk, Marc Geilen, and Twan Basten. 2006. SDF3: SDF for free. In Proceedings of the 6th International Conference on Application of Concurrency to System Design 2006 (ACSD’06). IEEE, 276--278.
[33]
William Thies, Michal Karczmarek, and Saman Amarasinghe. 2002. StreamIt: A language for streaming applications. In Proceedings of the International Conference on Compiler Construction. Springer, 179--196.
[34]
William Thies, Michal Karczmarek, Janis Sermulins, Rodric Rabbah, and Saman Amarasinghe. 2005. Teleport messaging for distributed stream programs. In Proceedings of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’05). ACM, New York, NY, 224--235.
[35]
Matthieu Wipliez, Ghislain Roquier, and Jean-François Nezan. 2011. Software code generation for the RVC-CAL language. J. Sign. Process. Syst. 63, 2 (2011), 203--213.
[36]
Herve Yviquel, Antoine Lorence, Khaled Jerbi, Gildas Cocherel, Alexandre Sanchez, and Mickael Raulet. 2013. Orcc: Multimedia development made easy. In Proceedings of the 21st ACM International Conference on Multimedia (MM’13). ACM, 863--866.

Cited By

View all
  • (2024)Hardware and Software Generation from Large Actor Machines in Streaming ApplicationsProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3635930(142-150)Online publication date: 8-Apr-2024
  • (2023)Scalable Actor Networks with CALProceedings of the 21st ACM-IEEE International Conference on Formal Methods and Models for System Design10.1145/3610579.3611074(169-179)Online publication date: 21-Sep-2023
  • (2023)Design Space Exploration for Partitioning Dataflow Program on CPU-GPU Heterogeneous SystemJournal of Signal Processing Systems10.1007/s11265-023-01884-695:10(1219-1229)Online publication date: 1-Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 18, Issue 6
November 2019
266 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3372377
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 14 December 2019
Accepted: 01 September 2019
Revised: 01 June 2018
Received: 01 March 2017
Published in TECS Volume 18, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Compiler
  2. Kahn process networks
  3. dataflow with firing
  4. stream program

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Hardware and Software Generation from Large Actor Machines in Streaming ApplicationsProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3635930(142-150)Online publication date: 8-Apr-2024
  • (2023)Scalable Actor Networks with CALProceedings of the 21st ACM-IEEE International Conference on Formal Methods and Models for System Design10.1145/3610579.3611074(169-179)Online publication date: 21-Sep-2023
  • (2023)Design Space Exploration for Partitioning Dataflow Program on CPU-GPU Heterogeneous SystemJournal of Signal Processing Systems10.1007/s11265-023-01884-695:10(1219-1229)Online publication date: 1-Oct-2023
  • (2022)Dynamic SIMD Parallel Execution on GPU from High-Level Dataflow SynthesisJournal of Low Power Electronics and Applications10.3390/jlpea1203004012:3(40)Online publication date: 17-Jul-2022
  • (2022)Performance Estimation of High-Level Dataflow Program on Heterogeneous Platforms by Dynamic Network ExecutionJournal of Low Power Electronics and Applications10.3390/jlpea1203003612:3(36)Online publication date: 23-Jun-2022
  • (2022)Auto-Partitioning Heterogeneous Task-Parallel Programs with StreamBlocksProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569659(398-411)Online publication date: 8-Oct-2022
  • (2022)VR-PRUNE: Decidable Variable-Rate Dataflow for Signal Processing SystemsIEEE Transactions on Signal Processing10.1109/TSP.2022.316238870(1819-1833)Online publication date: 2022
  • (2021)Generating hardware and software for RISC-V cores generated with Rocket Chip generator2021 IEEE 34th International System-on-Chip Conference (SOCC)10.1109/SOCC52499.2021.9739411(89-94)Online publication date: 14-Sep-2021
  • (2021)Methodologies for Synthesizing and Analyzing Dynamic Dataflow Programs in Heterogeneous Systems for Edge ComputingIEEE Open Journal of Circuits and Systems10.1109/OJCAS.2021.31163422(769-781)Online publication date: 2021
  • (2021)Performance Estimation of High-Level Dataflow Program on Heterogeneous Platforms2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)10.1109/MCSoC51149.2021.00018(69-76)Online publication date: Dec-2021
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media