Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3075564.3075568acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Stream Drive: A Dynamic Dataflow Framework For Clustered Embedded Architectures

Published: 15 May 2017 Publication History

Abstract

In this paper, we present StreamDrive, a dynamic dataflow framework for programming clustered embedded multicore architectures. StreamDrive simplifies development of dynamic dataflow applications starting from sequential reference C code and allows seamless handling of heterogeneous and application-specific processing elements at the application level. We address issues of efficient implementation of the dynamic dataflow runtime system in the context of constrained embedded environments, which have not been sufficiently addressed by previous research. We conducted a detailed performance evaluation of the StreamDrive implementation on our Application Specific Multiprocessor (ASMP) cluster using the Oriented FAST and Rotated BRIEF (ORB) algorithm typical of image processing domain. Our implementation has less than 10% parallelization overhead, near linear speed-up when the number of processors increases from 1 to 8, and achieves the performance of 15 VGA frames per second with a small cluster configuration of 4 processing elements and 64KB of shared memory, and of 30 VGA frames per second with 8 processors and 128KB of shared memory.

References

[1]
Christopher Brooks, Edward A Lee, Xiaojun Liu, Stephen Neuendorffer, Yang Zhao, Haiyang Zheng, Shuvra S Bhattacharyya, Elaine Cheong, II Davis, Mudit Goel, and others. 2008. Heterogeneous concurrent modeling and design in java (volume 1: Introduction to ptolemy ii). Technical Report. DTIC.
[2]
Joseph T Buck. 1994. A dynamic dataflow model suitable for efficient mixed hardware and software implementations of DSP applications. In HSCD Workshop. 165--172.
[3]
Erwin A de Kock, WJM Smits, Pieter van der Wolf, J-Y Brunei, WM Kruijtzer, Paul Lieverse, Kees A Vissers, and Gerben Essink. 2000. YAPI: Application modeling for signal processing systems. In DAC. 402--405.
[4]
J. Dennis. 1974. First version data flow procedure language. Technical Report MAC TM61. MIT Laboratory for Computer Science.
[5]
Adam Dunkels, Oliver Schmidt, Thiemo Voigt, and Muneeb Ali. 2006. Protothreads: simplifying event-driven programming of memory-constrained embedded systems. In Sensys. 29--42.
[6]
Stephen A Edwards and Olivier Tardieu. 2006. SHIM: A deterministic model for heterogeneous embedded systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 14, 8 (2006), 854--867.
[7]
Thierry Gautier, Xavier Besseron, and Laurent Pigeon. 2007. Kaapi: A thread scheduling runtime system for data flow computations on cluster of multiprocessors. In PASCO. 15--23.
[8]
Essayas Gebrewahid, Mingkun Yang, Gustav Cedersjö, Zain Ul Abdin, Veronica Gaspes, Jörn W Janneck, and Bertil Svensson. 2014. Realizing efficient execution of dataflow actors on manycores. In EUC. 321--328.
[9]
Thierry Goubier, Renaud Sirdey, Stéphane Louise, and Vincent David. 2011. ∑C: A programming model and language for embedded manycores. In ICA3PP. 385--394.
[10]
Wolfgang Haid. 2010. Design and Performance Analysis of Multiprocessor Streaming Applications. Ph.D. Dissertation. ETH, Zurich.
[11]
G. Kahn. 1974. The semantics of a simple language for parallel programming. In IFIP Congress.
[12]
E.A. Lee. 1997. A Denotational Semantics for Dataflow with Firing. Memorandum UCB/ERL M97/3. Electronics Research Laboratory, U. C. Berkeley.
[13]
Edward A Lee and David G Messerschmitt. 1987. Synchronous data flow. Proc. IEEE 75, 9 (1987), 1235--1245.
[14]
D. Melpignano, Luca Benini, Eric Flamand, Bruno Jego, Thierry Lepley, Germain Haugou, Fabien Clermidy, and Denis Dutoit. 2012. Platform 2012, a many-core computing accelerator for embedded SoCs: performance evaluation of visual analytics applications. In DAC. 1137--1142.
[15]
Daniel Orozco, Elkin Garcia, Robert Pavel, Rishi Khan, and Guang Gao. 2011. Tideflow: The time iterated dependency flow execution model. In Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM). 1--9.
[16]
Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean-François Nezan, and Slaheddine Aridhi. 2014. Preesm: A dataflow-based rapid prototyping framework for simplifying multicore DSP programming. In EDERC. 36--40.
[17]
Andy D Pimentel. 2008. The artemis workbench for system-level performance evaluation of embedded systems. International Journal of Embedded Systems 3, 3 (2008), 181--196.
[18]
Antoniu Pop and Albert Cohen. 2013. OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs. ACM Transactions on Architecture and Code Optimization 9, 4 (2013), 53.
[19]
E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. 2011. ORB: An Efficient Alternative to SIFT or SURF. In ICCV. 2564--2571.
[20]
Vítor Schwambach, Sébastien Cleyet-Merle, Alain Issard, and Stéphane Mancini. 2015. Estimating the Potential Speedup of Computer Vision Applications on Embedded Multiprocessors. CoRR abs/1502.07446 (2015).
[21]
Sundararajan Sriram and Shuvra S Bhattacharyya. 2009. Embedded multiprocessors: Scheduling and synchronization. CRC press.
[22]
Z. Vrba, P. Halvorsen, C. Griwodz, P. Beskow, H. Espeland, and D. Johansen. 2013. The Nornir run-time system for parallel programs using Kahn process networks on multi-core machines - a flexible alternative to MapReduce. The Journal of Supercomputing 63, 1 (2013), 191--217.
[23]
Asim YarKhan. 2012. Dynamic Task Execution on Shared and Distributed Memory Architectures. Ph.D. Dissertation. the University of Tennessee, Knoxville.

Cited By

View all
  • (2019)StreamDriveJournal of Signal Processing Systems10.1007/s11265-018-1351-191:3-4(275-301)Online publication date: 1-Mar-2019
  • (2018)Embedded Runtime for Reconfigurable Dataflow Graphs on Manycore ArchitecturesProceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms10.1145/3183767.3183780(51-56)Online publication date: 23-Jan-2018

Index Terms

  1. Stream Drive: A Dynamic Dataflow Framework For Clustered Embedded Architectures

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CF'17: Proceedings of the Computing Frontiers Conference
      May 2017
      450 pages
      ISBN:9781450344876
      DOI:10.1145/3075564
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 May 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. dataflow
      2. embedded multicore
      3. image processing
      4. programming model

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      CF '17
      Sponsor:
      CF '17: Computing Frontiers Conference
      May 15 - 17, 2017
      Siena, Italy

      Acceptance Rates

      CF'17 Paper Acceptance Rate 43 of 87 submissions, 49%;
      Overall Acceptance Rate 273 of 785 submissions, 35%

      Upcoming Conference

      CF '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 25 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)StreamDriveJournal of Signal Processing Systems10.1007/s11265-018-1351-191:3-4(275-301)Online publication date: 1-Mar-2019
      • (2018)Embedded Runtime for Reconfigurable Dataflow Graphs on Manycore ArchitecturesProceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms10.1145/3183767.3183780(51-56)Online publication date: 23-Jan-2018

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media