research-article

Implementation-Aware Model Analysis: The Case of Buffer-Throughput Tradeoff in Streaming Applications

Authors:

Kamyar Mirzazad Barijough,

Volodymyr Khibin,

Soheil GhiasiAuthors Info & Claims

LCTES'15: Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems 2015 CD-ROM

Article No.: 11, Pages 1 - 10

https://doi.org/10.1145/2670529.2754968

Published: 04 June 2015 Publication History

Abstract

Models of computation abstract away a number of implementation details in favor of well-defined semantics. While this has unquestionable benefits, we argue that analysis of models solely based on operational semantics (implementation-oblivious analysis) is unfit to drive implementation design space exploration. Specifically, we study the tradeoff between buffer size and streaming throughput in applications modeled as synchronous data flow (SDF) graphs. We demonstrate the inherent inaccuracy of implementation-oblivious approach, which only considers SDF operational semantic. We propose a rigorous transformation, which equips the state of the art buffer-throughput tradeoff analysis technique with implementation awareness. Extensive empirical evaluation show that our approach results in significantly more accurate estimates in streaming throughput at the model level, while running two orders of magnitude faster than cycle-accurate simulation of implementations.

References

[1]

M. Ade, R. Lauwereins, and J. Peperstraete. Data memory minimisation for synchronous data flow graphs emulated on DSP-FPGA targets. Design Automation Conference, 1997.

Digital Library

[2]

M. A. Bamakhrama and T. P. Stefanov. On the hard-real-time scheduling of embedded streaming applications. Design Automation for Embedded Systems, 2012.

Digital Library

[3]

S. Bell et al. Tile64 - processor: A 64-core soc with mesh interconnect. International Solid-State Circuits Conference, 2008.

[4]

Benchmarks. http://sharif.edu/~matin and http://leps.ece.ucdavis.edu.

[5]

S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee. Software Synthesis from Dataflow Graphs. Springer, 1996. ISBN 1461286018.

Digital Library

[6]

A. H. Ghamarian et al. Throughput analysis of synchronous data flow graphs. International Conference on Application of Concurrency to System Design, 2006.

Digital Library

[7]

Graphite. http://graphite.csail.mit.edu.

[8]

M. Hashemi and S. Ghiasi. Versatile task assignment for heterogeneous soft dual-processor platforms. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, 29(3), 2010.

Digital Library

[9]

M. Hashemi, M. H. Foroozannejad, S. Ghiasi, and C. Etzel. Formless: Scalable utilization of embedded manycores in streaming applications. International Conference on Languages, Compilers, Tools and Theory for Embedded Systems, pages 71--78, 2012.

Digital Library

[10]

M. Hashemi, M. H. Foroozannejad, and S. Ghiasi. Throughput-memory footprint trade-off in synthesis of streaming software on embedded multiprocessors. ACM Transactions on Embedded Computing Systems, 13(3), 2013.

Digital Library

[11]

E. A. Lee and D. G. Messerschmitt. Synchronous data flow. Proceedings of the IEEE, 75(9):1235--1245, 1987.

[12]

E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Transactions on Computers, 1987.

Digital Library

[13]

J. Miller et al. Graphite: A distributed parallel simulator for multicores. International Symposium on High-Performance Computer Architecture, January 2010.

[14]

A. Moonen et al. Practical and accurate throughput analysis with the cyclo static dataflow model. International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, 2007.

Digital Library

[15]

O. M. Moreira and M. J. Bekooij. Self-timed scheduling analysis for real-time applications. EURASIP Journal on Advances in Signal Processing, 2007.

[16]

H. Oh and S. Ha. Fractional rate dataflow model for efficient code synthesis. Journal of VLSI signal processing systems for signal, image and video technology, 2004.

Digital Library

[17]

K. Parhi. VLSI Digital Signal Processing Systems: Design and Implementation. Wiley-Interscience, 2008. ISBN B000UGR930.

[18]

A. Pinto, A. Bonivento, A. L. Sangiovanni-Vincentelli, R. Passerone, and M. Sgroi. System level design paradigms: Platform-based design and communication synthesis. ACM Transactions on Design Automation of Electronic Systems, 11 (3):537--563, 2006.

Digital Library

[19]

S. Raghav, A. Marongiu, C. Pinto, M. Ruggiero, D. Atienza Alonso, and L. Benini. SIMinG-1k: A thousand-core simulator running on GPGPUs. Concurrency and Computation: Practice and Experience, 25(10):1443--1461, 2013.

[20]

A. Sangiovanni-Vincentelli and G. Martin. A vision for embedded systems: platform-based design and software methodology. Design Test of Computers, 18(6):23 --33, 2001.

Digital Library

[21]

SDF3. http://www.es.ele.tue.nl/sdf3.

[22]

S. Stuijk et al. Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. Design Automation Conference, 2006.

Digital Library

[23]

W. Thies et al. Streamit: A language for streaming applications. International Conference on Compiler Construction, 2002.

Digital Library

[24]

Z. Xiao and B. Baas. 1080p h.264/avc baseline residual encoder for a fine-grained many-core system. IEEE Transactions on Circuits and Systems for Video Tech., 2011.

Digital Library

[25]

Y. Zhou and E. A. Lee. A causality interface for deadlock analysis in dataflow. International Conference on Embedded Software, pages 44--52, 2006.

Digital Library

Cited By

Amirshahi AHashemi M(2019)ECG Classification Algorithm Based on STDP and R-STDP Neural Networks for Real-Time Monitoring on Ultra Low-Power Personal Wearable DevicesIEEE Transactions on Biomedical Circuits and Systems10.1109/TBCAS.2019.294892013:6(1483-1493)Online publication date: Dec-2019
https://doi.org/10.1109/TBCAS.2019.2948920
Wang YChen WYang JLi T(2018)Towards Memory-Efficient Allocation of CNNs on Processing-in-Memory ArchitectureIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.279144029:6(1428-1441)Online publication date: 1-Jun-2018
https://doi.org/10.1109/TPDS.2018.2791440
Wang YZhang MYang J(2017)Towards memory-efficient processing-in-memory architecture for convolutional neural networksACM SIGPLAN Notices10.1145/3140582.308103252:5(81-90)Online publication date: 21-Jun-2017
https://dl.acm.org/doi/10.1145/3140582.3081032
Show More Cited By

Index Terms

Implementation-Aware Model Analysis: The Case of Buffer-Throughput Tradeoff in Streaming Applications

Recommendations

Implementation-Aware Model Analysis: The Case of Buffer-Throughput Tradeoff in Streaming Applications
LCTES '15

Models of computation abstract away a number of implementation details in favor of well-defined semantics. While this has unquestionable benefits, we argue that analysis of models solely based on operational semantics (implementation-oblivious analysis) ...
Worst-Case Throughput Analysis of SDF-Based Parametrized Dataflow
DSD '15: Proceedings of the 2015 Euromicro Conference on Digital System Design

Dynamic dataflow models of computation (MoCs) have been introduced to provide designers with enough expressive power to capture increasing levels of dynamism in modern streaming applications. Among dynamic dataflow MoCs, parametrized dataflow MoCs hold ...
Fractional Rate Dataflow Model for Efficient Code Synthesis

Automatic code synthesis from dataflow program graphs is a promising high-level design methodology for rapid prototyping of multimedia embedded systems. Memory efficient code synthesis from dataflow models has been an active research subject to reduce ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

LCTES'15: Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems 2015 CD-ROM

June 2015

149 pages

ISBN:9781450332576

DOI:10.1145/2670529

General Chair:
Sam H. Noh
Hongik University, Republic of Korea
,
Program Chairs:
Sebastian Fischmeister
University of Waterloo, Canada
,
Jason Xue
City University of Hong Kong, China

ACM SIGPLAN Notices Volume 50, Issue 5
LCTES '15
May 2015
141 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2808704
Editor:
Andy Gill
University of Kansas, Lawrence, KS
Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

LCTES'15

Sponsor:

LCTES'15: SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems 2015

June 18 - 19, 2015

OR, Portland, USA

Acceptance Rates

Overall Acceptance Rate 116 of 438 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
227
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Amirshahi AHashemi M(2019)ECG Classification Algorithm Based on STDP and R-STDP Neural Networks for Real-Time Monitoring on Ultra Low-Power Personal Wearable DevicesIEEE Transactions on Biomedical Circuits and Systems10.1109/TBCAS.2019.294892013:6(1483-1493)Online publication date: Dec-2019
https://doi.org/10.1109/TBCAS.2019.2948920
Wang YChen WYang JLi T(2018)Towards Memory-Efficient Allocation of CNNs on Processing-in-Memory ArchitectureIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.279144029:6(1428-1441)Online publication date: 1-Jun-2018
https://doi.org/10.1109/TPDS.2018.2791440
Wang YZhang MYang J(2017)Towards memory-efficient processing-in-memory architecture for convolutional neural networksACM SIGPLAN Notices10.1145/3140582.308103252:5(81-90)Online publication date: 21-Jun-2017
https://dl.acm.org/doi/10.1145/3140582.3081032
Wang YZhang MYang JNagarajan VShao Z(2017)Towards memory-efficient processing-in-memory architecture for convolutional neural networksProceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3078633.3081032(81-90)Online publication date: 21-Jun-2017
https://dl.acm.org/doi/10.1145/3078633.3081032
Hashemi MBarijough KGhiasi S(2016)Throughput-Driven Parallel Embedded Software Synthesis from Synchronous Dataflow Models: Caveats and RemediesModel-Implementation Fidelity in Cyber Physical System Design10.1007/978-3-319-47307-9_4(91-127)Online publication date: 10-Dec-2016
https://doi.org/10.1007/978-3-319-47307-9_4

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents