Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2660193.2660236acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article
Open access

StreamJIT: a commensal compiler for high-performance stream programming

Published: 15 October 2014 Publication History

Abstract

There are many domain libraries, but despite the performance benefits of compilation, domain-specific languages are comparatively rare due to the high cost of implementing an optimizing compiler. We propose commensal compilation, a new strategy for compiling embedded domain-specific languages by reusing the massive investment in modern language virtual machine platforms. Commensal compilers use the host language's front-end, use host platform APIs that enable back-end optimizations by the host platform JIT, and use an autotuner for optimization selection. The cost of implementing a commensal compiler is only the cost of implementing the domain-specific optimizations. We demonstrate the concept by implementing a commensal compiler for the stream programming language StreamJIT atop the Java platform. Our compiler achieves performance 2.8 times better than the StreamIt native code (via GCC) compiler with considerably less implementation effort.

References

[1]
Expression Trees. URL http://msdn.microsoft.com/en-us/library/bb397951%28v=vs.110%29.aspx.
[2]
Emitting Dynamic Methods and Assemblies. URL http://msdn.microsoft.com/en-us/library/8ffc3x75%28v=vs.110%29.aspx.
[3]
E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users' Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, third edition, 1999. ISBN 0-89871-447-8 (paperback).
[4]
J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao, A. Edelman, and S. Amarasinghe. PetaBricks: A Language and Compiler for Algorithmic Choice. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '09, pages 38--49, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-392-1.
[5]
J. Ansel, M. Pacula, Y. L. Wong, C. Chan, M. Olszewski, U.-M. O'Reilly, and S. Amarasinghe. SiblingRivalry: Online Autotuning Through Local Competitions. In Proceedings of the 2012 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES '12, pages 91--100, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1424-4.
[6]
J. Ansel, S. Kamil, K. Veeramachaneni, U.-M. O'Reilly, and S. Amarasinghe. OpenTuner: An Extensible Framework for Program Autotuning. 2013. URL http://hdl.handle.net/1721.1/81958.
[7]
ASM. ASM bytecode transformation library. http://asm.ow2.org/.
[8]
J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: A Java-compatible and Synthesizable Language for Heterogeneous Architectures. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA '10, pages 89--108, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0203-6.
[9]
J. Auerbach, D. F. Bacon, I. Burcea, P. Cheng, S. J. Fink, R. Rabbah, and S. Shukla. A Compiler and Runtime for Heterogeneous Computing. In Proceedings of the 49th Annual Design Automation Conference, DAC '12, pages 271--276, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1199-1.
[10]
E. Axelsson, K. Claessen, G. Devai, Z. Horvath, K. Keijzer, B. Lyckegrd, A. Persson, M. Sheeran, J. Svenningsson, and A. Vajda. Feldspar: A domain specific language for digital signal processing algorithms. In Formal Methods and Models for Codesign (MEMOCODE), 2010 8th IEEE/ACM International Conference on, pages 169--178, July 2010.
[11]
S. Bharadwaj. invokedynamic and Jython. In JVM Language Summit 2011, JVMLS 2011, 2011.
[12]
K. Brown, A. Sujeeth, H. J. Lee, T. Rompf, H. Chafi, M. Odersky, and K. Olukotun. A Heterogeneous Parallel Framework for Domain-Specific Languages. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pages 89--100, Oct 2011.
[13]
W. Dietl, S. Dietzel, M. D. Ernst, K. Muşlu, and T. Schiller. Building and using pluggable type-checkers. In ICSE'11, Proceedings of the 33rd International Conference on Software Engineering, pages 681--690, Waikiki, Hawaii, USA, May 25-27, 2011.
[14]
M. Frigo and S. Johnson. FFTW: an adaptive software architecture for the FFT. In Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, volume 3, pages 1381--1384 vol.3, May 1998.
[15]
M. Garcia. Runtime metaprogramming via java.lang.invoke.MethodHandle. May 2012. URL http://lamp.epfl.ch/~magarcia/ScalaCompilerCornerReloaded/2012Q2/RuntimeMP.pdf.
[16]
M. Gordon. Compiler Techniques for Scalable Performance of Stream Programs on Multicore Architectures. PhD thesis, Massachusetts Institute of Technology, 2010.
[17]
D. Heidinga. MethodHandle Implementation Tips and Tricks. In JVM Language Summit 2011, JVMLS 2011, 2011.
[18]
ImageMagick. ImageMagick. URL http://www.imagemagick.org/.
[19]
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed Data-parallel Programs from Sequential Building Blocks. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, EuroSys '07, pages 59--72, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-636-3.
[20]
M. Karczmarek. Constrained and Phased Scheduling of Synchronous Data Flow Graphs for StreamIt Language. PhD thesis, Massachusetts Institute of Technology, 2002.
[21]
A. Kulkarni and R. R. Newton. Embrace, Defend, Extend: A Methodology for Embedding Preexisting DSLs. In Proceedings of the 1st Annual Workshop on Functional Programming Concepts in Domain-specific Languages, FPCDSL '13, pages 27--34, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2380-2.
[22]
E. Meijer, B. Beckman, and G. Bierman. LINQ: Reconciling Object, Relations and XML in the .NET Framework. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD '06, pages 706--706, New York, NY, USA, 2006. ACM. ISBN 1-59593-434-0.
[23]
C. Nutter. Adding invokedynamic Support to JRuby. In JVM Language Summit 2011, JVMLS 2011, 2011.
[24]
F. Otto, V. Pankratius, and W. Tichy. High-level Multicore Programming with XJava. In Software Engineering - Companion Volume, 2009. ICSE-Companion 2009. 31st International Conference on, pages 319--322, May 2009.
[25]
F. Otto, V. Pankratius, and W. Tichy. XJava: Exploiting Parallelism with Object-Oriented Stream Programming. In H. Sips, D. Epema, and H.-X. Lin, editors, Euro-Par 2009 Parallel Processing, volume 5704 of Lecture Notes in Computer Science, pages 875--886. Springer Berlin Heidelberg, 2009. ISBN 978-3-642-03868-6.
[26]
F. Otto, C. Schaefer, M. Dempe, and W. Tichy. A Language-Based Tuning Mechanism for Task and Pipeline Parallelism. In P. DAmbra, M. Guarracino, and D. Talia, editors, Euro-Par 2010 - Parallel Processing, volume 6272 of Lecture Notes in Computer Science, pages 328--340. Springer Berlin Heidelberg, 2010. ISBN 978-3-642-15290-0.
[27]
J. Ponge and F. L. Mouël. JooFlux: Hijacking Java 7 InvokeDynamic To Support Live Code Modifications. CoRR, abs/1210.1039, 2012.
[28]
J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. In ACM SIGPLAN Conference on Programming Language Design and Implementation, Seattle, WA, June 2013.
[29]
B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proceedings of the 14th Annual Workshop on Microprogramming, MICRO 14, pages 183--198, Piscataway, NJ, USA, 1981. IEEE Press.
[30]
T. Rompf and M. Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. In Proceedings of the Ninth International Conference on Generative Programming and Component Engineering, GPCE '10, pages 127--136, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0154-1.
[31]
J. Rose. Bytecodes meet combinators: invokedynamic on the JVM. In Proceedings of the Third Workshop on Virtual Machines and Intermediate Languages, page 2. ACM, 2009.
[32]
J. H. Spring, J. Privat, R. Guerraoui, and J. Vitek. Stream-Flex: High-throughput Stream Programming in Java. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications, OOPSLA '07, pages 211--228, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-786-5.
[33]
A. K. Sujeeth, A. Gibbons, K. J. Brown, H. Lee, T. Rompf, M. Odersky, and K. Olukotun. Forge: Generating a High Performance DSL Implementation from a Declarative Specification. In Proceedings of the 12th International Conference on Generative Programming: Concepts & Experiences, GPCE '13, pages 145--154, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2373-4.
[34]
C. Thalinger and J. Rose. Optimizing Invokedynamic. In Proceedings of the 8th International Conference on the Principles and Practice of Programming in Java, PPPJ '10, pages 1--9, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0269-2.
[35]
The Java Tutorials. Aggregate Operations. URL http://docs.oracle.com/javase/tutorial/collections/streams/index.html.
[36]
The Java Tutorials. Type Annotations and Pluggable Type Systems. URL http://docs.oracle.com/javase/tutorial/java/annotations/type_annotations.html.
[37]
W. Thies. Language and Compiler Support for Stream Programs. PhD thesis, Massachusetts Institute of Technology, 2009.
[38]
W. Thies and S. Amarasinghe. An Empirical Characterization of Stream Programs and Its Implications for Language and Compiler Design. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT '10, pages 365--376, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0178-7.
[39]
A. Tiwari, C. Chen, J. Chame, M. Hall, and J. Hollingsworth. A Scalable Auto-tuning Framework for Compiler Optimization. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1--12, May 2009.
[40]
TypeToken. TypeToken. URL https://code.google.com/p/guava-libraries/wiki/ReflectionExplained#TypeToken.
[41]
Z. Wang and M. F. P. O'boyle. Using Machine Learning to Partition Streaming Programs. ACM Trans. Archit. Code Optim., 10(3):20:1--20:25, Sept. 2008. ISSN 1544-3566.
[42]
R. C. Whaley and J. J. Dongarra. Automatically Tuned Linear Algebra Software. In Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, Supercomputing '98, pages 1--27, Washington, DC, USA, 1998. IEEE Computer Society. ISBN 0-89791-984-X.
[43]
C. Wimmer and T. Wüurthinger. Truffle: A Self-optimizing Runtime System. In Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, SPLASH '12, pages 13--14, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1563-0.
[44]
T. Würthinger. Extending the Graal Compiler to Optimize Libraries. In Proceedings of the ACM International Conference Companion on Object Oriented Programming Systems Languages and Applications Companion, SPLASH '11, pages 41--42, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0942-4.
[45]
J. Xiong, J. Johnson, R. Johnson, and D. Padua. SPL: A Language and Compiler for DSP Algorithms. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation, PLDI '01, pages 298--308, New York, NY, USA, 2001. ACM. ISBN 1-58113-414-2.
[46]
Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A System for General-purpose Distributed Data-parallel Computing Using a High-level Language. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI'08, pages 1--14, Berkeley, CA, USA, 2008. USENIX Association.
[47]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster Computing with Working Sets. In Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pages 10--10, Berkeley, CA, USA, 2010. USENIX Association.

Cited By

View all
  • (2017)UpsortableProceedings of the VLDB Endowment10.14778/3137765.313779710:12(1873-1876)Online publication date: 1-Aug-2017
  • (2017)SPLACM Transactions on Programming Languages and Systems10.1145/303920739:1(1-39)Online publication date: 6-Mar-2017
  • (2017)Generalize or Die: Operating Systems Support for Memristor-Based Accelerators2017 IEEE International Conference on Rebooting Computing (ICRC)10.1109/ICRC.2017.8123649(1-8)Online publication date: Nov-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
October 2014
946 pages
ISBN:9781450325851
DOI:10.1145/2660193
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 49, Issue 10
    OOPSLA '14
    October 2014
    907 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2714064
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2014

Check for updates

Author Tags

  1. domain-specific languages
  2. embedded domain-specific languages

Qualifiers

  • Research-article

Funding Sources

Conference

SPLASH '14
Sponsor:

Acceptance Rates

OOPSLA '14 Paper Acceptance Rate 52 of 186 submissions, 28%;
Overall Acceptance Rate 268 of 1,244 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)159
  • Downloads (Last 6 weeks)20
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2017)UpsortableProceedings of the VLDB Endowment10.14778/3137765.313779710:12(1873-1876)Online publication date: 1-Aug-2017
  • (2017)SPLACM Transactions on Programming Languages and Systems10.1145/303920739:1(1-39)Online publication date: 6-Mar-2017
  • (2017)Generalize or Die: Operating Systems Support for Memristor-Based Accelerators2017 IEEE International Conference on Rebooting Computing (ICRC)10.1109/ICRC.2017.8123649(1-8)Online publication date: Nov-2017
  • (2017)Autotuning CUDA compiler parameters for heterogeneous applications using the OpenTuner frameworkConcurrency and Computation: Practice and Experience10.1002/cpe.397329:22Online publication date: 6-Mar-2017
  • (2016)RaftLib: A C++ template library for high performance stream parallel processingThe International Journal of High Performance Computing Applications10.1177/109434201667254231:5(391-404)Online publication date: 19-Oct-2016
  • (2015)LaminarIR: compile-time queues for structured streamsACM SIGPLAN Notices10.1145/2813885.273799450:6(121-130)Online publication date: 3-Jun-2015
  • (2015)LaminarIR: compile-time queues for structured streamsProceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/2737924.2737994(121-130)Online publication date: 3-Jun-2015
  • (2015)RaftLibProceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/2712386.2712400(96-105)Online publication date: 7-Feb-2015
  • (2015)A Backend Extension Mechanism for PQL/Java with Free Run-Time OptimisationCompiler Construction10.1007/978-3-662-46663-6_6(111-130)Online publication date: 2015
  • (2017)SPLACM Transactions on Programming Languages and Systems10.1145/303920739:1(1-39)Online publication date: 6-Mar-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media