Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2370816.2370825acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

Riposte: a trace-driven compiler and parallel VM for vector code in R

Published: 19 September 2012 Publication History

Abstract

There is a growing utilization gap between modern hardware and modern programming languages for data analysis.Due to power and other constraints, recent processor design has sought improved performance through increased SIMD and multi-core parallelism. At the same time, high-level, dynamically-typed languages for data analysis have become popular. These languages emphasize ease of use and high productivity, but have, in general, low performance and limited support for exploiting hardware parallelism.
In this paper, we describe Riposte, a new runtime for the R language, which bridges this gap. Riposte uses tracing, a technique commonly used to accelerate scalar code, to dynamically discover and extract sequences of vector operations from arbitrary R code. Once extracted, we can fuse traces to eliminate unnecessary memory traffic, compile them to use hardware SIMD units, and schedule them to run across multiple cores, allowing us to fully utilize the available parallelism on modern shared-memory machines. Our evaluation shows that Riposte can run vector R code near the speed of hand-optimized C, 5--50x faster than the open source implementation of R, and can also linearly scale to 32 cores for some tasks. Across 12 different workloads we achieve an overall average speed-up of over 150x without explicit programmer parallelization.

References

[1]
Google V8 Javascript engine. http://code.google.com/p/v8/.
[2]
The LuaJIT project. http://http://luajit.org/.
[3]
The Ra extension to R. http://www.milbo.users.sonic.net/ra/.
[4]
P. S. Abrams. An APL Machine. PhD thesis, Stanford Linear Accelerator Center, Stanford University, Stanford, CA, USA, 1970.
[5]
A. Aslam and L. Hendren. McFLAT: a profile-based framework for Matlab loop analysis and transformations. In Proceedings of the 23rd international conference on Languages and compilers for parallel computing, LCPC'10, pages 1--15, Berlin, Heidelberg, 2011. Springer-Verlag.
[6]
V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, PLDI '00, pages 1--12, New York, NY, USA, 2000. ACM.
[7]
P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR, pages 225--237, 2005.
[8]
S. Brunthaler. Inline caching meets quickening. In Proceedings of the 24th European conference on Object-oriented programming, ECOOP'10, pages 429--451, Berlin, Heidelberg, 2010. Springer-Verlag.
[9]
B. Catanzaro, M. Garland, and K. Keutzer. Copperhead: Compiling an embedded data parallel language. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, PPoPP '11, pages 47--56, New York, NY, USA, 2011. ACM.
[10]
M. Chevalier-Boisvert, L. Hendren, and C. Verbrugge. Optimizing Matlab through just-in-time specialization. In R. Gupta, editor, Compiler Construction, volume 6011 of Lecture Notes in Computer Science, pages 46--65. Springer Berlin / Heidelberg, 2010. 10.1007/978-3-642-11970-5_4.
[11]
D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: From lists to streams to nothing at all. In Proceedings of the 12th ACM SIGPLAN international conference on Functional programming, ICFP '07, pages 315--326, New York, NY, USA, 2007. ACM.
[12]
A. Das, W. J. Dally, and P. Mattson. Compiling for stream processing. In Proceedings of the 15th international conference on Parallel architectures and compilation techniques, PACT '06, pages 33--42, New York, NY, USA, 2006. ACM.
[13]
A. Gal, B. Eich, M. Shaver, D. Anderson, D. Mandelin, M. R. Haghighat, B. Kaplan, G. Hoare, B. Zbarsky, J. Orendorff, J. Ruderman, E. W. Smith, R. Reitmaier, M. Bebenita, M. Chang, and M. Franz. Trace-based just-in-time type specialization for dynamic languages. In Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation, PLDI '09, pages 465--478, New York, NY, USA, 2009. ACM.
[14]
A. Gal, C. W. Probst, and M. Franz. HotpathVM: An effective JIT compiler for resource-constrained devices. In Proceedings of the 2nd international conference on Virtual execution environments, VEE '06, pages 144--153, New York, NY, USA, 2006. ACM.
[15]
V. Grover and Y. Lin. Compiling CUDA and other languages for GPUs. In GPU Technology Conference (GTC), 2012.
[16]
L. J. Guibas and D. K. Wyatt. Compilation and delayed evaluation in APL. In Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, POPL '78, pages 1--8, New York, NY, USA, 1978. ACM.
[17]
G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming, ICFP '10, pages 261--272, New York, NY, USA, 2010. ACM.
[18]
N. Lameed and L. Hendren. Staged static techniques to efficiently implement array copy semantics in a Matlab JIT compiler. In Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software, CC'11/ETAPS'11, pages 22--41, Berlin, Heidelberg, 2011. Springer-Verlag.
[19]
T. C. Miller. Tentative compilation: A design for an APL compiler. SIGAPL APL Quote Quad, 9:88--95, May 1979.
[20]
F. Morandat, B. Hill, L. Osvald, and J. Vitek. Evaluating the design of the R language. In ECOOP 2012 Object-Oriented Programming, Lecture Notes in Computer Science, 2012.
[21]
C. Newburn, B. So, Z. Liu, M. McCool, A. Ghuloum, S. Toit, Z. G. Wang, Z. H. Du, Y. Chen, G. Wu, P. Guo, Z. Liu, and D. Zhang. Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language. In Code Generation and Optimization (CGO) 2011, pages 224--235, April 2011.
[22]
M. Papakipos. The PeakStream platform: High productivity software development for multi-core processors. Technical report, 2006.
[23]
S. Peyton Jones. Harnessing the multicores: Nested data parallelism in Haskell. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems, APLAS '08, pages 138--138, Berlin, Heidelberg, 2008. Springer-Verlag.
[24]
M. Pharr and W. R. Mark. ispc: A SPMD compiler for high-performance CPU programming. In Proceedings of the 2012 Innovative Parallel Computing: Foundations & Applications of GPU, Manycore, and Heterogeneous Systems, InPar '12, 2012.
[25]
R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the data: Parallel analysis with Sawzall. Sci. Program., 13(4):277--298, Oct. 2005.
[26]
M. Poletto and V. Sarkar. Linear scan register allocation. ACM Trans. Program. Lang. Syst., 21(5):895--913, Sept. 1999.
[27]
R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2011. ISBN 3-900051-07-0.
[28]
A. R. Runnalls and C. A. Silles. CXXR: An ideas hatchery for future R development. In Proceedings of the 2011 Joint Statistical Meetings (JSM), 2011.
[29]
M. Schmidberger, M. Morgan, D. Eddelbuettel, H. Yu, L. Tierney, and U. Mansmann. State of the art in parallel computing with R. Journal of Statistical Software, 31(1):1--27, 8 2009.
[30]
L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan. Larrabee: A many-core x86 architecture for visual computing. In ACM SIGGRAPH 2008 papers, SIGGRAPH '08, pages 18:1--18:15, New York, NY, USA, 2008. ACM.
[31]
L. Tierney. Code analysis and parallelizing vector operations in R. Computational Statistics, 24:217--223, 2009. 10.1007/s00180-008-0117-9.
[32]
L. Tierney. A byte code compiler for R. Technical report, 2012.
[33]
A. Tzannes, G. C. Caragea, R. Barua, and U. Vishkin. Lazy binary-splitting: A run-time adaptive work-stealing scheduler. In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '10, pages 179--190, New York, NY, USA, 2010. ACM.
[34]
D. Wentzlaff and A. Agarwal. Factored operating systems (FOS): The case for a scalable operating system for multicores. SIGOPS Oper. Syst. Rev., 43(2):76--85, Apr. 2009.
[35]
M. Wolfe. More iteration space tiling. In Proceedings of the 1989 ACM/IEEE conference on Supercomputing, Supercomputing '89, pages 655--664, New York, NY, USA, 1989. ACM.
[36]
Y. Ye, K. A. Ross, and N. Vesdapunt. Scalable aggregation on multicore processors. In Proceedings of the Seventh International Workshop on Data Management on New Hardware, DaMoN '11, pages 1--9, New York, NY, USA, 2011. ACM.
[37]
Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. von Laszewski, V. Nefedova, I. Raicu, T. Stef-Praun, and M. Wilde. Swift: Fast, reliable, loosely coupled parallel computation. In Services, 2007 IEEE Congress on, pages 199--206, July 2007.

Cited By

View all
  • (2023)Reusing Just-in-Time Compiled CodeProceedings of the ACM on Programming Languages10.1145/36228397:OOPSLA2(1176-1197)Online publication date: 16-Oct-2023
  • (2023)Artificial Intelligence Application for Effective Customer Relationship Management2023 International Conference on Computer Communication and Informatics (ICCCI)10.1109/ICCCI56745.2023.10128360(1-7)Online publication date: 23-Jan-2023
  • (2022)Data Management and Visual Information Processing in Financial Organization using Machine Learning2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS55157.2022.10011053(467-473)Online publication date: 24-Nov-2022
  • Show More Cited By

Index Terms

  1. Riposte: a trace-driven compiler and parallel VM for vector code in R

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
        September 2012
        512 pages
        ISBN:9781450311823
        DOI:10.1145/2370816
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 19 September 2012

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. data parallel
        2. just-in-time compilation
        3. r language
        4. tracing

        Qualifiers

        • Research-article

        Conference

        PACT '12
        Sponsor:
        • IFIP WG 10.3
        • SIGARCH
        • IEEE CS TCPP
        • IEEE CS TCAA

        Acceptance Rates

        Overall Acceptance Rate 121 of 471 submissions, 26%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)6
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 18 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Reusing Just-in-Time Compiled CodeProceedings of the ACM on Programming Languages10.1145/36228397:OOPSLA2(1176-1197)Online publication date: 16-Oct-2023
        • (2023)Artificial Intelligence Application for Effective Customer Relationship Management2023 International Conference on Computer Communication and Informatics (ICCCI)10.1109/ICCCI56745.2023.10128360(1-7)Online publication date: 23-Jan-2023
        • (2022)Data Management and Visual Information Processing in Financial Organization using Machine Learning2022 International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS55157.2022.10011053(467-473)Online publication date: 24-Nov-2022
        • (2021)Run-time data analysis to drive compiler optimizationsCompanion Proceedings of the 2021 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity10.1145/3484271.3484974(9-12)Online publication date: 17-Oct-2021
        • (2020)Contextual dispatch for function specializationProceedings of the ACM on Programming Languages10.1145/34282884:OOPSLA(1-24)Online publication date: 13-Nov-2020
        • (2019)Reflections on the compatibility, performance, and scalability of parallel PythonProceedings of the 15th ACM SIGPLAN International Symposium on Dynamic Languages10.1145/3359619.3359747(91-103)Online publication date: 20-Oct-2019
        • (2019)R melts brains: an IR for first-class environments and lazy effectful argumentsProceedings of the 15th ACM SIGPLAN International Symposium on Dynamic Languages10.1145/3359619.3359744(55-66)Online publication date: 20-Oct-2019
        • (2018)Automatic Hierarchical Parallelization of Linear RecurrencesACM SIGPLAN Notices10.1145/3296957.317316853:2(128-138)Online publication date: 19-Mar-2018
        • (2018)Julia: dynamism and performance reconciled by designProceedings of the ACM on Programming Languages10.1145/32764902:OOPSLA(1-23)Online publication date: 24-Oct-2018
        • (2018)FlashRACM SIGPLAN Notices10.1145/3200691.317850153:1(183-194)Online publication date: 10-Feb-2018
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media