Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3617651.3622994acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Diagnosing Compiler Performance by Comparing Optimization Decisions

Published: 19 October 2023 Publication History

Abstract

Modern compilers apply a set of optimization passes aiming to speed up the generated code. The combined effect of individual optimizations is difficult to predict. Thus, changes to a compiler's code may hinder the performance of generated code as an unintended consequence.
Performance regressions in compiled code are often related to misapplied optimizations. The regressions are hard to investigate, considering the vast number of compilation units and applied optimizations. A compilation unit consists of a root method and inlined methods. Thus, a method may be part of several compilation units and may be optimized differently in each. Moreover, inlining decisions are not invariant across runs of the virtual machine (VM).
We propose to solve the problem of diagnosing performance regressions by capturing the compiler's optimization decisions. We do so by representing the applied optimization phases, optimization decisions, and inlining decisions in the form of trees. This paper introduces an approach utilizing tree edit distance (TED) to detect optimization differences in a semi-automated way. We present an approach to compare optimization decisions in differently inlined methods. We employ these techniques to pinpoint the causes of performance problems in various benchmarks of the Graal compiler.

References

[1]
Milad Abdullah, Lubomír Bulej, Tomáš Bureš, Petr Hnětynka, Vojtěch Horký, and Petr Tůma. 2022. Reducing Experiment Costs in Automated Software Performance Regression Detection. In 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). 56–59. https://doi.org/10.1109/SEAA56994.2022.00017
[2]
Gergö Barany. 2018. Finding Missed Compiler Optimizations by Differential Testing. CC 2018. Association for Computing Machinery, New York, NY, USA. 82–92. isbn:9781450356442 https://doi.org/10.1145/3178372.3179521
[3]
Edd Barrett, Carl Friedrich Bolz-Tereick, Rebecca Killick, Sarah Mount, and Laurence Tratt. 2017. Virtual Machine Warmup Blows Hot and Cold. Proc. ACM Program. Lang., 1, OOPSLA (2017), Article 52, 10, 27 pages. https://doi.org/10.1145/3133876
[4]
Philip Bille. 2005. A survey on tree edit distance and related problems. Theoretical Computer Science, 337, 1 (2005), 217–239. issn:0304-3975 https://doi.org/10.1016/j.tcs.2004.12.030
[5]
Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. SIGPLAN Not., 41, 10 (2006), 10, 169–190. issn:0362-1340 https://doi.org/10.1145/1167515.1167488
[6]
Lubomír Bulej, François Farquet, Vojtěch Horký, and Petr Tůma. 2021. Tracking Performance of Graal on Public Benchmarks. Presentation at International Workshop on Load Testing and Benchmarking of Software Systems (LTB) 2021. https://doi.org/10.6084/m9.figshare.14447823
[7]
Sudarshan S. Chawathe, Anand Rajaraman, Hector Garcia-Molina, and Jennifer Widom. 1996. Change Detection in Hierarchically Structured Information. SIGMOD Rec., 25, 2 (1996), 6, 493–504. issn:0163-5808 https://doi.org/10.1145/235968.233366
[8]
Sudarshan S. Chawathe, Anand Rajaraman, Hector Garcia-Molina, and Jennifer Widom. 1996. Change Detection in Hierarchically Structured Information. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (SIGMOD ’96). Association for Computing Machinery, New York, NY, USA. 493–504. isbn:0897917944 https://doi.org/10.1145/233269.233366
[9]
Nullstone Corporation. 2012. NULLSTONE for Java. http://www.nullstone.com/htmls/ns-java.htm
[10]
Charlie Curtsinger and Emery D. Berger. 2015. Coz: Finding Code That Counts with Causal Profiling. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15). Association for Computing Machinery, New York, NY, USA. 184–197. isbn:9781450338349 https://doi.org/10.1145/2815400.2815409
[11]
Luca Della Toffola, Michael Pradel, and Thomas R. Gross. 2015. Performance Problems You Can Fix: A Dynamic Analysis of Memoization Opportunities. SIGPLAN Not., 50, 10 (2015), 10, 607–622. issn:0362-1340 https://doi.org/10.1145/2858965.2814290
[12]
Gilles Duboscq, Lukas Stadler, Thomas Wuerthinger, Doug Simon, Christian Wimmer, and Hanspeter Mössenböck. 2013. Graal IR: An Extensible Declarative Intermediate Representation. In Proceedings of the Asia-Pacific Programming Languages and Compilers Workshop. http://ssw.jku.at/General/Staff/GD/APPLC-2013-paper_12.pdf
[13]
Gilles Duboscq, Thomas Würthinger, Lukas Stadler, Christian Wimmer, Doug Simon, and Hanspeter Mössenböck. 2013. An Intermediate Representation for Speculative Optimizations in a Dynamic Compiler. In Proceedings of the 7th ACM Workshop on Virtual Machines and Intermediate Languages (VMIL ’13). Association for Computing Machinery, New York, NY, USA. 1–10. isbn:9781450326018 https://doi.org/10.1145/2542142.2542143
[14]
Andy Georges, Dries Buytaert, and Lieven Eeckhout. 2007. Statistically Rigorous Java Performance Evaluation. In Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA ’07). Association for Computing Machinery, New York, NY, USA. 57–76. isbn:9781595937865 https://doi.org/10.1145/1297027.1297033
[15]
Andy Georges, Lieven Eeckhout, and Dries Buytaert. 2008. Java Performance Evaluation through Rigorous Replay Compilation. In Proceedings of the 23rd ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications (OOPSLA ’08). Association for Computing Machinery, New York, NY, USA. 367–384. isbn:9781605582153 https://doi.org/10.1145/1449764.1449794
[16]
Xue Han, Tingting Yu, and Gongjun Yan. 2023. A systematic mapping study of software performance research. Software: Practice and Experience, 53, 5 (2023), 1249–1270. https://doi.org/10.1002/spe.3185
[17]
Atsushi Hashimoto and Nagisa Ishiura. 2016. Detecting Arithmetic Optimization Opportunities for C Compilers by Randomly Generated Equivalent Programs. IPSJ Transactions on System and LSI Design Methodology, 9 (2016), 21–29. https://doi.org/10.2197/ipsjtsldm.9.21
[18]
Urs Hölzle, Craig Chambers, and David Ungar. 1992. Debugging Optimized Code with Dynamic Deoptimization. In Proceedings of the ACM SIGPLAN 1992 Conference on Programming Language Design and Implementation (PLDI ’92). Association for Computing Machinery, New York, NY, USA. 32–43. isbn:0897914759 https://doi.org/10.1145/143095.143114
[19]
Urs Hölzle and David Ungar. 1994. Optimizing Dynamically-Dispatched Calls with Run-Time Type Feedback. SIGPLAN Not., 29, 6 (1994), 6, 326–336. issn:0362-1340 https://doi.org/10.1145/773473.178478
[20]
Xianglong Huang, Stephen M. Blackburn, Kathryn S. McKinley, J Eliot B. Moss, Zhenlin Wang, and Perry Cheng. 2004. The Garbage Collection Advantage: Improving Program Locality. SIGPLAN Not., 39, 10 (2004), 10, 69–80. issn:0362-1340 https://doi.org/10.1145/1035292.1028983
[21]
Kota Kitaura and Nagisa Ishiura. 2018. Random Testing of Compilers’ Performance Based on Mixed Static and Dynamic Code Comparison. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation (A-TEST 2018). Association for Computing Machinery, New York, NY, USA. 38–44. isbn:9781450360531 https://doi.org/10.1145/3278186.3278192
[22]
Thomas Kotzmann, Christian Wimmer, Hanspeter Mössenböck, Thomas Rodriguez, Kenneth Russell, and David Cox. 2008. Design of the Java HotSpot ™ Client Compiler for Java 6. ACM Trans. Archit. Code Optim., 5, 1 (2008), Article 7, 5, 32 pages. issn:1544-3566 https://doi.org/10.1145/1369396.1370017
[23]
David Leopoldseder, Lukas Stadler, Thomas Würthinger, Josef Eisl, Doug Simon, and Hanspeter Mössenböck. 2018. Dominance-Based Duplication Simulation (DBDS): Code Duplication to Enable Compiler Optimizations. In Proceedings of the 2018 International Symposium on Code Generation and Optimization (CGO 2018). Association for Computing Machinery, New York, NY, USA. 126–137. isbn:9781450356176 https://doi.org/10.1145/3168811
[24]
Raphael Mosaner, David Leopoldseder, Wolfgang Kisling, Lukas Stadler, and Hanspeter Mössenböck. 2022. Machine-Learning-Based Self-Optimizing Compiler Heuristics. In Proceedings of the 19th International Conference on Managed Programming Languages and Runtimes (MPLR ’22). Association for Computing Machinery, New York, NY, USA. 98–111. isbn:9781450396967 https://doi.org/10.1145/3546918.3546921
[25]
Tipp Moseley, Dirk Grunwald, and Ramesh Peri. 2009. OptiScope: Performance Accountability for Optimizing Compilers. In 2009 International Symposium on Code Generation and Optimization. 254–264. https://doi.org/10.1109/CGO.2009.26
[26]
Adrian Nistor, Po-Chun Chang, Cosmin Radoi, and Shan Lu. 2015. Caramel: Detecting and Fixing Performance Problems That Have Non-Intrusive Fixes. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, 902–912. isbn:9781479919345
[27]
Kazunori Ogata, Tamiya Onodera, Kiyokuni Kawachiya, Hideaki Komatsu, and Toshio Nakatani. 2006. Replay Compilation: Improving Debuggability of a Just-in-Time Compiler. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA ’06). Association for Computing Machinery, New York, NY, USA. 241–252. isbn:1595933484 https://doi.org/10.1145/1167473.1167493
[28]
Oracle. 2019. Replay Compilation in HotSpot JVM. https://github.com/openjdk/jdk/blob/master/src/hotspot/share/ci/ciReplay.hpp
[29]
Oracle. 2023. Graal Compiler. https://www.graalvm.org/latest/reference-manual/java/compiler/
[30]
Oracle. 2023. GraalVM. https://www.graalvm.org/
[31]
Oracle. 2023. Profile Replay in GraalVM. https://github.com/oracle/graal/blob/master/compiler/src/jdk.internal.vm.compiler/src/org/graalvm/compiler/hotspot/ProfileReplaySupport.java
[32]
Oracle. 2023. proftool. https://github.com/graalvm/mx/blob/master/README-proftool.md
[33]
Michael Paleczny, Christopher Vick, and Cliff Click. 2001. The Java HotSpot ™ Server Compiler. In Proceedings of the 2001 Symposium on Java ™ Virtual Machine Research and Technology Symposium - Volume 1 (JVM’01). USENIX Association, USA. 1.
[34]
Aleksandar Prokopec, Gilles Duboscq, David Leopoldseder, and Thomas Wïrthinger. 2019. An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers. In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 164–179. https://doi.org/10.1109/CGO.2019.8661171
[35]
Aleksandar Prokopec, Andrea Rosà, David Leopoldseder, Gilles Duboscq, Petr Tůma, Martin Studener, Lubomír Bulej, Yudi Zheng, Alex Villazón, Doug Simon, Thomas Würthinger, and Walter Binder. 2019. Renaissance: Benchmarking Suite for Parallel Applications on the JVM. In Proc. 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 17. https://doi.org/10.1145/3314221.3314637
[36]
John Rose. 2013. MethodData. https://wiki.openjdk.org/display/HotSpot/MethodData
[37]
Narendran Sachindran and J. Eliot B. Moss. 2003. Mark-Copy: Fast Copying GC with Less Space Overhead. SIGPLAN Not., 38, 11 (2003), 10, 326–343. issn:0362-1340 https://doi.org/10.1145/949343.949335
[38]
Stanley M. Selkow. 1977. The tree-to-tree editing problem. Inform. Process. Lett., 6, 6 (1977), 184–186. issn:0020-0190 https://doi.org/10.1016/0020-0190(77)90064-3
[39]
Linhai Song and Shan Lu. 2017. Performance Diagnosis for Inefficient Loops. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 370–380. https://doi.org/10.1109/ICSE.2017.41
[40]
Lukas Stadler, Gilles Duboscq, Hanspeter Mössenböck, Thomas Würthinger, and Doug Simon. 2013. An Experimental Study of the Influence of Dynamic Compiler Optimizations on Scala Performance. In Proceedings of the 4th Workshop on Scala (SCALA ’13). Association for Computing Machinery, New York, NY, USA. Article 9, 8 pages. isbn:9781450320641 https://doi.org/10.1145/2489837.2489846
[41]
Lukas Stadler, Thomas Würthinger, and Hanspeter Mössenböck. 2018. Partial Escape Analysis and Scalar Replacement for Java. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO ’14). Association for Computing Machinery, New York, NY, USA. 165–174. isbn:9781450326704 https://doi.org/10.1145/2544137.2544157
[42]
Sun Microsystems, Inc. 2006. The Java HotSpot ™ Performance Engine Architecture. https://www.oracle.com/java/technologies/whitepaper.html
[43]
Jialiang Tan, Shuyin Jiao, Milind Chabbi, and Xu Liu. 2020. What Every Scientific Programmer Should Know about Compiler Optimizations? In Proceedings of the 34th ACM International Conference on Supercomputing (ICS ’20). Association for Computing Machinery, New York, NY, USA. Article 42, 12 pages. isbn:9781450379830 https://doi.org/10.1145/3392717.3392754
[44]
Jubi Taneja, Zhengyang Liu, and John Regehr. 2020. Testing Static Analyses for Precision and Soundness. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (CGO 2020). Association for Computing Machinery, New York, NY, USA. 81–93. isbn:9781450370479 https://doi.org/10.1145/3368826.3377927
[45]
Theodoros Theodoridis, Manuel Rigger, and Zhendong Su. 2022. Finding Missed Optimizations through the Lens of Dead Code Elimination. ASPLOS ’22. Association for Computing Machinery, New York, NY, USA. 697–709. isbn:9781450392051 https://doi.org/10.1145/3503222.3507764
[46]
Igor Veresov. 2013. https://www.slideshare.net/maddocig/tiered
[47]
April W. Wade, Prasad A. Kulkarni, and Michael R. Jantz. 2020. Exploring Impact of Profile Data on Code Quality in the HotSpot JVM. ACM Transactions on Embedded Computing Systems, 19, 6 (2020), Article 48, 10, 26 pages. issn:1539-9087 https://doi.org/10.1145/3391894
[48]
Shasha Wen, Xu Liu, and Milind Chabbi. 2015. Runtime Value Numbering: A Profiling Technique to Pinpoint Redundant Computations. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 254–265. https://doi.org/10.1109/PACT.2015.29
[49]
Thomas Würthinger. 2007. Visualization of Program Dependence Graphs. Johannes Kepler University Linz. https://ssw.jku.at/Research/Papers/Wuerthinger07Master/Wuerthinger07Master.pdf
[50]
Tingting Yu and Michael Pradel. 2018. Pinpointing and repairing performance bottlenecks in concurrent programs. Empirical Software Engineering, 23, 5 (2018), 01 10, 3034–3071. issn:1573-7616 https://doi.org/10.1007/s10664-017-9578-1

Cited By

View all
  • (2024)Online and Offline Compiler Performance Comparison and Optimization2024 International Conference on Intelligent Systems for Cybersecurity (ISCS)10.1109/ISCS61804.2024.10581084(1-6)Online publication date: 3-May-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MPLR 2023: Proceedings of the 20th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes
October 2023
184 pages
ISBN:9798400703805
DOI:10.1145/3617651
  • General Chair:
  • Rodrigo Bruno,
  • Program Chair:
  • Eliot Moss
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Compiler Optimizations
  2. Just-In-Time Compilation
  3. Tree Matching
  4. Virtual Machines

Qualifiers

  • Research-article

Conference

MPLR '23
Sponsor:

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)93
  • Downloads (Last 6 weeks)6
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Online and Offline Compiler Performance Comparison and Optimization2024 International Conference on Intelligent Systems for Cybersecurity (ISCS)10.1109/ISCS61804.2024.10581084(1-6)Online publication date: 3-May-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media