Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1346281.1346293acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Improving the performance of object-oriented languages with dynamic predication of indirect jumps

Published: 01 March 2008 Publication History

Abstract

Indirect jump instructions are used to implement increasingly-common programming constructs such as virtual function calls, switch-case statements, jump tables, and interface calls. The performance impact of indirect jumps is likely to increase because indirect jumps with multiple targets are difficult to predict even with specialized hardware.
This paper proposes a new way of handling hard-to-predict indirect jumps: dynamically predicating them. The compiler (static or dynamic) identifies indirect jumps that are suitable for predication along with their control-flow merge (CFM) points. The hardware predicates theinstructions between different targets of the jump and its CFM point if the jump turns out to be hard-to-predict at run time. If the jump would actually have been mispredicted, its dynamic predication eliminates a pipeline flush, thereby improving performance.
Our evaluations show that Dynamic Indirect jump Predication (DIP) improves the performance of a set of object-oriented applications including the Java DaCapo benchmark suite by 37.8% compared to a commonly-used branch target buffer based predictor, while also reducing energy consumption by 24.8%. We compare DIP to three previously proposed indirect jump predictors and find that it provides the best performance and energy-efficiency.

Supplementary Material

JPG File (1346293.jpg)
index.html (index.html)
Slides from the presentation
ZIP File (p80-joao-slides.zip)
Supplemental material for Improving the performance of object-oriented languages with dynamic predication of indirect jumps
Audio only (1346293.mp3)
Video (1346293.mp4)

References

[1]
J.R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of control dependence to data dependence. In POPL-10, 1983.
[2]
B. Alpern, A. Cocchi, S. Fink, D. Grove, and D. Lieber. Efficient implementation of Java interfaces: Invokeinterface considered harmless. In OOPSLA, 2001.
[3]
S. Bhansali, W.-K. Chen, S.D. Jong, A. Edwards, and M. Drinic. Framework for instruction-level tracing and analysis of programs. In VEE, 2006.
[4]
N.L. Binkert, R.G. Dreslinski, L.R. Hsu, K.T. Lim, A.G. Saidi, and S.K. Reinhardt. The M5 simulator: Modeling networked systems. IEEE Micro, 26(4):52--60, 2006.
[5]
S.M. Blackburn et al. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA'06, 2006.
[6]
D. Brooks, V. Tiwari, and M. Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In ISCA-27, 2000.
[7]
B. Calder, D. Grunwald, and B. Zorn. Quantifying behavioral differences between C and C++ programs. Journal of Programming Languages, 2(4):323--351, 1995.
[8]
L. Cardelli and P. Wegner. On understanding types, data abstraction, and polymorphism. ACM Computing Surveys, 17(4):471--523, Dec. 1985.
[9]
P. Chang, E. Hao, and Y.N. Patt. Target prediction for indirect jumps. In ISCA, 1997.
[10]
R. Cytron et al. Efficiently computing static single assignment form and the control dependence graph. ACM TOPLAS, 13(4):451--490, Oct. 1991.
[11]
L.P. Deutsch and A.M. Schiffman. Efficient implementation of the Smalltalk-80 system. In POPL, 1984.
[12]
K. Driesen and U. Holzle. Accurate indirect branch prediction. In ISCA-25, 1998.
[13]
K. Driesen and U. Holzle. Multi-stage cascaded prediction. In Euro-Par, 1999.
[14]
M.A. Ertl and D. Gregg. Optimizing indirect branch prediction accuracy in virtual machine interpreters. In PLDI, 2003.
[15]
M. Farrens, T. Heil, J.E. Smith, and G. Tyson. Restricted dual path execution. Technical Report CSE-97-18, University of California at Davis, Nov. 1997.
[16]
S. Gochman, R. Ronen, I. Anati, A. Berkovits, T. Kurts, A. Naveh, A. Saeed, Z. Sperber, and R.C. Valentine. The Intel PentiumM processor: Microarchitecture and performance. Intel Technology Journal, 7(2), May 2003.
[17]
T. Heil and J.E. Smith. Selective dual path execution. Technical report, University of Wisconsin-Madison, Nov. 1996.
[18]
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, Feb. 2001.
[19]
U. Holzle, C. Chambers, and D. Ungar. Optimizing dynamically-typed object-oriented languages with polymorphic inline caches. In ECOOP, 1991.
[20]
U. Holzle and D. Ungar. Optimizing dynamically-dispatched calls with run-time type feedback. In PLDI, 1994.
[21]
Intel Corp. ICC 9.1 for Linux. http://www.intel.com/cd/software/products/asmo-na/eng/compilers/284264.htm.
[22]
Intel Corp. Intel Core2 Duo Desktop Processor E6600. http://processorfinder.intel.com/details.aspx?sSpec=SL9ZL.
[23]
Intel Corp. Intel VTune Performance Analyzers. http://www.intel.com/vtune/.
[24]
K. Ishizaki, M. Kawahito, T. Yasue, H. Komatsu, and T. Nakatani. A study of devirtualization techniques for a java just-in-time compiler. In OOPSLA-15, 2000.
[25]
E. Jacobsen, E. Rotenberg, and J.E. Smith. Assigning confidence to conditional branch predictions. In MICRO-29, 1996.
[26]
D. Jimenez and C. Lin. Dynamic branch prediction with perceptrons. In HPCA, 2001.
[27]
J.A. Joao, O. Mutlu, H. Kim, and Y.N. Patt. Dynamic predication of indirect jumps. IEEE Computer Architecture Letters, May 2007.
[28]
J. Kalamatianos and D.R. Kaeli. Predicting indirect branches via data compression. In MICRO-31.
[29]
R.E. Kessler. The Alpha 21264 microprocessor. IEEE Micro, 19(2):24--36, 1999.
[30]
H. Kim, J.A. Joao, O. Mutlu, C.J. Lee, Y.N. Patt, and R.S. Cohn. VPC Prediction: Reducing the cost of indirect branches via hardware-based dynamic devirtualization. In ISCA-34, 2007.
[31]
H. Kim, J.A. Joao, O. Mutlu, and Y.N. Patt. Diverge-merge processor (DMP): Dynamic predicated execution of complex control-flow graphs based on frequently executed paths. In MICRO-39, 2006.
[32]
H. Kim, J.A. Joao, O. Mutlu, and Y.N. Patt. Diverge-merge processor: Generalized and energy-efficient dynamic predication. IEEE Micro, 27(1):94--104, 2007.
[33]
H. Kim, J.A. Joao, O. Mutlu, and Y.N. Patt. Profile-assisted compiler support for dynamic predication in diverge-merge processors. In CGO-5, 2007.
[34]
A. Klauser, T. Austin, D. Grunwald, and B. Calder. Dynamic hammock predication for non-predicated instruction set architectures. In PACT, 1998.
[35]
A. Klauser, A. Paithankar, and D. Grunwald. Selective eager execution on the polypath architecture. In ISCA-25, 1998.
[36]
B.R. Rau, D.W.L. Yen, W. Yen, and R.A. Towle. The Cydra 5 departmental supercomputer. IEEE Computer, 22:12--35, Jan. 1989.
[37]
E.M. Riseman and C.C. Foster. The inhibition of potential parallelism by conditional jumps. IEEE Transactions on Computers, C-21(12):1405--1411, 1972.
[38]
A. Roth, A. Moshovos, and G.S. Sohi. Improving virtual function call target prediction via dependence-based pre-computation. In ICS-13, 1999.
[39]
A. Seznec and P. Michaud. A case for (partially) TAgged GEometric history length branch prediction. JILP, Feb. 2006.
[40]
E.H. Sussenguth. Instruction Control Sequence. U.S. Patent 3559183, Jan. 26, 1971.
[41]
D. Tarditi, July 2007. Personal communication.
[42]
J. Tendler, S. Dodson, S. Fields, H. Le, and B. Sinharoy. POWER4 system microarchitecture. IBM Technical White Paper, Oct. 2001.
[43]
P.H. Wang, H. Wang, R.M. Kling, K. Ramakrishnan, and J.P. Shen. Register renaming and scheduling for dynamic execution of predicated code. In HPCA-7, 2001.
[44]
M. Wolczko. Benchmarking Java with the Richards benchmark. http://research.sun.com/people/mario/java_benchmarking/richards/richards.html.

Cited By

View all
  • (2021)Characterizing Massively Parallel Polymorphism2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS51385.2021.00037(205-216)Online publication date: Mar-2021
  • (2019)JumpswitchesProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358832(285-299)Online publication date: 10-Jul-2019
  • (2018)Cost effective speculation with the omnipredictorProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243208(1-13)Online publication date: 1-Nov-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
March 2008
352 pages
ISBN:9781595939586
DOI:10.1145/1346281
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 36, Issue 1
    ASPLOS '08
    March 2008
    339 pages
    ISSN:0163-5964
    DOI:10.1145/1353534
    Issue’s Table of Contents
  • cover image ACM SIGOPS Operating Systems Review
    ACM SIGOPS Operating Systems Review  Volume 42, Issue 2
    ASPLOS '08
    March 2008
    339 pages
    ISSN:0163-5980
    DOI:10.1145/1353535
    Issue’s Table of Contents
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 43, Issue 3
    ASPLOS '08
    March 2008
    339 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1353536
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic predication
  2. indirect jumps
  3. object-oriented languages
  4. predicated execution
  5. virtual functions

Qualifiers

  • Research-article

Conference

ASPLOS08

Acceptance Rates

ASPLOS XIII Paper Acceptance Rate 31 of 127 submissions, 24%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Characterizing Massively Parallel Polymorphism2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS51385.2021.00037(205-216)Online publication date: Mar-2021
  • (2019)JumpswitchesProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358832(285-299)Online publication date: 10-Jul-2019
  • (2018)Cost effective speculation with the omnipredictorProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243208(1-13)Online publication date: 1-Nov-2018
  • (2018)Quantitative Overhead Analysis for Python2018 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC.2018.8573512(36-47)Online publication date: Sep-2018
  • (2017)ShortCutACM SIGARCH Computer Architecture News10.1145/3140659.308023745:2(494-506)Online publication date: 24-Jun-2017
  • (2017)ShortCutProceedings of the 44th Annual International Symposium on Computer Architecture10.1145/3079856.3080237(494-506)Online publication date: 24-Jun-2017
  • (2016)YakProceedings of the 12th USENIX conference on Operating Systems Design and Implementation10.5555/3026877.3026905(349-365)Online publication date: 2-Nov-2016
  • (2015)Bungee jumpsProceedings of the 48th International Symposium on Microarchitecture10.1145/2830772.2830781(370-382)Online publication date: 5-Dec-2015
  • (2014)On-device objective-C application optimization framework for high-performance mobile processorsProceedings of the conference on Design, Automation & Test in Europe10.5555/2616606.2616711(1-6)Online publication date: 24-Mar-2014
  • (2014)A General Low-Cost Indirect Branch Prediction Using Target Address PointersJournal of Computer Science and Technology10.1007/s11390-014-1480-329:6(929-946)Online publication date: 17-Nov-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media