Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Data-driven context-sensitivity for points-to analysis

Published: 12 October 2017 Publication History

Abstract

We present a new data-driven approach to achieve highly cost-effective context-sensitive points-to analysis for Java. While context-sensitivity has greater impact on the analysis precision and performance than any other precision-improving techniques, it is difficult to accurately identify the methods that would benefit the most from context-sensitivity and decide how much context-sensitivity should be used for them. Manually designing such rules is a nontrivial and laborious task that often delivers suboptimal results in practice. To overcome these challenges, we propose an automated and data-driven approach that learns to effectively apply context-sensitivity from codebases. In our approach, points-to analysis is equipped with a parameterized and heuristic rules, in disjunctive form of properties on program elements, that decide when and how much to apply context-sensitivity. We present a greedy algorithm that efficiently learns the parameter of the heuristic rules. We implemented our approach in the Doop framework and evaluated using three types of context-sensitive analyses: conventional object-sensitivity, selective hybrid object-sensitivity, and type-sensitivity. In all cases, experimental results show that our approach significantly outperforms existing techniques.

References

[1]
Ole Agesen. 1994. Constraint-based type inference and parametric polymorphism. Springer Berlin Heidelberg, Berlin, Heidelberg, 78–100.
[2]
Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA ’06). ACM, New York, NY, USA, 169–190.
[3]
Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’09). ACM, New York, NY, USA, 243–262.
[4]
Sooyoung Cha, Sehun Jeong, and Hakjoo Oh. 2016. Learning a Strategy for Choosing Widening Thresholds from a Large Codebase. Springer International Publishing, Cham, 25–41.
[5]
Kwonsoo Chae, Hakjoo Oh, Kihong Heo, and Hongseok Yang. 2017. Automatically Generating Features for Learning Program Analysis Heuristics. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017).
[6]
Ramkrishna Chatterjee, Barbara G. Ryder, and William A. Landi. 1999. Relevant Context Inference. In Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’99). ACM, New York, NY, USA, 133–146.
[7]
David Grove, Greg DeFouw, Jeffrey Dean, and Craig Chambers. 1997. Call Graph Construction in Object-oriented Languages. In Proceedings of the 12th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA ’97). ACM, New York, NY, USA, 108–124.
[8]
Samuel Z. Guyer and Calvin Lin. 2003. Client-driven Pointer Analysis. In Proceedings of the 10th International Conference on Static Analysis (SAS’03). Springer-Verlag, Berlin, Heidelberg, 214–236. http://dl.acm.org/citation.cfm?id=1760267.1760284
[9]
Nevin Heintze and Olivier Tardieu. 2001. Demand-driven Pointer Analysis. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation (PLDI ’01). ACM, New York, NY, USA, 24–34.
[10]
Kihong Heo, Hakjoo Oh, and Hongseok Yang. 2016. Learning a Variable-Clustering Strategy for Octagon from Labeled Data Generated by a Static Analysis. Springer Berlin Heidelberg, Berlin, Heidelberg, 237–256.
[11]
Kihong Heo, Hakjoo Oh, and Kwangkeun Yi. 2017. Machine-Learning-Guided Selectively Unsound Static Analysis. In Proceedings of the 39th International Conference on Software Engineering. ACM.
[12]
Michael Hind. 2001. Pointer Analysis: Haven’t We Solved This Problem Yet?. In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’01). ACM, New York, NY, USA, 54–61.
[13]
George Kastrinis and Yannis Smaragdakis. 2013a. Efficient and Effective Handling of Exceptions in Java Points-to Analysis. In Proceedings of the 22Nd International Conference on Compiler Construction (CC’13). Springer-Verlag, Berlin, Heidelberg, 41–60.
[14]
George Kastrinis and Yannis Smaragdakis. 2013b. Hybrid Context-sensitivity for Points-to Analysis. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’13). ACM, New York, NY, USA, 423–434.
[15]
George Kastrinis and Yannis Smaragdakis. 2013c. Hybrid Context-sensitivity for Points-to Analysis. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’13). ACM, New York, NY, USA, 423–434.
[16]
Ondřej Lhoták and Laurie Hendren. 2006. Context-Sensitive Points-to Analysis: Is It Worth It?. In Proceedings of the 15th International Conference on Compiler Construction (CC’06). Springer-Verlag, Berlin, Heidelberg, 47–64.
[17]
Ondřej Lhoták and Laurie Hendren. 2008. Evaluating the Benefits of Context-sensitive Points-to Analysis Using a BDD-based Implementation. ACM Trans. Softw. Eng. Methodol. 18, 1, Article 3 (Oct. 2008), 53 pages.
[18]
Donglin Liang and Mary Jean Harrold. 1999. Efficient Points-to Analysis for Whole-program Analysis. In Proceedings of the 7th European Software Engineering Conference Held Jointly with the 7th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE-7). Springer-Verlag, London, UK, UK, 199–215. http://dl.acm.org/citation. cfm?id=318773.318943
[19]
Donglin Liang, Maikel Pennings, and Mary Jean Harrold. 2005. Evaluating the Impact of Context-sensitivity on Andersen’s Algorithm for Java Programs. In Proceedings of the 6th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’05). ACM, New York, NY, USA, 6–12.
[20]
Percy Liang, Omer Tripp, and Mayur Naik. 2011. Learning Minimal Abstractions. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 31–42.
[21]
Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized Object Sensitivity for Points-to Analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1 (Jan. 2005), 1–41.
[22]
Hakjoo Oh, Wonchan Lee, Kihong Heo, Hongseok Yang, and Kwangkeun Yi. 2014. Selective Context-sensitivity Guided by Impact Pre-analysis. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 475–484.
[23]
Hakjoo Oh, Hongseok Yang, and Kwangkeun Yi. 2015. Learning a Strategy for Adapting a Program Analysis via Bayesian Optimisation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 572–588.
[24]
Erik Ruf. 1995. Context-insensitive Alias Analysis Reconsidered. In Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation (PLDI ’95). ACM, New York, NY, USA, 13–22.
[25]
Erik Ruf. 2000. Effective Synchronization Removal for Java. In Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation (PLDI ’00). ACM, New York, NY, USA, 208–218.
[26]
Micha Sharir and Amir Pnueli. 1981. Two approaches to interprocedural data flow analysis. Prentice-Hall, Englewood Cliffs, NJ, Chapter 7, 189–234.
[27]
Yannis Smaragdakis and George Balatsouras. 2015. Pointer Analysis. Found. Trends Program. Lang. 2, 1 (April 2015), 1–69.
[28]
Yannis Smaragdakis, Martin Bravenboer, and Ondrej Lhoták. 2011. Pick Your Contexts Well: Understanding Object-sensitivity. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 17–30.
[29]
Yannis Smaragdakis, George Kastrinis, and George Balatsouras. 2014. Introspective Analysis: Context-sensitivity, Across the Board. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 485–495.
[30]
Manu Sridharan and Rastislav Bodík. 2006. Refinement-based Context-sensitive Points-to Analysis for Java. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’06). ACM, New York, NY, USA, 387–400.
[31]
Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodík. 2005. Demand-driven Points-to Analysis for Java. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA ’05). ACM, New York, NY, USA, 59–76.
[32]
Tian Tan, Yue Li, and Jingling Xue. 2016. Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting. In Static Analysis - 23rd International Symposium, SAS 2016, Edinburgh, UK, September 8-10, 2016, Proceedings. 489–510.
[33]
Omer Tripp, Marco Pistoia, Stephen J. Fink, Manu Sridharan, and Omri Weisman. 2009. TAJ: Effective Taint Analysis of Web Applications. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’09). ACM, New York, NY, USA, 87–97.
[34]
Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot - a Java Bytecode Optimization Framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research (CASCON ’99). IBM Press, 13–. http://dl.acm.org/citation.cfm?id=781995.782008
[35]
Robert P. Wilson and Monica S. Lam. 1995. Efficient Context-sensitive Pointer Analysis for C Programs. In Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation (PLDI ’95). ACM, New York, NY, USA, 1–12.
[36]
Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. 2014. On Abstraction Refinement for Program Analyses in Datalog. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 239–248.

Cited By

View all
  • (2024)Towards Effective Static Type-Error Detection for PythonProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695545(1808-1820)Online publication date: 27-Oct-2024
  • (2024)Scaling Abstraction Refinement for Program Analyses in Datalog using Graph Neural NetworksProceedings of the ACM on Programming Languages10.1145/36897658:OOPSLA2(1532-1560)Online publication date: 8-Oct-2024
  • (2024)When to Stop Going Down the Rabbit Hole: Taming Context-Sensitivity on the FlyProceedings of the 13th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis10.1145/3652588.3663321(35-44)Online publication date: 20-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 1, Issue OOPSLA
October 2017
1786 pages
EISSN:2475-1421
DOI:10.1145/3152284
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2017
Published in PACMPL Volume 1, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Context-sensitivity
  2. Data-driven program analysis
  3. Points-to analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)143
  • Downloads (Last 6 weeks)24
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Effective Static Type-Error Detection for PythonProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695545(1808-1820)Online publication date: 27-Oct-2024
  • (2024)Scaling Abstraction Refinement for Program Analyses in Datalog using Graph Neural NetworksProceedings of the ACM on Programming Languages10.1145/36897658:OOPSLA2(1532-1560)Online publication date: 8-Oct-2024
  • (2024)When to Stop Going Down the Rabbit Hole: Taming Context-Sensitivity on the FlyProceedings of the 13th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis10.1145/3652588.3663321(35-44)Online publication date: 20-Jun-2024
  • (2024)Learning Abstraction Selection for Bayesian Program AnalysisProceedings of the ACM on Programming Languages10.1145/36498458:OOPSLA1(954-982)Online publication date: 29-Apr-2024
  • (2024)Generic Sensitivity: Generics-Guided Context Sensitivity for Pointer AnalysisIEEE Transactions on Software Engineering10.1109/TSE.2024.337764550:5(1144-1162)Online publication date: May-2024
  • (2023)A Cocktail Approach to Practical Call Graph ConstructionProceedings of the ACM on Programming Languages10.1145/36228337:OOPSLA2(1001-1033)Online publication date: 16-Oct-2023
  • (2023)A Container-Usage-Pattern-Based Context Debloating Approach for Object-Sensitive Pointer AnalysisProceedings of the ACM on Programming Languages10.1145/36228327:OOPSLA2(971-1000)Online publication date: 16-Oct-2023
  • (2023)ORAQL — Optimistic Responses to Alias Queries in LLVMProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605644(655-664)Online publication date: 7-Aug-2023
  • (2023)Beware of the Unexpected: Bimodal Taint AnalysisProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598050(211-222)Online publication date: 12-Jul-2023
  • (2023)Hybrid Inlining: A Framework for Compositional and Context-Sensitive Static AnalysisProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598042(114-126)Online publication date: 12-Jul-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media