Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2818754.2818827acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Database-backed program analysis for scalable error propagation

Published: 16 May 2015 Publication History

Abstract

Software is rapidly increasing in size and complexity. Static analyses must be designed to scale well if they are to be usable with realistic applications, but prior efforts have often been limited by available memory. We propose a database-backed strategy for large program analysis based on graph algorithms, using a Semantic Web database to manage representations of the program under analysis. Our approach is applicable to a variety of interprocedural finite distributive subset (IFDS) dataflow problems; we focus on error propagation as a motivating example. Our implementation analyzes multi-million-line programs quickly and in just a fraction of the memory required by prior approaches. When memory alone is insufficient, our approach falls back on disk using several hybrid configurations tuned to put all available resources to good use.

References

[1]
B. Hardekopf and C. Lin, "The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code," in PLDI, J. Ferrante and K. S. McKinley, Eds. ACM, 2007, pp. 290--299.
[2]
I. Dillig, T. Dillig, and A. Aiken, "Sound, complete and scalable path-sensitive analysis," in PLDI, R. Gupta and S. P. Amarasinghe, Eds. ACM, 2008, pp. 270--280.
[3]
K. Chen and D. Wagner, "Large-scale analysis of format string vulnerabilities in Debian Linux," in PLAS, M. W. Hicks, Ed. ACM, 2007, pp. 75--84.
[4]
H. Oh, K. Heo, W. Lee, W. Lee, and K. Yi, "Design and implementation of sparse global analyses for C-like languages," in PLDI, J. Vitek, H. Lin, and F. Tip, Eds. ACM, 2012, pp. 229--238.
[5]
L. Li, C. Cifuentes, and N. Keynes, "Boosting the performance of flow-sensitive points-to analysis using value flow," in SIGSOFT FSE, T. Gyimóthy and A. Zeller, Eds. ACM, 2011, pp. 343--353.
[6]
C. Cifuentes, N. Keynes, L. Li, N. Hawes, M. Valdiviezo, A. Browne, J. Zimmermann, A. Craik, D. Teoh, and C. Hoermann, "Static deep error checking in large system applications using Parfait," in SIGSOFT FSE, T. Gyimóthy and A. Zeller, Eds. ACM, 2011, pp. 432--435.
[7]
B. Hardekopf and C. Lin, "Semi-sparse flow-sensitive pointer analysis," in POPL, Z. Shao and B. C. Pierce, Eds. ACM, 2009, pp. 226--238.
[8]
O. Lhoták and K.-C. A. Chung, "Points-to analysis with efficient strong updates," in POPL, T. Ball and M. Sagiv, Eds. ACM, 2011, pp. 3--16.
[9]
H. Yu, J. Xue, W. Huo, X. Feng, and Z. Zhang, "Level by level: making flow- and context-sensitive pointer analysis scalable for millions of lines of code," in CGO, A. Moshovos, J. G. Steffan, K. M. Hazelwood, and D. R. Kaeli, Eds. ACM, 2010, pp. 218--229.
[10]
B. Hardekopf and C. Lin, "Flow-sensitive pointer analysis for millions of lines of code," in CGO. IEEE, 2011, pp. 289--298.
[11]
M. Bravenboer and Y. Smaragdakis, "Strictly declarative specification of sophisticated points-to analyses," in OOPSLA, S. Arora and G. T. Leavens, Eds. ACM, 2009, pp. 243--262.
[12]
E. Hajiyev, M. Verbaere, O. de Moor, and K. D. Voider, "CodeQuest: Querying source code with DataLog," in OOPSLA Companion, R. E. Johnson and R. P. Gabriel, Eds. ACM, 2005, pp. 102--103.
[13]
E. Hajiyev, M. Verbaere, and O. de Moor, "CodeQuest: Scalable source code queries with Datalog," in ECOOP, ser. Lecture Notes in Computer Science, D. Thomas, Ed., vol. 4067. Springer, 2006, pp. 2--27.
[14]
Y. Smaragdakis, M. Bravenboer, and O. Lhoták, "Pick your contexts well: understanding object-sensitivity," in POPL, T. Ball and M. Sagiv, Eds. ACM, 2011, pp. 17--30.
[15]
H. S. Gunawi, C. Rubio-González, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and B. Liblit, "EIO: Error handling is occasionally correct," in FAST, M. Baker and E. Riedel, Eds. USENIX, 2008, pp. 207--222.
[16]
C. Rubio-González, H. S. Gunawi, B. Liblit, R. H. Arpaci-Dusseau, and A. C. Arpaci-Dusseau, "Error propagation analysis for file systems," in PLDI, M. Hind and A. Diwan, Eds. ACM, 2009, pp. 270--280.
[17]
F. Cristian, "Exception handling," in Dependability of Resilient Computers, 1989, pp. 68--97.
[18]
M. Lippert and C. V. Lopes, "A study on exception detection and handling using aspect-oriented programming," in ICSE, 2000, pp. 418--427.
[19]
R. P. L. Buse and W. Weimer, "Automatic documentation inference for exceptions," in ISSTA, B. G. Ryder and A. Zeller, Eds. ACM, 2008, pp. 273--282.
[20]
R. Miller and A. Tripathi, "Issues with exception handling in object-oriented systems," in In Object-Oriented Programming, 11th European Conference (ECOOP). Springer-Verlag, 1997, pp. 85--103.
[21]
M. P. Robillard and G. C. Murphy, "Regaining control of exception handling," University of British Columbia, Vancouver, BC, Canada, Tech. Rep., 1999.
[22]
GCC team, "GCC coding conventions," Sep. 2013. {Online}. Available: http://gcc.gnu.org/codingconventions.html
[23]
LLVM Project, "LLVM coding standards," Oct. 2013. {Online}. Available: http://llvm.org/docs/CodingStandards.html
[24]
P. Terdiman, "Exceptions: just say no," Mar. 2008. {Online}. Available: http://www.codercorner.com/blog/?p=33
[25]
B. Weinberger, C. Silverstein, G. Eitzmann, M. Mentovai, and T. Landray, "Google C++ style guide," Sep. 2013, revision 3.274. {Online}. Available: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
[26]
G. Candea, M. Delgado, M. Chen, and A. Fox, "Automatic failure-path inference: A generic introspection technique for Internet applications," in Proceedings of the The Third IEEE Workshop on Internet Applications (WIAPP '03). San Jose, California: IEEE, Jun. 2003, pp. 132--141.
[27]
C. A. Flanagan and M. Burrows, "System and method for dynamically detecting unchecked error condition values in computer programs," United States Patent #6,378,081 B1, Apr. 2002.
[28]
T. Goradia, "Dynamic impact analysis: A cost-effective technique to enforce error-propagation," in ISSTA, 1993, pp. 171--181.
[29]
M. Hiller, A. Jhumka, and N. Suri, "An approach for analysing the propagation of data errors in software," in DSN. IEEE Computer Society, 2001, pp. 161--172.
[30]
M. Hiller, A. Jhumka, and N. Suri, "Propane: an environment for examining the propagation of errors in software," in ISSTA, 2002, pp. 81--85.
[31]
M. Hiller, A. Jhumka, and N. Suri, "Epic: Profiling the propagation and effect of data errors in software," IEEE Trans. Computers, vol. 53, no. 5, pp. 512--530, 2004.
[32]
A. Jhumka, M. Hiller, and N. Suri, "Assessing inter-modular error propagation in distributed software," in SRDS. IEEE Computer Society, 2001, pp. 152--161.
[33]
A. Johansson and N. Suri, "Error propagation profiling of operating systems," in DSN. IEEE Computer Society, 2005, pp. 86--95.
[34]
K. G. Shin and T.-H. Lin, "Modeling and measurement of error propagation in a multimodule computing system," IEEE Trans. Computers, vol. 37, no. 9, pp. 1053--1066, 1988.
[35]
M. W. Bigrigg and J. J. Vos, "The set-check-use methodology for detecting error propagation failures in I/O routines," in Workshop on Dependability Benchmarking, Washington, DC, Jun. 2002.
[36]
C. Rubio-González and B. Liblit, "Expect the unexpected: error code mismatches between documentation and the real world," in PASTE, S. Lerner and A. Rountev, Eds. ACM, 2010, pp. 73--80.
[37]
C. Rubio-González and B. Liblit, "Defective error/pointer interactions in the Linux kernel," in ISSTA, M. B. Dwyer and F. Tip, Eds. ACM, 2011, pp. 111--121.
[38]
T. W. Reps, S. Horwitz, and S. Sagiv, "Precise interprocedural dataflow analysis via graph reachability," in POPL, R. K. Cytron and P. Lee, Eds. ACM Press, 1995, pp. 49--61.
[39]
C. Weiss, P. Karras, and A. Bernstein, "Hexastore: sextuple indexing for Semantic Web data management," PVLDB, vol. 1, no. 1, pp. 1008--1019, 2008.
[40]
C. Weiss and A. Bernstein, "On-disk storage techniques for Semantic Web data - Are B-trees always the optimal solution?" in The 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS2009), Oct. 2009, pp. 49--64.
[41]
G. C. Necula, S. McPeak, S. P. Rahul, and W. Weimer, "CIL: Intermediate language and tools for analysis and transformation of C programs," in CC, ser. Lecture Notes in Computer Science, R. N. Horspool, Ed., vol. 2304. Springer, 2002, pp. 213--228.
[42]
"clang: a C language family frontend for LLVM," Nov. 2013. {Online}. Available: http://clang.llvm.org/
[43]
C. Lattner and V. S. Adve, "LLVM: A compilation framework for lifelong program analysis & transformation," in CGO. IEEE Computer Society, 2004, pp. 75--88.
[44]
C. Rubio-González, "Finding error-propagation bugs in large software systems using static analysis," Ph.D. dissertation, University of Wisconsin--Madison, Aug. 2012.
[45]
J.-S. Möller, B. Webster, and M. Charlebois, "LLVMLinux," Feb. 2014. {Online}. Available: http://llvm.linuxfoundation.org/
[46]
N. Kidd, T. Reps, and A. Lal, "WALi: A C++ library for weighted pushdown systems," Aug. 2012. {Online}. Available: http://www.cs.wisc.edu/wpis/wpds/download.php
[47]
R. E. Bryant, "Binary decision diagrams and beyond: enabling technologies for formal verification," in ICCAD, R. L. Rudell, Ed. IEEE Computer Society, 1995, pp. 236--243.
[48]
J. Lind-Nielsen and H. Cohen, "BuDDy: A BDD package," Apr. 2013. {Online}. Available: http://sourceforge.net/projects/buddy
[49]
M. Stonebraker and A. Kumar, "Operating system support for data management," IEEE Database Eng. Bull, vol. 9, no. 3, pp. 43--50, 1986.
[50]
"Amazon best sellers: Best sellers in laptop computers," Jan. 2015. {Online}. Available: http://www.amazon.com/Best-Sellers-Electronics-Laptop-Computers/zgbs/electronics/565108/
[51]
W. R. Bush, J. D. Pincus, and D. J. Sielaff, "A static analyzer for finding dynamic programming errors," Softw., Pract. Exper., vol. 30, no. 7, pp. 775--802, 2000.
[52]
P. Cousot and R. Cousot, "Modular static program analysis," in CC, ser. Lecture Notes in Computer Science, R. N. Horspool, Ed., vol. 2304. Springer, 2002, pp. 159--178.
[53]
A. Aiken, S. Bugrara, I. Dillig, T. Dillig, B. Hackett, and P. Hawkins, "An overview of the Saturn project," in PASTE, M. Das and D. Grossman, Eds. ACM, 2007, pp. 43--48.
[54]
I. Dillig, T. Dillig, A. Aiken, and M. Sagiv, "Precise and compact modular procedure summaries for heap manipulating programs," in PLDI, M. W. Hall and D. A. Padua, Eds. ACM, 2011, pp. 567--577.
[55]
J.-D. Choi, R. Cytron, and J. Ferrante, "Automatic construction of sparse data flow evaluation graphs," in POPL, D. S. Wise, Ed. ACM Press, 1991, pp. 55--66.
[56]
G. Ramalingam, "On sparse evaluation representations," in SAS, ser. Lecture Notes in Computer Science, P. V. Hentenryck, Ed., vol. 1302. Springer, 1997, pp. 1--15.
[57]
H. Chen, D. Dean, and D. Wagner, "Model checking one million lines of C code," in NDSS. The Internet Society, 2004.
[58]
J. Whaley and M. S. Lam, "Cloning-based context-sensitive pointer alias analysis using binary decision diagrams," in PLDI, W. Pugh and C. Chambers, Eds. ACM, 2004, pp. 131--144.
[59]
M. S. Lam, J. Whaley, V. B. Livshits, M. C. Martin, D. Avots, M. Carbin, and C. Unkel, "Context-sensitive program analysis as database queries," in PODS, C. Li, Ed. ACM, 2005, pp. 1--12.
[60]
Y. Smaragdakis and M. Bravenboer, "Using Datalog for fast and easy program analysis," in Datalog, ser. Lecture Notes in Computer Science, O. de Moor, G. Gottlob, T. Furche, and A. J. Sellers, Eds., vol. 6702. Springer, 2010, pp. 245--251.
[61]
G. Kastrinis and Y. Smaragdakis, "Hybrid context-sensitivity for points-to analysis," in PLDI, H.-J. Boehm and C. Flanagan, Eds. ACM, 2013, pp. 423--434.
[62]
T. W. Reps, "Demand interprocedural program analysis using logic databases," in Workshop on Programming with Logic Databases (Book), ILPS, ser. The Kluwer International Series in Engineering and Computer Science 296, R. Ramakrishnan, Ed. Kluwer, 1993, pp. 163--196.
[63]
J. Whaley, D. Avots, M. Carbin, and M. S. Lam, "Using Datalog with binary decision diagrams for program analysis," in APLAS, ser. Lecture Notes in Computer Science, K. Yi, Ed., vol. 3780. Springer, 2005, pp. 97--118.
[64]
M. Eichberg, S. Kloppenburg, K. Klose, and M. Mezini, "Defining and continuous checking of structural program dependencies," in ICSE, W. Schäfer, M. B. Dwyer, and V. Gruhn, Eds. ACM, 2008, pp. 391--400.
[65]
T. Neumann and G. Weikum, "RDF-3X: a RISC-style engine for RDF," PVLDB, vol. 1, no. 1, pp. 647--659, 2008.
[66]
T. W. Reps, "Solving demand versions of interprocedural analysis problems," in CC, ser. Lecture Notes in Computer Science, P. Fritzson, Ed., vol. 786. Springer, 1994, pp. 389--403.
[67]
J. Rohmer, R. Lescoeur, and J.-M. Kerisit, "The Alexander method - a technique for the processing of recursive axioms in deductive databases," New Generation Comput., vol. 4, no. 3, pp. 273--285, 1986.
[68]
C. Beeri and R. Ramakrishnan, "On the power of magic," J. Log. Program., vol. 10, no. 3&4, pp. 255--299, 1991.
[69]
F. Bancilhon, D. Maier, Y. Sagiv, and J. D. Ullman, "Magic sets and other strange ways to implement logic programs," in PODS, A. Silberschatz, Ed. ACM, 1986, pp. 1--15.
[70]
H. Alani, L. Kagal, A. Fokoue, P. T. Groth, C. Biemann, J. X. Parreira, L. Aroyo, N. F. Noy, C. Welty, and K. Janowicz, Eds., The Semantic Web - ISWC 2013 - 12th International Semantic Web Conference, Sydney, NSW, Australia, October 21--25, 2013, Proceedings, ser. Lecture Notes in Computer Science, vol. 8218. Springer, 2013.
[71]
I. Keivanloo and J. Rilling, "Clone detection meets semantic web-based transitive closure computation," in Proc. RAISE. IEEE, 2012, pp. 12--16.
[72]
O. Erling, "Virtuoso, a hybrid RDBMS/graph column store," IEEE Data Eng. Bull., vol. 35, no. 1, pp. 3--8, 2012.
[73]
D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach, "Scalable semantic web data management using vertical partitioning," in VLDB, C. Koch, J. Gehrke, M. N. Garofalakis, D. Srivastava, K. Aberer, A. Deshpande, D. Florescu, C. Y. Chan, V. Ganti, C.-C. Kanne, W. Klas, and E. J. Neuhold, Eds. ACM, 2007, pp. 411--422.
[74]
J. Urbani, J. Maassen, and H. E. Bal, "Massive semantic web data compression with MapReduce," in HPDC, S. Hariri and K. Keahey, Eds. ACM, 2010, pp. 795--802.
[75]
J. Urbani, S. Kotoulas, J. Maassen, F. van Harmelen, and H. E. Bal, "OWL reasoning with WebPIE: Calculating the closure of 100 billion triples," in ESWC (1), ser. Lecture Notes in Computer Science, L. Aroyo, G. Antoniou, E. Hyvönen, A. ten Teije, H. Stuckenschmidt, L. Cabral, and T. Tudorache, Eds., vol. 6088. Springer, 2010, pp. 213--227.

Cited By

View all
  • (2023)DStream: A Streaming-Based Highly Parallel IFDS FrameworkProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00208(2488-2500)Online publication date: 14-May-2023
  • (2021)Sustainable SolvingProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00102(1098-1110)Online publication date: 22-May-2021
  • (2017)Automatically detecting integrity violations in database-centric applicationsProceedings of the 25th International Conference on Program Comprehension10.1109/ICPC.2017.37(251-262)Online publication date: 20-May-2017

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '15: Proceedings of the 37th International Conference on Software Engineering - Volume 1
May 2015
999 pages
ISBN:9781479919345

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 May 2015

Check for updates

Qualifiers

  • Research-article

Conference

ICSE '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)DStream: A Streaming-Based Highly Parallel IFDS FrameworkProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00208(2488-2500)Online publication date: 14-May-2023
  • (2021)Sustainable SolvingProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00102(1098-1110)Online publication date: 22-May-2021
  • (2017)Automatically detecting integrity violations in database-centric applicationsProceedings of the 25th International Conference on Program Comprehension10.1109/ICPC.2017.37(251-262)Online publication date: 20-May-2017

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media