Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Static specification inference using predicate mining

Published: 10 June 2007 Publication History

Abstract

The reliability and correctness of complex software systems can be significantly enhanced through well-defined specifications that dictate the use of various units of abstraction (e.g., modules, or procedures). Often times, however, specifications are either missing, imprecise, or simply too complex to encode within a signature, necessitating specification inference. The process of inferring specifications from complex software systems forms the focus of this paper. We describe a static inference mechanism for identifying the preconditions that must hold whenever a procedure is called. These preconditions may reflect both data flow properties (e.g., whenever p is called, variable x must be non-null) as well as control-flow properties (e.g., every call to p must bepreceded by a call to q). We derive these preconditions using a ninter-procedural path-sensitive dataflow analysis that gathers predicates at each program point. We apply mining techniques to these predicates to make specification inference robust to errors. This technique also allows us to derive higher-level specifications that abstract structural similarities among predicates (e.g., procedure p is called immediately after a conditional test that checks whether some variable v is non-null.) We describe an implementation of these techniques, and validate the effectiveness of the approach on a number of large open-source benchmarks. Experimental results confirm that our mining algorithms are efficient, and that the specifications derived are both precise and useful-the implementation discovers several critical, yet previously, undocumented preconditions for well-tested libraries.

References

[1]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487--499, 1994.
[2]
R. Agrawal and R. Srikant. Mining sequential patterns. In Eleventh International Conference on Data Engineering, pages 3--14, 1995.
[3]
R. Alur, P. Cerny, P. Madhusudan, and W. Nam. Synthesis of interface specidications for java classes. In POPL '05: Proceedings of the 32nd ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages, pages 98--109, 2005.
[4]
G. Ammons, R. Bodik, and J. Larus. Mining specifications. In POPL'02: Proceedings of the 29th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages, pages 4--16, 2002.
[5]
G. Ammons, D. Mandelin, R. Bodik, and J. Larus. Debugging temporal specifications with concept analysis. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, pages 182--195, 2003.
[6]
P. Anderson, T. Reps, and T. Teitelbaum. Design and implementation of a fine-grained software inspection tool. IEEE Trans. on Software Engineering, 29(8):721--733, August 2003.
[7]
T. Ball and S.K. Rajamani. Automatically validating temporal safety properties of interfaces. In SPIN 2001, Workshop on Model Checking of Software, LNCS 2057, pages 103--122, May 2001.
[8]
B. Blanchet. A computationally sound mechanized prover for security protocols. In IEEE Symposium on Security and Privacy, pages 140--154, 2006.
[9]
D. Burdick, M. Calimlim, J. Flannick, J. Gehrke, and T. Yiu. Mafia: A performance study of mining maximal frequent itemsets. In Workshop on Frequent Itemset Mining Implementations (FIMI'03), 2003.
[10]
M. Castro, M. Costa, and T. Harris. Securing software by enforcing data-flow integrity. In OSDI '06: Proceedings of the 7th Usenix Symposium on Operating Systems Design and Implementation, pages 147--160, 2006.
[11]
B. Chin, S. Markstrum, and T. Millstein. Semantic type qualifiers. In PLDI '05: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 85--95, 2005.
[12]
C. Cortes, K. Fisher, D. Pregibon, and A. Rogers. Hancock: a language for extracting signatures from data streams. In KDD '00: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pages 9--17, 2000.
[13]
D. Engler, D. Chen, and A. Chou. Bugs as inconsistent behavior: A general approach to inferring errors in systems code. In SOSP'01: Proceedings of the 18th ACM Symposium on Operating Systems Principles, pages 57--72, 2001.
[14]
M. Ernst, J. Cockrell, W. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE TSE, 27(2):1--25, February 2001.
[15]
J. Fischer, R. Jhala, and R. Majumdar. Joining dataflow with predicates. In ESEC--FSE '05: 10th European Software Engineering Conference and 13th ACM SIGSOFT international symposium on Foundations of Software Engineering, pages 227--236, 2005.
[16]
J. Foster, T. Terauchi, and A. Aiken. Flow-sensitive type qualifiers. In PLDI '02: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 1--12, 2002.
[17]
M. Furr and J. Foster. Checking type safety of foreign function calls. In PLDI '05: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 62--72, 2005.
[18]
P. Godefroid, N. Klarslund, and K. Sen. Dart: Directed automated random testing. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 213--223, Chicago, Il, 2005.
[19]
D. Gunopulos, R. Khardon, H. Mannila, S. Saluja, H. Toivonen, and R. Sharma. Discovering all most specific sentences. ACM Transactions on Database Systems, 28:140--174, 2003.
[20]
T. Henzinger, R. Jhala, and R. Majumdar. Permissive interfaces. SIGSOFT Softw. Eng. Notes, 30(5):31--40, 2005.
[21]
G. J. Holzmann. The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley, 2004.
[22]
R. Jhala and R. Majumdar. Path slicing. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 38--47, 2005.
[23]
T. Kremenek, P. Twohey, G. Back, A. Ng, and D. Engler. From uncertainty to belief: Inferring the specification within. In OSDI'06: Proceedings of the 7th Usenix Symposium on Operating Systems Design and Implementation, pages 161--176, 2006.
[24]
P. Lam, V. Kuncak, and M. Rinard. Generalized typestate checking for data structure consistency. In VMCAI '05: Proceedings of 6th International Conference on Verification, Model Checking and Abstract Interpretation, pages 430--447, 2005.
[25]
Z. Li, S. Lu, S. Myagmar, and Y. Zhou. Cp-miner: A tool for finding copy-paste and related bugs in operating system code. In OSDI '04: Proceedings of the 6th Usenix Symposium on Operating Systems Design and Implementation, pages 289--302, 2004.
[26]
Z. Li and Y. Zhou. Pr-miner: Automatically extracting implicit programming rules and detecting violations in large software code. In ESEC-FSE '05: 10th European Software Engineering Conference and 13th ACM SIGSOFT international symposium on Foundations of Software Engineering, pages 306--315, 2005.
[27]
B. Liblit, M. Naik, A. Zheng, A. Aiken, and M. Jordan. Scalable statistical bug isolation. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 15--26, Chicago, Illinois, 2005.
[28]
B. Livshits and T. Zimmermann. Dynamine: Finding common error patterns by mining software revision histories. In ESEC-FSE '05: 10th European Software Engineering Conference and 13th ACM SIGSOFT international symposium on Foundations of Software Engineering, pages 296--305, 2005.
[29]
D. Mandelin, L. Xu, R. Bodik, and D. Kimelman. Jungloid mining: helping to navigate the api jungle. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 48--61, 2005.
[30]
J. Palsberg. Closure analysis in constraint form. ACM Trans. Program. Lang. Syst., 17(1):47--62, 1995.
[31]
M. K. Ramanathan, A. Grama, and S. Jagannathan. Path-sensitive inference of function precedence protocols. In ICSE '07: Proceedings of the 29th International Conference on Software Engineering, 2007.
[32]
M. Rinard, C. Cadar, D. Dumitran, D. M. Roy, T. Leu, and W.S. Beebee. Enhancing server availability and security through failureoblivious computing. In OSDI '04: Proceedings of the 6th Usenix Symposium on Operating Systems Design and Implementation, 2004.
[33]
O. Shivers. Control-Flow Analysis of Higher-Order Languages. PhD thesis, Carnegie Mellon University, May 1991.
[34]
W. Weimer and G. Necula. Mining temporal specifications for error detection. In TACAS '05: Eleventh International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 461--476, April 2005.
[35]
J. Whaley, M. Martin, and M. Lam. Automatic extraction of objectoriented component interfaces. In Proceedings of the International Symposium of Software Testing and Analysis, ISSTA, 2002.
[36]
Y. Xie and A. Aiken. Scalable error detection using boolean satisfiability. In POPL '05: Proceedings of the 32nd ACM Symposium on Principles of Programming Languages, pages 351--363, 2005.
[37]
G. Yang. The complexity of mining maximal frequent itemsets and maximal frequent patterns. In KDD '04: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pages 344--353, 2004.
[38]
J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: Mining temporal api rules from imperfect traces. In ICSE '06: Proceedings of the International Conference on Software Engineering, 2006.
[39]
J. Yang, P. Twohey, D. Engler, and M. Musuvathi. Using model checking to find serious file system errors. In OSDI '04: Proceedings of the 6th Usenix Symposium on Operating Systems Design and Implementation, pages 273--288, San Francisco, CA, 2004.

Cited By

View all
  • (2024)Discovering API usage specifications for security detection using two-stage code miningCybersecurity10.1186/s42400-024-00224-w7:1Online publication date: 3-Oct-2024
  • (2024)Boosting API Misuse Detection via Integrating API Constraints from Multiple SourcesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644904(14-26)Online publication date: 15-Apr-2024
  • (2024)Enchanting Program Specification Synthesis by Large Language Models Using Static Analysis and Program VerificationComputer Aided Verification10.1007/978-3-031-65630-9_16(302-328)Online publication date: 24-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 42, Issue 6
Proceedings of the 2007 PLDI conference
June 2007
491 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1273442
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2007
    508 pages
    ISBN:9781595936332
    DOI:10.1145/1250734
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2007
Published in SIGPLAN Volume 42, Issue 6

Check for updates

Author Tags

  1. preconditions
  2. predicate mining
  3. program analysis
  4. specification inference

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)12
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Discovering API usage specifications for security detection using two-stage code miningCybersecurity10.1186/s42400-024-00224-w7:1Online publication date: 3-Oct-2024
  • (2024)Boosting API Misuse Detection via Integrating API Constraints from Multiple SourcesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644904(14-26)Online publication date: 15-Apr-2024
  • (2024)Enchanting Program Specification Synthesis by Large Language Models Using Static Analysis and Program VerificationComputer Aided Verification10.1007/978-3-031-65630-9_16(302-328)Online publication date: 24-Jul-2024
  • (2023)Python API Misuse Mining and Classification Based on Hybrid Analysis and Attention MechanismInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402350043233:10(1567-1597)Online publication date: 7-Aug-2023
  • (2022)API Misuse Detection Method Based on Transformer2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS57517.2022.00100(958-969)Online publication date: Dec-2022
  • (2021)A Large-scale Study on API Misuses in the Wild2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST49551.2021.00034(241-252)Online publication date: Apr-2021
  • (2021)Identifying change patterns of API misuses from code changesScience China Information Sciences10.1007/s11432-019-2745-564:3Online publication date: 7-Feb-2021
  • (2021)Unbounded Procedure Summaries from Bounded EnvironmentsVerification, Model Checking, and Abstract Interpretation10.1007/978-3-030-67067-2_14(291-324)Online publication date: 17-Jan-2021
  • (2020)API-misuse detection driven by fine-grained API-constraint knowledge graphProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3416551(461-472)Online publication date: 21-Dec-2020
  • (2020)Out of Sight, Out of Place: Detecting and Assessing Swapped Arguments2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM51674.2020.00031(227-237)Online publication date: Sep-2020
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media